Database Object Types – Some Design Considerations

There’s now a wide range of objects that a database developer can use to solve problems. For this post or two I’m considering certain objects (as implemented in MS SQL Server) and how I think they are probably best used .

Stored Procedure
A stored procedure has the full range of DML operations and you can raise errors in the procedure, useful if you want to make run-time checks on data. It will return data to a client program, but within the database you can’t re-use the result set. That is to say, you wouldn’t be able to SELECT the proc results and join them to another table. In MS SQL Server you can EXECUTE the stored procedure into a temporary table (or possibly a permanent table) and join to that, but it isn’t standard SQL and was a bit clumsy last time I used the technique.

If your result set is a “foundation” element that will be re-used in other queries, it will probably be better to use a view, which can be treated just like a table in your SQL, or a function. The view or function can still be wrapped in a stored procedure to offer a consistent interface for client programs and allowing you to raise errors as above.

Trigger
A trigger is the best solution for auditing changes to a table. Consider alternatives if you’re willing to trust that all developers will write comprehensive auditing code for their programs, that it will be possible to merge the data from these individual audit systems, and that no-one apart from dbas will have direct access to the data so as to bypass the auditing.

In the systems I work on that’s never the case, so I use audit triggers. They’re the only way to make sure that changes are recorded whatever program or access method is in use. Seems paranoic? No doubt, but if you can’t be 100% sure that your audit records are available and complete, why bother auditing at all?

I try not to use triggers for any other purpose. You can’t pass in parameters to a trigger so it’s not as natural to code as a stored procedure. A trigger is easily forgotten and may only come to mind when it’s slowing down a big UPDATE or INSERT (probably in a script you’re running to solve a critical production issue). As a maintenance programmer I really don’t like the idea that code could suddenly jump out at me without warning. I’m sure most others feel the same, so if you want your application to survive and thrive during the maintenance phase, don’t get “trigger-happy”…

There are traps in the coding of the trigger. You have to consider whether the trigger itself would generate further trigger activity and whether you might get into an endless loop. I think the question to ask is – are there really no alternatives to using a trigger here? I’ve almost always found one, even for validating multiple rows of data being inserted into a table – one of the times when a trigger would be a “natural” solution.

View

These are virtual tables based on select statements. They can include multiple tables or views and subject to restrictions they can span partitions. INSERTS, UPDATES and DELETES can be done on views provided the SQL behind meets certain restrictions (with additional restrictions in the case of partitioned views).

It may be possible to bypass the DDL restrictions on some views by using an INSTEAD OF trigger. This can be attached to a view – even if the view isn’t defined with the SCHEMABINDING option (see below). I think the same caveats apply here as with ordinary triggers (see above). An INSTEAD OF trigger might be helpful if some client applications make direct inserts into tables – and the schema has changed so that the tables needed are no longer available. Apart from that they should probably be avoided, especially if access to the underlying tables is still allowed. Object views use INSTEAD OF triggers to keep the underlying relational data in synch with the object properties. That’s another debate which I won’t enter into here.

You can’t ORDER a view except if there is a TOP clause. One trick to force ordering is to create a view with a TOP 100 PERCENT clause, but apparently the ordering in that case is still not guaranteed.

Oracle is more liberal as to where it allows sorting. MS SQL Server’s prohibition on view and function ordering is probably a stricter implementation of relational theory which states that the definition of a table (or a virtual table I presume) does not include ordering of its rows. Lack of sorting isn’t a major issue. Explicit ordering should always be added to code released to Production – it’s just a convenience when you’re developing to have your data “ready-sorted”.

Note: Schemabinding means binding the view to the database schema. This means that any changes to the schema must leave the view in a consistent state. An error will prevent the dropping of any tables or columns which appear in the view.

Materialised Views
These are known as Indexed Views in MS SQL Server, and are used to boost performance. When a query references a view, its SQL is expanded by having the definition of the view written. This happens at run time, and if it imposes a big overhead, the view can be bound to the schema. The view can then be indexed and much of the work is therefore done before the data is queried.

The materialized view is an advanced technical feature, apparently more commonly used in data warehousing than in the kind of conventional reporting systems that I work on. It needs careful use. I’m going to consider Oracle as well as SQL Server in this posting; I’m more familiar with materialized views in Oracle and they seem to be more configurable (with more opportunity to get into trouble). There are a lot of technical details regarding how materialized views can be built and restrictions that apply. It seems to me that the two most important factors are:

1.Data refreshes
The Materialized view can speed up performance for SELECT queries, but the trade-off is added overhead for inserts, updates and deletes. So a materialized view will not help if all of the tables are volatile, with lots of DML going on.

2.Query Rewrites
This feature allows the optimizer to look at queries submitted and check them against the materialized view. Where the materialized view can satisfy part or all of the query, it will be “written in” to the query plan.

Point 1 is a normal design issue, and the worst penalty for a bad decision might seem to be a loss of performance. You could get into serious trouble in a live system though, if you work around the performance issue by taking control of the data refreshes. Oracle allows you to specify when these will happen, which is OK as long as there is guaranteed to be no change to the data between refreshes – or you’re happy to defer data changes to a batch process occuring at set times (and everyone else feels the same way).

Point 2 seems to be the feature that really needs a lot of thinking about, and I worked on a system where this` issue cropped up. Remember that once query rewrite is enabled, the optimiser will patch the materialized view into any submitted query as it sees fit.

This means that by enabling query rewrite on your materialized view, you could be redeveloping code that you’ve never seen. The optimizer considers the query predicate in making its decision. If there’s a filter on the materialized view it shouldn’t get used unless it’s compatible with the submitted query – but will you take that on trust, or will you test it? It won’t be easy to find the code that might be affected, and clearly you can’t test code that hasn’t been deployed yet, but such code could also be affected.

There’s a final nasty twist if you’ve got query rewrite enabled and the data refresh isn’t automatic. I assume that query rewrite doesn’t force a refresh (or the performance benefits would disappear) so – you could be patching old data into queries instead of the up-to-date information available in underlying tables. It’s possible to disable or re-enable the materialized view in line with the updating policy but this would need careful management.

For me the overall message is that as long as the environment is well controlled the materialized view could be a useful tool. For example, a good security model would help since there’ll be much more control over where the view could be used. Someone with oversight of the whole system at a good level of detail could manage the decision on whether data refreshes should be controlled instead of automatic. Such was not the case in the system I worked on…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.