Agile and Continuous Delivery techniques have been applied to databases for some time, and it’s clear they offer big advantages. Some types of application are particularly suited. They’ll be well-represented among the successes and case studies. Others, such as those including a complex relational database, don’t fit the mould quite so easily. These would be in the “Agility at Scale” category, and often in use within large enterprises.
How does our database project fit in? If Agile wouldn’t be an easy win, can we still get some benefit? I’ve been thinking about that, because the data warehouses I work on are definitely not easy wins. I’m jotting down some notes here, not a formal document, and I hope to develop my thinking in future blogs. If you want to comment or criticise you’d be welcome. By the way, there are two podcasts worth listening to: see the Links (Topical) page under “Database\Continuous Delivery”, and I should acknowledge that a number of the points I make below are drawn from them.
It seems to me that it’s important not to start by looking at the issue in technical terms. There are more fundamental questions to be answered first. The questions below may be some help.
What Kind of Database Do We Have?
(We’ll assume that data from the old database version has to be preserved) Is there plenty of down time in the normal schedule to deploy a full copy of the new database? If “yes”, your application may be a mainstream candidate after all. The database can probably be treated as just one more file deployed with the application.
If “no”, we’ll need to take measures that permit the application to stay live while we do the release. We’ll need to be able to fix bugs without losing any changes applied since the release. This would include our data loads as well as client updates. It may mean de-coupling database changes from client application changes, and fixing bugs by moving forward instead of backing out the new version.
If it’s an “application database” (per Martin Fowler, see “Vocabulary for definitions)) that’s helpful because you don’t have to think about other users. An “integration database”, on the other hand, has a number of client applications, some of whom you may not even know about.
How Much Do Bugs or Failures Cost?
This is an important consideration of course. Bear in mind that if you’re using a “NoSql” offering “eventual consistency”, a decision to relax some of the consistency of a relational database has already been made. This should inform your thinking. I see the following:
An estimate of the range of costs in time and money. This includes damage claims by customers or fines from government
The above, plus a range estimate of the costs in lost business from existing and future customers
For example the cost of an occasional mishandled order as against a quarter of an hour’s website downtime.
Will Our Testing Stand Up?
Since the studies are typically based on mainstream cases, it’s often said that a short deployment cycle means we can fix bugs as we find them, instead of doing exhaustive testing before release. This assumes that bugs will show up immediately. This may be so, but I have worked on applications where we were coding for new types of transaction that we might not see in Production for months.
So, if inputs vary, and especially if execution paths change with them, testing will have to deal with it, if the cost of bugs and failures is significant. Otherwise, a bug may sit there undetected while other changes go in. When the fix is done it could have a knock-on effect on those changes. In an integration database, users of some client applications may only go online intermittently; they may not even know who to contact if something goes wrong.
How Much Change is Going On?
There are two models of database deployment, the “state-based” model and the “transformation-based” model. “State-based” is more like a normal application deployment, and it’s most suitable where changes are relatively infrequent and concentrated on the source code. If you are making rapid changes and they often affect the schema, you will probably opt for transformation-based deployment, which works with the “delta”, the changes that are applied with each new version.
Are There Tools You Can Use?
There are many tools that are available for the deployment of database source code. You’ll need to know what deployment mode you are going to use before you make a choice. What tools are available for automating testing of the data? It would be much better to avoid “rolling your own”.
Who Has to be Convinced?
Your management will need to be convinced of course, as they’ll be providing the budget and approving the design. Naturally the DBAs will have opinions, and if there is a separate team handling deployments they’ll have to be comfortable that the new process can handle a failure gracefully. You need to speak to all of these people and the timing needs careful thought.
How Much Do You Need?
Continuous Delivery reminds me a bit of “Just-In-Time” manufacturing, which arrived here from Japan some time around the early Eighties. To make JIT work properly needed an end-to-end view of the supply and production process. The contracts and the relationships between companies were as important in making it work as any of the practical arrangements.
Motor manufacturing was ideally suited, which is why it started there. Was JIT fully adopted in continuous process industries, where a stop in production can in some cases destroy the production line? Where I worked it certainly wasn’t – but there were probably benefits to be gained by adopting JIT techniques selectively. It seems Agile and Continuous Delivery ideas can also be useful even where they can’t be applied fully.