Category Archives: Data Warehouse

“Test-First” Database Development (1)

I’m looking at the practicalities of using Continuous Delivery/Agile techniques where the database is a “high scale” project in Agile terms, and therefore relatively difficult (for more detail about this Agile terminology refer to Scott Ambler’s writings).

As well as thinking about the big issues, I want to start putting together some of the pieces to make it work. I’m doing this in Sql Server 2014 Transact-Sql. I’ve started with automated testing of reporting procedures. There may be utilities that can do this; finding suitable tools is part of the research. It still seems worthwhile having alternatives in Transact-Sql if possible. There’s no purchase or installation to approve, and the code is easy to understand and to edit. Continue reading “Test-First” Database Development (1)

Agile Continuous Delivery of Databases (1)

Agile and Continuous Delivery techniques have been applied to databases for some time, and it’s clear they offer big advantages. Some types of application are particularly suited. They’ll be well-represented among the successes and case studies. Others, such as those including a complex relational database, don’t fit the mould quite so easily. These would be in the “Agility at Scale” category, and often in use within large enterprises.

How does our database project fit in? If Agile wouldn’t be an easy win, can we still get some benefit? I’ve been thinking about that, because the data warehouses I work on are definitely not easy wins. I’m jotting down some notes here, not a formal document, and I hope to develop my thinking in future blogs. Continue reading Agile Continuous Delivery of Databases (1)

Sql Server Collation Cheat Sheet

This post is not written for an architect or DBA who has to choose a collation and wants an in-depth explanation, but for a developer (perhaps one with an error to fix) who needs a primer on the subject.

Put very simply, collations have to do with languages and with the rules used to compare characters. This is relevant in joins and sorting. You might think that a collation ought to be specified for the query; that in your sql statement you should determine the rules to use for comparing values. You can do that (although you’ll set it at column, not query level), but the columns in the tables have a collation property which is what the query uses by default. This is a convenience as it standardises comparisons and saves you having to specify the rules over and over again. It’s when the two conflict that you have an error which may make it necessary to override one or more default collations. Continue reading Sql Server Collation Cheat Sheet

OPENROWSET and BULK INSERT (3)

{Previous Article in Series}
{Next}
This is the promised post about using a Powershell script to inspect large files. It won’t be possible to open a very large file in Excel of course, and some text readers may struggle under the weight of data. This script is an alternative that runs under most if not all versions of Windows, and doesn’t need an installation or a licence. Continue reading OPENROWSET and BULK INSERT (3)

OPENROWSET and BULK INSERT (2)

{Previous Article in Series}
{Next}

As a data warehouse developer you’ll probably have to load big files. Even if you’re not responsible for loads in PROD, you may well have to import data into the DEV database. There’s often an expectation that this shouldn’t take long, so if you hit any problems you could come under time pressure.

A toolkit and a methodical approach can help you to grind out a predictable result every time. I hope these notes may be some help. Continue reading OPENROWSET and BULK INSERT (2)

Stored Procedure Performance – What You Can Do

There are lots of things you can do to address stored procedure performance issues in Sql Server (this is based on version 2014 which is what I’m currently working in). In fact, there are so many techniques that they seemed worth listing in a blog. Of course, correct indexing is fundamental; I’m assuming you’ve looked at that already. What is listed below would be more appropriate for problems such as execution plans being compiled with specific parameters and not working well when other parameters are passed in. There’s no detail here though: please bear in mind that some of the commands listed are well-known to be buggy. You’ll need to Google for more information on issues and trade-offs. Continue reading Stored Procedure Performance – What You Can Do