You may sometimes want to do a lookup and return it as a column in your query results – but you don’t want to reference the source table in the FROM clause. In this case you may decide to use a column-level query. I’m not arguing the rights or wrongs of column-level queries, either on performance or theoretical grounds. The problem is that if this approach is applied naively it’s a bug.
Please note – these scripts run in Oracle: if you want to use them in Sql Server, run the following script:
CREATE TABLE DUAL(DUMMY CHAR(1) PRIMARY KEY(DUMMY) CHECK(DUMMY = 'X'));INSERT INTO DUAL VALUES('X');
Obviously this won’t work:
WITH Duplis As
(
SELECT 'X' As DUMMY FROM DUAL
UNION ALL
SELECT 'Y' FROM DUAL
)
SELECT DUMMY
,(SELECT DUMMY FROM Duplis HAVING COUNT(DISTINCT DUMMY)= 1) As SINGLE_VALUE_LOOKUP
FROM Duplis
So, to ensure that only one lookup value is plugged into the row the typical approach is to wrap the column in a MAX function. Now it does work:
WITH Duplis As
(
SELECT 'X' As DUMMY FROM DUAL
UNION ALL
SELECT 'Y' FROM DUAL
)
SELECT DUMMY
,(SELECT MAX(DUMMY) FROM Duplis) As SINGLE_VALUE_LOOKUP
FROM Duplis
Job done? Only if you’re in such a rush you don’t mind risking what’s shown above. The assumption was that there will only be one value in the lookup table, so it was safe to use the MAX function just to keep the sql parser happy. If that assumption doesn’t hold, then as shown above the larger (or higher in an alphabetical sort) value will be selected arbitrarily. Let’s try a rewrite. This time we’ll use a HAVING clause to check that there is only one value:
WITH Duplis As
(
SELECT * FROM DUAL
UNION ALL
SELECT * FROM DUAL
)
SELECT DUMMY
,(SELECT MAX(DUMMY) FROM Duplis HAVING COUNT(DISTINCT DUMMY) = 1) As SINGLE_VALUE_LOOKUP
FROM Duplis
That still does what we want it to do where, as expected, we have only one value in the lookup table. What if there’s more than one?
WITH Duplis As
(
SELECT 'X' As DUMMY FROM DUAL
UNION ALL
SELECT 'Y' FROM DUAL
)
SELECT DUMMY
,(SELECT MAX(DUMMY) FROM Duplis HAVING COUNT(DISTINCT DUMMY) = 1) As SINGLE_VALUE_LOOKUP
FROM Duplis
As you can see, you get a Null back. because there was >1 distinct value in the source data. Anyone who’s nervous about the HAVING clause isn’t really confident of the assumption they’re making when they use the MAX function. Of course, you wouldn’t want to break the application if the SELECT belongs to an INSERT statement and the target column isn’t nullable. In that case, wrap the whole thing in a COALESCE statement so as to return a missing value token, as below:
COALESCE((SELECT MAX(DUMMY) FROM Duplis HAVING COUNT(DISTINCT DUMMY) = 1), 'N/A') As SINGLE_VALUE_LOOKUP