Sql JOIN Cheat Sheet

Someone accidentally found my Sql Collation Cheat Sheet while searching for a “Sql JOIN Cheat Sheet” so I thought I’d knock out a post on the topic. Towards the end, you’ll find a script that demonstrates them. The script was run in Azure Sql. Some joins may be Microsoft extensions to the ANSI standard. Look at the documentation for fuller information.

When I was a beginner I had thought there was a deep meaning to the words “left” and “right”, as in LEFT OUTER JOIN. It was a mild let-down to find out that it only refers to placement of the table name in relation to the “JOIN” keyword; switching a LEFT OUTER to a RIGHT OUTER join defines the table named after the join keyword as the outer table.

(INNER) JOIN: “All rows from two tables matched according to the join conditions”
Rows are returned only where both keys satisfy the join condition. The keywords LEFT and RIGHT don’t apply here because the join condition by itself determines which rows will be returned , as in “tableA.Column1 = tableB.Column1” (equi-join) or “tableA.Column1 > tableB.Column1” *theta join).

LEFT (OUTER) JOIN: “All rows from this table – and any matches from that one”
All rows are returned from the left table, with any matching rows from the right table. Columns in unmatched rows are padded with Nulls. Omitting the “OUTER” keyword is possible, but I think it’s bad style, since the word carries a lot of meaning. The outer table is the one that has all rows returned. If you want to apply a filter on the inner table, put it into the join. Don’t filter in the “WHERE” clause, you’ll effectively convert it to an inner join.

RIGHT (OUTER) JOIN {as above}
As above, but the outer table is on the right of the JOIN keyword. The left join is the standard. There is never any need to switch to a right join for a two-table join, and I’ve never had to use then in a multi-table join either. In most if not all cases you can do the same job more readably by other means.

FULL JOIN “All rows from both tables; match them where you can”
Like a combination of right and left outer joins: you get all the rows from both tables, and any rows which can’t be matched have columns padded with Nulls. One use for this would be in reconciliations: you’ll see the rows that match off, and all of the unmatched rows on both sides. One point to beware of is performance: last time I wrote a rec I found that using GROUP BY was much faster, but over time it may have been optimised.

CROSS JOIN “No matching – all possible row combinations from two tables”
You don’t strictly need this; it’s a Cartesian product. To get the same effect just SELECT from both tables with no join clause. One reason to use an explicit CROSS JOIN is to document your intentions; you didn’t just forget to include the join.

NATURAL JOIN “Rows from two tables where columns with matching names have identical values”
It’s implemented in Oracle but not in Sql Server. This creates a match on all columns which have the same name. You might just use it if you were in a terrible rush and needed to bash out a lot of throwaway Sql. The obvious problem is that renaming columns will alter the rowset returned – the stuff of nightmares for a maintenance programmer.

NESTED JOIN
I’ve seen this capability in more than one RDBMS, but I don’t know if it’s part of the standard. The best way to understand it might be to look at the example in the sample code. It’s used with multi-table joins, and if you want “all rows from table A, plus matching rows from table B, but only where the conditions of a join between Table B and table C are satisfied”, the nested join will do it. In the sample code I’ve also shown an alternative, using a derived table, as I think this may be a clearer way to write it.

UNION JOIN
Whoa! Now we’re getting really exotic – keep this one for showing off at job interviews. As Joe Celko says in this discussion, “The …UNION JOIN…is part of the infixed JOINs we defined in SQL-99. Nobody uses it, but it was easy to define, so it got into the Standards. It can stick two totally unrelated tables together blindly.” (Not implemented in Sql Server, and possibly not anywhere else either)

OTHER JOIN TYPES
The above joins can be recognised by the keywords that you use to define them (if I’ve missed anything, please leave a comment). The following joins cannot:

SELF JOIN
This doesn’t have any special syntax; it just means that a single table appears twice in the join. Columns in the table are tested against other columns from the same table.

SEMI-JOINS
I’m not aware of any database which implements an explicit semi-join. They don’t really look like joins, because they’re usually constructed with the second table in a sub-query (probably wrapped in an EXISTS function). The “semi” join title refers to the fact that rows are only returned from the main table, not (of course) from the sub-query. This means that a row in the main table appears only once, even if it’s matched several times in the sub-query. You’ll find references to “left” and “right” semi-joins, but this seems pointless if a sub-query is used to build it.

ANTI-JOIN
An anti-join is a specialised type of semi-join. Rows in the main table only appear if they are not matched in the sub-query.

IMPLEMENTATION JOINS
Merge joins, hash joins etc., are types that you won’t normally specify to solve business problems. They are different methods of executing the explicit JOIN that you did specify. For performance tuning, you can add “hints” to your query suggesting that the optimiser utilise one of these join types.

RUNNING THE SAMPLE CODE
Set your sql client to return rows as text so that the PRINT statements output text above each set of results.

Code Samples
/*
DROP TABLE #Left
DROP TABLE #RIGHT
DROP TABLE #RightChild
*/

SET NOCOUNT ON
SELECT @@VERSION

CREATE TABLE #Left
(
KeyColumn char(1)
,DataColumn int
)

CREATE TABLE #Right
(
KeyColumn char(1)
,DataColumn int
)

CREATE TABLE #RightChild
(
KeyColumn char(1)
,DataColumn int
)

INSERT INTO #Left
(
KeyColumn
,DataColumn
)
VALUES
(
'a'
,1
)
,('b', 2)
,('c', 3)
,('d', 4)
,('e', 5)
,('f', 6)

INSERT INTO #Right
(
KeyColumn
,DataColumn
)
VALUES
(
'a'
,10
)
,('b', 20)
,('c', 30)
,('g', 40)
,('h', 50)
,('i', 60)

INSERT INTO #RightChild
(
KeyColumn
,DataColumn
)
VALUES
('a', 400)
,('b', 500)

--left join--
PRINT '(INNER) JOIN'
SELECT #L.*
,#R.*
FROM
#Left #L
JOIN #Right #R
ON #L.KeyColumn = #R.KeyColumn

PRINT 'LEFT OUTER JOIN'
SELECT #L.*
,#R.*
FROM
#Left #L
LEFT OUTER JOIN #Right #R
ON #L.KeyColumn = #R.KeyColumn

PRINT 'RIGHT OUTER JOIN'
SELECT #L.*
,#R.*
FROM
#Left #L
RIGHT OUTER JOIN #Right #R
ON #L.KeyColumn = #R.KeyColumn

PRINT 'CROSS JOIN'
SELECT #L.*
,#R.*
FROM
#Left #L
CROSS JOIN #Right #R

PRINT 'FULL JOIN'
SELECT #L.*
,#R.*
FROM #Left #L
FULL JOIN #Right #R
ON #L.KeyColumn = #R.KeyColumn

BEGIN TRY
EXEC('SELECT *FROM #L NATURAL JOIN #R')
END TRY
BEGIN CATCH
PRINT '---NATURAL JOIN NOT IMPLEMENTED IN SQL SERVER---'
PRINT ''
END CATCH
--SELF JOIN

PRINT 'MULTI-LEVEL JOINS******'
PRINT 'LEFT OUTER JOIN WITH INNER JOIN BELOW'
SELECT #L.*
,#R.*
,#RC.*
FROM #Left #L
LEFT OUTER JOIN #Right #R
ON #L.KeyColumn = #R.KeyColumn
JOIN #RightChild #RC
ON #R.KeyColumn = #RC.KeyColumn

PRINT 'LEFT OUTER JOIN WITH LEFT OUTER JOIN BELOW'
SELECT #L.*
,#R.*
,#RC.*
FROM #Left #L
LEFT OUTER JOIN #Right #R
ON #L.KeyColumn = #R.KeyColumn
LEFT OUTER JOIN #RightChild #RC
ON #R.KeyColumn = #RC.KeyColumn

PRINT 'LEFT OUTER JOIN WITH NESTED INNER JOIN'
SELECT #L.*
,#R.*
,#RC.*
FROM #Left #L
LEFT OUTER JOIN #Right #R
JOIN #RightChild #RC
ON #R.KeyColumn = #RC.KeyColumn
ON #L.KeyColumn = #R.KeyColumn

PRINT 'LEFT OUTER JOIN WITH DERIVED TABLE'
SELECT #L.*
,DT.*
FROM #Left #L
LEFT OUTER JOIN
(
SELECT #R.KeyColumn
,#R.DataColumn
,#RC.DataColumn As ChildDataColumn
FROM
#Right #R
JOIN #RightChild #RC
ON #R.KeyColumn = #RC.KeyColumn
) DT
ON #L.KeyColumn = DT.KeyColumn

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.