T-SQL - Left Outer Joins - Filters in the where clause versus the on clause.
- by Greg Potter
I am trying to compare two tables to find rows in each table that is not in the other. Table 1 has a groupby column to create 2 sets of data within table one.
groupby number
----------- -----------
1 1
1 2
2 1
2 2
2 4
Table 2 has only one column.
number
-----------
1
3
4
So Table 1 has the values 1,2,4 in group 2 and Table 2 has the values 1,3,4.
I expect the following result when joining for Group 2:
`Table 1 LEFT OUTER Join Table 2`
T1_Groupby T1_Number T2_Number
----------- ----------- -----------
2 2 NULL
`Table 2 LEFT OUTER Join Table 1`
T1_Groupby T1_Number T2_Number
----------- ----------- -----------
NULL NULL 3
The only way I can get this to work is if I put a where clause for the first join:
PRINT 'Table 1 LEFT OUTER Join Table 2, with WHERE clause'
select table1.groupby as [T1_Groupby],
table1.number as [T1_Number],
table2.number as [T2_Number]
from table1
LEFT OUTER join table2
--******************************
on table1.number = table2.number
--******************************
WHERE table1.groupby = 2
AND table2.number IS NULL
and a filter in the ON for the second:
PRINT 'Table 2 LEFT OUTER Join Table 1, with ON clause'
select table1.groupby as [T1_Groupby],
table1.number as [T1_Number],
table2.number as [T2_Number]
from table2
LEFT OUTER join table1
--******************************
on table2.number = table1.number
AND table1.groupby = 2
--******************************
WHERE table1.number IS NULL
Can anyone come up with a way of not using the filter in the on clause but in the where clause?
The context of this is I have a staging area in a database and I want to identify new records and records that have been deleted. The groupby field is the equivalent of a batchid for an extract and I am comparing the latest extract in a temp table to a the batch from yesterday stored in a partioneds table, which also has all the previously extracted batches as well. Code to create table 1 and 2:
create table table1 (number int, groupby int)
create table table2 (number int)
insert into table1 (number, groupby) values (1, 1)
insert into table1 (number, groupby) values (2, 1)
insert into table1 (number, groupby) values (1, 2)
insert into table2 (number) values (1)
insert into table1 (number, groupby) values (2, 2)
insert into table2 (number) values (3)
insert into table1 (number, groupby) values (4, 2)
insert into table2 (number) values (4)
EDIT:
A bit more context - depending on where I put the filter I different results. As stated above the where clause gives me the correct result in one state and the ON in the other. I am looking for a consistent way of doing this.
Where -
select table1.groupby as [T1_Groupby],
table1.number as [T1_Number],
table2.number as [T2_Number]
from table1
LEFT OUTER join table2
--******************************
on table1.number = table2.number
--******************************
WHERE table1.groupby = 2
AND table2.number IS NULL
Result:
T1_Groupby T1_Number T2_Number
----------- ----------- -----------
2 2 NULL
On -
select table1.groupby as [T1_Groupby],
table1.number as [T1_Number],
table2.number as [T2_Number]
from table1
LEFT OUTER join table2
--******************************
on table1.number = table2.number
AND table1.groupby = 2
--******************************
WHERE table2.number IS NULL
Result:
T1_Groupby T1_Number T2_Number
----------- ----------- -----------
1 1 NULL
2 2 NULL
1 2 NULL
Where (table 2 this time) -
select table1.groupby as [T1_Groupby],
table1.number as [T1_Number],
table2.number as [T2_Number]
from table2
LEFT OUTER join table1
--******************************
on table2.number = table1.number
AND table1.groupby = 2
--******************************
WHERE table1.number IS NULL
Result:
T1_Groupby T1_Number T2_Number
----------- ----------- -----------
NULL NULL 3
On -
select table1.groupby as [T1_Groupby],
table1.number as [T1_Number],
table2.number as [T2_Number]
from table2
LEFT OUTER join table1
--******************************
on table2.number = table1.number
--******************************
WHERE table1.number IS NULL
AND table1.groupby = 2
Result:
T1_Groupby T1_Number T2_Number
----------- ----------- -----------
(0) rows returned