SQL excels at operating on dataset. In this post, I will discuss how to implement basic set operations in transact SQL (TSQL). The operations that I am going to discuss are union, intersection and complement (subtraction).
Union
Intersection
Complement (subtraction)
Implementing set operations using union, intersect and except
We can use TSQL keywords union, intersect and except to implement set operations. Since we are in an election year, I will use voter records of propositions as an example. We create the following table and insert 6 records into the table.
declare @votes table (VoterId int, PropId int)
insert into @votes values (1, 30)
insert into @votes values (2, 30)
insert into @votes values (3, 30)
insert into @votes values (4, 30)
insert into @votes values (4, 31)
insert into @votes values (5, 31)
Voters 1, 2, 3 and 4 voted for proposition 30 and voters 4 and 5 voted for proposition 31.
The following TSQL statement implements union using the union keyword. The union returns voters who voted for either proposition 30 or 31.
select VoterId from @votes where PropId = 30
union
select VoterId from @votes where PropId = 31
The following TSQL statement implements intersection using the intersect keyword. The intersection will return voters who voted only for both proposition 30 and 31.
select VoterId from @votes where PropId = 30
intersect
select VoterId from @votes where PropId = 31
The following TSQL statement implements complement using the except keyword. The complement will return voters who voted for proposition 30 but not 31.
select VoterId from @votes where PropId = 30
except
select VoterId from @votes where PropId = 31 Implementing set operations using join
An alternative way to implement set operation in TSQL is to use full outer join, inner join and left outer join.
The following TSQL statement implements union using full outer join.
select Coalesce(A.VoterId, B.VoterId)
from (select VoterId from @votes where PropId = 30) A
full outer join (select VoterId from @votes where PropId = 31) B on A.VoterId = B.VoterId
The following TSQL statement implements intersection using inner join.
select Coalesce(A.VoterId, B.VoterId)
from (select VoterId from @votes where PropId = 30) A
inner join (select VoterId from @votes where PropId = 31) B on A.VoterId = B.VoterId
The following TSQL statement implements complement using left outer join.
select Coalesce(A.VoterId, B.VoterId)
from (select VoterId from @votes where PropId = 30) A
left outer join (select VoterId from @votes where PropId = 31) B on A.VoterId = B.VoterId
where B.VoterId is null Which one to choose?
To choose which technique to use, just keep two things in mind:
The union, intersect and except technique treats an entire record as a member.
The join technique allows the member to be specified in the “on” clause. However, it is necessary to use Coalesce function to project sets on the two sides of the join into a single set.