Getting the first of a GROUP BY clause in SQL
- by Michael Bleigh
I'm trying to implement single-column regionalization for a Rails application and I'm running into some major headaches with a complex SQL need. For this system, a region can be represented by a country code (e.g. us) a continent code that is uppercase (e.g. NA) or it can be NULL indicating the "default" information. I need to group these items by some relevant information such as a foreign key (we'll call it external_id).
Given a country and its continent, I need to be able to select only the most specific region available. So if records exist with the country code, I select them. If, not I want a records with the continent code. If not that, I want records with a NULL code so I can receive the default values.
So far I've figured that I may be able to use a generated CASE statement to get an arbitrary sort order. Something like this:
SELECT *, CASE region
WHEN 'us' THEN 1
WHEN 'NA' THEN 2
ELSE 3
END AS region_sort
FROM my_table
WHERE region IN ('us','NA') OR region IS NULL
GROUP BY external_id
ORDER BY region_sort
The problem is that without an aggregate function the actual data returned by the GROUP BY for a given row seems to be untameable. How can I massage this query to make it return only the first record of the region_sort ordered groups?