Complex SQL query help on aggregating values for nested subquery
- by François Beausoleil
Hi!
I have people, companies, employees, events and event kinds. I'm making a report/followup sheet where people, companies and employees are the rows, and the columns are event kinds.
Event kinds are simple values describing: "Promised Donation", "Received Donation", "Phoned", "Followed up" and such. Event kinds are ordered:
CREATE TABLE event_kinds (
id,
name,
position);
Events hold the actual reference to the event:
CREATE TABLE events (
id,
person_id,
company_id,
referrer_id,
event_kind_id,
created_at);
referrer_id is another reference to people. It is the person which sent the information/tip along, and is an optional field, although I sometimes want to filter on an event_kind that has a specific referrer, while I don't for other event kinds.
Notice I don't have an employee ID reference. The reference exists, but is implied. I have application code to validate that person_id and company_id really reference an employee record. The other tables are pretty basic:
CREATE TABLE people (
id, name);
CREATE TABLE companies (
id, name);
CREATE TABLE employees (
id, person_id, company_id);
I'm trying to achieve the following report:
Referrer Phoned Promised Donated
Francois Feb 16th Feb 20th Mar 1st
Apple (Steve Jobs) Steve Ballmer Mar 3rd
IBM Bill Gates Mar 7th
The first row is a people record, the 2nd is an employee, and the 3rd is a company. If I asked for referrer Bill Gates for Phoned event kinds, I'd only see the 3rd row, while asking for Steve and Phoned would return no rows.
Right now, I do 3 queries, one for companies, one for people and a last one for employees. I want the event kind columns to be ordered, but I do that in application code and show it properly there. Here's where I'm at so far:
SELECT companies.id,
companies.name,
(SELECT events.id FROM events WHERE events.referrer_id = 1470 AND events.company_id = companies.id AND events.person_id IS NULL AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9,
(SELECT events.id FROM events WHERE events.company_id = companies.id AND events.person_id IS NULL AND events.event_kind_id = 10 ORDER BY created_at DESC LIMIT 1) event_kind_10,
(SELECT events.created_at FROM events WHERE events.referrer_id = 1470 AND events.company_id = companies.id AND events.person_id IS NULL AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9_order
FROM "companies"
SELECT people.id,
people.name,
(SELECT events.id FROM events WHERE events.referrer_id = 1470 AND events.company_id IS NULL AND events.person_id = people.id AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9,
(SELECT events.id FROM events WHERE events.company_id IS NULL AND events.person_id = people.id AND events.event_kind_id = 10 ORDER BY created_at DESC LIMIT 1) event_kind_10,
(SELECT events.created_at FROM events WHERE events.referrer_id = 1470 AND events.company_id IS NULL AND events.person_id = people.id AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9_order
FROM "people"
SELECT employees.id,
employees.company_id,
employees.person_id,
(SELECT events.id FROM events WHERE events.referrer_id = 1470 AND events.company_id = employees.company_id AND events.person_id = employees.person_id AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9,
(SELECT events.id FROM events WHERE events.company_id = employees.company_id AND events.person_id = employees.person_id AND events.event_kind_id = 10 ORDER BY created_at DESC LIMIT 1) event_kind_10,
(SELECT events.created_at FROM events WHERE events.referrer_id = 1470 AND events.company_id = employees.company_id AND events.person_id = employees.person_id AND events.event_kind_id = 9 ORDER BY created_at DESC LIMIT 1) event_kind_9_order
FROM "employees"
I rather suspect I'm doing this wrong. There should be an "easier" way to do it.
One other filter criteria would be to filter on people/company names: WHERE LOWER(companies.name) LIKE '%apple%'.
Note that I'm ordering by the dates of event_kind_9 here, and a secondary sort is by person/company name.
To summarize: I want to paginate the result set, find the latest event for each cell, order the result set by the date of the latest event, and by company/person name, filter by referrer in some event kinds, but not others.
For reference, I'm using PostgreSQL, from Ruby, ActiveRecord/Rails. The solution is pure SQL though.