Optimization of running total calculation in SQL for multiple values per join condition
- by Kiril
I have the following table (test_table):
date value
---------------
d1 10.0
d1 20.0
d2 60.0
d2 10.0
d2 -20.0
d3 40.0
I calculate the running total as follows. I use the same query twice, because first I need to calculate the values for a specifi date, and afterwards I can calculate the running total. Otherwise, joining the two tables where date is not unique, I would get too many results from the join:
SELECT t1.date, SUM(t2.value) AS total
FROM
(SELECT date, SUM(value) AS value FROM test_table GROUP BY date) AS t1
JOIN
(SELECT date, SUM(value) AS value FROM test_table GROUP BY date) AS t2
ON t1.date >= t2.date
GROUP BY t1.date
ORDER BY t1.date
This gives me (which is fine):
date total
-------------
d1 30.0
d2 80.0
d3 120.0
BUT, this query isn't very efficient, because I need to change conditions in two places, if necessary. In production, the test_table is a lot bigger ( 4 Mio. rows), and the query takes too much time to complete. Question: How can I avoid using the same query twice?