Remove redundant SQL code
- by Dave Jarvis
Code
The following code calculates the slope and intercept for a linear regression against a slathering of data. It then applies the equation y = mx + b against the same result set to calculate the value of the regression line for each row.
Can the two separate sub-selects be joined so that the data and its slope/intercept are calculated without executing the data gathering part of the query twice?
SELECT
AVG(D.AMOUNT) as AMOUNT,
Y.YEAR * ymxb.SLOPE + ymxb.INTERCEPT as REGRESSION_LINE,
Y.YEAR as YEAR,
MAKEDATE(Y.YEAR,1) as AMOUNT_DATE
FROM
CITY C,
STATION S,
YEAR_REF Y,
MONTH_REF M,
DAILY D,
(SELECT
((avg(t.AMOUNT * t.YEAR)) - avg(t.AMOUNT) * avg(t.YEAR)) /
(stddev( t.AMOUNT ) * stddev( t.YEAR )) as CORRELATION,
((sum(t.YEAR) * sum(t.AMOUNT)) - (count(1) * sum(t.YEAR * t.AMOUNT))) /
(power(sum(t.YEAR), 2) - count(1) * sum(power(t.YEAR, 2))) as SLOPE,
((sum( t.YEAR ) * sum( t.YEAR * t.AMOUNT )) -
(sum( t.AMOUNT ) * sum(power(t.YEAR, 2)))) /
(power(sum(t.YEAR), 2) - count(1) * sum(power(t.YEAR, 2))) as INTERCEPT
FROM (
SELECT
AVG(D.AMOUNT) as AMOUNT,
Y.YEAR as YEAR,
MAKEDATE(Y.YEAR,1) as AMOUNT_DATE
FROM
CITY C,
STATION S,
YEAR_REF Y,
MONTH_REF M,
DAILY D
WHERE
$X{ IN, C.ID, CityCode } AND
SQRT(
POW( C.LATITUDE - S.LATITUDE, 2 ) +
POW( C.LONGITUDE - S.LONGITUDE, 2 ) ) < $P{Radius} AND
S.STATION_DISTRICT_ID = Y.STATION_DISTRICT_ID AND
Y.YEAR BETWEEN 1900 AND 2009 AND
M.YEAR_REF_ID = Y.ID AND
M.CATEGORY_ID = $P{CategoryCode} AND
M.ID = D.MONTH_REF_ID AND
D.DAILY_FLAG_ID <> 'M'
GROUP BY
Y.YEAR
) t
) ymxb
WHERE
$X{ IN, C.ID, CityCode } AND
SQRT(
POW( C.LATITUDE - S.LATITUDE, 2 ) +
POW( C.LONGITUDE - S.LONGITUDE, 2 ) ) < $P{Radius} AND
S.STATION_DISTRICT_ID = Y.STATION_DISTRICT_ID AND
Y.YEAR BETWEEN 1900 AND 2009 AND
M.YEAR_REF_ID = Y.ID AND
M.CATEGORY_ID = $P{CategoryCode} AND
M.ID = D.MONTH_REF_ID AND
D.DAILY_FLAG_ID <> 'M'
GROUP BY
Y.YEAR
Question
How do I execute the duplicate bits only once per query, instead of twice? The duplicate bit is the WHERE clause:
$X{ IN, C.ID, CityCode } AND
SQRT(
POW( C.LATITUDE - S.LATITUDE, 2 ) +
POW( C.LONGITUDE - S.LONGITUDE, 2 ) ) < $P{Radius} AND
S.STATION_DISTRICT_ID = Y.STATION_DISTRICT_ID AND
Y.YEAR BETWEEN 1900 AND 2009 AND
M.YEAR_REF_ID = Y.ID AND
M.CATEGORY_ID = $P{CategoryCode} AND
M.ID = D.MONTH_REF_ID AND
D.DAILY_FLAG_ID <> 'M'
Related
http://stackoverflow.com/questions/1595659/how-to-eliminate-duplicate-calculation-in-sql
Thank you!