Building Stored Procedure to group data into ranges with roughly equal results in each bucket
Posted
by Len
on Stack Overflow
See other posts from Stack Overflow
or by Len
Published on 2010-03-28T21:24:32Z
Indexed on
2010/03/28
21:43 UTC
Read the original article
Hit count: 281
sql-server
|stored-procedures
I am trying to build one procedure to take a large amount of data and create 5 range buckets to display the data. the buckets ranges will have to be set according to the results.
Here is my existing SP
GO
/****** Object: StoredProcedure [dbo].[sp_GetRangeCounts] Script Date: 03/28/2010 19:50:45 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[sp_GetRangeCounts]
@idMenu int
AS
declare
@myMin decimal(19,2),
@myMax decimal(19,2),
@myDif decimal(19,2),
@range1 decimal(19,2),
@range2 decimal(19,2),
@range3 decimal(19,2),
@range4 decimal(19,2),
@range5 decimal(19,2),
@range6 decimal(19,2)
SELECT @myMin=Min(modelpropvalue), @myMax=Max(modelpropvalue)
FROM xmodelpropertyvalues where modelPropUnitDescriptionID=@idMenu
set @myDif=(@myMax-@myMin)/5
set @range1=@myMin
set @range2=@myMin+@myDif
set @range3=@range2+@myDif
set @range4=@range3+@myDif
set @range5=@range4+@myDif
set @range6=@range5+@myDif
select @myMin,@myMax,@myDif,@range1,@range2,@range3,@range4,@range5,@range6
select t.range as myRange, count(*) as myCount
from (
select case
when modelpropvalue between @range1 and @range2 then 'range1'
when modelpropvalue between @range2 and @range3 then 'range2'
when modelpropvalue between @range3 and @range4 then 'range3'
when modelpropvalue between @range4 and @range5 then 'range4'
when modelpropvalue between @range5 and @range6 then 'range5'
end as range
from xmodelpropertyvalues where modelpropunitDescriptionID=@idmenu) t
group by t.range order by t.range
This calculates the min and max value from my table, works out the difference between the two and creates 5 buckets. The problem is that if there are a small amount of very high (or very low) values then the buckets will appear very distorted - as in these results...
range1 2806
range2 296
range3 75
range5 1
Basically I want to rebuild the SP so it creates buckets with equal amounts of results in each. I have played around with some of the following approaches without quite nailing it...
SELECT modelpropvalue, NTILE(5) OVER (ORDER BY modelpropvalue) FROM xmodelpropertyvalues - this creates a new column with either 1,2,3,4 or 5 in it
ROW_NUMBER()OVER (ORDER BY modelpropvalue) between @range1 and @range2
ROW_NUMBER()OVER (ORDER BY modelpropvalue) between @range2 and @range3
- or maybe i could allocate every record a row number then divide into ranges from this?
© Stack Overflow or respective owner