Normalizing a table

Posted by Alex on Stack Overflow See other posts from Stack Overflow or by Alex
Published on 2012-03-20T20:59:26Z Indexed on 2012/03/20 23:30 UTC
Read the original article Hit count: 191

Filed under:
|

I have a legacy table, which I can't change. The values in it can be modified from legacy application (application also can't be changed). Due to a lot of access to the table from new application (new requirement), I'd like to create a temporary table, which would hopefully speed up the queries.

The actual requirement, is to calculate number of business days from X to Y. For example, give me all business days from Jan 1'st 2001 until Dec 24'th 2004. The table is used to mark which days are off, as different companies may have different days off - it isn't just Saturday + Sunday)

The temporary table would be created from a .NET program, each time user enters the screen for this query (user may run query multiple times, with different values, table is created once), so I'd like it to be as fast as possible. Approach below runs in under a second, but I only tested it with a small dataset, and still it takes probably close to half a second, which isn't great for UI - even though it's just the overhead for first query.

The legacy table looks like this:

CREATE TABLE [business_days](
    [country_code] [char](3) ,
    [state_code] [varchar](4) ,
    [calendar_year] [int] ,
    [calendar_month] [varchar](31) ,
    [calendar_month2] [varchar](31) ,
    [calendar_month3] [varchar](31) ,
    [calendar_month4] [varchar](31) ,
    [calendar_month5] [varchar](31) ,
    [calendar_month6] [varchar](31) ,
    [calendar_month7] [varchar](31) ,
    [calendar_month8] [varchar](31) ,
    [calendar_month9] [varchar](31) ,
    [calendar_month10] [varchar](31) ,
    [calendar_month11] [varchar](31) ,
    [calendar_month12] [varchar](31) ,
misc.
)

Each month has 31 characters, and any day off (Saturday + Sunday + holiday) is marked with X. Each half day is marked with an 'H'. For example, if a month starts on a Thursday, than it will look like (Thursday+Friday workdays, Saturday+Sunday marked with X):

'  XX     XX ..'

I'd like the new table to look like so:

create table #Temp (country varchar(3), state varchar(4), date datetime, hours int)

And I'd like to only have rows for days which are off (marked with X or H from previous query)

What I ended up doing, so far is this: Create a temporary-intermediate table, that looks like this:

create table #Temp_2 (country_code varchar(3), state_code varchar(4), calendar_year int, calendar_month varchar(31), month_code int)

To populate it, I have a union which basically unions calendar_month, calendar_month2, calendar_month3, etc.

Than I have a loop which loops through all the rows in #Temp_2, after each row is processed, it is removed from #Temp_2. To process the row there is a loop from 1 to 31, and substring(calendar_month, counter, 1) is checked for either X or H, in which case there is an insert into #Temp table. [edit added code]

Declare @country_code char(3)
Declare @state_code varchar(4)
Declare @calendar_year int
Declare @calendar_month varchar(31)
Declare @month_code int
Declare @calendar_date datetime
Declare @day_code int
WHILE EXISTS(SELECT * From #Temp_2) -- where processed = 0)
BEGIN
    Select Top 1 @country_code = t2.country_code, @state_code = t2.state_code, @calendar_year = t2.calendar_year, @calendar_month = t2.calendar_month, @month_code = t2.month_code From #Temp_2 t2 -- where processed = 0

    set @day_code = 1
    while @day_code <= 31
    begin
        if substring(@calendar_month, @day_code, 1) = 'X'
        begin
            set @calendar_date = convert(datetime, (cast(@month_code as varchar) + '/' + cast(@day_code as varchar) + '/' + cast(@calendar_year as varchar)))
            insert into #Temp (country, state, date, hours) values (@country_code, @state_code, @calendar_date, 8)
        end
        if substring(@calendar_month, @day_code, 1) = 'H'
        begin
            set @calendar_date = convert(datetime, (cast(@month_code as varchar) + '/' + cast(@day_code as varchar) + '/' + cast(@calendar_year as varchar)))
            insert into #Temp (country, state, date, hours) values (@country_code, @state_code, @calendar_date, 4)
        end

        set @day_code = @day_code + 1
    end
    delete from #Temp_2 where @country_code = country_code AND @state_code = state_code AND @calendar_year = calendar_year AND @calendar_month = calendar_month AND @month_code = month_code
    --update #Temp_2 set processed = 1 where @country_code = country_code AND @state_code = state_code AND @calendar_year = calendar_year AND @calendar_month = calendar_month AND @month_code = month_code
END

I am not an expert in SQL, so I'd like to get some input on my approach, and maybe even a much better approach suggestion.

After having the temp table, I'm planning to do (dates would be coming from a table):

select cast(convert(datetime, ('01/31/2012'), 101) -convert(datetime, ('01/17/2012'), 101) as int) -  ((select sum(hours) from #Temp where date between convert(datetime, ('01/17/2012'), 101) and convert(datetime, ('01/31/2012'), 101)) / 8)

Besides the solution of normalizing the table, the other solution I implemented for now, is a function which does all this logic of getting the business days by scanning the current table. It runs pretty fast, but I'm hesitant to call a function, if I can instead add a simpler query to get result.

(I'm currently trying this on MSSQL, but I would need to do same for Sybase ASE and Oracle)

© Stack Overflow or respective owner

Related posts about sql-server

Related posts about tsql