Resolving data redundancy up front
- by okeofs
Introduction
As all of us
do when confronted with a problem, the resource of choice is to ‘Google it’.
This is
where the plot thickens. Recently I was asked to stage data from numerous
databases which were to be loaded into a data warehouse. To make a long story
short, I was looking for a manner in which to obtain the table names from each
database, to ascertain potential overlap.
As the
source data comes from a SQL database created from dumps of a third party
product, one could say that there were
+/- 95 tables for each database.
Yes I know
that first instinct is to use the system stored procedure “exec sp_msforeachdb 'select "?" AS db,
* from [?].sys.tables'”.
However, if one
stops to think about this, it would be nice to have all the results in a temporary
or disc based table; which in itself , implies
additional labour.
This said, I decided to ‘re-invent’ the wheel. The full
code sample may be found at the bottom of this article.
Define a few temporary tables and variables
declare @SQL varchar(max);
declare
@databasename varchar(75)
/*
drop table
##rawdata3
drop table
#rawdata1
drop table
#rawdata11
*/
-- A temp table
to hold the names of my databases
CREATE TABLE #rawdata1
(
database_name varchar(50) ,
database_size varchar(50),
remarks Varchar(50)
)
--A temp table
with the same database names as above, HOWEVER using an
--Identity
number (recNO) as a loop variable.
--You will
note below that I loop through until I reach 25 (see below) as at
--that point the
system databases, the reporting server database etc begin.
--1- 24 are
user databases. These are really what I was looking for.
--Whilst NOT
the best solution,it works and the code was meant as a quick
--and dirty.
CREATE TABLE #rawdata11
(
recNo int identity(1,1),
database_name varchar(50) ,
database_size varchar(50),
remarks Varchar(50)
)
--My output
table showing the database name and table name
CREATE TABLE ##rawdata3
(
database_name varchar(75) ,
table_name varchar(75),
)
Insert the database names into a
temporary table
I pull the
database names using the system stored procedure sp_databases
INSERT INTO #rawdata1
EXEC sp_databases
Go
Insert the results
from #rawdata1 into a table containing a record number #rawdata11 so that I can LOOP through the
extract
INSERT into #rawdata11
select * from #rawdata1
We now declare 3 more variables:
@kounter
is used to keep track of our position within the loop.
@databasename is used to keep track of the’ current
‘ database name being used in the current pass of the loop; as inorder to obtain the tables for that
database we need to issue a ‘USE’
statement, an insert command and other related code parts. This is the challenging
part.
@sql is a varchar(max) variable used to contain
the ‘USE’ statement PLUS the’ insert ‘ code statements.
We now initalize @kounter to 1 .
declare @kounter int;
declare
@databasename varchar(75);
declare @sql varchar(max);
set @kounter =
1
The Loop
The astute reader
will remember that the temporary table #rawdata11 contains our database names and each ‘database row’ has a record number
(recNo). I am only interested in record numbers under 25. I now set the value
of the temporary variable @DatabaseName (see below) .Note that I used the row
number as a part of the predicate.
Now, knowing the
database name, I can create dynamic T-SQL to be executed using the sp_sqlexec
stored procedure (see the code in red below).
Finally, after all
the tables for that given database have been placed in temporary table ##rawdata3,
I increment the counter and continue on.
Note that I used a
global temporary table to ensure that the result set persists after the termination
of the run.
At some stage, I
plan to redo this part of the code, as global temporary tables are not really
an ideal solution.
WHILE (@kounter < 25)
BEGIN
select
@DatabaseName = database_name from #rawdata11 where
recNo = @kounter
set @SQL = 'Use ' + @DatabaseName +
' Insert into
##rawdata3 ' +
+ ' SELECT table_catalog,Table_name FROM
information_schema.tables'
exec sp_sqlexec @Sql
SET
@kounter =
@kounter + 1
END
The full code extract
Here is the
full code sample.
declare @SQL varchar(max);
declare
@databasename varchar(75)
/*
drop table
##rawdata3
drop table
#rawdata1
drop table
#rawdata11
*/
CREATE TABLE #rawdata1
(
database_name varchar(50) ,
database_size varchar(50),
remarks Varchar(50)
)
CREATE TABLE #rawdata11
(
recNo int identity(1,1),
database_name varchar(50) ,
database_size varchar(50),
remarks Varchar(50)
)
CREATE TABLE ##rawdata3
(
database_name varchar(75) ,
table_name varchar(75),
)
INSERT INTO #rawdata1
EXEC sp_databases
go
INSERT into #rawdata11
select * from #rawdata1
declare @kounter int;
declare
@databasename varchar(75);
declare @sql varchar(max);
set @kounter = 1
WHILE (@kounter < 25)
BEGIN
select
@databasename = database_name from #rawdata11 where
recNo = @kounter
set @SQL = 'Use ' + @DatabaseName +
' Insert into
##rawdata3 ' +
+ ' SELECT table_catalog,Table_name FROM
information_schema.tables'
exec sp_sqlexec @Sql
SET
@kounter =
@kounter + 1
END
select * from ##rawdata3
where
table_name like '%SalesOrderHeader%'