Categorize data without consolidating?
Posted
by sqlnoob
on Stack Overflow
See other posts from Stack Overflow
or by sqlnoob
Published on 2010-05-27T01:49:22Z
Indexed on
2010/05/27
10:11 UTC
Read the original article
Hit count: 106
I have a table with about 1000 records and 2000 columns. What I want to do is categorize each row such that all records with equal column values for all columns except 'ID' are given a category ID. My final answer would look like:
ID A B C ..... Category ID 1 1 0 3 1 2 2 1 3 2 3 1 0 3 1 4 2 1 3 2 5 4 5 6 3 6 4 5 6 3
where all columns (besides ID) are equal for IDs 1,3 so they get the same category ID and so on.
I guess my thought was to just write a SQL query that does a group by on every single column besides 'ID' and assign a number to each group and then join back to my original table. My current input is a text file, and I have SAS, MS Access, and Excel to work with. (I could use proc sql from within SAS).
Before I go this route and construct the whole query, I was just wondering if there was a better way to do this? It will take some work just to write the query, and I'm not even sure if it is practical to join on 2000 columns (never tried), so I thought I'd ask for ideas before I got too far down the wrong path.
EDIT: I just realized my title doesn't really make sense. What I was originally thinking was "Is there a way I can group by and categorize at the same time without actually consolidating into groups?"
© Stack Overflow or respective owner