Best way to randomly select rows *per* column in SQL Server

Posted by LesterDove on Stack Overflow See other posts from Stack Overflow or by LesterDove
Published on 2010-04-28T16:47:57Z Indexed on 2010/04/28 17:23 UTC
Read the original article Hit count: 327

Filed under:
|
|

A search of SO yields many results describing how to select random rows of data from a database table. My requirement is a bit different, though, in that I'd like to select individual columns from across random rows in the most efficient/random/interesting way possible.

To better illustrate: I have a large Customers table, and from that I'd like to generate a bunch of fictitious demo Customer records that aren't real people. I'm thinking of just querying randomly from the Customers table, and then randomly pairing FirstNames with LastNames, Address, City, State, etc.

So if this is my real Customer data (simplified):

FirstName  LastName  State  
==========================
Sally      Simpson   SD
Will       Warren    WI    
Mike       Malone    MN
Kelly      Kline     KS

Then I'd generate several records that look like this:

FirstName  LastName  State  
==========================
Sally      Warren    MN
Kelly      Malone    SD

Etc.

My initial approach works, but it lacks the elegance that I'm hoping the final answer will provide. (I'm particularly unhappy with the repetitiveness of the subqueries, and the fact that this solution requires a known/fixed number of fields and therefore isn't reusable.)

SELECT 
FirstName = (SELECT TOP 1 FirstName FROM Customer ORDER BY newid()),
LastName= (SELECT TOP 1 LastNameFROM Customer ORDER BY newid()),
State = (SELECT TOP 1 State FROM Customer ORDER BY newid())

Thanks!

© Stack Overflow or respective owner

Related posts about sql

Related posts about tsql