How to obtain a random sub-datatable from another data table
- by developerit
Introduction
In this article, I’ll show how to get a random subset of data from a DataTable. This is useful when you already have queries that are filtered correctly but returns all the rows.
Analysis
I came across this situation when I wanted to display a random tag cloud. I already had the query to get the keywords ordered by number of clicks and I wanted to created a tag cloud. Tags that are the most popular should have more chance to get picked and should be displayed larger than less popular ones.
Implementation
In this code snippet, there is everything you need.
' Min size, in pixel for the tag
Private Const MIN_FONT_SIZE As Integer = 9
' Max size, in pixel for the tag
Private Const MAX_FONT_SIZE As Integer = 14
' Basic function that retreives Tags from a DataBase
Public Shared Function GetTags() As MediasTagsDataTable
	' Simple call to the TableAdapter, to get the Tags ordered by number of clicks
	Dim dt As MediasTagsDataTable = taMediasTags.GetDataValide
	' If the query returned no result, return an empty DataTable
	If dt Is Nothing OrElse dt.Rows.Count < 1 Then
		Return New MediasTagsDataTable
	End If
	' Set the font-size of the group of data
	' We are dividing our results into sub set, according to their number of clicks
	' Example: 10 results -> [0,2] will get font size 9, [3,5] will get font size 10, [6,8] wil get 11, ...
	' This is the number of elements in one group
	Dim groupLenth As Integer = CType(Math.Floor(dt.Rows.Count / (MAX_FONT_SIZE - MIN_FONT_SIZE)), Integer)
	' Counter of elements in the same group
	Dim counter As Integer = 0
	' Counter of groups
	Dim groupCounter As Integer = 0
	' Loop througt the list
	For Each row As MediasTagsRow In dt
		' Set the font-size in a custom column
		row.c_FontSize = MIN_FONT_SIZE + groupCounter
		' Increment the counter
		counter += 1
		' If the group counter is less than the counter
		If groupLenth <= counter Then
			' Start a new group
			counter = 0
			groupCounter += 1
		End If
	Next
	' Return the new DataTable with font-size
	Return dt
End Function
' Function that generate the random sub set
Public Shared Function GetRandomSampleTags(ByVal KeyCount As Integer) As MediasTagsDataTable
	' Get the data
	Dim dt As MediasTagsDataTable = GetTags()
	' Create a new DataTable that will contains the random set
	Dim rep As MediasTagsDataTable = New MediasTagsDataTable
	' Count the number of row in the new DataTable
	Dim count As Integer = 0
	' Random number generator
	Dim rand As New Random()
	While count < KeyCount
		Randomize()
		' Pick a random row
		Dim r As Integer = rand.Next(0, dt.Rows.Count - 1)
		Dim tmpRow As MediasTagsRow = dt(r)
		' Import it into the new DataTable
		rep.ImportRow(tmpRow)
		' Remove it from the old one, to be sure not to pick it again
		dt.Rows.RemoveAt(r)
		' Increment the counter
		count += 1
	End While
	' Return the new sub set
	Return rep
End Function
Pro’s
This method is good because it doesn’t require much work to get it work fast. It is a good concept when you are working with small tables, let says less than 100 records.
Con’s
If you have more than 100 records, out of memory exception may occur since we are coping and duplicating rows. I would consider using a stored procedure instead.