SQL SERVER – Select and Delete Duplicate Records – SQL in Sixty Seconds #036 – Video

Posted by pinaldave on SQL Authority See other posts from SQL Authority or by pinaldave
Published on Wed, 19 Dec 2012 01:30:09 +0000 Indexed on 2012/12/19 5:06 UTC
Read the original article Hit count: 733

Filed under:

database

|

Pinal Dave

|

PostADay

|

sql

|

SQL Authority

|

SQL in Sixty Seconds

|

SQL Query

|

SQL Scripts

|

SQL Server

|

SQL Server Management Stu

|

SQL Tips and Tricks

|

T SQL

|

Technology

|

video

|

excel

Developers often face situations when they find their column have duplicate records and they want to delete it. A good developer will never delete any data without observing it and making sure that what is being deleted is the absolutely fine to delete. Before deleting duplicate data, one should select it and see if the data is really duplicate.

In this video we are demonstrating two scripts – 1) selects duplicate records 2) deletes duplicate records.

We are assuming that the table has a unique incremental id. Additionally, we are assuming that in the case of the duplicate records we would like to keep the latest record. If there is really a business need to keep unique records, one should consider to create a unique index on the column. Unique index will prevent users entering duplicate data into the table from the beginning. This should be the best solution. However, deleting duplicate data is also a very valid request. If user realizes that they need to keep only unique records in the column and if they are willing to create unique constraint, the very first requirement of creating a unique constraint is to delete the duplicate records.

Let us see how to connect the values in Sixty Seconds:

Here is the script which is used in the video.

USE tempdb GO CREATE TABLE TestTable (ID INT, NameCol VARCHAR(100)) GO INSERT INTO TestTable (ID, NameCol) SELECT 1, 'First' UNION ALL SELECT 2, 'Second' UNION ALL SELECT 3, 'Second' UNION ALL SELECT 4, 'Second' UNION ALL SELECT 5, 'Second' UNION ALL SELECT 6, 'Third' GO -- Selecting Data SELECT * FROM TestTable GO -- Detecting Duplicate SELECT NameCol, COUNT(*) TotalCount FROM TestTable GROUP BY NameCol HAVING COUNT(*) > 1 ORDER BY COUNT(*) DESC GO -- Deleting Duplicate DELETE FROM TestTable WHERE ID NOT IN ( SELECT MAX(ID) FROM TestTable GROUP BY NameCol) GO -- Selecting Data SELECT * FROM TestTable GO DROP TABLE TestTable GO

Related Tips in SQL in Sixty Seconds:

What would you like to see in the next SQL in Sixty Seconds video?

Reference: Pinal Dave (http://blog.sqlauthority.com)

Filed under: Database, Pinal Dave, PostADay, SQL, SQL Authority, SQL in Sixty Seconds, SQL Query, SQL Scripts, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL, Technology, Video Tagged: Excel

Developer IT