Ghost Records, Backups, and Database Compression…With a Pinch of Security Considerations
- by Argenis
Today Jeffrey Langdon (@jlangdon) posed on #SQLHelp the following questions: So I set to answer his question, and I said to myself: “Hey, I haven’t blogged in a while, how about I blog about this particular topic?”. Thus, this post was born. (If you have never heard of Ghost Records and/or the Ghost Cleanup Task, go see this blog post by Paul Randal) 1) Do ghost records get copied over in a backup? If you guessed yes, you guessed right. The backup process in SQL Server takes all data as it is on disk – it doesn’t crack the pages open to selectively pick which slots have actual data and which ones do not. The whole page is backed up, regardless of its contents. Even if ghost cleanup has run and processed the ghost records, the slots are not overwritten immediately, but rather until another DML operation comes along and uses them. As a matter of fact, all of the allocated space for a database will be included in a full backup. So, this poses a bit of a security/compliance problem for some of you DBA folk: if you want to take a full backup of a database after you’ve purged sensitive data, you should rebuild all of your indexes (with FILLFACTOR set to 100%). But the empty space on your data file(s) might still contain sensitive data! A SHRINKFILE might help get rid of that (not so) empty space, but that might not be the end of your troubles. You might _STILL_ have (not so) empty space on your files! One approach that you can follow is to export all of the data on your database to another SQL Server instance that does NOT have Instant File Initialization enabled. This can be a tedious and time-consuming process, though. So you have to weigh in your options and see what makes sense for you. Snapshot Replication is another idea that comes to mind. 2) Does Compression get rid of ghost records (2008)? The answer to this is no. The Ghost Records/Ghost Cleanup Task mechanism is alive and well on compressed tables and indexes. You can prove this running a simple script: CREATE DATABASE GhostRecordsTest GO USE GhostRecordsTest GO CREATE TABLE myTable (myPrimaryKey int IDENTITY(1,1) PRIMARY KEY CLUSTERED, myWideColumn varchar(1000) NOT NULL DEFAULT 'Default string value') ALTER TABLE myTable REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = PAGE) GO INSERT INTO myTable DEFAULT VALUES GO 10 DELETE myTable WHERE myPrimaryKey % 2 = 0 DBCC TRACEON(2514) DBCC CHECKTABLE(myTable) TraceFlag 2514 will make DBCC CHECKTABLE give you an extra tidbit of information on its output. For the above script: “Ghost Record count = 5” Until next time, -Argenis