How do large blobs affect SQL delete performance, and how can I mitigate the impact?
- by Max Pollack
I'm currently experiencing a strange issue that my understanding of SQL Server doesn't quite mesh with. We use SQL as our file storage for our internal storage service, and our database has about half a million rows in it. Most of the files (86%) are 1mb or under, but even on fresh copies of our database where we simply populate the table with data for the purposes of a test, it appears that rows with large amounts of data stored in a BLOB frequently cause timeouts when our SQL Server is under load.
My understanding of how SQL Server deletes rows is that it's a garbage collection process, i.e. the row is marked as a ghost and the row is later deleted by the ghost cleanup process after the changes are copied to the transaction log. This suggests to me that regardless of the size of the data in the blob, row deletion should be close to instantaneous. However when deleting these rows we are definitely experiencing large numbers of timeouts and astoundingly low performance.
In our test data set, its files over 30mb that cause this issue. This is an edge case, we don't frequently encounter these, and even though we're looking into SQL filestream as a solution to some of our problems, we're trying to narrow down where these issues are originating from.
We ARE performing our deletes inside of a transaction. We're also performing updates to metadata such as file size stats, but these exist in a separate table away from the file data itself. Hierarchy data is stored in the table that contains the file information.
Really, in the end it's not so much what we're doing around the deletes that matters, we just can't find any references to low delete performance on rows that contain a large amount of data in a BLOB. We are trying to determine if this is even an avenue worth exploring, or if it has to be one of our processes around the delete that's causing the issue.
Are there any situations in which this could occur? Is it common for a database server to come to the point of complete timeouts when many of these deletes are occurring simultaneously? Is there a way to combat this issue if it exists?
(cross-posted from StackOverflow )