SQL 2005 Transaction Rollback Hung–unresolved deadlock
- by steveh99999
Encountered an interesting issue recently with a SQL 2005 sp3 Enterprise Edition system. Every weekend, a full database reindex was being run on this system – normally this took around one and a half hours. Then, one weekend, the job ran for over 17 hours - and had yet to complete... At this point, DBA cancelled the job. Job status is now cancelled – issue over… However, cancelling the job had not killed the reindex transaction – DBCC OPENTRAN was still showing the transaction being open. The oldest open transaction in the database was now over 17 hours old. Consequently, transaction log % used growing dramatically and locks still being held in the database... Further attempts to kill the transaction did nothing. ie we had a transaction which could not be killed. In sysprocesses, it was apparent the SPID was in rollback status, but the spid was not accumulating CPU or IO. Was the SPID stuck ? On examination of the SQL errorlog – shortly after the reindex had started, a whole bunch of deadlock output had been produced by trace flag 1222. Then this :- spid5s ***Stack Dump being sent to xxxxxxx\SQLDump0042.txt spid5s * ******************************************************************************* spid5s * spid5s * BEGIN STACK DUMP: spid5s * 12/05/10 01:04:47 spid 5 spid5s * spid5s * Unresolved deadlock spid5s * spid5s * spid5s * ******************************************************************************* spid5s * ------------------------------------------------------------------------------- spid5s * Short Stack Dump spid5s Stack Signature for the dump is 0x000001D7 spid5s External dump process return code 0x20000001. Unresolved deadlock – don’t think I’ve ever seen one of these before…. A quick call to Microsoft support confirmed the following bug had been hit :- http://support.microsoft.com/kb/961479 So, only option to get rid of the hung spid – to restart SQL Server… Fortunately SQL Server restarted without any issues. I was pleasantly surprised to see that recovery on this particular database was fast. However, restarting SQL Server to fix an issue is not something I would normally rush to do... Short term fix – the reindex was changed to use MAXDOP of 1. Longer term fix will be to apply the correct CU, or wait for SQL 2005 sp 4 ?? This should be released any day soon I hope..