ZFS - destroying deduplicated zvol or data set stalls the server. How to recover?

Posted by ewwhite on Server Fault See other posts from Server Fault or by ewwhite
Published on 2011-02-11T16:51:15Z Indexed on 2011/02/12 7:27 UTC
Read the original article Hit count: 566

Filed under:
|
|
|

I'm using Nexentastor on a secondary storage server running on an HP ProLiant DL180 G6 with 12 Midline (7200 RPM) SAS drives. The system has an E5620 CPU and 8GB RAM. There is no ZIL or L2ARC device.

Last week, I created a 750GB sparse zvol with dedup and compression enabled to share via iSCSI to a VMWare ESX host. I then created a Windows 2008 file server image and copied ~300GB of user data to the VM. Once happy with the system, I moved the virtual machine to an NFS store on the same pool.

Once up and running with my VMs on the NFS datastore, I decided to remove the original 750GB zvol. Doing so stalled the system. Access to the Nexenta web interface and NMC halted. I was eventually able to get to a raw shell. Most OS operations were fine, but the system was hanging on the zfs destroy -r vol1/filesystem command. Ugly. I found the following two OpenSolaris bugzilla entries and now understand that the machine will be bricked for an unknown period of time. It's been 14 hours, so I need a plan to be able to regain access to the server.

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6924390

and

http://bugs.opensolaris.org/bugdatabase/view_bug.do;jsessionid=593704962bcbe0743d82aa339988?bug_id=6924824

In the future, I'll probably take the advice given in one of the buzilla workarounds:

Workaround
    Do not use dedupe, and do not attempt to destroy zvols that had dedupe enabled.

Update: I had to force the system to power off. Upon reboot, the system stalls at Importing zfs filesystems. It's been that way for 2 hours now.

© Server Fault or respective owner

Related posts about zfs

Related posts about opensolaris