Windows NT Service shutdown issues
- by Jeremiah Gowdy
I have developed middleware that provides RPC functionality to multiple client applications on multiple platforms within our organization. The middleware is written in C# and runs as a Windows NT Service. It handles things like file access to network shares, database access, etc. The middleware is hosted on two high end systems running Windows Server 2008 R2.
When one of our server administrators goes to reboot the machine, primarily to do Windows Updates, there are serious problems with how the system behaves in regards to my NT Service. My service is designed to immediately stop listening for new connections, immediately start refusing new requests on existing connections, and otherwise shut down as rapidly as possible in the case of an OnStop or OnShutdown request from the SCM. Still, to maintain system integrity, operations that are currently in progress are allowed to continue for a reasonable time. Usually the server shuts down inside of 30 seconds (when the service is manually stopped for example). However, when the system is instructed to restart, my service immediately loses access to network drives and UNC paths, causing data integrity problems for any open files and partial writes to those locations. My service does list Workstation (and thus SMB Redirector) as a dependency, so I would think that my service would need to be stopped prior to Workstation/Redirector being stopped if Windows were honoring those dependencies.
Basically, my application is forced to crash and burn, failing remote procedure calls and eventually being forced to terminate by the operating system after a timeout period has elapsed (seems to be on the order of 20-30 seconds).
Unlike a Windows application, my Windows NT Service doesn't seem to have any power to stop a system shutdown in progress, delay the system shutdown, or even just the opportunity to save out any pending network share disk writes before being forcibly disconnected and shutdown. How is an NT Service developer supposed to have any kind of application integrity in this environment? Why is it that Forms Applications get all of the opportunity to finish their business prior to shutdown, while services seem to get no such benefits?
I have tried:
Calling SetProcessShutdownParameters via p/invoke to try to notify my application of the shutdown sooner to avoid Redirector shutting down before I do.
Calling ServiceBase.RequestAdditionalTime with a value less than or equal to the two minute limit.
Tweaking the WaitToKillServiceTimeout
Everything I can think of to make my service shutdown faster.
But in the end, I still get ~30 seconds of problematic time in which my service doesn't even seem to have been notified of an OnShutdown event yet, but requests are failing due to redirector no longer servicing my network share requests.
How is this issue meant to be resolved? What can I do to delay or stop the shutdown, or at least be allowed to shut down my active tasks without Redirector services disappearing out from under me? I can understand what Microsoft is trying to do to prevent services from dragging their feet and showing shutdowns, but that seems like a great goal for Windows client operating systems, not for servers. I don't want my servers to shutdown fast, I want operational integrity and graceful shutdowns.
Thanks in advance for any help you can provide.
PS in regards to writing my own middleware, this is for a telephony application with sub-second "soft-realtime" response time requirements. It does make sense, and it's not a point I'm looking to debate. :)