Problem with WCF-SQL Adapter
- by Paul Petrov
When using WCF receive adapter with SQL binding in Polling mode please be aware of the following problem.
Problem:
At some regular but seemingly random intervals the application stops processing new requests, places a lock on the database and prevent other application from accessing it. Initially it looked like DTC issue, as it was distributed transaction that stalled most of the time.
Symptoms:
Orchestration instances in Dehydrated state, receive location not picking up new messages, exclusive locks on database tables, errors in DTC trace.
Cause:
Microsoft has confirmed that there is a bug in the WCF-SQL adapter. In the receive adapter binding configuration there's receiveTimeout property set to 10 minutes by default. If during this period data is not found in the table the adapter would start new thread and allocate more memory without releasing old resources. Thus if there's no new data in the table for a long time a new thread will be created in the host instance every 10 minutes until it reaches threshold (1000) and then there's no threads left for this host instance and it can't start/complete any tasks. Then this host instance won't be able to do anything. If other artifacts are hosted in the instance they will suffer consequences as well.
Solution:
- Set receiveTimeout to the maximum time 24.20:31:23.6470000.
- Place WCF-SQL receive locations in separate host to provide its own thread pool and eliminate impact on other processes
- Ensure WCF-SQL dedicated host instances are restarted at interval less or equal to receiveTimeout to flush threads and memory
- Monitor performance counters Process/Thread Count/BTSNTSvc{n} for thread count trend and respond to alert if it grows by restarting host instance
If you use WCF-SQL Adapter in the Notification mode then make sure to remove sqlAdapterInboundTransactionBehavior otherwise this location will exhibit the same issue. In this case though, setting receiveTimeout doesn't help and new thread will be created at default intervals (10 min) ignoring maximum setting.