Subterranean IL: The ThreadLocal type
Posted
by Simon Cooper
on Simple Talk
See other posts from Simple Talk
or by Simon Cooper
Published on Thu, 03 May 2012 11:11:00 GMT
Indexed on
2012/05/30
16:56 UTC
Read the original article
Hit count: 301
Subterranean IL
I came across ThreadLocal<T>
while I was researching ConcurrentBag
. To look at it, it doesn't really make much sense. What's all those extra Cn
classes doing in there? Why is there a GenericHolder<T,U,V,W>
class? What's going on? However, digging deeper, it's a rather ingenious solution to a tricky problem.
Thread statics
Declaring that a variable is thread static, that is, values assigned and read from the field is specific to the thread doing the reading, is quite easy in .NET:
[ThreadStatic] private static string s_ThreadStaticField;
ThreadStaticAttribute
is not a pseudo-custom attribute; it is compiled as a normal attribute, but the CLR has in-built magic, activated by that attribute, to redirect accesses to the field based on the executing thread's identity.
TheadStaticAttribute
provides a simple solution when you want to use a single field as thread-static. What if you want to create an arbitary number of thread static variables at runtime? Thread-static fields can only be declared, and are fixed, at compile time. Prior to .NET 4, you only had one solution - thread local data slots. This is a lesser-known function of Thread
that has existed since .NET 1.1:
LocalDataStoreSlot threadSlot = Thread.AllocateNamedDataSlot("slot1"); string value = "foo"; Thread.SetData(threadSlot, value); string gettedValue = (string)Thread.GetData(threadSlot);
Each instance of LocalStoreDataSlot
mediates access to a single slot, and each slot acts like a separate thread-static field.
As you can see, using thread data slots is quite cumbersome. You need to keep track of LocalDataStoreSlot
objects, it's not obvious how instances of LocalDataStoreSlot
correspond to individual thread-static variables, and it's not type safe. It's also relatively slow and complicated; the internal implementation consists of a whole series of classes hanging off a single thread-static field in Thread
itself, using various arrays, lists, and locks for synchronization. ThreadLocal<T>
is far simpler and easier to use.
ThreadLocal
ThreadLocal
provides an abstraction around thread-static fields that allows it to be used just like any other class; it can be used as a replacement for a thread-static field, it can be used in a List<ThreadLocal<T>>
, you can create as many as you need at runtime. So what does it do? It can't just have an instance-specific thread-static field, because thread-static fields have to be declared as static
, and so shared between all instances of the declaring type. There's something else going on here.
The values stored in instances of ThreadLocal<T>
are stored in instantiations of the GenericHolder<T,U,V,W>
class, which contains a single ThreadStatic
field (s_value
) to store the actual value. This class is then instantiated with various combinations of the Cn
types for generic arguments.
In .NET, each separate instantiation of a generic type has its own static state. For example, GenericHolder<int,C0,C1,C2>
has a completely separate s_value
field to GenericHolder<int,C1,C14,C1>
. This feature is (ab)used by ThreadLocal
to emulate instance thread-static fields.
Every time an instance of ThreadLocal
is constructed, it is assigned a unique number from the static s_currentTypeId
field using Interlocked.Increment
, in the FindNextTypeIndex
method. The hexadecimal representation of that number then defines the specific Cn
types that instantiates the GenericHolder
class. That instantiation is therefore 'owned' by that instance of ThreadLocal
.
This gives each instance of ThreadLocal
its own ThreadStatic
field through a specific unique instantiation of the GenericHolder
class. Although GenericHolder
has four type variables, the first one is always instantiated to the type stored in the ThreadLocal<T>
. This gives three free type variables, each of which can be instantiated to one of 16 types (C0
to C15
). This puts an upper limit of 4096 (163) on the number of ThreadLocal<T>
instances that can be created for each value of T. That is, there can be a maximum of 4096 instances of ThreadLocal<string>
, and separately a maximum of 4096 instances of ThreadLocal<object>
, etc.
However, there is an upper limit of 16384 enforced on the total number of ThreadLocal
instances in the AppDomain. This is to stop too much memory being used by thousands of instantiations of GenericHolder<T,U,V,W>
, as once a type is loaded into an AppDomain it cannot be unloaded, and will continue to sit there taking up memory until the AppDomain is unloaded. The total number of ThreadLocal
instances created is tracked by the ThreadLocalGlobalCounter
class.
So what happens when either limit is reached? Firstly, to try and stop this limit being reached, it recycles GenericHolder
type indexes of ThreadLocal
instances that get disposed using the s_availableIndices
concurrent stack. This allows GenericHolder
instantiations of disposed ThreadLocal
instances to be re-used. But if there aren't any available instantiations, then ThreadLocal
falls back on a standard thread local slot using TLSHolder
. This makes it very important to dispose of your ThreadLocal
instances if you'll be using lots of them, so the type instantiations can be recycled.
The previous way of creating arbitary thread-static variables, thread data slots, was slow, clunky, and hard to use. In comparison, ThreadLocal
can be used just like any other type, and each instance appears from the outside to be a non-static thread-static variable. It does this by using the CLR type system to assign each instance of ThreadLocal
its own instantiated type containing a thread-static field, and so delegating a lot of the bookkeeping that thread data slots had to do to the CLR type system itself! That's a very clever use of the CLR type system.
© Simple Talk or respective owner