Anatomy of a .NET Assembly - Signature encodings
Posted
by Simon Cooper
on Simple Talk
See other posts from Simple Talk
or by Simon Cooper
Published on Fri, 27 May 2011 11:31:00 GMT
Indexed on
2011/06/20
16:35 UTC
Read the original article
Hit count: 411
Anatomy of a .NET Assembly
If you've just joined this series, I highly recommend you read the previous posts in this series, starting here, or at least these posts, covering the CLR metadata tables.
Before we look at custom attribute encoding, we first need to have a brief look at how signatures are encoded in an assembly in general.
Signature types
There are several types of signatures in an assembly, all of which share a common base representation, and are all stored as binary blobs in the #Blob
heap, referenced by an offset from various metadata tables.
The types of signatures are:
- Method definition and method reference signatures.
- Field signatures
- Property signatures
- Method local variables. These are referenced from the
StandAloneSig
table, which is then referenced by method body headers. - Generic type specifications. These represent a particular instantiation of a generic type.
- Generic method specifications. Similarly, these represent a particular instantiation of a generic method.
Representing a type
All metadata signatures are based around the ELEMENT_TYPE
structure. This assigns a number to each 'built-in' type in the framework; for example, Uint16
is 0x07
, String
is 0x0e
, and Object
is 0x1c
. Byte codes are also used to indicate SzArrays, multi-dimensional arrays, custom types, and generic type and method variables. However, these require some further information.
Firstly, custom types (ie not one of the built-in types). These require you to specify the 4-byte TypeDefOrRef
coded token after the CLASS
(0x12
) or VALUETYPE
(0x11
) element type. This 4-byte value is stored in a compressed format before being written out to disk (for more excruciating details, you can refer to the CLI specification).
SzArrays simply have the array item type after the SZARRAY
byte (0x1d
). Multidimensional arrays follow the ARRAY
element type with a series of compressed integers indicating the number of dimensions, and the size and lower bound of each dimension.
Generic variables are simply followed by the index of the generic variable they refer to.
There are other additions as well, for example, a specific byte value indicates a method parameter passed by reference (BYREF
), and other values indicating custom modifiers.
Some examples...
To demonstrate, here's a few examples and what the resulting blobs in the #Blob
heap will look like. Each name in capitals corresponds to a particular byte value in the ELEMENT_TYPE
or CALLCONV
structure, and coded tokens to custom types are represented by the type name in curly brackets.
- A simple field:
int intField; FIELD I4
- A field of an array of a generic type parameter (assuming
T
is the first generic parameter of the containing type):
T[] genArrayField FIELD SZARRAY VAR 0
- An instance method signature (note how the number of parameters does not include the return type):
instance string MyMethod(MyType, int&, bool[][]); HASTHIS DEFAULT 3 STRING CLASS {MyType} BYREF I4 SZARRAY SZARRAY BOOLEAN
- A generic type instantiation:
MyGenericType<MyType, MyStruct> GENERICINST CLASS {MyGenericType} 2 CLASS {MyType} VALUETYPE {MyStruct}
- For more complicated examples, in the following C# type declaration:
GenericType<T> : GenericBaseType<object[], T, GenericType<T>> { ... }
the Extends field of theTypeDef
for GenericType will point to aTypeSpec
with the following blob:GENERICINST CLASS {GenericBaseType} 3 SZARRAY OBJECT VAR 0 GENERICINST CLASS {GenericType} 1 VAR 0
- And a static generic method signature (generic parameters on types are referenced using
VAR
, generic parameters on methods usingMVAR
):TResult[] GenericMethod<TInput, TResult>( TInput, System.Converter<TInput, TOutput>); GENERIC 2 2 SZARRAY MVAR 1 MVAR 0 GENERICINST CLASS {System.Converter} 2 MVAR 0 MVAR 1
As you can see, complicated signatures are recursively built up out of quite simple building blocks to represent all the possible variations in a .NET assembly.
Now we've looked at the basics of normal method signatures, in my next post I'll look at custom attribute application signatures, and how they are different to normal signatures.
© Simple Talk or respective owner