PostSharp, Obfuscation, and IL
- by Simon Cooper
Aspect-oriented programming (AOP) is a relatively new programming paradigm. Originating at Xerox PARC in 1994, the paradigm was first made available for general-purpose development as an extension to Java in 2001. From there, it has quickly been adapted for use in all the common languages used today. In the .NET world, one of the primary AOP toolkits is PostSharp.
Attributes and AOP
Normally, attributes in .NET are entirely a metadata construct. Apart from a few special attributes in the .NET framework, they have no effect whatsoever on how a class or method executes within the CLR. Only by using reflection at runtime can you access any attributes declared on a type or type member.
PostSharp changes this. By declaring a custom attribute that derives from PostSharp.Aspects.Aspect, applying it to types and type members, and running the resulting assembly through the PostSharp postprocessor, you can essentially declare 'clever' attributes that change the behaviour of whatever the aspect has been applied to at runtime.
A simple example of this is logging. By declaring a TraceAttribute that derives from OnMethodBoundaryAspect, you can automatically log when a method has been executed:
public class TraceAttribute : PostSharp.Aspects.OnMethodBoundaryAspect
{
public override void OnEntry(MethodExecutionArgs args)
{
MethodBase method = args.Method;
System.Diagnostics.Trace.WriteLine(
String.Format(
"Entering {0}.{1}.",
method.DeclaringType.FullName,
method.Name));
}
public override void OnExit(MethodExecutionArgs args)
{
MethodBase method = args.Method;
System.Diagnostics.Trace.WriteLine(
String.Format(
"Leaving {0}.{1}.",
method.DeclaringType.FullName,
method.Name));
}
}
[Trace]
public void MethodToLog() { ... }
Now, whenever MethodToLog is executed, the aspect will automatically log entry and exit, without having to add the logging code to MethodToLog itself.
PostSharp Performance
Now this does introduce a performance overhead - as you can see, the aspect allows access to the MethodBase of the method the aspect has been applied to. If you were limited to C#, you would be forced to retrieve each MethodBase instance using Type.GetMethod(), matching on the method name and signature. This is slow. Fortunately, PostSharp is not limited to C#. It can use any instruction available in IL. And in IL, you can do some very neat things.
Ldtoken
C# allows you to get the Type object corresponding to a specific type name using the typeof operator:
Type t = typeof(Random);
The C# compiler compiles this operator to the following IL:
ldtoken [mscorlib]System.Random
call class [mscorlib]System.Type
[mscorlib]System.Type::GetTypeFromHandle(
valuetype [mscorlib]System.RuntimeTypeHandle)
The ldtoken instruction obtains a special handle to a type called a RuntimeTypeHandle, and from that, the Type object can be obtained using GetTypeFromHandle. These are both relatively fast operations - no string lookup is required, only direct assembly and CLR constructs are used.
However, a little-known feature is that ldtoken is not just limited to types; it can also get information on methods and fields, encapsulated in a RuntimeMethodHandle or RuntimeFieldHandle:
// get a MethodBase for String.EndsWith(string)
ldtoken method instance bool [mscorlib]System.String::EndsWith(string)
call class [mscorlib]System.Reflection.MethodBase
[mscorlib]System.Reflection.MethodBase::GetMethodFromHandle(
valuetype [mscorlib]System.RuntimeMethodHandle)
// get a FieldInfo for the String.Empty field
ldtoken field string [mscorlib]System.String::Empty
call class [mscorlib]System.Reflection.FieldInfo
[mscorlib]System.Reflection.FieldInfo::GetFieldFromHandle(
valuetype [mscorlib]System.RuntimeFieldHandle)
These usages of ldtoken aren't usable from C# or VB, and aren't likely to be added anytime soon (Eric Lippert's done a blog post on the possibility of adding infoof, methodof or fieldof operators to C#). However, PostSharp deals directly with IL, and so can use ldtoken to get MethodBase objects quickly and cheaply, without having to resort to string lookups.
The kicker
However, there are problems. Because ldtoken for methods or fields isn't accessible from C# or VB, it hasn't been as well-tested as ldtoken for types. This has resulted in various obscure bugs in most versions of the CLR when dealing with ldtoken and methods, and specifically, generic methods and methods of generic types. This means that PostSharp was behaving incorrectly, or just plain crashing, when aspects were applied to methods that were generic in some way.
So, PostSharp has to work around this. Without using the metadata tokens directly, the only way to get the MethodBase of generic methods is to use reflection: Type.GetMethod(), passing in the method name as a string along with information on the signature.
Now, this works fine. It's slower than using ldtoken directly, but it works, and this only has to be done for generic methods. Unfortunately, this poses problems when the assembly is obfuscated.
PostSharp and Obfuscation
When using ldtoken, obfuscators don't affect how PostSharp operates. Because the ldtoken instruction directly references the type, method or field within the assembly, it is unaffected if the name of the object is changed by an obfuscator. However, the indirect loading used for generic methods was breaking, because that uses the name of the method when the assembly is put through the PostSharp postprocessor to lookup the MethodBase at runtime. If the name then changes, PostSharp can't find it anymore, and the assembly breaks.
So, PostSharp needs to know about any changes an obfuscator does to an assembly. The way PostSharp does this is by adding another layer of indirection. When PostSharp obfuscation support is enabled, it includes an extra 'name table' resource in the assembly, consisting of a series of method & type names. When PostSharp needs to lookup a method using reflection, instead of encoding the method name directly, it looks up the method name at a fixed offset inside that name table:
MethodBase genericMethod = typeof(ContainingClass).GetMethod(GetNameAtIndex(22));
PostSharp.NameTable resource:
...
20: get_Prop1
21: set_Prop1
22: DoFoo
23: GetWibble
When the assembly is later processed by an obfuscator, the obfuscator can replace all the method and type names within the name table with their new name. That way, the reflection lookups performed by PostSharp will now use the new names, and everything will work as expected:
MethodBase genericMethod = typeof(#kGy).GetMethod(GetNameAtIndex(22));
PostSharp.NameTable resource:
...
20: #kkA
21: #zAb
22: #EF5a
23: #2tg
As you can see, this requires direct support by an obfuscator in order to perform these rewrites. Dotfuscator supports it, and now, starting with SmartAssembly 6.6.4, SmartAssembly does too.
So, a relatively simple solution to a tricky problem, with some CLR bugs thrown in for good measure. You don't see those every day!