Search Results

Search found 847 results on 34 pages for 'simon carpentier'.

Page 5/34 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Gnome-terminal doesn't close at end of SSH session

- by Simón

I have defined in gnome-terminal that it closes at end of shell. When I press Control-D or I execute exit, the terminal closes. But if I open the SSH session with gnome-terminal -x ssh server and I execute reboot in the SSH session (to reboot the remote server), it hangs and it doesn't close. What's happening? This worked to me before but I have to reinstall my Ubuntu (in local) and now gnome-terminal doesn't close itself when SSH session ends.

Read the article
Inside Red Gate - Be Reasonable!

- by Simon Cooper

As I discussed in my previous posts, divisions and project teams within Red Gate are allowed a lot of autonomy to manage themselves. It's not just the teams though, there's an awful lot of freedom given to individual employees within the company as well. Reasonableness How Red Gate treats it's employees is embodied in the phrase 'You will be reasonable with us, and we will be reasonable with you'. As an employee, you are trusted to do your job to the best of you ability. There's no one looking over your shoulder, no one clocking you in and out each day. Everyone is working at the company because they want to, and one of the core ideas of Red Gate is that the company exists to 'let people do the best work of their lives'. Everything is geared towards that. To help you do your job, office services and the IT department are there. If you need something to help you work better (a third or fourth monitor, footrests, or a new keyboard) then ask people in Information Systems (IS) or Office Services and you will be given it, no questions asked. Everyone has administrator access to their own machines, and you can install whatever you want on it. If there's a particular bit of software you need, then ask IS and they will buy it. As an example, last year I wanted to replace my main hard drive with an SSD; I had a summer job at school working in a computer repair shop, so knew what to do. I went to IS and asked for 'an SSD, a SATA cable, and a screwdriver'. And I got it there and then, even the screwdriver. Awesome. I screwed it in myself, copied all my main drive files across, and I was good to go. Of course, if you're not happy doing that yourself, then IS will sort it all out for you, no problems. If you need something that the company doesn't have (say, a book off Amazon, or you need some specifications printing off & bound), then everyone has a expense limit of £100 that you can use without any sign-off needed from your managers. If you need a company credit card for whatever reason, then you can get it. This freedom extends to working hours and holiday; you're expected to be in the office 11am-3pm each day, but outside those times you can work whenever you want. If you need a half-day holiday on a days notice, or even the same day, then you'll get it, unless there's a good reason you're needed that day. If you need to work from home for a day or so for whatever reason, then you can. If it's reasonable, then it's allowed. Trust issues? A lot of trust, and a lot of leeway, is given to all the people in Red Gate. Everyone is expected to work hard, do their jobs to the best of their ability, and there will be a minimum of bureaucratic obstacles that stop you doing your work. What happens if you abuse this trust? Well, an example is company trip expenses. You're free to expense what you like; food, drink, transport, etc, but if you expenses are not reasonable, then you will never travel with the company again. Simple as that. Everyone knows when they're abusing the system, so simply don't do it. Along with reasonableness, another phrase used is 'Don't be a ***'. If you act like a ***, and abuse any of the trust placed in you, even if you're the best tester, salesperson, dev, or manager in the company, then you won't be a part of the company any more. From what I know about other companies, employee trust is highly variable between companies, all the way up to CCTV trained on employee's monitors. As a dev, I want to produce well-written & useful code that solves people's problems. Being able to get whatever I need - install whatever tools I need, get time off when I need to, obtain reference books within a day - all let me do my job, and so let Red Gate help other people do their own jobs through the tools we produce. Plus, I don't think I would like working for a company that doesn't allow admin access to your own machine and blocks Facebook!

Read the article
Anatomy of a .NET Assembly - PE Headers

- by Simon Cooper

Today, I'll be starting a look at what exactly is inside a .NET assembly - how the metadata and IL is stored, how Windows knows how to load it, and what all those bytes are actually doing. First of all, we need to understand the PE file format. PE files .NET assemblies are built on top of the PE (Portable Executable) file format that is used for all Windows executables and dlls, which itself is built on top of the MSDOS executable file format. The reason for this is that when .NET 1 was released, it wasn't a built-in part of the operating system like it is nowadays. Prior to Windows XP, .NET executables had to load like any other executable, had to execute native code to start the CLR to read & execute the rest of the file. However, starting with Windows XP, the operating system loader knows natively how to deal with .NET assemblies, rendering most of this legacy code & structure unnecessary. It still is part of the spec, and so is part of every .NET assembly. The result of this is that there are a lot of structure values in the assembly that simply aren't meaningful in a .NET assembly, as they refer to features that aren't needed. These are either set to zero or to certain pre-defined values, specified in the CLR spec. There are also several fields that specify the size of other datastructures in the file, which I will generally be glossing over in this initial post. Structure of a PE file Most of a PE file is split up into separate sections; each section stores different types of data. For instance, the .text section stores all the executable code; .rsrc stores unmanaged resources, .debug contains debugging information, and so on. Each section has a section header associated with it; this specifies whether the section is executable, read-only or read/write, whether it can be cached... When an exe or dll is loaded, each section can be mapped into a different location in memory as the OS loader sees fit. In order to reliably address a particular location within a file, most file offsets are specified using a Relative Virtual Address (RVA). This specifies the offset from the start of each section, rather than the offset within the executable file on disk, so the various sections can be moved around in memory without breaking anything. The mapping from RVA to file offset is done using the section headers, which specify the range of RVAs which are valid within that section. For example, if the .rsrc section header specifies that the base RVA is 0x4000, and the section starts at file offset 0xa00, then an RVA of 0x401d (offset 0x1d within the .rsrc section) corresponds to a file offset of 0xa1d. Because each section has its own base RVA, each valid RVA has a one-to-one mapping with a particular file offset. PE headers As I said above, most of the header information isn't relevant to .NET assemblies. To help show what's going on, I've created a diagram identifying all the various parts of the first 512 bytes of a .NET executable assembly. I've highlighted the relevant bytes that I will refer to in this post: Bear in mind that all numbers are stored in the assembly in little-endian format; the hex number 0x0123 will appear as 23 01 in the diagram. The first 64 bytes of every file is the DOS header. This starts with the magic number 'MZ' (0x4D, 0x5A in hex), identifying this file as an executable file of some sort (an .exe or .dll). Most of the rest of this header is zeroed out. The important part of this header is at offset 0x3C - this contains the file offset of the PE signature (0x80). Between the DOS header & PE signature is the DOS stub - this is a stub program that simply prints out 'This program cannot be run in DOS mode.\r\n' to the console. I will be having a closer look at this stub later on. The PE signature starts at offset 0x80, with the magic number 'PE\0\0' (0x50, 0x45, 0x00, 0x00), identifying this file as a PE executable, followed by the PE file header (also known as the COFF header). The relevant field in this header is in the last two bytes, and it specifies whether the file is an executable or a dll; bit 0x2000 is set for a dll. Next up is the PE standard fields, which start with a magic number of 0x010b for x86 and AnyCPU assemblies, and 0x20b for x64 assemblies. Most of the rest of the fields are to do with the CLR loader stub, which I will be covering in a later post. After the PE standard fields comes the NT-specific fields; again, most of these are not relevant for .NET assemblies. The one that is is the highlighted Subsystem field, and specifies if this is a GUI or console app - 0x20 for a GUI app, 0x30 for a console app. Data directories & section headers After the PE and COFF headers come the data directories; each directory specifies the RVA (first 4 bytes) and size (next 4 bytes) of various important parts of the executable. The only relevant ones are the 2nd (Import table), 13th (Import Address table), and 15th (CLI header). The Import and Import Address table are only used by the startup stub, so we will look at those later on. The 15th points to the CLI header, where the CLR-specific metadata begins. After the data directories comes the section headers; one for each section in the file. Each header starts with the section's ASCII name, null-padded to 8 bytes. Again, most of each header is irrelevant, but I've highlighted the base RVA and file offset in each header. In the diagram, you can see the following sections: .text: base RVA 0x2000, file offset 0x200 .rsrc: base RVA 0x4000, file offset 0xa00 .reloc: base RVA 0x6000, file offset 0x1000 The .text section contains all the CLR metadata and code, and so is by far the largest in .NET assemblies. The .rsrc section contains the data you see in the Details page in the right-click file properties page, but is otherwise unused. The .reloc section contains address relocations, which we will look at when we study the CLR startup stub. What about the CLR? As you can see, most of the first 512 bytes of an assembly are largely irrelevant to the CLR, and only a few bytes specify needed things like the bitness (AnyCPU/x86 or x64), whether this is an exe or dll, and the type of app this is. There are some bytes that I haven't covered that affect the layout of the file (eg. the file alignment, which determines where in a file each section can start). These values are pretty much constant in most .NET assemblies, and don't affect the CLR data directly. Conclusion To summarize, the important data in the first 512 bytes of a file is: DOS header. This contains a pointer to the PE signature. DOS stub, which we'll be looking at in a later post. PE signature PE file header (aka COFF header). This specifies whether the file is an exe or a dll. PE standard fields. This specifies whether the file is AnyCPU/32bit or 64bit. PE NT-specific fields. This specifies what type of app this is, if it is an app. Data directories. The 15th entry (at offset 0x168) contains the RVA and size of the CLI header inside the .text section. Section headers. These are used to map between RVA and file offset. The important one is .text, which is where all the CLR data is stored. In my next post, we'll start looking at the metadata used by the CLR directly, which is all inside the .text section.

Read the article
SortedDictionary and SortedList

- by Simon Cooper

Apart from Dictionary<TKey, TValue>, there's two other dictionaries in the BCL - SortedDictionary<TKey, TValue> and SortedList<TKey, TValue>. On the face of it, these two classes do the same thing - provide an IDictionary<TKey, TValue> interface where the iterator returns the items sorted by the key. So what's the difference between them, and when should you use one rather than the other? (as in my previous post, I'll assume you have some basic algorithm & datastructure knowledge) SortedDictionary We'll first cover SortedDictionary. This is implemented as a special sort of binary tree called a red-black tree. Essentially, it's a binary tree that uses various constraints on how the nodes of the tree can be arranged to ensure the tree is always roughly balanced (for more gory algorithmical details, see the wikipedia link above). What I'm concerned about in this post is how the .NET SortedDictionary is actually implemented. In .NET 4, behind the scenes, the actual implementation of the tree is delegated to a SortedSet<KeyValuePair<TKey, TValue>>. One example tree might look like this: Each node in the above tree is stored as a separate SortedSet<T>.Node object (remember, in a SortedDictionary, T is instantiated to KeyValuePair<TKey, TValue>): class Node { public bool IsRed; public T Item; public SortedSet<T>.Node Left; public SortedSet<T>.Node Right; } The SortedSet only stores a reference to the root node; all the data in the tree is accessed by traversing the Left and Right node references until you reach the node you're looking for. Each individual node can be physically stored anywhere in memory; what's important is the relationship between the nodes. This is also why there is no constructor to SortedDictionary or SortedSet that takes an integer representing the capacity; there are no internal arrays that need to be created and resized. This may seen trivial, but it's an important distinction between SortedDictionary and SortedList that I'll cover later on. And that's pretty much it; it's a standard red-black tree. Plenty of webpages and datastructure books cover the algorithms behind the tree itself far better than I could. What's interesting is the comparions between SortedDictionary and SortedList, which I'll cover at the end. As a side point, SortedDictionary has existed in the BCL ever since .NET 2. That means that, all through .NET 2, 3, and 3.5, there has been a bona-fide sorted set class in the BCL (called TreeSet). However, it was internal, so it couldn't be used outside System.dll. Only in .NET 4 was this class exposed as SortedSet. SortedList Whereas SortedDictionary didn't use any backing arrays, SortedList does. It is implemented just as the name suggests; two arrays, one containing the keys, and one the values (I've just used random letters for the values): The items in the keys array are always guarenteed to be stored in sorted order, and the value corresponding to each key is stored in the same index as the key in the values array. In this example, the value for key item 5 is 'z', and for key item 8 is 'm'. Whenever an item is inserted or removed from the SortedList, a binary search is run on the keys array to find the correct index, then all the items in the arrays are shifted to accomodate the new or removed item. For example, if the key 3 was removed, a binary search would be run to find the array index the item was at, then everything above that index would be moved down by one: and then if the key/value pair {7, 'f'} was added, a binary search would be run on the keys to find the index to insert the new item, and everything above that index would be moved up to accomodate the new item: If another item was then added, both arrays would be resized (to a length of 10) before the new item was added to the arrays. As you can see, any insertions or removals in the middle of the list require a proportion of the array contents to be moved; an O(n) operation. However, if the insertion or removal is at the end of the array (ie the largest key), then it's only O(log n); the cost of the binary search to determine it does actually need to be added to the end (excluding the occasional O(n) cost of resizing the arrays to fit more items). As a side effect of using backing arrays, SortedList offers IList Keys and Values views that simply use the backing keys or values arrays, as well as various methods utilising the array index of stored items, which SortedDictionary does not (and cannot) offer. The Comparison So, when should you use one and not the other? Well, here's the important differences: Memory usage SortedDictionary and SortedList have got very different memory profiles. SortedDictionary... has a memory overhead of one object instance, a bool, and two references per item. On 64-bit systems, this adds up to ~40 bytes, not including the stored item and the reference to it from the Node object. stores the items in separate objects that can be spread all over the heap. This helps to keep memory fragmentation low, as the individual node objects can be allocated wherever there's a spare 60 bytes. In contrast, SortedList... has no additional overhead per item (only the reference to it in the array entries), however the backing arrays can be significantly larger than you need; every time the arrays are resized they double in size. That means that if you add 513 items to a SortedList, the backing arrays will each have a length of 1024. To conteract this, the TrimExcess method resizes the arrays back down to the actual size needed, or you can simply assign list.Capacity = list.Count. stores its items in a continuous block in memory. If the list stores thousands of items, this can cause significant problems with Large Object Heap memory fragmentation as the array resizes, which SortedDictionary doesn't have. Performance Operations on a SortedDictionary always have O(log n) performance, regardless of where in the collection you're adding or removing items. In contrast, SortedList has O(n) performance when you're altering the middle of the collection. If you're adding or removing from the end (ie the largest item), then performance is O(log n), same as SortedDictionary (in practice, it will likely be slightly faster, due to the array items all being in the same area in memory, also called locality of reference). So, when should you use one and not the other? As always with these sort of things, there are no hard-and-fast rules. But generally, if you: need to access items using their index within the collection are populating the dictionary all at once from sorted data aren't adding or removing keys once it's populated then use a SortedList. But if you: don't know how many items are going to be in the dictionary are populating the dictionary from random, unsorted data are adding & removing items randomly then use a SortedDictionary. The default (again, there's no definite rules on these sort of things!) should be to use SortedDictionary, unless there's a good reason to use SortedList, due to the bad performance of SortedList when altering the middle of the collection.

Read the article
Subterranean IL: Constructor constraints

- by Simon Cooper

The constructor generic constraint is a slightly wierd one. The ECMA specification simply states that it: constrains [the type] to being a concrete reference type (i.e., not abstract) that has a public constructor taking no arguments (the default constructor), or to being a value type. There seems to be no reference within the spec to how you actually create an instance of a generic type with such a constraint. In non-generic methods, the normal way of creating an instance of a class is quite different to initializing an instance of a value type. For a reference type, you use newobj: newobj instance void IncrementableClass::.ctor() and for value types, you need to use initobj: .locals init ( valuetype IncrementableStruct s1 ) ldloca 0 initobj IncrementableStruct But, for a generic method, we need a consistent method that would work equally well for reference or value types. Activator.CreateInstance<T> To solve this problem the CLR designers could have chosen to create something similar to the constrained. prefix; if T is a value type, call initobj, and if it is a reference type, call newobj instance void !!0::.ctor(). However, this solution is much more heavyweight than constrained callvirt. The newobj call is encoded in the assembly using a simple reference to a row in a metadata table. This encoding is no longer valid for a call to !!0::.ctor(), as different constructor methods occupy different rows in the metadata tables. Furthermore, constructors aren't virtual, so we would have to somehow do a dynamic lookup to the correct method at runtime without using a MethodTable, something which is completely new to the CLR. Trying to do this in IL results in the following verification error: newobj instance void !!0::.ctor() [IL]: Error: Unable to resolve token. This is where Activator.CreateInstance<T> comes in. We can call this method to return us a new T, and make the whole issue Somebody Else's Problem. CreateInstance does all the dynamic method lookup for us, and returns us a new instance of the correct reference or value type (strangely enough, Activator.CreateInstance<T> does not itself have a .ctor constraint on its generic parameter): .method private static !!0 CreateInstance<.ctor T>() { call !!0 [mscorlib]System.Activator::CreateInstance<!!0>() ret } Going further: compiler enhancements Although this method works perfectly well for solving the problem, the C# compiler goes one step further. If you decompile the C# version of the CreateInstance method above: private static T CreateInstance() where T : new() { return new T(); } what you actually get is this (edited slightly for space & clarity): .method private static !!T CreateInstance<.ctor T>() { .locals init ( [0] !!T CS$0$0000, [1] !!T CS$0$0001 ) DetectValueType: ldloca.s 0 initobj !!T ldloc.0 box !!T brfalse.s CreateInstance CreateValueType: ldloca.s 1 initobj !!T ldloc.1 ret CreateInstance: call !!0 [mscorlib]System.Activator::CreateInstance<T>() ret } What on earth is going on here? Looking closer, it's actually quite a clever performance optimization around value types. So, lets dissect this code to see what it does. The CreateValueType and CreateInstance sections should be fairly self-explanatory; using initobj for value types, and Activator.CreateInstance for reference types. How does the DetectValueType section work? First, the stack transition for value types: ldloca.s 0 // &[!!T(uninitialized)] initobj !!T // ldloc.0 // !!T box !!T // O[!!T] brfalse.s // branch not taken When the brfalse.s is hit, the top stack entry is a non-null reference to a boxed !!T, so execution continues to to the CreateValueType section. What about when !!T is a reference type? Remember, the 'default' value of an object reference (type O) is zero, or null. ldloca.s 0 // &[!!T(null)] initobj !!T // ldloc.0 // null box !!T // null brfalse.s // branch taken Because box on a reference type is a no-op, the top of the stack at the brfalse.s is null, and so the branch to CreateInstance is taken. For reference types, Activator.CreateInstance is called which does the full dynamic lookup using reflection. For value types, a simple initobj is called, which is far faster, and also eliminates the unboxing that Activator.CreateInstance has to perform for value types. However, this is strictly a performance optimization; Activator.CreateInstance<T> works for value types as well as reference types. Next... That concludes the initial premise of the Subterranean IL series; to cover the details of generic methods and generic code in IL. I've got a few other ideas about where to go next; however, if anyone has any itching questions, suggestions, or things you've always wondered about IL, do let me know.

Read the article
Developing Schema Compare for Oracle (Part 3): Ghost Objects

- by Simon Cooper

In the previous blog post, I covered how we solved the problem of dependencies between objects and between schemas. However, that isn’t the end of the issue. The dependencies algorithm I described works when you’re querying live databases and you can get dependencies for a particular schema direct from the server, and that’s all well and good. To throw a (rather large) spanner in the works, Schema Compare also has the concept of a snapshot, which is a read-only compressed XML representation of a selection of schemas that can be compared in the same way as a live database. This can be useful for keeping historical records or a baseline of a database schema, or comparing a schema on a computer that doesn’t have direct access to the database. So, how do snapshots interact with dependencies? Inter-database dependencies don't pose an issue as we store the dependencies in the snapshot. However, comparing a snapshot to a live database with cross-schema dependencies does cause a problem; what if the live database has a dependency to an object that does not exist in the snapshot? Take a basic example schema, where you’re only populating SchemaA: SOURCE TARGET (using snapshot) CREATE TABLE SchemaA.Table1 ( Col1 NUMBER REFERENCES SchemaB.Table1(col1)); CREATE TABLE SchemaA.Table1 ( Col1 VARCHAR2(100)); CREATE TABLE SchemaB.Table1 ( Col1 NUMBER PRIMARY KEY); CREATE TABLE SchemaB.Table1 ( Col1 VARCHAR2(100)); In this case, we want to generate a sync script to synchronize SchemaA.Table1 on the database represented by the snapshot. When taking a snapshot, database dependencies are followed, but because you’re not comparing it to anything at the time, the comparison dependencies algorithm described in my last post cannot be used. So, as you only take a snapshot of SchemaA on the target database, SchemaB.Table1 will not be in the snapshot. If this snapshot is then used to compare against the above source schema, SchemaB.Table1 will be included in the source, but the object will not be found in the target snapshot. This is the same problem that was solved with comparison dependencies, but here we cannot use the comparison dependencies algorithm as the snapshot has not got any information on SchemaB! We've now hit quite a big problem - we’re trying to include SchemaB.Table1 in the target, but we simply do not know the status of this object on the database the snapshot was taken from; whether it exists in the database at all, whether it’s the same as the target, whether it’s different... What can we do about this sorry state of affairs? Well, not a lot, it would seem. We can’t query the original database, as it may not be accessible, and we cannot assume any default state as it could be wrong and break the script (and we currently do not have a roll-back mechanism for failed synchronizes). The only way to fix this properly is for the user to go right back to the start and re-create the snapshot, explicitly including the schemas of these 'ghost' objects. So, the only thing we can do is flag up dependent ghost objects in the UI, and ask the user what we should do with it – assume it doesn’t exist, assume it’s the same as the target, or specify a definition for it. Unfortunately, such functionality didn’t make the cut for v1 of Schema Compare (as this is very much an edge case for a non-critical piece of functionality), so we simply flag the ghost objects up in the sync wizard as unsyncable, and let the user sort out what’s going on and edit the sync script as appropriate. There are some things that we do do to alleviate somewhat this rather unhappy situation; if a user creates a snapshot from the source or target of a database comparison, we include all the objects registered from the database, not just the ones in the schemas originally selected for comparison. This includes any extra dependent objects registered through the comparison dependencies algorithm. If the user then compares the resulting snapshot against the same database they were comparing against when it was created, the extra dependencies will be included in the snapshot as required and everything will be good. Fortunately, this problem will come up quite rarely, and only when the user uses snapshots and tries to sync objects with unknown cross-schema dependencies. However, the solution is not an easy one, and lead to some difficult architecture and design decisions within the product. And all this pain follows from the simple decision to allow schema pre-filtering! Next: why adding a column to a table isn't as easy as you would think...

Read the article
Anatomy of a .NET Assembly - CLR metadata 1

- by Simon Cooper

Before we look at the bytes comprising the CLR-specific data inside an assembly, we first need to understand the logical format of the metadata (For this post I only be looking at simple pure-IL assemblies; mixed-mode assemblies & other things complicates things quite a bit). Metadata streams Most of the CLR-specific data inside an assembly is inside one of 5 streams, which are analogous to the sections in a PE file. The name of each section in a PE file starts with a ., and the name of each stream in the CLR metadata starts with a #. All but one of the streams are heaps, which store unstructured binary data. The predefined streams are: #~ Also called the metadata stream, this stream stores all the information on the types, methods, fields, properties and events in the assembly. Unlike the other streams, the metadata stream has predefined contents & structure. #Strings This heap is where all the namespace, type & member names are stored. It is referenced extensively from the #~ stream, as we'll be looking at later. #US Also known as the user string heap, this stream stores all the strings used in code directly. All the strings you embed in your source code end up in here. This stream is only referenced from method bodies. #GUID This heap exclusively stores GUIDs used throughout the assembly. #Blob This heap is for storing pure binary data - method signatures, generic instantiations, that sort of thing. Items inside the heaps (#Strings, #US, #GUID and #Blob) are indexed using a simple binary offset from the start of the heap. At that offset is a coded integer giving the length of that item, then the item's bytes immediately follow. The #GUID stream is slightly different, in that GUIDs are all 16 bytes long, so a length isn't required. Metadata tables The #~ stream contains all the assembly metadata. The metadata is organised into 45 tables, which are binary arrays of predefined structures containing information on various aspects of the metadata. Each entry in a table is called a row, and the rows are simply concatentated together in the file on disk. For example, each row in the TypeRef table contains: A reference to where the type is defined (most of the time, a row in the AssemblyRef table). An offset into the #Strings heap with the name of the type An offset into the #Strings heap with the namespace of the type. in that order. The important tables are (with their table number in hex): 0x2: TypeDef 0x4: FieldDef 0x6: MethodDef 0x14: EventDef 0x17: PropertyDef Contains basic information on all the types, fields, methods, events and properties defined in the assembly. 0x1: TypeRef The details of all the referenced types defined in other assemblies. 0xa: MemberRef The details of all the referenced members of types defined in other assemblies. 0x9: InterfaceImpl Links the types defined in the assembly with the interfaces that type implements. 0xc: CustomAttribute Contains information on all the attributes applied to elements in this assembly, from method parameters to the assembly itself. 0x18: MethodSemantics Links properties and events with the methods that comprise the get/set or add/remove methods of the property or method. 0x1b: TypeSpec 0x2b: MethodSpec These tables provide instantiations of generic types and methods for each usage within the assembly. There are several ways to reference a single row within a table. The simplest is to simply specify the 1-based row index (RID). The indexes are 1-based so a value of 0 can represent 'null'. In this case, which table the row index refers to is inferred from the context. If the table can't be determined from the context, then a particular row is specified using a token. This is a 4-byte value with the most significant byte specifying the table, and the other 3 specifying the 1-based RID within that table. This is generally how a metadata table row is referenced from the instruction stream in method bodies. The third way is to use a coded token, which we will look at in the next post. So, back to the bytes Now we've got a rough idea of how the metadata is logically arranged, we can now look at the bytes comprising the start of the CLR data within an assembly: The first 8 bytes of the .text section are used by the CLR loader stub. After that, the CLR-specific data starts with the CLI header. I've highlighted the important bytes in the diagram. In order, they are: The size of the header. As the header is a fixed size, this is always 0x48. The CLR major version. This is always 2, even for .NET 4 assemblies. The CLR minor version. This is always 5, even for .NET 4 assemblies, and seems to be ignored by the runtime. The RVA and size of the metadata header. In the diagram, the RVA 0x20e4 corresponds to the file offset 0x2e4 Various flags specifying if this assembly is pure-IL, whether it is strong name signed, and whether it should be run as 32-bit (this is how the CLR differentiates between x86 and AnyCPU assemblies). A token pointing to the entrypoint of the assembly. In this case, 06 (the last byte) refers to the MethodDef table, and 01 00 00 refers to to the first row in that table. (after a gap) RVA of the strong name signature hash, which comes straight after the CLI header. The RVA 0x2050 corresponds to file offset 0x250. The rest of the CLI header is mainly used in mixed-mode assemblies, and so is zeroed in this pure-IL assembly. After the CLI header comes the strong name hash, which is a SHA-1 hash of the assembly using the strong name key. After that comes the bodies of all the methods in the assembly concatentated together. Each method body starts off with a header, which I'll be looking at later. As you can see, this is a very small assembly with only 2 methods (an instance constructor and a Main method). After that, near the end of the .text section, comes the metadata, containing a metadata header and the 5 streams discussed above. We'll be looking at this in the next post. Conclusion The CLI header data doesn't have much to it, but we've covered some concepts that will be important in later posts - the logical structure of the CLR metadata and the overall layout of CLR data within the .text section. Next, I'll have a look at the contents of the #~ stream, and how the table data is arranged on disk.

Read the article
Inside Red Gate - Experimenting In Public

- by Simon Cooper

Over the next few weeks, we'll be performing experiments on SmartAssembly to confirm or refute various hypotheses we have about how people use the product, what is stopping them from using it to its full extent, and what we can change to make it more useful and easier to use. Some of these experiments can be done within the team, some within Red Gate, and some need to be done on external users. External testing Some external testing can be done by standard usability tests and surveys, however, there are some hypotheses that can only be tested by building a version of SmartAssembly with some things in the UI or implementation changed. We'll then be able to look at how the experimental build is used compared to the 'mainline' build, which forms our baseline or control group, and use this data to confirm or refute the relevant hypotheses. However, there are several issues we need to consider before running experiments using separate builds: Ideally, the user wouldn't know they're running an experimental SmartAssembly. We don't want users to use the experimental build like it's an experimental build, we want them to use it like it's the real mainline build. Only then will we get valid, useful, and informative data concerning our hypotheses. There's no point running the experiments if we can't find out what happens after the download. To confirm or refute some of our hypotheses, we need to find out how the tool is used once it is installed. Fortunately, we've applied feature usage reporting to the SmartAssembly codebase itself to provide us with that information. Of course, this then makes the experimental data conditional on the user agreeing to send that data back to us in the first place. Unfortunately, even though this does limit the amount of useful data we'll be getting back, and possibly skew the data, there's not much we can do about this; we don't collect feature usage data without the user's consent. Looks like we'll simply have to live with this. What if the user tries to buy the experiment? This is something that isn't really covered by the Lean Startup book; how do you support users who give you money for an experiment? If the experiment is a new feature, and the user buys a license for SmartAssembly based on that feature, then what do we do if we later decide to pivot & scrap that feature? We've either got to spend time and money bringing that feature up to production quality and into the mainline anyway, or we've got disgruntled customers. Either way is bad. Again, there's not really any good solution to this. Similarly, what if we've removed some features for an experiment and a potential new user downloads the experimental build? (As I said above, there's no indication the build is an experimental build, as we want to see what users really do with it). The crucial feature they need is missing, causing a bad trial experience, a lost potential customer, and a lost chance to help the customer with their problem. Again, this is something not really covered by the Lean Startup book, and something that doesn't have a good solution. So, some tricky issues there, not all of them with nice easy answers. Turns out the practicalities of running Lean Startup experiments are more complicated than they first seem!

Read the article
Subterranean IL: Custom modifiers

- by Simon Cooper

In IL, volatile is an instruction prefix used to set a memory barrier at that instruction. However, in C#, volatile is applied to a field to indicate that all accesses on that field should be prefixed with volatile. As I mentioned in my previous post, this means that the field definition needs to store this information somehow, as such a field could be accessed from another assembly. However, IL does not have a concept of a 'volatile field'. How is this information stored? Attributes The standard way of solving this is to apply a VolatileAttribute or similar to the field; this extra metadata notifies the C# compiler that all loads and stores to that field should use the volatile prefix. However, there is a problem with this approach, namely, the .NET C++ compiler. C++ allows methods to be overloaded using properties, like volatile or const, on the parameters; this is perfectly legal C++: public ref class VolatileMethods { void Method(int *i) {} void Method(volatile int *i) {} } If volatile was specified using a custom attribute, then the VolatileMethods class wouldn't be compilable to IL, as there is nothing to differentiate the two methods from each other. This is where custom modifiers come in. Custom modifiers Custom modifiers are similar to custom attributes, but instead of being applied to an IL element separately to its declaration, they are embedded within the field or parameter's type signature itself. The VolatileMethods class would be compiled to the following IL: .class public VolatileMethods { .method public instance void Method(int32* i) {} .method public instance void Method( int32 modreq( [mscorlib]System.Runtime.CompilerServices.IsVolatile)* i) {} } The modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile) is the custom modifier. This adds a TypeDef or TypeRef token to the signature of the field or parameter, and even though they are mostly ignored by the CLR when it's executing the program, this allows methods and fields to be overloaded in ways that wouldn't be allowed using attributes. Because the modifiers are part of the signature, they need to be fully specified when calling such a method in IL: call instance void Method( int32 modreq([mscorlib]System.Runtime.CompilerServices.IsVolatile)*) There are two ways of applying modifiers; modreq specifies required modifiers (like IsVolatile), and modopt specifies optional modifiers that can be ignored by compilers (like IsLong or IsConst). The type specified as the modifier argument are simple placeholders; if you have a look at the definitions of IsVolatile and IsLong they are completely empty. They exist solely to be referenced by a modifier. Custom modifiers are used extensively by the C++ compiler to specify concepts that aren't expressible in IL, but still need to be taken into account when calling method overloads. C++ and C# That's all very well and good, but how does this affect C#? Well, the C++ compiler uses modreq(IsVolatile) to specify volatility on both method parameters and fields, as it would be slightly odd to have the same concept represented using a modifier or attribute depending on what it was applied to. Once you've compiled your C++ project, it can then be referenced and used from C#, so the C# compiler has to recognise the modreq(IsVolatile) custom modifier applied to fields, and vice versa. So, even though you can't overload fields or parameters with volatile using C#, volatile needs to be expressed using a custom modifier rather than an attribute to guarentee correct interoperability and behaviour with any C++ dlls that happen to come along. Next up: a closer look at attributes, and how certain attributes compile in unexpected ways.

Read the article
Inside Red Gate - Project teams

- by Simon Cooper

Within each division in Red Gate, development effort is structured around one or more project teams; currently, each division contains 2-3 separate teams. These are self contained units responsible for a particular development project. Project team structure The typical size of a development team varies, but is normally around 4-7 people - one project manager, two developers, one or two testers, a technical author (who is responsible for the text within the application, website content, and help documentation) and a user experience designer (who designs and prototypes the UIs) . However, team sizes can vary from 3 up to 12, depending on the division and project. As an rule, all the team sits together in the same area of the office. (Again, this is my experience of what happens. I haven't worked in the DBA division, and SQL Tools might have changed completely since I moved to .NET. As I mentioned in my previous post, each division is free to structure itself as it sees fit.) Depending on the project, and the other needs in the division, the tech author and UX designer may be shared between several projects. Generally, developers and testers work on one project at a time. If the project is a simple point release, then it might not need a UX designer at all. However, if it's a brand new product, then a UX designer and tech author will be involved right from the start. Developers, testers, and the project manager will normally stay together in the same team as they work on different projects, unless there's a good reason to split or merge teams for a particular project. Technical authors and UX designers will normally go wherever they are needed in the division, depending on what each project needs at the time. In my case, I was working with more or less the same people for over 2 years, all the way through SQL Compare 7, 8, and Schema Compare for Oracle. This helped to build a great sense of camaraderie wihin the team, and helped to form and maintain a team identity. This, in turn, meant we worked very well together, and so the final result was that much better (as well as making the work more fun). How is a project started and run? The product manager within each division collates user feedback and ideas, does lots of research, throws in a few ideas from people within the company, and then comes up with a list of what the division should work on in the next few years. This is split up into projects, and after each project is greenlit (I'll be discussing this later on) it is then assigned to a project team, as and when they become available (I'm sure there's lots of discussions and meetings at this point that I'm not aware of!). From that point, it's entirely up to the project team. Just as divisions are autonomous, project teams are also given a high degree of autonomy. All the teams in Red Gate use some sort of vaguely agile methodology; most use some variations on SCRUM, some have experimented with Kanban. Some store the project progress on a whiteboard, some use our bug tracker, others use different methods. It all depends on what the team members think will work best for them to get the best result at the end. From that point, the project proceeds as you would expect; code gets written, tests pass and fail, discussions about how to resolve various problems are had and decided upon, and out pops a new product, new point release, new internal tool, or whatever the project's goal was. The project manager ensures that everyone works together without too much bloodshed and that thrown missiles are constrained to Nerf bullets, the developers write the code, the testers ensure it actually works, and the tech author and UX designer ensure that people will be able to use the final product to solve their problem (after all, developers make lousy UI designers and technical authors). Projects in Red Gate last a relatively short amount of time; most projects are less than 6 months. The longest was 18 months. This has evolved as the company has grown, and I suspect is a side effect of the type of software Red Gate produces. As an ISV, we sell packaged software; we only get revenue when customers purchase the ready-made tools. As a result, we only get a sellable piece of software right at the end of a project. Therefore, the longer the project lasts, the more time and money has to be invested by the company before we get any revenue from it, and the riskier the project becomes. This drives the average project time down. Small project teams are the core of how Red Gate produces software, and are what the whole development effort of the company is built around. In my next post, I'll be looking at the office itself, and how all 200 of us manage to fit on two floors of a small office building.

Read the article
what is best book to learn optimized programming in java [closed]

- by Abhishek Simon

Possible Duplicate: Is there a canonical book for learning Java as an experienced developer? Let me elaborate a little: I used to be a C/C++ programmer where I used data structure concept like trees, queues stack etc and tried to optimize as much as possible, minimum no. of loops, variables and tried to make it efficient. It's been a couple of years that I started writing java codes, but it is simply not that efficient in terms of performance, memory intensive etc. To the point: I want to enter programming challenges using java so I need to improve my approach at things I program. So please suggest me some books that can help me learn to program better and have a chance in solving challenges in programming.

Read the article
Can an agile shop really score 12 on the Joel Test?

- by Simon

I really like the Joel test, use it myself, and encourage my staff and interviewees to consider it carefully. However I don't think I can ever score more than 9 because a few points seem to contradict the Agile Manifesto, XP and TDD, which are the bedrocks of my world. Specifically: the questions about schedule, specs, testers and quiet working conditions run counter to what we are trying to create and the values that we have adopted in being genuinely agile. So my question is: is it possible for a true Agile shop to score 12?

Read the article
PostSharp, Obfuscation, and IL

- by Simon Cooper

Aspect-oriented programming (AOP) is a relatively new programming paradigm. Originating at Xerox PARC in 1994, the paradigm was first made available for general-purpose development as an extension to Java in 2001. From there, it has quickly been adapted for use in all the common languages used today. In the .NET world, one of the primary AOP toolkits is PostSharp. Attributes and AOP Normally, attributes in .NET are entirely a metadata construct. Apart from a few special attributes in the .NET framework, they have no effect whatsoever on how a class or method executes within the CLR. Only by using reflection at runtime can you access any attributes declared on a type or type member. PostSharp changes this. By declaring a custom attribute that derives from PostSharp.Aspects.Aspect, applying it to types and type members, and running the resulting assembly through the PostSharp postprocessor, you can essentially declare 'clever' attributes that change the behaviour of whatever the aspect has been applied to at runtime. A simple example of this is logging. By declaring a TraceAttribute that derives from OnMethodBoundaryAspect, you can automatically log when a method has been executed: public class TraceAttribute : PostSharp.Aspects.OnMethodBoundaryAspect { public override void OnEntry(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Entering {0}.{1}.", method.DeclaringType.FullName, method.Name)); } public override void OnExit(MethodExecutionArgs args) { MethodBase method = args.Method; System.Diagnostics.Trace.WriteLine( String.Format( "Leaving {0}.{1}.", method.DeclaringType.FullName, method.Name)); } } [Trace] public void MethodToLog() { ... } Now, whenever MethodToLog is executed, the aspect will automatically log entry and exit, without having to add the logging code to MethodToLog itself. PostSharp Performance Now this does introduce a performance overhead - as you can see, the aspect allows access to the MethodBase of the method the aspect has been applied to. If you were limited to C#, you would be forced to retrieve each MethodBase instance using Type.GetMethod(), matching on the method name and signature. This is slow. Fortunately, PostSharp is not limited to C#. It can use any instruction available in IL. And in IL, you can do some very neat things. Ldtoken C# allows you to get the Type object corresponding to a specific type name using the typeof operator: Type t = typeof(Random); The C# compiler compiles this operator to the following IL: ldtoken [mscorlib]System.Random call class [mscorlib]System.Type [mscorlib]System.Type::GetTypeFromHandle( valuetype [mscorlib]System.RuntimeTypeHandle) The ldtoken instruction obtains a special handle to a type called a RuntimeTypeHandle, and from that, the Type object can be obtained using GetTypeFromHandle. These are both relatively fast operations - no string lookup is required, only direct assembly and CLR constructs are used. However, a little-known feature is that ldtoken is not just limited to types; it can also get information on methods and fields, encapsulated in a RuntimeMethodHandle or RuntimeFieldHandle: // get a MethodBase for String.EndsWith(string) ldtoken method instance bool [mscorlib]System.String::EndsWith(string) call class [mscorlib]System.Reflection.MethodBase [mscorlib]System.Reflection.MethodBase::GetMethodFromHandle( valuetype [mscorlib]System.RuntimeMethodHandle) // get a FieldInfo for the String.Empty field ldtoken field string [mscorlib]System.String::Empty call class [mscorlib]System.Reflection.FieldInfo [mscorlib]System.Reflection.FieldInfo::GetFieldFromHandle( valuetype [mscorlib]System.RuntimeFieldHandle) These usages of ldtoken aren't usable from C# or VB, and aren't likely to be added anytime soon (Eric Lippert's done a blog post on the possibility of adding infoof, methodof or fieldof operators to C#). However, PostSharp deals directly with IL, and so can use ldtoken to get MethodBase objects quickly and cheaply, without having to resort to string lookups. The kicker However, there are problems. Because ldtoken for methods or fields isn't accessible from C# or VB, it hasn't been as well-tested as ldtoken for types. This has resulted in various obscure bugs in most versions of the CLR when dealing with ldtoken and methods, and specifically, generic methods and methods of generic types. This means that PostSharp was behaving incorrectly, or just plain crashing, when aspects were applied to methods that were generic in some way. So, PostSharp has to work around this. Without using the metadata tokens directly, the only way to get the MethodBase of generic methods is to use reflection: Type.GetMethod(), passing in the method name as a string along with information on the signature. Now, this works fine. It's slower than using ldtoken directly, but it works, and this only has to be done for generic methods. Unfortunately, this poses problems when the assembly is obfuscated. PostSharp and Obfuscation When using ldtoken, obfuscators don't affect how PostSharp operates. Because the ldtoken instruction directly references the type, method or field within the assembly, it is unaffected if the name of the object is changed by an obfuscator. However, the indirect loading used for generic methods was breaking, because that uses the name of the method when the assembly is put through the PostSharp postprocessor to lookup the MethodBase at runtime. If the name then changes, PostSharp can't find it anymore, and the assembly breaks. So, PostSharp needs to know about any changes an obfuscator does to an assembly. The way PostSharp does this is by adding another layer of indirection. When PostSharp obfuscation support is enabled, it includes an extra 'name table' resource in the assembly, consisting of a series of method & type names. When PostSharp needs to lookup a method using reflection, instead of encoding the method name directly, it looks up the method name at a fixed offset inside that name table: MethodBase genericMethod = typeof(ContainingClass).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: get_Prop1 21: set_Prop1 22: DoFoo 23: GetWibble When the assembly is later processed by an obfuscator, the obfuscator can replace all the method and type names within the name table with their new name. That way, the reflection lookups performed by PostSharp will now use the new names, and everything will work as expected: MethodBase genericMethod = typeof(#kGy).GetMethod(GetNameAtIndex(22)); PostSharp.NameTable resource: ... 20: #kkA 21: #zAb 22: #EF5a 23: #2tg As you can see, this requires direct support by an obfuscator in order to perform these rewrites. Dotfuscator supports it, and now, starting with SmartAssembly 6.6.4, SmartAssembly does too. So, a relatively simple solution to a tricky problem, with some CLR bugs thrown in for good measure. You don't see those every day!

Read the article
Subterranean IL: Fault exception handlers

- by Simon Cooper

Fault event handlers are one of the two handler types that aren't available in C#. It behaves exactly like a finally, except it is only run if control flow exits the block due to an exception being thrown. As an example, take the following method: .method public static void FaultExample(bool throwException) { .try { ldstr "Entering try block" call void [mscorlib]System.Console::WriteLine(string) ldarg.0 brfalse.s NormalReturn ThrowException: ldstr "Throwing exception" call void [mscorlib]System.Console::WriteLine(string) newobj void [mscorlib]System.Exception::.ctor() throw NormalReturn: ldstr "Leaving try block" call void [mscorlib]System.Console::WriteLine(string) leave.s Return } fault { ldstr "Fault handler" call void [mscorlib]System.Console::WriteLine(string) endfault } Return: ldstr "Returning from method" call void [mscorlib]System.Console::WriteLine(string) ret } If we pass true to this method the following gets printed: Entering try block Throwing exception Fault handler and the exception gets passed up the call stack. So, the exception gets thrown, the fault handler gets run, and the exception propagates up the stack afterwards in the normal way. If we pass false, we get the following: Entering try block Leaving try block Returning from method Because we are leaving the .try using a leave.s instruction, and not throwing an exception, the fault handler does not get called. Fault handlers and C# So why were these not included in C#? It seems a pretty simple feature; one extra keyword that compiles in exactly the same way, and with the same semantics, as a finally handler. If you think about it, the same behaviour can be replicated using a normal catch block: try { throw new Exception(); } catch { // fault code goes here throw; } The catch block only gets run if an exception is thrown, and the exception gets rethrown and propagates up the call stack afterwards; exactly like a fault block. The only complications that occur is when you want to add a fault handler to a try block with existing catch handlers. Then, you either have to wrap the try in another try: try { try { // ... } catch (DirectoryNotFoundException) { // ... // leave.s as normal... } catch (IOException) { // ... throw; } } catch { // fault logic throw; } or separate out the fault logic into another method and call that from the appropriate handlers: try { // ... } catch (DirectoryNotFoundException ) { // ... } catch (IOException ioe) { // ... HandleFaultLogic(); throw; } catch (Exception e) { HandleFaultLogic(); throw; } To be fair, the number of times that I would have found a fault handler useful is minimal. Still, it's quite annoying knowing such functionality exists, but you're not able to access it from C#. Fortunately, there are some easy workarounds one can use instead. Next time: filter handlers.

Read the article
Subterranean IL: Exception handler semantics

- by Simon Cooper

In my blog posts on fault and filter exception handlers, I said that the same behaviour could be replicated using normal catch blocks. Well, that isn't entirely true... Changing the handler semantics Consider the following: .try { .try { .try { newobj instance void [mscorlib]System.Exception::.ctor() // IL for: // e.Data.Add("DictKey", true) throw } fault { ldstr "1: Fault handler" call void [mscorlib]System.Console::WriteLine(string) endfault } } filter { ldstr "2a: Filter logic" call void [mscorlib]System.Console::WriteLine(string) // IL for: // (bool)((Exception)e).Data["DictKey"] endfilter }{ ldstr "2b: Filter handler" call void [mscorlib]System.Console::WriteLine(string) leave.s Return } } catch object { ldstr "3: Catch handler" call void [mscorlib]System.Console::WriteLine(string) leave.s Return } Return: // rest of method If the filter handler is engaged (true is inserted into the exception dictionary) then the filter handler gets engaged, and the following gets printed to the console: 2a: Filter logic 1: Fault handler 2b: Filter handler and if the filter handler isn't engaged, then the following is printed: 2a:Filter logic 1: Fault handler 3: Catch handler Filter handler execution The filter handler is executed first. Hmm, ok. Well, what happens if we replaced the fault block with the C# equivalent (with the exception dictionary value set to false)? .try { // throw exception } catch object { ldstr "1: Fault handler" call void [mscorlib]System.Console::WriteLine(string) rethrow } we get this: 1: Fault handler 2a: Filter logic 3: Catch handler The fault handler is executed first, instead of the filter block. Eh? This change in behaviour is due to the way the CLR searches for exception handlers. When an exception is thrown, the CLR stops execution of the thread, and searches up the stack for an exception handler that can handle the exception and stop it propagating further - catch or filter handlers. It checks the type clause of catch clauses, and executes the code in filter blocks to see if the filter can handle the exception. When the CLR finds a valid handler, it saves the handler's location, then goes back to where the exception was thrown and executes fault and finally blocks between there and the handler location, discarding stack frames in the process, until it reaches the handler. So? By replacing a fault with a catch, we have changed the semantics of when the filter code is executed; by using a rethrow instruction, we've split up the exception handler search into two - one search to find the first catch, then a second when the rethrow instruction is encountered. This is only really obvious when mixing C# exception handlers with fault or filter handlers, so this doesn't affect code written only in C#. However it could cause some subtle and hard-to-debug effects with object initialization and ordering when using and calling code written in a language that can compile fault and filter handlers.

Read the article
Anatomy of a .NET Assembly - Signature encodings

- by Simon Cooper

If you've just joined this series, I highly recommend you read the previous posts in this series, starting here, or at least these posts, covering the CLR metadata tables. Before we look at custom attribute encoding, we first need to have a brief look at how signatures are encoded in an assembly in general. Signature types There are several types of signatures in an assembly, all of which share a common base representation, and are all stored as binary blobs in the #Blob heap, referenced by an offset from various metadata tables. The types of signatures are: Method definition and method reference signatures. Field signatures Property signatures Method local variables. These are referenced from the StandAloneSig table, which is then referenced by method body headers. Generic type specifications. These represent a particular instantiation of a generic type. Generic method specifications. Similarly, these represent a particular instantiation of a generic method. All these signatures share the same underlying mechanism to represent a type Representing a type All metadata signatures are based around the ELEMENT_TYPE structure. This assigns a number to each 'built-in' type in the framework; for example, Uint16 is 0x07, String is 0x0e, and Object is 0x1c. Byte codes are also used to indicate SzArrays, multi-dimensional arrays, custom types, and generic type and method variables. However, these require some further information. Firstly, custom types (ie not one of the built-in types). These require you to specify the 4-byte TypeDefOrRef coded token after the CLASS (0x12) or VALUETYPE (0x11) element type. This 4-byte value is stored in a compressed format before being written out to disk (for more excruciating details, you can refer to the CLI specification). SzArrays simply have the array item type after the SZARRAY byte (0x1d). Multidimensional arrays follow the ARRAY element type with a series of compressed integers indicating the number of dimensions, and the size and lower bound of each dimension. Generic variables are simply followed by the index of the generic variable they refer to. There are other additions as well, for example, a specific byte value indicates a method parameter passed by reference (BYREF), and other values indicating custom modifiers. Some examples... To demonstrate, here's a few examples and what the resulting blobs in the #Blob heap will look like. Each name in capitals corresponds to a particular byte value in the ELEMENT_TYPE or CALLCONV structure, and coded tokens to custom types are represented by the type name in curly brackets. A simple field: int intField; FIELD I4 A field of an array of a generic type parameter (assuming T is the first generic parameter of the containing type): T[] genArrayField FIELD SZARRAY VAR 0 An instance method signature (note how the number of parameters does not include the return type): instance string MyMethod(MyType, int&, bool[][]); HASTHIS DEFAULT 3 STRING CLASS {MyType} BYREF I4 SZARRAY SZARRAY BOOLEAN A generic type instantiation: MyGenericType<MyType, MyStruct> GENERICINST CLASS {MyGenericType} 2 CLASS {MyType} VALUETYPE {MyStruct} For more complicated examples, in the following C# type declaration: GenericType<T> : GenericBaseType<object[], T, GenericType<T>> { ... } the Extends field of the TypeDef for GenericType will point to a TypeSpec with the following blob: GENERICINST CLASS {GenericBaseType} 3 SZARRAY OBJECT VAR 0 GENERICINST CLASS {GenericType} 1 VAR 0 And a static generic method signature (generic parameters on types are referenced using VAR, generic parameters on methods using MVAR): TResult[] GenericMethod<TInput, TResult>( TInput, System.Converter<TInput, TOutput>); GENERIC 2 2 SZARRAY MVAR 1 MVAR 0 GENERICINST CLASS {System.Converter} 2 MVAR 0 MVAR 1 As you can see, complicated signatures are recursively built up out of quite simple building blocks to represent all the possible variations in a .NET assembly. Now we've looked at the basics of normal method signatures, in my next post I'll look at custom attribute application signatures, and how they are different to normal signatures.

Read the article
What encoding is used by javax.xml.transform.Transformer?

- by Simon Reeves

Please can you answer a couple of questions based on the code below (excludes the try/catch blocks), which transforms input XML and XSL files into an output XSL-FO file: File xslFile = new File("inXslFile.xsl"); File xmlFile = new File("sourceXmlFile.xml"); TransformerFactory tFactory = TransformerFactory.newInstance(); Transformer transformer = tFactory.newTransformer(new StreamSource(xslFile)); FileOutputStream fos = new FileOutputStream(new File("outFoFile.fo"); transformer.transform(new StreamSource(xmlFile), new StreamResult(fos)); inXslFile is encoded using UTF-8 - however there are no tags in file which states this. sourceXmlFile is UTF-8 encoded and there may be a metatag at start of file indicating this. am currently using Java 6 with intention of upgrading in the future. What encoding is used when reading the xslFile? What encoding is used when reading the xmlFile? What encoding will be applied to the FO outfile? How can I obtain the info (properties) for 1 - 3? Is there a method call? How can the properties in 4 be altered - using configuration and dynamically? if known - Where is there info (web site) on this that I can read - I have looked without much success.

Read the article
Is code that terminates on a random condition guaranteed to terminate?

- by Simon Campbell

If I had a code which terminated based on if a random number generator returned a result (as follows), would it be 100% certain that the code would terminate if it was allowed to run forever. while (random(MAX_NUMBER) != 0): // random returns a random number between 0 and MAX_NUMBER print('Hello World') I am also interested in any distinctions between purely random and the deterministic random that computers generally use. Assume the seed is not able to be known in the case of the deterministic random. Naively it could be suggested that the code will exit, after all every number has some possibility and all of time for that possibility to be exercised. On the other hand it could be argued that there is the random chance it may not ever meet the exit condition-- the generator could generate 1 'randomly' until infinity. (I suppose one would question the validity of the random number generator if it was a deterministic generator returning only 1's 'randomly' though)

Read the article
Security of logging people in automatically from another app?

- by Simon

I have 2 apps. They both have accounts, and each account has users. These apps are going to share the same users and accounts and they will always be in sync. I want to be able to login automatically from one app to the other. So my solution is to generate a login_key, for example: 2sa7439e-a570-ac21-a2ao-z1qia9ca6g25 once a day. And provide a automated login link to the other app... for example if the user clicks on: https://account_name.securityhole.io/login/2sa7439e-a570-ac21-a2ao-z1qia9ca6g25/user/123 They are logged in automatically, session created. So here we have 3 things that a intruder has to get right in order to gain access; account name, login key, and the user id. Bad idea? Or should I can down the path of making one app an oauth provider? Or is there a better way?

Read the article
The SmartAssembly Rearchitecture

- by Simon Cooper

You may have noticed that not a lot has happened to SmartAssembly in the past few months. However, the team has been very busy behind the scenes working on an entirely new version of SmartAssembly. SmartAssembly 6.5 Over the past few releases of SmartAssembly, the team had come to the realisation that the current 'architecture' - grown organically, way before RedGate bought it, from a simple name obfuscator over the years into a full-featured obfuscator and assembly instrumentation tool - was simply not up to the task. Not for what we wanted to do with it at the time, and not what we have planned for the future. Not only was it not up to what we wanted it to do, but it was severely limiting our development capabilities; long-standing bugs in the root architecture that couldn't be fixed, some rather...interesting...design decisions, and convoluted logic that increased the complexity of any bugfix or new feature tenfold. So, we set out to fix this. Earlier this year, a new engine was written on which SmartAssembly would be based. Over the following few months, each feature was ported over to the new engine and extensively tested by our existing unit and integration tests. The engine was linked into the existing UI (no easy task, due to the tight coupling between the UI and old engine), and existing RedGate products were tested on the new SmartAssembly to ensure the new engine acted in the same way. The result is SmartAssembly 6.5. The risks of a rearchitecture Are there risks to rearchitecting a product like SmartAssembly? Of course. There was a lot of undocumented behaviour in the old engine, and as part of the rearchitecture we had to find this behaviour, define it, and document it. In the process we found some behaviour of the old engine that simply did not make sense; hence the changes in pruning & obfuscation behaviour in the release notes. All the special edge cases we had to find, document, and re-implement. There was a chance that these special cases would not be found until near the end of the project, when everything is functionally complete and interacting together. By that stage, it would be hard to go back and change anything without a whole lot of extra work, delaying the release by months. We always knew this was a possibility; our initial estimate of the time required was '4 months, ± 4 months'. And that was including various mitigation strategies to reduce the likelihood of these issues being found right at the end. Fortunately, this worst-case did not happen. However, the rearchitecture did produce some benefits. As well as numerous bug fixes that we could not fix any other way, we've also added logging that lets you find out exactly why a particular field or property wasn't pruned or obfuscated. There's a new command line interface, we've tested it with WP7.1 and Silverlight 5, and we've added a new option to error reporting to improve the performance of instrumented apps by ~10%, at the cost of inaccurate line numbers in reports. So? What differences will I see? Largely none. SmartAssembly 6.5 produces the same output as SmartAssembly 6.2. The performance of 6.5 will be much faster for some users, and generally the same as 6.2 for the remaining. If you've encountered a bug with previous versions of SmartAssembly, I encourage you to try 6.5, as it has most likely been fixed in the rearchitecture. If you encounter a bug with 6.5, please do tell us; we'll be doing another release quite soon, so we'll aim to fix any issues caused by 6.5 in that release. Most importantly, the new architecture finally allows us to implement some Big Things with SmartAssembly we've been planning for many months; these will fundamentally change how you build, release and monitor your application. Stay tuned for further updates!

Read the article
Why you shouldn't add methods to interfaces in APIs

- by Simon Cooper

It is an oft-repeated maxim that you shouldn't add methods to a publically-released interface in an API. Recently, I was hit hard when this wasn't followed. As part of the work on ApplicationMetrics, I've been implementing auto-reporting of MVC action methods; whenever an action was called on a controller, ApplicationMetrics would automatically report it without the developer needing to add manual ReportEvent calls. Fortunately, MVC provides easy hook when a controller is created, letting me log when it happens - the IControllerFactory interface. Now, the dll we provide to instrument an MVC webapp has to be compiled against .NET 3.5 and MVC 1, as the lowest common denominator. This MVC 1 dll will still work when used in an MVC 2, 3 or 4 webapp because all MVC 2+ webapps have a binding redirect redirecting all references to previous versions of System.Web.Mvc to the correct version, and type forwards taking care of any moved types in the new assemblies. Or at least, it should. IControllerFactory In MVC 1 and 2, IControllerFactory was defined as follows: public interface IControllerFactory { IController CreateController(RequestContext requestContext, string controllerName); void ReleaseController(IController controller); } So, to implement the logging controller factory, we simply wrap the existing controller factory: internal sealed class LoggingControllerFactory : IControllerFactory { private readonly IControllerFactory m_CurrentController; public LoggingControllerFactory(IControllerFactory currentController) { m_CurrentController = currentController; } public IController CreateController( RequestContext requestContext, string controllerName) { // log the controller being used FeatureSessionData.ReportEvent("Controller used:", controllerName); return m_CurrentController.CreateController(requestContext, controllerName); } public void ReleaseController(IController controller) { m_CurrentController.ReleaseController(controller); } } Easy. This works as expected in MVC 1 and 2. However, in MVC 3 this type was throwing a TypeLoadException, saying a method wasn't implemented. It turns out that, in MVC 3, the definition of IControllerFactory was changed to this: public interface IControllerFactory { IController CreateController(RequestContext requestContext, string controllerName); SessionStateBehavior GetControllerSessionBehavior( RequestContext requestContext, string controllerName); void ReleaseController(IController controller); } There's a new method in the interface. So when our MVC 1 dll was redirected to reference System.Web.Mvc v3, LoggingControllerFactory tried to implement version 3 of IControllerFactory, was missing the GetControllerSessionBehaviour method, and so couldn't be loaded by the CLR. Implementing the new method Fortunately, there was a workaround. Because interface methods are normally implemented implicitly in the CLR, if we simply declare a virtual method matching the signature of the new method in MVC 3, then it will be ignored in MVC 1 and 2 and implement the extra method in MVC 3: internal sealed class LoggingControllerFactory : IControllerFactory { ... public virtual SessionStateBehaviour GetControllerSessionBehaviour( RequestContext requestContext, string controllerName) {} ... } However, this also has problems - the SessionStateBehaviour type only exists in .NET 4, and we're limited to .NET 3.5 by support for MVC 1 and 2. This means that the only solutions to support all MVC versions are: Construct the LoggingControllerFactory type at runtime using reflection Produce entirely separate dlls for MVC 1&2 and MVC 3. Ugh. And all because of that blasted extra method! Another solution? Fortunately, in this case, there is a third option - System.Web.Mvc also provides a DefaultControllerFactory type that can provide the implementation of GetControllerSessionBehaviour for us in MVC 3, while still allowing us to override CreateController and ReleaseController. However, this does mean that LoggingControllerFactory won't be able to wrap any calls to GetControllerSessionBehaviour. This is an acceptable bug, given the other options, as very few developers will be overriding GetControllerSessionBehaviour in their own custom controller factory. So, if you're providing an interface as part of an API, then please please please don't add methods to it. Especially if you don't provide a 'default' implementing type. Any code compiled against the previous version that can't be updated will have some very tough decisions to make to support both versions.

Read the article
How can we distribute a client app to other non-US businesses?

- by Simon

I'm working on an app which acts as a client for our web service. We sell this service to businesses, and we want to distribute the app to their employees for free. The app will be customised for each client. If we were in the US, my understanding is that we'd ask them to enrol in the volume purchasing program, and submit a version of our app for each business, for enterprise distribution at the free price point However, the businesses aren't all in the US, so they can't enrol in the VPP. They have thousands of employees, so promo codes won't be sufficient. What are our alternatives?

Read the article
Inserting multiple links to one image in Confluence

- by Simon

I am setting up a Wiki in Confluence v3.5.1 I have added a visio diagram (JPG) to a page (this diagram will take up most of the page) - This diagram depicts the workflow between developers and support and clients. I envisage users being able to click on different parts of the diagram and it to open up child pages with more details about that particular process (with videos on 'how-to' do that specific task, like log issues in Jira) However, from what I can see, there is no way from the Confluence editor to add multiple links to the one image, right? I looked at Anchors, but this does not look like it will do the job. So, what is the best option? I remember Dreamweaver having these sorts of tools built in, and there appears to be other utilities that can help put in image map HTML tags, but I cannot see a way of easily editing the HTML in Confluence editor. Also worried about the headache this could cause with managing future changes of the page.

Read the article
How do I boot into console mode (redux)

- by Leo Simon

I'm running Ubuntu 12.04. This question was asked some time ago How do I disable the boot splash screen? but the answers didn't work for me. The standard way to boot into console mode used to be to edit /etc/default/grub and set GRUB_CMDLINE_LINUX_DEFAULT="text" This worked fine until I ran the fix proposed in https://help.ubuntu.com/community/SoundTroubleshootingProcedure in order to get sound to work. Since then, I have disabled the boot-splash-screen, but I can avoid what I presume is the lightdm login prompt screen. All I want to do is disable this gui and be prompted with a console login prompt. (Shouldnt be so hard should it???) I read in three 33416 mentioned above that there was a bug in lightdm (it wasn't recognizing "text" properly as an option for GRUB_CMDLINE_LINUX_DEFAULT.) But this discussion happened more than a year ago, and it's surely been fixed. Yet my lightdm is uptodate (so I'm told when I try to update it with apt-get). As suggested in one of the above, I tried sudo update-rc.d -f lightdm remove which resulted in a hung machine. I managed to recover using recovery mode, but now I still get the gui again. Another suggestion is to edit /etc/init/lightdm.override. I've done this and set it to "manual" as suggested, but lightdm simply ignores this. Could somebody suggest how to proceed please? Thanks very much, Leo

Read the article
Why enumerator structs are a really bad idea

- by Simon Cooper

If you've ever poked around the .NET class libraries in Reflector, I'm sure you would have noticed that the generic collection classes all have implementations of their IEnumerator as a struct rather than a class. As you will see, this design decision has some rather unfortunate side effects... As is generally known in the .NET world, mutable structs are a Very Bad Idea; and there are several other blogs around explaining this (Eric Lippert's blog post explains the problem quite well). In the BCL, the generic collection enumerators are all mutable structs, as they need to keep track of where they are in the collection. This bit me quite hard when I was coding a wrapper around a LinkedList<int>.Enumerator. It boils down to this code: sealed class EnumeratorWrapper : IEnumerator<int> { private readonly LinkedList<int>.Enumerator m_Enumerator; public EnumeratorWrapper(LinkedList<int> linkedList) { m_Enumerator = linkedList.GetEnumerator(); } public int Current { get { return m_Enumerator.Current; } } object System.Collections.IEnumerator.Current { get { return Current; } } public bool MoveNext() { return m_Enumerator.MoveNext(); } public void Reset() { ((System.Collections.IEnumerator)m_Enumerator).Reset(); } public void Dispose() { m_Enumerator.Dispose(); } } The key line here is the MoveNext method. When I initially coded this, I thought that the call to m_Enumerator.MoveNext() would alter the enumerator state in the m_Enumerator class variable and so the enumeration would proceed in an orderly fashion through the collection. However, when I ran this code it went into an infinite loop - the m_Enumerator.MoveNext() call wasn't actually changing the state in the m_Enumerator variable at all, and my code was looping forever on the first collection element. It was only after disassembling that method that I found out what was going on The MoveNext method above results in the following IL: .method public hidebysig newslot virtual final instance bool MoveNext() cil managed { .maxstack 1 .locals init ( [0] bool CS$1$0000, [1] valuetype [System]System.Collections.Generic.LinkedList`1/Enumerator CS$0$0001) L_0000: nop L_0001: ldarg.0 L_0002: ldfld valuetype [System]System.Collections.Generic.LinkedList`1/Enumerator EnumeratorWrapper::m_Enumerator L_0007: stloc.1 L_0008: ldloca.s CS$0$0001 L_000a: call instance bool [System]System.Collections.Generic.LinkedList`1/Enumerator::MoveNext() L_000f: stloc.0 L_0010: br.s L_0012 L_0012: ldloc.0 L_0013: ret } Here, the important line is 0002 - m_Enumerator is accessed using the ldfld operator, which does the following: Finds the value of a field in the object whose reference is currently on the evaluation stack. So, what the MoveNext method is doing is the following: public bool MoveNext() { LinkedList<int>.Enumerator CS$0$0001 = this.m_Enumerator; bool CS$1$0000 = CS$0$0001.MoveNext(); return CS$1$0000; } The enumerator instance being modified by the call to MoveNext is the one stored in the CS$0$0001 variable on the stack, and not the one in the EnumeratorWrapper class instance. Hence why the state of m_Enumerator wasn't getting updated. Hmm, ok. Well, why is it doing this? If you have a read of Eric Lippert's blog post about this issue, you'll notice he quotes a few sections of the C# spec. In particular, 7.5.4: ...if the field is readonly and the reference occurs outside an instance constructor of the class in which the field is declared, then the result is a value, namely the value of the field I in the object referenced by E. And my m_Enumerator field is readonly! Indeed, if I remove the readonly from the class variable then the problem goes away, and the code works as expected. The IL confirms this: .method public hidebysig newslot virtual final instance bool MoveNext() cil managed { .maxstack 1 .locals init ( [0] bool CS$1$0000) L_0000: nop L_0001: ldarg.0 L_0002: ldflda valuetype [System]System.Collections.Generic.LinkedList`1/Enumerator EnumeratorWrapper::m_Enumerator L_0007: call instance bool [System]System.Collections.Generic.LinkedList`1/Enumerator::MoveNext() L_000c: stloc.0 L_000d: br.s L_000f L_000f: ldloc.0 L_0010: ret } Notice on line 0002, instead of the ldfld we had before, we've got a ldflda, which does this: Finds the address of a field in the object whose reference is currently on the evaluation stack. Instead of loading the value, we're loading the address of the m_Enumerator field. So now the call to MoveNext modifies the enumerator stored in the class rather than on the stack, and everything works as expected. Previously, I had thought enumerator structs were an odd but interesting feature of the BCL that I had used in the past to do linked list slices. However, effects like this only underline how dangerous mutable structs are, and I'm at a loss to explain why the enumerators were implemented as structs in the first place. (interestingly, the SortedList<TKey, TValue> enumerator is a struct but is private, which makes it even more odd - the only way it can be accessed is as a boxed IEnumerator!). I would love to hear people's theories as to why the enumerators are implemented in such a fashion. And bonus points if you can explain why LinkedList<int>.Enumerator.Reset is an explicit implementation but Dispose is implicit... Note to self: never ever ever code a mutable struct.

Read the article

Search Results

Search found 847 results on 34 pages for 'simon carpentier'.

Page 5/34 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Simón

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Abhishek Simon

- by Simon

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Reeves

- by Simon Campbell

- by Simon

- by Simon Cooper

- by Simon Cooper

- by Simon

- by Simon

- by Leo Simon

- by Simon Cooper

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >