Search Results

Search found 1375 results on 55 pages for 'asymptotic complexity'.

Page 55/55 | < Previous Page | 51 52 53 54 55

Using WeakReference to resolve issue with .NET unregistered event handlers causing memory leaks.

- by Eric

The problem: Registered event handlers create a reference from the event to the event handler's instance. If that instance fails to unregister the event handler (via Dispose, presumably), then the instance memory will not be freed by the garbage collector. Example: class Foo { public event Action AnEvent; public void DoEvent() { if (AnEvent != null) AnEvent(); } } class Bar { public Bar(Foo l) { l.AnEvent += l_AnEvent; } void l_AnEvent() { } } If I instantiate a Foo, and pass this to a new Bar constructor, then let go of the Bar object, it will not be freed by the garbage collector because of the AnEvent registration. I consider this a memory leak, and seems just like my old C++ days. I can, of course, make Bar IDisposable, unregister the event in the Dispose() method, and make sure to call Dispose() on instances of it, but why should I have to do this? I first question why events are implemented with strong references? Why not use weak references? An event is used to abstractly notify an object of changes in another object. It seems to me that if the event handler's instance is no longer in use (i.e., there are no non-event references to the object), then any events that it is registered with should automatically be unregistered. What am I missing? I have looked at WeakEventManager. Wow, what a pain. Not only is it very difficult to use, but its documentation is inadequate (see http://msdn.microsoft.com/en-us/library/system.windows.weakeventmanager.aspx -- noticing the "Notes to Inheritors" section that has 6 vaguely described bullets). I have seen other discussions in various places, but nothing I felt I could use. I propose a simpler solution based on WeakReference, as described here. My question is: Does this not meet the requirements with significantly less complexity? To use the solution, the above code is modified as follows: class Foo { public WeakReferenceEvent AnEvent = new WeakReferenceEvent(); internal void DoEvent() { AnEvent.Invoke(); } } class Bar { public Bar(Foo l) { l.AnEvent += l_AnEvent; } void l_AnEvent() { } } Notice two things: 1. The Foo class is modified in two ways: The event is replaced with an instance of WeakReferenceEvent, shown below; and the invocation of the event is changed. 2. The Bar class is UNCHANGED. No need to subclass WeakEventManager, implement IWeakEventListener, etc. OK, so on to the implementation of WeakReferenceEvent. This is shown here. Note that it uses the generic WeakReference that I borrowed from here: http://damieng.com/blog/2006/08/01/implementingweakreferencet I had to add Equals() and GetHashCode() to his class, which I include below for reference. class WeakReferenceEvent { public static WeakReferenceEvent operator +(WeakReferenceEvent wre, Action handler) { wre._delegates.Add(new WeakReference<Action>(handler)); return wre; } public static WeakReferenceEvent operator -(WeakReferenceEvent wre, Action handler) { foreach (var del in wre._delegates) if (del.Target == handler) { wre._delegates.Remove(del); return wre; } return wre; } HashSet<WeakReference<Action>> _delegates = new HashSet<WeakReference<Action>>(); internal void Invoke() { HashSet<WeakReference<Action>> toRemove = null; foreach (var del in _delegates) { if (del.IsAlive) del.Target(); else { if (toRemove == null) toRemove = new HashSet<WeakReference<Action>>(); toRemove.Add(del); } } if (toRemove != null) foreach (var del in toRemove) _delegates.Remove(del); } } public class WeakReference<T> : IDisposable { private GCHandle handle; private bool trackResurrection; public WeakReference(T target) : this(target, false) { } public WeakReference(T target, bool trackResurrection) { this.trackResurrection = trackResurrection; this.Target = target; } ~WeakReference() { Dispose(); } public void Dispose() { handle.Free(); GC.SuppressFinalize(this); } public virtual bool IsAlive { get { return (handle.Target != null); } } public virtual bool TrackResurrection { get { return this.trackResurrection; } } public virtual T Target { get { object o = handle.Target; if ((o == null) || (!(o is T))) return default(T); else return (T)o; } set { handle = GCHandle.Alloc(value, this.trackResurrection ? GCHandleType.WeakTrackResurrection : GCHandleType.Weak); } } public override bool Equals(object obj) { var other = obj as WeakReference<T>; return other != null && Target.Equals(other.Target); } public override int GetHashCode() { return Target.GetHashCode(); } } It's functionality is trivial. I override operator + and - to get the += and -= syntactic sugar matching events. These create WeakReferences to the Action delegate. This allows the garbage collector to free the event target object (Bar in this example) when nobody else is holding on to it. In the Invoke() method, simply run through the weak references and call their Target Action. If any dead (i.e., garbage collected) references are found, remove them from the list. Of course, this only works with delegates of type Action. I tried making this generic, but ran into the missing where T : delegate in C#! As an alternative, simply modify class WeakReferenceEvent to be a WeakReferenceEvent, and replace the Action with Action. Fix the compiler errors and you have a class that can be used like so: class Foo { public WeakReferenceEvent<int> AnEvent = new WeakReferenceEvent<int>(); internal void DoEvent() { AnEvent.Invoke(5); } } Hopefully this will help someone else when they run into the mystery .NET event memory leak!

Read the article
Approaches for generic, compile-time safe lazy-load methods

- by Aaronaught

Suppose I have created a wrapper class like the following: public class Foo : IFoo { private readonly IFoo innerFoo; public Foo(IFoo innerFoo) { this.innerFoo = innerFoo; } public int? Bar { get; set; } public int? Baz { get; set; } } The idea here is that the innerFoo might wrap data-access methods or something similarly expensive, and I only want its GetBar and GetBaz methods to be invoked once. So I want to create another wrapper around it, which will save the values obtained on the first run. It's simple enough to do this, of course: int IFoo.GetBar() { if ((Bar == null) && (innerFoo != null)) Bar = innerFoo.GetBar(); return Bar ?? 0; } int IFoo.GetBaz() { if ((Baz == null) && (innerFoo != null)) Baz = innerFoo.GetBaz(); return Baz ?? 0; } But it gets pretty repetitive if I'm doing this with 10 different properties and 30 different wrappers. So I figured, hey, let's make this generic: T LazyLoad<T>(ref T prop, Func<IFoo, T> loader) { if ((prop == null) && (innerFoo != null)) prop = loader(innerFoo); return prop; } Which almost gets me where I want, but not quite, because you can't ref an auto-property (or any property at all). In other words, I can't write this: int IFoo.GetBar() { return LazyLoad(ref Bar, f => f.GetBar()); // <--- Won't compile } Instead, I'd have to change Bar to have an explicit backing field and write explicit getters and setters. Which is fine, except for the fact that I end up writing even more redundant code than I was writing in the first place. Then I considered the possibility of using expression trees: T LazyLoad<T>(Expression<Func<T>> propExpr, Func<IFoo, T> loader) { var memberExpression = propExpr.Body as MemberExpression; if (memberExpression != null) { // Use Reflection to inspect/set the property } } This plays nice with refactoring - it'll work great if I do this: return LazyLoad(f => f.Bar, f => f.GetBar()); But it's not actually safe, because someone less clever (i.e. myself in 3 days from now when I inevitably forget how this is implemented internally) could decide to write this instead: return LazyLoad(f => 3, f => f.GetBar()); Which is either going to crash or result in unexpected/undefined behaviour, depending on how defensively I write the LazyLoad method. So I don't really like this approach either, because it leads to the possibility of runtime errors which would have been prevented in the first attempt. It also relies on Reflection, which feels a little dirty here, even though this code is admittedly not performance-sensitive. Now I could also decide to go all-out and use DynamicProxy to do method interception and not have to write any code, and in fact I already do this in some applications. But this code is residing in a core library which many other assemblies depend on, and it seems horribly wrong to be introducing this kind of complexity at such a low level. Separating the interceptor-based implementation from the IFoo interface by putting it into its own assembly doesn't really help; the fact is that this very class is still going to be used all over the place, must be used, so this isn't one of those problems that could be trivially solved with a little DI magic. The last option I've already thought of would be to have a method like: T LazyLoad<T>(Func<T> getter, Action<T> setter, Func<IFoo, T> loader) { ... } This option is very "meh" as well - it avoids Reflection but is still error-prone, and it doesn't really reduce the repetition that much. It's almost as bad as having to write explicit getters and setters for each property. Maybe I'm just being incredibly nit-picky, but this application is still in its early stages, and it's going to grow substantially over time, and I really want to keep the code squeaky-clean. Bottom line: I'm at an impasse, looking for other ideas. Question: Is there any way to clean up the lazy-loading code at the top, such that the implementation will: Guarantee compile-time safety, like the ref version; Actually reduce the amount of code repetition, like the Expression version; and Not take on any significant additional dependencies? In other words, is there a way to do this just using regular C# language features and possibly a few small helper classes? Or am I just going to have to accept that there's a trade-off here and strike one of the above requirements from the list?

Read the article
Does there exist an "idea checkout system" on the Internet?

- by TimeSpace Traveller

Greetings. I would like to ask the following question: is there anything on the Internet like an "idea checkout" system? Situation: I'm a software developer. Since my current job has started 2 years ago, my mentor at that time has pointed me to the open source world. I have only put little time to look at some of the open source projects, let alone any contribution. However, it is my wish to start developing something outside of the work. Well, except a little problem. I don't know what to develop! It is not about the technical knowledge; the problem is that, I am not a creative person. I am very good at analytical thinking, as well as debugging skills. When being told by my work partners to develop a solution, I could get it done without a problem. However, outside of work, I have no idea what to develop. When I look at the Internet, it seems that so many people have already been developing on so many interesting stuff, making me wonder what I could develop, so that I would not reinvent something already existed. That starts to make me wonder. On the Internet, is there anything like an "idea checkout" system or society? For example, some people would throw in as a software idea, and the system would keep it as an "inventory"; later, a potential software developer would "check out" the idea, just a how people would check out a book from the library. Then, the developer would check the "idea" back in, with a certain kind of work-in-progress or developed software, thus becoming an open-source project. I have just noticed that here at stackoverflow, there is a "Project-Ideas" tag, so perhaps that can provide me some ideas on what to develop; still, my wonder is about a system that people would provide ideas, and people would check out ideas to develop / implement into actual solution. Is there such a system or society existing anywhere on the Internet? Any input is welcome! Thank you very much. Update: Thank you for everyone who has answered my question. Certainly, "getting idea" is part of my problem; as a software developer, however, I'm concerned more than just "getting idea". What I am concerned more, as I have commented, is about the existence of such an idea exchanging ecosystem, capable to initiate open-source projects. I'll put an example here. Say, person A has an idea of music search program, but not search by the attributes of the music (composer, singer, publisher, lyrics, etc.); instead, he wants a program (and a database) to search a piece of music by melody. Very often, people only remember a piece of music by its melody, not even the name of the music (e.g. the music he wants was only once heard in a bookstore, but the melody just gets stuck in his head!). In order to find that piece, normally he would just need to blindly search for it, and spent a long time to do so. A search by melody would enable person A to find the piece much quicker. However, he would not want to personally work on it, not just because of the complexity (he is not a musician and/or programmer, knowing almost nothing about music systems in computer, search algorithms, etc.), but also legal issues (RIAA??), thus he would just like to keep the idea at some place, and let other people to work on that. Now, a developer (person B) may be at the same stage as I am right now, wishing to find something to develop, but not having an idea. With the idea exchanging ecosystem, person B will search, and somehow discover person A's music search idea, and feeling interested enough to work on it. So he "checks out" the idea, start working on it (at least a skeleton), and checks back in with the progress. An open-source project starts from here, fulfilling person A's wish, and person B's programming desire. The above is just an example, because there are already such systems exist on the Internet, but it illustrates what I think about the idea exchange system in my mind. My main concern is about idea exchanging ecosystem, not at personal and unorganized level, but at a semi-organized protocol that's specifically for software developers, having actual projects coming out as the fruits. Not about "projects", but about "ideas and product of ideas". Hopefully that would clear up some of the original idea of this question. Any input is welcome; in fact, I would like to hear as many people as possible how everyone thinks about this. Thank you very much!

Read the article
Help needed with Javascript Variable Scope / OOP and Call Back Functions

- by gargantaun

I think this issue goes beyond typical variable scope and closure stuff, or maybe I'm an idiot. Here goes anyway... I'm creating a bunch of objects on the fly in a jQuery plugin. The object look something like this function WedgePath(canvas){ this.targetCanvas = canvas; this.label; this.logLabel = function(){ console.log(this.label) } } the jQuery plugin looks something like this (function($) { $.fn.myPlugin = function() { return $(this).each(function() { // Create Wedge Objects for(var i = 1; i <= 30; i++){ var newWedge = new WedgePath(canvas); newWedge.label = "my_wedge_"+i; globalFunction(i, newWedge]); } }); } })(jQuery); So... the plugin creates a bunch of wedgeObjects, then calls 'globalFunction' for each one, passing in the latest WedgePath instance. Global function looks like this. function globalFunction(indicator_id, pWedge){ var targetWedge = pWedge; targetWedge.logLabel(); } What happens next is that the console logs each wedges label correctly. However, I need a bit more complexity inside globalFunction. So it actually looks like this... function globalFunction(indicator_id, pWedge){ var targetWedge = pWedge; someSql = "SELECT * FROM myTable WHERE id = ?"; dbInterface.executeSql(someSql, [indicator_id], function(transaction, result){ targetWedge.logLabel(); }) } There's a lot going on here so i'll explain. I'm using client side database storage (WebSQL i call it). 'dbInterface' an instance of a simple javascript object I created which handles the basics of interacting with a client side database [shown at the end of this question]. the executeSql method takes up to 4 arguments The SQL String an optional arguments array an optional onSuccess handler an optional onError handler (not used in this example) What I need to happen is: When the WebSQL query has completed, it takes some of that data and manipulates some attribute of a particular wedge. But, when I call 'logLabel' on an instance of WedgePath inside the onSuccess handler, I get the label of the very last instance of WedgePath that was created way back in the plugin code. Now I suspect that the problem lies in the var newWedge = new WedgePath(canvas); line. So I tried pushing each newWedge into an array, which I thought would prevent that line from replacing or overwriting the WedgePath instance at every iteration... wedgeArray = []; // Inside the plugin... for(var i = 1; i <= 30; i++){ var newWedge = new WedgePath(canvas); newWedge.label = "my_wedge_"+i; wedgeArray.push(newWedge); } for(var i = 0; i < wedgeArray.length; i++){ wedgeArray[i].logLabel() } But again, I get the last instance of WedgePath to be created. This is driving me nuts. I apologise for the length of the question but I wanted to be as clear as possible. END ============================================================== Also, here's the code for dbInterface object should it be relevant. function DatabaseInterface(db){ var DB = db; this.sql = function(sql, arr, pSuccessHandler, pErrorHandler){ successHandler = (pSuccessHandler) ? pSuccessHandler : this.defaultSuccessHandler; errorHandler = (pErrorHandler) ? pErrorHandler : this.defaultErrorHandler; DB.transaction(function(tx){ if(!arr || arr.length == 0){ tx.executeSql(sql, [], successHandler, errorHandler); }else{ tx.executeSql(sql,arr, successHandler, errorHandler) } }); } // ---------------------------------------------------------------- // A Default Error Handler // ---------------------------------------------------------------- this.defaultErrorHandler = function(transaction, error){ // error.message is a human-readable string. // error.code is a numeric error code console.log('WebSQL Error: '+error.message+' (Code '+error.code+')'); // Handle errors here var we_think_this_error_is_fatal = true; if (we_think_this_error_is_fatal) return true; return false; } // ---------------------------------------------------------------- // A Default Success Handler // This doesn't do anything except log a success message // ---------------------------------------------------------------- this.defaultSuccessHandler = function(transaction, results) { console.log("WebSQL Success. Default success handler. No action taken."); } }

Read the article
C# Reading and Writing a Char[] to and from a Byte[] - Updated with Solution

- by Simon G

Hi, I have a byte array of around 10,000 bytes which is basically a blob from delphi that contains char, string, double and arrays of various types. This need to be read in and updated via C#. I've created a very basic reader that gets the byte array from the db and converts the bytes to the relevant object type when accessing the property which works fine. My problem is when I try to write to a specific char[] item, it doesn't seem to update the byte array. I've created the following extensions for reading and writing: public static class CharExtension { public static byte ToByte( this char c ) { return Convert.ToByte( c ); } public static byte ToByte( this char c, int position, byte[] blob ) { byte b = c.ToByte(); blob[position] = b; return b; } } public static class CharArrayExtension { public static byte[] ToByteArray( this char[] c ) { byte[] b = new byte[c.Length]; for ( int i = 1; i < c.Length; i++ ) { b[i] = c[i].ToByte(); } return b; } public static byte[] ToByteArray( this char[] c, int positon, int length, byte[] blob ) { byte[] b = c.ToByteArray(); Array.Copy( b, 0, blob, positon, length ); return b; } } public static class ByteExtension { public static char ToChar( this byte[] b, int position ) { return Convert.ToChar( b[position] ); } } public static class ByteArrayExtension { public static char[] ToCharArray( this byte[] b, int position, int length ) { char[] c = new char[length]; for ( int i = 0; i < length; i++ ) { c[i] = b.ToChar( position ); position += 1; } return c; } } to read and write chars and char arrays my code looks like: Byte[] _Blob; // set from a db field public char ubin { get { return _tariffBlob.ToChar( 14 ); } set { value.ToByte( 14, _Blob ); } } public char[] usercaplas { get { return _tariffBlob.ToCharArray( 2035, 10 ); } set { value.ToByteArray( 2035, 10, _Blob ); } } So to write to the objects I can do: ubin = 'C'; // this will update the byte[] usercaplas = new char[10] { 'A', 'B', etc. }; // this will update the byte[] usercaplas[3] = 'C'; // this does not update the byte[] I know the reason is that the setter property is not being called but I want to know is there a way around this using code similar to what I already have? I know a possible solution is to use a private variable called _usercaplas that I set and update as needed however as the byte array is nearly 10,000 bytes in length the class is already long and I would like a simpler approach as to reduce the overall code length and complexity. Thank Solution Here's my solution should anyone want it. If you have a better way of doing then let me know please. First I created a new class for the array: public class CharArrayList : ArrayList { char[] arr; private byte[] blob; private int length = 0; private int position = 0; public CharArrayList( byte[] blob, int position, int length ) { this.blob = blob; this.length = length; this.position = position; PopulateInternalArray(); SetArray(); } private void PopulateInternalArray() { arr = blob.ToCharArray( position, length ); } private void SetArray() { foreach ( char c in arr ) { this.Add( c ); } } private void UpdateInternalArray() { this.Clear(); SetArray(); } public char this[int i] { get { return arr[i]; } set { arr[i] = value; UpdateInternalArray(); } } } Then I created a couple of extension methods to help with converting to a byte[] public static byte[] ToByteArray( this CharArrayList c ) { byte[] b = new byte[c.Count]; for ( int i = 0; i < c.Count; i++ ) { b[i] = Convert.ToChar( c[i] ).ToByte(); } return b; } public static byte[] ToByteArray( this CharArrayList c, byte[] blob, int position, int length ) { byte[] b = c.ToByteArray(); Array.Copy( b, 0, blob, position, length ); return b; } So to read and write to the object: private CharArrayList _usercaplass; public CharArrayList usercaplas { get { if ( _usercaplass == null ) _usercaplass = new CharArrayList( _tariffBlob, 2035, 100 ); return _usercaplass; } set { _usercaplass = value; _usercaplass.ToByteArray( _tariffBlob, 2035, 100 ); } } As mentioned before its not an ideal solutions as I have to have private variables and extra code in the setter but I couldnt see a way around it.

Read the article
Design advice for avoiding change in several classes

- by Anders Svensson

Hi, I'm trying to figure out how to design a small application more elegantly, and make it more resistant to change. Basically it is a sort of project price calculator, and the problem is that there are many parameters that can affect the pricing. I'm trying to avoid cluttering the code with a lot of if-clauses for each parameter, but still I have e.g. if-clauses in two places checking for the value of the size parameter. I have the Head First Design Patterns book, and have tried to find ideas there, but the closest I got was the decorator pattern, which has an example where starbuzz coffee sets prices depending first on condiments added, and then later in an exercise by adding a size parameter (Tall, Grande, Venti). But that didn't seem to help, because adding that parameter still seemed to add if-clause complexity in a lot of places (and this being an exercise they didn't explain that further). What I am trying to avoid is having to change several classes if a parameter were to change or a new parameter added, or at least change in as few places as possible (there's some fancy design principle word for this that I don't rememeber :-)). Here below is the code. Basically it calculates the price for a project that has the tasks "Writing" and "Analysis" with a size parameter and different pricing models. There will be other parameters coming in later too, like "How new is the product?" (New, 1-5 years old, 6-10 years old), etc. Any advice on the best design would be greatly appreciated, whether a "design pattern" or just good object oriented principles that would make it resistant to change (e.g. adding another size, or changing one of the size values, and only have to change in one place rather than in several if-clauses): public class Project { private readonly int _numberOfProducts; protected Size _size; public Task Analysis { get; set; } public Task Writing { get; set; } public Project(int numberOfProducts) { _numberOfProducts = numberOfProducts; _size = GetSize(); Analysis = new AnalysisTask(numberOfProducts, _size); Writing = new WritingTask(numberOfProducts, _size); } private Size GetSize() { if (_numberOfProducts <= 2) return Size.small; if (_numberOfProducts <= 8) return Size.medium; return Size.large; } public double GetPrice() { return Analysis.GetPrice() + Writing.GetPrice(); } } public abstract class Task { protected readonly int _numberOfProducts; protected Size _size; protected double _pricePerHour; protected Dictionary<Size, int> _hours; public abstract int TotalHours { get; } public double Price { get; set; } protected Task(int numberOfProducts, Size size) { _numberOfProducts = numberOfProducts; _size = size; } public double GetPrice() { return _pricePerHour * TotalHours; } } public class AnalysisTask : Task { public AnalysisTask(int numberOfProducts, Size size) : base(numberOfProducts, size) { _pricePerHour = 850; _hours = new Dictionary<Size, int>() { { Size.small, 56 }, { Size.medium, 104 }, { Size.large, 200 } }; } public override int TotalHours { get { return _hours[_size]; } } } public class WritingTask : Task { public WritingTask(int numberOfProducts, Size size) : base(numberOfProducts, size) { _pricePerHour = 650; _hours = new Dictionary<Size, int>() { { Size.small, 125 }, { Size.medium, 100 }, { Size.large, 60 } }; } public override int TotalHours { get { if (_size == Size.small) return _hours[_size] * _numberOfProducts; if (_size == Size.medium) return (_hours[Size.small] * 2) + (_hours[Size.medium] * (_numberOfProducts - 2)); return (_hours[Size.small] * 2) + (_hours[Size.medium] * (8 - 2)) + (_hours[Size.large] * (_numberOfProducts - 8)); } } } public enum Size { small, medium, large } public partial class Form1 : Form { public Form1() { InitializeComponent(); List<int> quantities = new List<int>(); for (int i = 0; i < 100; i++) { quantities.Add(i); } comboBoxNumberOfProducts.DataSource = quantities; } private void comboBoxNumberOfProducts_SelectedIndexChanged(object sender, EventArgs e) { Project project = new Project((int)comboBoxNumberOfProducts.SelectedItem); labelPrice.Text = project.GetPrice().ToString(); labelWriterHours.Text = project.Writing.TotalHours.ToString(); labelAnalysisHours.Text = project.Analysis.TotalHours.ToString(); } } At the end is a simple current calling code in the change event for a combobox that set size... (BTW, I don't like the fact that I have to use several dots to get to the TotalHours at the end here either, as far as I can recall, that violates the "principle of least knowledge" or "the law of demeter", so input on that would be appreciated too, but it's not the main point of the question) Regards, Anders

Read the article
More elegant way to make a C++ member function change different member variables based on template p

- by Eric Moyer

Today, I wrote some code that needed to add elements to different container variables depending on the type of a template parameter. I solved it by writing a friend helper class specialized on its own template parameter which had a member variable of the original class. It saved me a few hundred lines of repeating myself without adding much complexity. However, it seemed kludgey. I would like to know if there is a better, more elegant way. The code below is a greatly simplified example illustrating the problem and my solution. It compiles in g++. #include <vector> #include <algorithm> #include <iostream> namespace myNS{ template<class Elt> struct Container{ std::vector<Elt> contents; template<class Iter> void set(Iter begin, Iter end){ contents.erase(contents.begin(), contents.end()); std::copy(begin, end, back_inserter(contents)); } }; struct User; namespace WkNS{ template<class Elt> struct Worker{ User& u; Worker(User& u):u(u){} template<class Iter> void set(Iter begin, Iter end); }; }; struct F{ int x; explicit F(int x):x(x){} }; struct G{ double x; explicit G(double x):x(x){} }; struct User{ Container<F> a; Container<G> b; template<class Elt> void doIt(Elt x, Elt y){ std::vector<Elt> v; v.push_back(x); v.push_back(y); Worker<Elt>(*this).set(v.begin(), v.end()); } }; namespace WkNS{ template<class Elt> template<class Iter> void Worker<Elt>::set(Iter begin, Iter end){ std::cout << "Set a." << std::endl; u.a.set(begin, end); } template<> template<class Iter> void Worker<G>::set(Iter begin, Iter end){ std::cout << "Set b." << std::endl; u.b.set(begin, end); } }; }; int main(){ using myNS::F; using myNS::G; myNS::User u; u.doIt(F(1),F(2)); u.doIt(G(3),G(4)); } User is the class I was writing. Worker is my helper class. I have it in its own namespace because I don't want it causing trouble outside myNS. Container is a container class whose definition I don't want to modify, but is used by User in its instance variables. doIt<F> should modify a. doIt<G> should modify b. F and G are open to limited modification if that would produce a more elegant solution. (As an example of one such modification, in the real application F's constructor takes a dummy parameter to make it look like G's constructor and save me from repeating myself.) In the real code, Worker is a friend of User and member variables are private. To make the example simpler to write, I made everything public. However, a solution that requires things to be public really doesn't answer my question. Given all these caveats, is there a better way to write User::doIt?

Read the article
How should I implement simple caches with concurrency on Redis?

- by solublefish

Background I have a 2-tier web service - just my app server and an RDBMS. I want to move to a pool of identical app servers behind a load balancer. I currently cache a bunch of objects in-process. I hope to move them to a shared Redis. I have a dozen or so caches of simple, small-sized business objects. For example, I have a set of Foos. Each Foo has a unique FooId and an OwnerId. One "owner" may own multiple Foos. In a traditional RDBMS this is just a table with an index on the PK FooId and one on OwnerId. I'm caching this in one process simply: Dictionary<int,Foo> _cacheFooById; Dictionary<int,HashSet<int>> _indexFooIdsByOwnerId; Reads come straight from here, and writes go here and to the RDBMS. I usually have this invariant: "For a given group [say by OwnerId], the whole group is in cache or none of it is." So when I cache miss on a Foo, I pull that Foo and all the owner's other Foos from the RDBMS. Updates make sure to keep the index up to date and respect the invariant. When an owner calls GetMyFoos I never have to worry that some are cached and some aren't. What I did already The first/simplest answer seems to be to use plain ol' SET and GET with a composite key and json value: SET( "ServiceCache:Foo:" + theFoo.Id, JsonSerialize(theFoo)); I later decided I liked: HSET( "ServiceCache:Foo", theFoo.FooId, JsonSerialize(theFoo)); That lets me get all the values in one cache as HVALS. It also felt right - I'm literally moving hashtables to Redis, so perhaps my top-level items should be hashes. This works to first order. If my high-level code is like: UpdateCache(myFoo); AddToIndex(myFoo); That translates into: HSET ("ServiceCache:Foo", theFoo.FooId, JsonSerialize(theFoo)); var myFoos = JsonDeserialize( HGET ("ServiceCache:FooIndex", theFoo.OwnerId) ); myFoos.Add(theFoo.OwnerId); HSET ("ServiceCache:FooIndex", theFoo.OwnerId, JsonSerialize(myFoos)); However, this is broken in two ways. Two concurrent operations can read/modify/write at the same time. The latter "wins" the final HSET and the former's index update is lost. Another operation could read the index in between the first and second lines. It would miss a Foo that it should find. So how do I index properly? I think I could use a Redis set instead of a json-encoded value for the index. That would solve part of the problem since the "add-to-index-if-not-already-present" would be atomic. I also read about using MULTI as a "transaction" but it doesn't seem like it does what I want. Am I right that I can't really MULTI; HGET; {update}; HSET; EXEC since it doesn't even do the HGET before I issue the EXEC? I also read about using WATCH and MULTI for optimistic concurrency, then retrying on failure. But WATCH only works on top-level keys. So it's back to SET/GET instead of HSET/HGET. And now I need a new index-like-thing to support getting all the values in a given cache. If I understand it right, I can combine all these things to do the job. Something like: while(!succeeded) { WATCH( "ServiceCache:Foo:" + theFoo.FooId ); WATCH( "ServiceCache:FooIndexByOwner:" + theFoo.OwnerId ); WATCH( "ServiceCache:FooIndexAll" ); MULTI(); SET ("ServiceCache:Foo:" + theFoo.FooId, JsonSerialize(theFoo)); SADD ("ServiceCache:FooIndexByOwner:" + theFoo.OwnerId, theFoo.FooId); SADD ("ServiceCache:FooIndexAll", theFoo.FooId); EXEC(); //TODO somehow set succeeded properly } Finally I'd have to translate this pseudocode into real code depending how my client library uses WATCH/MULTI/EXEC; it looks like they need some sort of context to hook them together. All in all this seems like a lot of complexity for what has to be a very common case; I can't help but think there's a better, smarter, Redis-ish way to do things that I'm just not seeing. How do I lock properly? Even if I had no indexes, there's still a (probably rare) race condition. A: HGET - cache miss B: HGET - cache miss A: SELECT B: SELECT A: HSET C: HGET - cache hit C: UPDATE C: HSET B: HSET ** this is stale data that's clobbering C's update. Note that C could just be a really-fast A. Again I think WATCH, MULTI, retry would work, but... ick. I know in some places people use special Redis keys as locks for other objects. Is that a reasonable approach here? Should those be top-level keys like ServiceCache:FooLocks:{Id} or ServiceCache:Locks:Foo:{Id}? Or make a separate hash for them - ServiceCache:Locks with subkeys Foo:{Id}, or ServiceCache:Locks:Foo with subkeys {Id} ? How would I work around abandoned locks, say if a transaction (or a whole server) crashes while "holding" the lock?

Read the article
Microsoft and jQuery

- by Rick Strahl

The jQuery JavaScript library has been steadily getting more popular and with recent developments from Microsoft, jQuery is also getting ever more exposure on the ASP.NET platform including now directly from Microsoft. jQuery is a light weight, open source DOM manipulation library for JavaScript that has changed how many developers think about JavaScript. You can download it and find more information on jQuery on www.jquery.com. For me jQuery has had a huge impact on how I develop Web applications and was probably the main reason I went from dreading to do JavaScript development to actually looking forward to implementing client side JavaScript functionality. It has also had a profound impact on my JavaScript skill level for me by seeing how the library accomplishes things (and often reviewing the terse but excellent source code). jQuery made an uncomfortable development platform (JavaScript + DOM) a joy to work on. Although jQuery is by no means the only JavaScript library out there, its ease of use, small size, huge community of plug-ins and pure usefulness has made it easily the most popular JavaScript library available today. As a long time jQuery user, I’ve been excited to see the developments from Microsoft that are bringing jQuery to more ASP.NET developers and providing more integration with jQuery for ASP.NET’s core features rather than relying on the ASP.NET AJAX library. Microsoft and jQuery – making Friends jQuery is an open source project but in the last couple of years Microsoft has really thrown its weight behind supporting this open source library as a supported component on the Microsoft platform. When I say supported I literally mean supported: Microsoft now offers actual tech support for jQuery as part of their Product Support Services (PSS) as jQuery integration has become part of several of the ASP.NET toolkits and ships in several of the default Web project templates in Visual Studio 2010. The ASP.NET MVC 3 framework (still in Beta) also uses jQuery for a variety of client side support features including client side validation and we can look forward toward more integration of client side functionality via jQuery in both MVC and WebForms in the future. In other words jQuery is becoming an optional but included component of the ASP.NET platform. PSS support means that support staff will answer jQuery related support questions as part of any support incidents related to ASP.NET which provides some piece of mind to some corporate development shops that require end to end support from Microsoft. In addition to including jQuery and supporting it, Microsoft has also been getting involved in providing development resources for extending jQuery’s functionality via plug-ins. Microsoft’s last version of the Microsoft Ajax Library – which is the successor to the native ASP.NET AJAX Library – included some really cool functionality for client templates, databinding and localization. As it turns out Microsoft has rebuilt most of that functionality using jQuery as the base API and provided jQuery plug-ins of these components. Very recently these three plug-ins were submitted and have been approved for inclusion in the official jQuery plug-in repository and been taken over by the jQuery team for further improvements and maintenance. Even more surprising: The jQuery-templates component has actually been approved for inclusion in the next major update of the jQuery core in jQuery V1.5, which means it will become a native feature that doesn’t require additional script files to be loaded. Imagine this – an open source contribution from Microsoft that has been accepted into a major open source project for a core feature improvement. Microsoft has come a long way indeed! What the Microsoft Involvement with jQuery means to you For Microsoft jQuery support is a strategic decision that affects their direction in client side development, but nothing stopped you from using jQuery in your applications prior to Microsoft’s official backing and in fact a large chunk of developers did so readily prior to Microsoft’s announcement. Official support from Microsoft brings a few benefits to developers however. jQuery support in Visual Studio 2010 means built-in support for jQuery IntelliSense, automatically added jQuery scripts in many projects types and a common base for client side functionality that actually uses what most developers are already using. If you have already been using jQuery and were worried about straying from the Microsoft line and their internal Microsoft Ajax Library – worry no more. With official support and the change in direction towards jQuery Microsoft is now following along what most in the ASP.NET community had already been doing by using jQuery, which is likely the reason for Microsoft’s shift in direction in the first place. ASP.NET AJAX and the Microsoft AJAX Library weren’t bad technology – there was tons of useful functionality buried in these libraries. However, these libraries never got off the ground, mainly because early incarnations were squarely aimed at control/component developers rather than application developers. For all the functionality that these controls provided for control developers they lacked in useful and easily usable application developer functionality that was easily accessible in day to day client side development. The result was that even though Microsoft shipped support for these tools in the box (in .NET 3.5 and 4.0), other than for the internal support in ASP.NET for things like the UpdatePanel and the ASP.NET AJAX Control Toolkit as well as some third party vendors, the Microsoft client libraries were largely ignored by the developer community opening the door for other client side solutions. Microsoft seems to be acknowledging developer choice in this case: Many more developers were going down the jQuery path rather than using the Microsoft built libraries and there seems to be little sense in continuing development of a technology that largely goes unused by the majority of developers. Kudos for Microsoft for recognizing this and gracefully changing directions. Note that even though there will be no further development in the Microsoft client libraries they will continue to be supported so if you’re using them in your applications there’s no reason to start running for the exit in a panic and start re-writing everything with jQuery. Although that might be a reasonable choice in some cases, jQuery and the Microsoft libraries work well side by side so that you can leave existing solutions untouched even as you enhance them with jQuery. The Microsoft jQuery Plug-ins – Solid Core Features One of the most interesting developments in Microsoft’s embracing of jQuery is that Microsoft has started contributing to jQuery via standard mechanism set for jQuery developers: By submitting plug-ins. Microsoft took some of the nicest new features of the unpublished Microsoft Ajax Client Library and re-wrote these components for jQuery and then submitted them as plug-ins to the jQuery plug-in repository. Accepted plug-ins get taken over by the jQuery team and that’s exactly what happened with the three plug-ins submitted by Microsoft with the templating plug-in even getting slated to be published as part of the jQuery core in the next major release (1.5). The following plug-ins are provided by Microsoft: jQuery Templates – a client side template rendering engine jQuery Data Link – a client side databinder that can synchronize changes without code jQuery Globalization – provides formatting and conversion features for dates and numbers The first two are ports of functionality that was slated for the Microsoft Ajax Library while functionality for the globalization library provides functionality that was already found in the original ASP.NET AJAX library. To me all three plug-ins address a pressing need in client side applications and provide functionality I’ve previously used in other incarnations, but with more complete implementations. Let’s take a close look at these plug-ins. jQuery Templates http://api.jquery.com/category/plugins/templates/ Client side templating is a key component for building rich JavaScript applications in the browser. Templating on the client lets you avoid from manually creating markup by creating DOM nodes and injecting them individually into the document via code. Rather you can create markup templates – similar to the way you create classic ASP server markup – and merge data into these templates to render HTML which you can then inject into the document or replace existing content with. Output from templates are rendered as a jQuery matched set and can then be easily inserted into the document as needed. Templating is key to minimize client side code and reduce repeated code for rendering logic. Instead a single template can be used in many places for updating and adding content to existing pages. Further if you build pure AJAX interfaces that rely entirely on client rendering of the initial page content, templates allow you to a use a single markup template to handle all rendering of each specific HTML section/element. I’ve used a number of different client rendering template engines with jQuery in the past including jTemplates (a PHP style templating engine) and a modified version of John Resig’s MicroTemplating engine which I built into my own set of libraries because it’s such a commonly used feature in my client side applications. jQuery templates adds a much richer templating model that allows for sub-templates and access to the data items. Like John Resig’s original Micro Template engine, the core basics of the templating engine create JavaScript code which means that templates can include JavaScript code. To give you a basic idea of how templates work imagine I have an application that downloads a set of stock quotes based on a symbol list then displays them in the document. To do this you can create an ‘item’ template that describes how each of the quotes is renderd as a template inside of the document: <script id="stockTemplate" type="text/x-jquery-tmpl"> <div id="divStockQuote" class="errordisplay" style="width: 500px;"> <div class="label">Company:</div><div><b>${Company}(${Symbol})</b></div> <div class="label">Last Price:</div><div>${LastPrice}</div> <div class="label">Net Change:</div><div> {{if NetChange > 0}} <b style="color:green" >${NetChange}</b> {{else}} <b style="color:red" >${NetChange}</b> {{/if}} </div> <div class="label">Last Update:</div><div>${LastQuoteTimeString}</div> </div> </script> The ‘template’ is little more than HTML with some markup expressions inside of it that define the template language. Notice the embedded ${} expressions which reference data from the quote objects returned from an AJAX call on the server. You can embed any JavaScript or value expression in these template expressions. There are also a number of structural commands like {{if}} and {{each}} that provide for rudimentary logic inside of your templates as well as commands ({{tmpl}} and {{wrap}}) for nesting templates. You can find more about the full set of markup expressions available in the documentation. To load up this data you can use code like the following: <script type="text/javascript"> //var Proxy = new ServiceProxy("../PageMethods/PageMethodsService.asmx/"); $(document).ready(function () { $("#btnGetQuotes").click(GetQuotes); }); function GetQuotes() { var symbols = $("#txtSymbols").val().split(","); $.ajax({ url: "../PageMethods/PageMethodsService.asmx/GetStockQuotes", data: JSON.stringify({ symbols: symbols }), // parameter map type: "POST", // data has to be POSTed contentType: "application/json", timeout: 10000, dataType: "json", success: function (result) { var quotes = result.d; var jEl = $("#stockTemplate").tmpl(quotes); $("#quoteDisplay").empty().append(jEl); }, error: function (xhr, status) { alert(status + "\r\n" + xhr.responseText); } }); }; </script> In this case an ASMX AJAX service is called to retrieve the stock quotes. The service returns an array of quote objects. The result is returned as an object with the .d property (in Microsoft service style) that returns the actual array of quotes. The template is applied with: var jEl = $("#stockTemplate").tmpl(quotes); which selects the template script tag and uses the .tmpl() function to apply the data to it. The result is a jQuery matched set of elements that can then be appended to the quote display element in the page. The template is merged against an array in this example. When the result is an array the template is automatically applied to each each array item. If you pass a single data item – like say a stock quote – the template works exactly the same way but is applied only once. Templates also have access to a $data item which provides the current data item and information about the tempalte that is currently executing. This makes it possible to keep context within the context of the template itself and also to pass context from a parent template to a child template which is very powerful. Templates can be evaluated by using the template selector and calling the .tmpl() function on the jQuery matched set as shown above or you can use the static $.tmpl() function to provide a template as a string. This allows you to dynamically create templates in code or – more likely – to load templates from the server via AJAX calls. In short there are options The above shows off some of the basics, but there’s much for functionality available in the template engine. Check the documentation link for more information and links to additional examples. The plug-in download also comes with a number of examples that demonstrate functionality. jQuery templates will become a native component in jQuery Core 1.5, so it’s definitely worthwhile checking out the engine today and get familiar with this interface. As much as I’m stoked about templating becoming part of the jQuery core because it’s such an integral part of many applications, there are also a couple shortcomings in the current incarnation: Lack of Error Handling Currently if you embed an expression that is invalid it’s simply not rendered. There’s no error rendered into the template nor do the various template functions throw errors which leaves finding of bugs as a runtime exercise. I would like some mechanism – optional if possible – to be able to get error info of what is failing in a template when it’s rendered. No String Output Templates are always rendered into a jQuery matched set and there’s no way that I can see to directly render to a string. String output can be useful for debugging as well as opening up templating for creating non-HTML string output. Limited JavaScript Access Unlike John Resig’s original MicroTemplating Engine which was entirely based on JavaScript code generation these templates are limited to a few structured commands that can ‘execute’. There’s no code execution inside of script code which means you’re limited to calling expressions available in global objects or the data item passed in. This may or may not be a big deal depending on the complexity of your template logic. Error handling has been discussed quite a bit and it’s likely there will be some solution to that particualar issue by the time jQuery templates ship. The others are relatively minor issues but something to think about anyway. jQuery Data Link http://api.jquery.com/category/plugins/data-link/ jQuery Data Link provides the ability to do two-way data binding between input controls and an underlying object’s properties. The typical scenario is linking a textbox to a property of an object and have the object updated when the text in the textbox is changed and have the textbox change when the value in the object or the entire object changes. The plug-in also supports converter functions that can be applied to provide the conversion logic from string to some other value typically necessary for mapping things like textbox string input to say a number property and potentially applying additional formatting and calculations. In theory this sounds great, however in reality this plug-in has some serious usability issues. Using the plug-in you can do things like the following to bind data: person = { firstName: "rick", lastName: "strahl"}; $(document).ready( function() { // provide for two-way linking of inputs $("form").link(person); // bind to non-input elements explicitly $("#objFirst").link(person, { firstName: { name: "objFirst", convertBack: function (value, source, target) { $(target).text(value); } } }); $("#objLast").link(person, { lastName: { name: "objLast", convertBack: function (value, source, target) { $(target).text(value); } } }); }); This code hooks up two-way linking between a couple of textboxes on the page and the person object. The first line in the .ready() handler provides mapping of object to form field with the same field names as properties on the object. Note that .link() does NOT bind items into the textboxes when you call .link() – changes are mapped only when values change and you move out of the field. Strike one. The two following commands allow manual binding of values to specific DOM elements which is effectively a one-way bind. You specify the object and a then an explicit mapping where name is an ID in the document. The converter is required to explicitly assign the value to the element. Strike two. You can also detect changes to the underlying object and cause updates to the input elements bound. Unfortunately the syntax to do this is not very natural as you have to rely on the jQuery data object. To update an object’s properties and get change notification looks like this: function updateFirstName() { $(person).data("firstName", person.firstName + " (code updated)"); } This works fine in causing any linked fields to be updated. In the bindings above both the firstName input field and objFirst DOM element gets updated. But the syntax requires you to use a jQuery .data() call for each property change to ensure that the changes are tracked properly. Really? Sure you’re binding through multiple layers of abstraction now but how is that better than just manually assigning values? The code savings (if any) are going to be minimal. As much as I would like to have a WPF/Silverlight/Observable-like binding mechanism in client script, this plug-in doesn’t help much towards that goal in its current incarnation. While you can bind values, the ‘binder’ is too limited to be really useful. If initial values can’t be assigned from the mappings you’re going to end up duplicating work loading the data using some other mechanism. There’s no easy way to re-bind data with a different object altogether since updates trigger only through the .data members. Finally, any non-input elements have to be bound via code that’s fairly verbose and frankly may be more voluminous than what you might write by hand for manual binding and unbinding. Two way binding can be very useful but it has to be easy and most importantly natural. If it’s more work to hook up a binding than writing a couple of lines to do binding/unbinding this sort of thing helps very little in most scenarios. In talking to some of the developers the feature set for Data Link is not complete and they are still soliciting input for features and functionality. If you have ideas on how you want this feature to be more useful get involved and post your recommendations. As it stands, it looks to me like this component needs a lot of love to become useful. For this component to really provide value, bindings need to be able to be refreshed easily and work at the object level, not just the property level. It seems to me we would be much better served by a model binder object that can perform these binding/unbinding tasks in bulk rather than a tool where each link has to be mapped first. I also find the choice of creating a jQuery plug-in questionable – it seems a standalone object – albeit one that relies on the jQuery library – would provide a more intuitive interface than the current forcing of options onto a plug-in style interface. Out of the three Microsoft created components this is by far the least useful and least polished implementation at this point. jQuery Globalization http://github.com/jquery/jquery-global Globalization in JavaScript applications often gets short shrift and part of the reason for this is that natively in JavaScript there’s little support for formatting and parsing of numbers and dates. There are a number of JavaScript libraries out there that provide some support for globalization, but most are limited to a particular portion of globalization. As .NET developers we’re fairly spoiled by the richness of APIs provided in the framework and when dealing with client development one really notices the lack of these features. While you may not necessarily need to localize your application the globalization plug-in also helps with some basic tasks for non-localized applications: Dealing with formatting and parsing of dates and time values. Dates in particular are problematic in JavaScript as there are no formatters whatsoever except the .toString() method which outputs a verbose and next to useless long string. With the globalization plug-in you get a good chunk of the formatting and parsing functionality that the .NET framework provides on the server. You can write code like the following for example to format numbers and dates: var date = new Date(); var output = $.format(date, "MMM. dd, yy") + "\r\n" + $.format(date, "d") + "\r\n" + // 10/25/2010 $.format(1222.32213, "N2") + "\r\n" + $.format(1222.33, "c") + "\r\n"; alert(output); This becomes even more useful if you combine it with templates which can also include any JavaScript expressions. Assuming the globalization plug-in is loaded you can create template expressions that use the $.format function. Here’s the template I used earlier for the stock quote again with a couple of formats applied: <script id="stockTemplate" type="text/x-jquery-tmpl"> <div id="divStockQuote" class="errordisplay" style="width: 500px;"> <div class="label">Company:</div><div><b>${Company}(${Symbol})</b></div> <div class="label">Last Price:</div> <div>${$.format(LastPrice,"N2")}</div> <div class="label">Net Change:</div><div> {{if NetChange > 0}} <b style="color:green" >${NetChange}</b> {{else}} <b style="color:red" >${NetChange}</b> {{/if}} </div> <div class="label">Last Update:</div> <div>${$.format(LastQuoteTime,"MMM dd, yyyy")}</div> </div> </script> There are also parsing methods that can parse dates and numbers from strings into numbers easily: alert($.parseDate("25.10.2010")); alert($.parseInt("12.222")); // de-DE uses . for thousands separators As you can see culture specific options are taken into account when parsing. The globalization plugin provides rich support for a variety of locales: Get a list of all available cultures Query cultures for culture items (like currency symbol, separators etc.) Localized string names for all calendar related items (days of week, months) Generated off of .NET’s supported locales In short you get much of the same functionality that you already might be using in .NET on the server side. The plugin includes a huge number of locales and an Globalization.all.min.js file that contains the text defaults for each of these locales as well as small locale specific script files that define each of the locale specific settings. It’s highly recommended that you NOT use the huge globalization file that includes all locales, but rather add script references to only those languages you explicitly care about. Overall this plug-in is a welcome helper. Even if you use it with a single locale (like en-US) and do no other localization, you’ll gain solid support for number and date formatting which is a vital feature of many applications. Changes for Microsoft It’s good to see Microsoft coming out of its shell and away from the ‘not-built-here’ mentality that has been so pervasive in the past. It’s especially good to see it applied to jQuery – a technology that has stood in drastic contrast to Microsoft’s own internal efforts in terms of design, usage model and… popularity. It’s great to see that Microsoft is paying attention to what customers prefer to use and supporting the customer sentiment – even if it meant drastically changing course of policy and moving into a more open and sharing environment in the process. The additional jQuery support that has been introduced in the last two years certainly has made lives easier for many developers on the ASP.NET platform. It’s also nice to see Microsoft submitting proposals through the standard jQuery process of plug-ins and getting accepted for various very useful projects. Certainly the jQuery Templates plug-in is going to be very useful to many especially since it will be baked into the jQuery core in jQuery 1.5. I hope we see more of this type of involvement from Microsoft in the future. Kudos!© Rick Strahl, West Wind Technologies, 2005-2010Posted in jQuery ASP.NET

Read the article
C#/.NET Little Wonders: The Concurrent Collections (1 of 3)

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. In the next few weeks, we will discuss the concurrent collections and how they have changed the face of concurrent programming. This week’s post will begin with a general introduction and discuss the ConcurrentStack<T> and ConcurrentQueue<T>. Then in the following post we’ll discuss the ConcurrentDictionary<T> and ConcurrentBag<T>. Finally, we shall close on the third post with a discussion of the BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. A brief history of collections In the beginning was the .NET 1.0 Framework. And out of this framework emerged the System.Collections namespace, and it was good. It contained all the basic things a growing programming language needs like the ArrayList and Hashtable collections. The main problem, of course, with these original collections is that they held items of type object which means you had to be disciplined enough to use them correctly or you could end up with runtime errors if you got an object of a type you weren't expecting. Then came .NET 2.0 and generics and our world changed forever! With generics the C# language finally got an equivalent of the very powerful C++ templates. As such, the System.Collections.Generic was born and we got type-safe versions of all are favorite collections. The List<T> succeeded the ArrayList and the Dictionary<TKey,TValue> succeeded the Hashtable and so on. The new versions of the library were not only safer because they checked types at compile-time, in many cases they were more performant as well. So much so that it's Microsoft's recommendation that the System.Collections original collections only be used for backwards compatibility. So we as developers came to know and love the generic collections and took them into our hearts and embraced them. The problem is, thread safety in both the original collections and the generic collections can be problematic, for very different reasons. Now, if you are only doing single-threaded development you may not care – after all, no locking is required. Even if you do have multiple threads, if a collection is “load-once, read-many” you don’t need to do anything to protect that container from multi-threaded access, as illustrated below: 1: public static class OrderTypeTranslator 2: { 3: // because this dictionary is loaded once before it is ever accessed, we don't need to synchronize 4: // multi-threaded read access 5: private static readonly Dictionary<string, char> _translator = new Dictionary<string, char> 6: { 7: {"New", 'N'}, 8: {"Update", 'U'}, 9: {"Cancel", 'X'} 10: }; 11: 12: // the only public interface into the dictionary is for reading, so inherently thread-safe 13: public static char? Translate(string orderType) 14: { 15: char charValue; 16: if (_translator.TryGetValue(orderType, out charValue)) 17: { 18: return charValue; 19: } 20: 21: return null; 22: } 23: } Unfortunately, most of our computer science problems cannot get by with just single-threaded applications or with multi-threading in a load-once manner. Looking at today's trends, it's clear to see that computers are not so much getting faster because of faster processor speeds -- we've nearly reached the limits we can push through with today's technologies -- but more because we're adding more cores to the boxes. With this new hardware paradigm, it is even more important to use multi-threaded applications to take full advantage of parallel processing to achieve higher application speeds. So let's look at how to use collections in a thread-safe manner. Using historical collections in a concurrent fashion The early .NET collections (System.Collections) had a Synchronized() static method that could be used to wrap the early collections to make them completely thread-safe. This paradigm was dropped in the generic collections (System.Collections.Generic) because having a synchronized wrapper resulted in atomic locks for all operations, which could prove overkill in many multithreading situations. Thus the paradigm shifted to having the user of the collection specify their own locking, usually with an external object: 1: public class OrderAggregator 2: { 3: private static readonly Dictionary<string, List<Order>> _orders = new Dictionary<string, List<Order>>(); 4: private static readonly _orderLock = new object(); 5: 6: public void Add(string accountNumber, Order newOrder) 7: { 8: List<Order> ordersForAccount; 9: 10: // a complex operation like this should all be protected 11: lock (_orderLock) 12: { 13: if (!_orders.TryGetValue(accountNumber, out ordersForAccount)) 14: { 15: _orders.Add(accountNumber, ordersForAccount = new List<Order>()); 16: } 17: 18: ordersForAccount.Add(newOrder); 19: } 20: } 21: } Notice how we’re performing several operations on the dictionary under one lock. With the Synchronized() static methods of the early collections, you wouldn’t be able to specify this level of locking (a more macro-level). So in the generic collections, it was decided that if a user needed synchronization, they could implement their own locking scheme instead so that they could provide synchronization as needed. The need for better concurrent access to collections Here’s the problem: it’s relatively easy to write a collection that locks itself down completely for access, but anything more complex than that can be difficult and error-prone to write, and much less to make it perform efficiently! For example, what if you have a Dictionary that has frequent reads but in-frequent updates? Do you want to lock down the entire Dictionary for every access? This would be overkill and would prevent concurrent reads. In such cases you could use something like a ReaderWriterLockSlim which allows for multiple readers in a lock, and then once a writer grabs the lock it blocks all further readers until the writer is done (in a nutshell). This is all very complex stuff to consider. Fortunately, this is where the Concurrent Collections come in. The Parallel Computing Platform team at Microsoft went through great pains to determine how to make a set of concurrent collections that would have the best performance characteristics for general case multi-threaded use. Now, as in all things involving threading, you should always make sure you evaluate all your container options based on the particular usage scenario and the degree of parallelism you wish to acheive. This article should not be taken to understand that these collections are always supperior to the generic collections. Each fills a particular need for a particular situation. Understanding what each container is optimized for is key to the success of your application whether it be single-threaded or multi-threaded. General points to consider with the concurrent collections The MSDN points out that the concurrent collections all support the ICollection interface. However, since the collections are already synchronized, the IsSynchronized property always returns false, and SyncRoot always returns null. Thus you should not attempt to use these properties for synchronization purposes. Note that since the concurrent collections also may have different operations than the traditional data structures you may be used to. Now you may ask why they did this, but it was done out of necessity to keep operations safe and atomic. For example, in order to do a Pop() on a stack you have to know the stack is non-empty, but between the time you check the stack’s IsEmpty property and then do the Pop() another thread may have come in and made the stack empty! This is why some of the traditional operations have been changed to make them safe for concurrent use. In addition, some properties and methods in the concurrent collections achieve concurrency by creating a snapshot of the collection, which means that some operations that were traditionally O(1) may now be O(n) in the concurrent models. I’ll try to point these out as we talk about each collection so you can be aware of any potential performance impacts. Finally, all the concurrent containers are safe for enumeration even while being modified, but some of the containers support this in different ways (snapshot vs. dirty iteration). Once again I’ll highlight how thread-safe enumeration works for each collection. ConcurrentStack<T>: The thread-safe LIFO container The ConcurrentStack<T> is the thread-safe counterpart to the System.Collections.Generic.Stack<T>, which as you may remember is your standard last-in-first-out container. If you think of algorithms that favor stack usage (for example, depth-first searches of graphs and trees) then you can see how using a thread-safe stack would be of benefit. The ConcurrentStack<T> achieves thread-safe access by using System.Threading.Interlocked operations. This means that the multi-threaded access to the stack requires no traditional locking and is very, very fast! For the most part, the ConcurrentStack<T> behaves like it’s Stack<T> counterpart with a few differences: Pop() was removed in favor of TryPop() Returns true if an item existed and was popped and false if empty. PushRange() and TryPopRange() were added Allows you to push multiple items and pop multiple items atomically. Count takes a snapshot of the stack and then counts the items. This means it is a O(n) operation, if you just want to check for an empty stack, call IsEmpty instead which is O(1). ToArray() and GetEnumerator() both also take snapshots. This means that iteration over a stack will give you a static view at the time of the call and will not reflect updates. Pushing on a ConcurrentStack<T> works just like you’d expect except for the aforementioned PushRange() method that was added to allow you to push a range of items concurrently. 1: var stack = new ConcurrentStack<string>(); 2: 3: // adding to stack is much the same as before 4: stack.Push("First"); 5: 6: // but you can also push multiple items in one atomic operation (no interleaves) 7: stack.PushRange(new [] { "Second", "Third", "Fourth" }); For looking at the top item of the stack (without removing it) the Peek() method has been removed in favor of a TryPeek(). This is because in order to do a peek the stack must be non-empty, but between the time you check for empty and the time you execute the peek the stack contents may have changed. Thus the TryPeek() was created to be an atomic check for empty, and then peek if not empty: 1: // to look at top item of stack without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (stack.TryPeek(out item)) 5: { 6: Console.WriteLine("Top item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Stack was empty."); 11: } Finally, to remove items from the stack, we have the TryPop() for single, and TryPopRange() for multiple items. Just like the TryPeek(), these operations replace Pop() since we need to ensure atomically that the stack is non-empty before we pop from it: 1: // to remove items, use TryPop or TryPopRange to get multiple items atomically (no interleaves) 2: if (stack.TryPop(out item)) 3: { 4: Console.WriteLine("Popped " + item); 5: } 6: 7: // TryPopRange will only pop up to the number of spaces in the array, the actual number popped is returned. 8: var poppedItems = new string[2]; 9: int numPopped = stack.TryPopRange(poppedItems); 10: 11: foreach (var theItem in poppedItems.Take(numPopped)) 12: { 13: Console.WriteLine("Popped " + theItem); 14: } Finally, note that as stated before, GetEnumerator() and ToArray() gets a snapshot of the data at the time of the call. That means if you are enumerating the stack you will get a snapshot of the stack at the time of the call. This is illustrated below: 1: var stack = new ConcurrentStack<string>(); 2: 3: // adding to stack is much the same as before 4: stack.Push("First"); 5: 6: var results = stack.GetEnumerator(); 7: 8: // but you can also push multiple items in one atomic operation (no interleaves) 9: stack.PushRange(new [] { "Second", "Third", "Fourth" }); 10: 11: while(results.MoveNext()) 12: { 13: Console.WriteLine("Stack only has: " + results.Current); 14: } The only item that will be printed out in the above code is "First" because the snapshot was taken before the other items were added. This may sound like an issue, but it’s really for safety and is more correct. You don’t want to enumerate a stack and have half a view of the stack before an update and half a view of the stack after an update, after all. In addition, note that this is still thread-safe, whereas iterating through a non-concurrent collection while updating it in the old collections would cause an exception. ConcurrentQueue<T>: The thread-safe FIFO container The ConcurrentQueue<T> is the thread-safe counterpart of the System.Collections.Generic.Queue<T> class. The concurrent queue uses an underlying list of small arrays and lock-free System.Threading.Interlocked operations on the head and tail arrays. Once again, this allows us to do thread-safe operations without the need for heavy locks! The ConcurrentQueue<T> (like the ConcurrentStack<T>) has some departures from the non-concurrent counterpart. Most notably: Dequeue() was removed in favor of TryDequeue(). Returns true if an item existed and was dequeued and false if empty. Count does not take a snapshot It subtracts the head and tail index to get the count. This results overall in a O(1) complexity which is quite good. It’s still recommended, however, that for empty checks you call IsEmpty instead of comparing Count to zero. ToArray() and GetEnumerator() both take snapshots. This means that iteration over a queue will give you a static view at the time of the call and will not reflect updates. The Enqueue() method on the ConcurrentQueue<T> works much the same as the generic Queue<T>: 1: var queue = new ConcurrentQueue<string>(); 2: 3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5: queue.Enqueue("Second"); 6: queue.Enqueue("Third"); For front item access, the TryPeek() method must be used to attempt to see the first item if the queue. There is no Peek() method since, as you’ll remember, we can only peek on a non-empty queue, so we must have an atomic TryPeek() that checks for empty and then returns the first item if the queue is non-empty. 1: // to look at first item in queue without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (queue.TryPeek(out item)) 5: { 6: Console.WriteLine("First item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Queue was empty."); 11: } Then, to remove items you use TryDequeue(). Once again this is for the same reason we have TryPeek() and not Peek(): 1: // to remove items, use TryDequeue. If queue is empty returns false. 2: if (queue.TryDequeue(out item)) 3: { 4: Console.WriteLine("Dequeued first item " + item); 5: } Just like the concurrent stack, the ConcurrentQueue<T> takes a snapshot when you call ToArray() or GetEnumerator() which means that subsequent updates to the queue will not be seen when you iterate over the results. Thus once again the code below will only show the first item, since the other items were added after the snapshot. 1: var queue = new ConcurrentQueue<string>(); 2: 3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5: 6: var iterator = queue.GetEnumerator(); 7: 8: queue.Enqueue("Second"); 9: queue.Enqueue("Third"); 10: 11: // only shows First 12: while (iterator.MoveNext()) 13: { 14: Console.WriteLine("Dequeued item " + iterator.Current); 15: } Using collections concurrently You’ll notice in the examples above I stuck to using single-threaded examples so as to make them deterministic and the results obvious. Of course, if we used these collections in a truly multi-threaded way the results would be less deterministic, but would still be thread-safe and with no locking on your part required! For example, say you have an order processor that takes an IEnumerable<Order> and handles each other in a multi-threaded fashion, then groups the responses together in a concurrent collection for aggregation. This can be done easily with the TPL’s Parallel.ForEach(): 1: public static IEnumerable<OrderResult> ProcessOrders(IEnumerable<Order> orderList) 2: { 3: var proxy = new OrderProxy(); 4: var results = new ConcurrentQueue<OrderResult>(); 5: 6: // notice that we can process all these in parallel and put the results 7: // into our concurrent collection without needing any external locking! 8: Parallel.ForEach(orderList, 9: order => 10: { 11: var result = proxy.PlaceOrder(order); 12: 13: results.Enqueue(result); 14: }); 15: 16: return results; 17: } Summary Obviously, if you do not need multi-threaded safety, you don’t need to use these collections, but when you do need multi-threaded collections these are just the ticket! The plethora of features (I always think of the movie The Three Amigos when I say plethora) built into these containers and the amazing way they acheive thread-safe access in an efficient manner is wonderful to behold. Stay tuned next week where we’ll continue our discussion with the ConcurrentBag<T> and the ConcurrentDictionary<TKey,TValue>. For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here. Tweet Technorati Tags: C#,.NET,Concurrent Collections,Collections,Multi-Threading,Little Wonders,BlackRabbitCoder,James Michael Hare

Read the article
Toorcon 15 (2013)

- by danx

The Toorcon gang (senior staff): h1kari (founder), nfiltr8, and Geo Introduction to Toorcon 15 (2013) A Tale of One Software Bypass of MS Windows 8 Secure Boot Breaching SSL, One Byte at a Time Running at 99%: Surviving an Application DoS Security Response in the Age of Mass Customized Attacks x86 Rewriting: Defeating RoP and other Shinanighans Clowntown Express: interesting bugs and running a bug bounty program Active Fingerprinting of Encrypted VPNs Making Attacks Go Backwards Mask Your Checksums—The Gorry Details Adventures with weird machines thirty years after "Reflections on Trusting Trust" Introduction to Toorcon 15 (2013) Toorcon 15 is the 15th annual security conference held in San Diego. I've attended about a third of them and blogged about previous conferences I attended here starting in 2003. As always, I've only summarized the talks I attended and interested me enough to write about them. Be aware that I may have misrepresented the speaker's remarks and that they are not my remarks or opinion, or those of my employer, so don't quote me or them. Those seeking further details may contact the speakers directly or use The Google. For some talks, I have a URL for further information. A Tale of One Software Bypass of MS Windows 8 Secure Boot Andrew Furtak and Oleksandr Bazhaniuk Yuri Bulygin, Oleksandr ("Alex") Bazhaniuk, and (not present) Andrew Furtak Yuri and Alex talked about UEFI and Bootkits and bypassing MS Windows 8 Secure Boot, with vendor recommendations. They previously gave this talk at the BlackHat 2013 conference. MS Windows 8 Secure Boot Overview UEFI (Unified Extensible Firmware Interface) is interface between hardware and OS. UEFI is processor and architecture independent. Malware can replace bootloader (bootx64.efi, bootmgfw.efi). Once replaced can modify kernel. Trivial to replace bootloader. Today many legacy bootkits—UEFI replaces them most of them. MS Windows 8 Secure Boot verifies everything you load, either through signatures or hashes. UEFI firmware relies on secure update (with signed update). You would think Secure Boot would rely on ROM (such as used for phones0, but you can't do that for PCs—PCs use writable memory with signatures DXE core verifies the UEFI boat loader(s) OS Loader (winload.efi, winresume.efi) verifies the OS kernel A chain of trust is established with a root key (Platform Key, PK), which is a cert belonging to the platform vendor. Key Exchange Keys (KEKs) verify an "authorized" database (db), and "forbidden" database (dbx). X.509 certs with SHA-1/SHA-256 hashes. Keys are stored in non-volatile (NV) flash-based NVRAM. Boot Services (BS) allow adding/deleting keys (can't be accessed once OS starts—which uses Run-Time (RT)). Root cert uses RSA-2048 public keys and PKCS#7 format signatures. SecureBoot — enable disable image signature checks SetupMode — update keys, self-signed keys, and secure boot variables CustomMode — allows updating keys Secure Boot policy settings are: always execute, never execute, allow execute on security violation, defer execute on security violation, deny execute on security violation, query user on security violation Attacking MS Windows 8 Secure Boot Secure Boot does NOT protect from physical access. Can disable from console. Each BIOS vendor implements Secure Boot differently. There are several platform and BIOS vendors. It becomes a "zoo" of implementations—which can be taken advantage of. Secure Boot is secure only when all vendors implement it correctly. Allow only UEFI firmware signed updates protect UEFI firmware from direct modification in flash memory protect FW update components program SPI controller securely protect secure boot policy settings in nvram protect runtime api disable compatibility support module which allows unsigned legacy Can corrupt the Platform Key (PK) EFI root certificate variable in SPI flash. If PK is not found, FW enters setup mode wich secure boot turned off. Can also exploit TPM in a similar manner. One is not supposed to be able to directly modify the PK in SPI flash from the OS though. But they found a bug that they can exploit from User Mode (undisclosed) and demoed the exploit. It loaded and ran their own bootkit. The exploit requires a reboot. Multiple vendors are vulnerable. They will disclose this exploit to vendors in the future. Recommendations: allow only signed updates protect UEFI fw in ROM protect EFI variable store in ROM Breaching SSL, One Byte at a Time Yoel Gluck and Angelo Prado Angelo Prado and Yoel Gluck, Salesforce.com CRIME is software that performs a "compression oracle attack." This is possible because the SSL protocol doesn't hide length, and because SSL compresses the header. CRIME requests with every possible character and measures the ciphertext length. Look for the plaintext which compresses the most and looks for the cookie one byte-at-a-time. SSL Compression uses LZ77 to reduce redundancy. Huffman coding replaces common byte sequences with shorter codes. US CERT thinks the SSL compression problem is fixed, but it isn't. They convinced CERT that it wasn't fixed and they issued a CVE. BREACH, breachattrack.com BREACH exploits the SSL response body (Accept-Encoding response, Content-Encoding). It takes advantage of the fact that the response is not compressed. BREACH uses gzip and needs fairly "stable" pages that are static for ~30 seconds. It needs attacker-supplied content (say from a web form or added to a URL parameter). BREACH listens to a session's requests and responses, then inserts extra requests and responses. Eventually, BREACH guesses a session's secret key. Can use compression to guess contents one byte at-a-time. For example, "Supersecret SupersecreX" (a wrong guess) compresses 10 bytes, and "Supersecret Supersecret" (a correct guess) compresses 11 bytes, so it can find each character by guessing every character. To start the guess, BREACH needs at least three known initial characters in the response sequence. Compression length then "leaks" information. Some roadblocks include no winners (all guesses wrong) or too many winners (multiple possibilities that compress the same). The solutions include: lookahead (guess 2 or 3 characters at-a-time instead of 1 character). Expensive rollback to last known conflict check compression ratio can brute-force first 3 "bootstrap" characters, if needed (expensive) block ciphers hide exact plain text length. Solution is to align response in advance to block size Mitigations length: use variable padding secrets: dynamic CSRF tokens per request secret: change over time separate secret to input-less servlets Future work eiter understand DEFLATE/GZIP HTTPS extensions Running at 99%: Surviving an Application DoS Ryan Huber Ryan Huber, Risk I/O Ryan first discussed various ways to do a denial of service (DoS) attack against web services. One usual method is to find a slow web page and do several wgets. Or download large files. Apache is not well suited at handling a large number of connections, but one can put something in front of it Can use Apache alternatives, such as nginx How to identify malicious hosts short, sudden web requests user-agent is obvious (curl, python) same url requested repeatedly no web page referer (not normal) hidden links. hide a link and see if a bot gets it restricted access if not your geo IP (unless the website is global) missing common headers in request regular timing first seen IP at beginning of attack count requests per hosts (usually a very large number) Use of captcha can mitigate attacks, but you'll lose a lot of genuine users. Bouncer, goo.gl/c2vyEc and www.github.com/rawdigits/Bouncer Bouncer is software written by Ryan in netflow. Bouncer has a small, unobtrusive footprint and detects DoS attempts. It closes blacklisted sockets immediately (not nice about it, no proper close connection). Aggregator collects requests and controls your web proxies. Need NTP on the front end web servers for clean data for use by bouncer. Bouncer is also useful for a popularity storm ("Slashdotting") and scraper storms. Future features: gzip collection data, documentation, consumer library, multitask, logging destroyed connections. Takeaways: DoS mitigation is easier with a complete picture Bouncer designed to make it easier to detect and defend DoS—not a complete cure Security Response in the Age of Mass Customized Attacks Peleus Uhley and Karthik Raman Peleus Uhley and Karthik Raman, Adobe ASSET, blogs.adobe.com/asset/ Peleus and Karthik talked about response to mass-customized exploits. Attackers behave much like a business. "Mass customization" refers to concept discussed in the book Future Perfect by Stan Davis of Harvard Business School. Mass customization is differentiating a product for an individual customer, but at a mass production price. For example, the same individual with a debit card receives basically the same customized ATM experience around the world. Or designing your own PC from commodity parts. Exploit kits are another example of mass customization. The kits support multiple browsers and plugins, allows new modules. Exploit kits are cheap and customizable. Organized gangs use exploit kits. A group at Berkeley looked at 77,000 malicious websites (Grier et al., "Manufacturing Compromise: The Emergence of Exploit-as-a-Service", 2012). They found 10,000 distinct binaries among them, but derived from only a dozen or so exploit kits. Characteristics of Mass Malware: potent, resilient, relatively low cost Technical characteristics: multiple OS, multipe payloads, multiple scenarios, multiple languages, obfuscation Response time for 0-day exploits has gone down from ~40 days 5 years ago to about ~10 days now. So the drive with malware is towards mass customized exploits, to avoid detection There's plenty of evicence that exploit development has Project Manager bureaucracy. They infer from the malware edicts to: support all versions of reader support all versions of windows support all versions of flash support all browsers write large complex, difficult to main code (8750 lines of JavaScript for example Exploits have "loose coupling" of multipe versions of software (adobe), OS, and browser. This allows specific attacks against specific versions of multiple pieces of software. Also allows exploits of more obscure software/OS/browsers and obscure versions. Gave examples of exploits that exploited 2, 3, 6, or 14 separate bugs. However, these complete exploits are more likely to be buggy or fragile in themselves and easier to defeat. Future research includes normalizing malware and Javascript. Conclusion: The coming trend is that mass-malware with mass zero-day attacks will result in mass customization of attacks. x86 Rewriting: Defeating RoP and other Shinanighans Richard Wartell Richard Wartell The attack vector we are addressing here is: First some malware causes a buffer overflow. The malware has no program access, but input access and buffer overflow code onto stack Later the stack became non-executable. The workaround malware used was to write a bogus return address to the stack jumping to malware Later came ASLR (Address Space Layout Randomization) to randomize memory layout and make addresses non-deterministic. The workaround malware used was to jump t existing code segments in the program that can be used in bad ways "RoP" is Return-oriented Programming attacks. RoP attacks use your own code and write return address on stack to (existing) expoitable code found in program ("gadgets"). Pinkie Pie was paid $60K last year for a RoP attack. One solution is using anti-RoP compilers that compile source code with NO return instructions. ASLR does not randomize address space, just "gadgets". IPR/ILR ("Instruction Location Randomization") randomizes each instruction with a virtual machine. Richard's goal was to randomize a binary with no source code access. He created "STIR" (Self-Transofrming Instruction Relocation). STIR disassembles binary and operates on "basic blocks" of code. The STIR disassembler is conservative in what to disassemble. Each basic block is moved to a random location in memory. Next, STIR writes new code sections with copies of "basic blocks" of code in randomized locations. The old code is copied and rewritten with jumps to new code. the original code sections in the file is marked non-executible. STIR has better entropy than ASLR in location of code. Makes brute force attacks much harder. STIR runs on MS Windows (PEM) and Linux (ELF). It eliminated 99.96% or more "gadgets" (i.e., moved the address). Overhead usually 5-10% on MS Windows, about 1.5-4% on Linux (but some code actually runs faster!). The unique thing about STIR is it requires no source access and the modified binary fully works! Current work is to rewrite code to enforce security policies. For example, don't create a *.{exe,msi,bat} file. Or don't connect to the network after reading from the disk. Clowntown Express: interesting bugs and running a bug bounty program Collin Greene Collin Greene, Facebook Collin talked about Facebook's bug bounty program. Background at FB: FB has good security frameworks, such as security teams, external audits, and cc'ing on diffs. But there's lots of "deep, dark, forgotten" parts of legacy FB code. Collin gave several examples of bountied bugs. Some bounty submissions were on software purchased from a third-party (but bounty claimers don't know and don't care). We use security questions, as does everyone else, but they are basically insecure (often easily discoverable). Collin didn't expect many bugs from the bounty program, but they ended getting 20+ good bugs in first 24 hours and good submissions continue to come in. Bug bounties bring people in with different perspectives, and are paid only for success. Bug bounty is a better use of a fixed amount of time and money versus just code review or static code analysis. The Bounty program started July 2011 and paid out $1.5 million to date. 14% of the submissions have been high priority problems that needed to be fixed immediately. The best bugs come from a small % of submitters (as with everything else)—the top paid submitters are paid 6 figures a year. Spammers like to backstab competitors. The youngest sumitter was 13. Some submitters have been hired. Bug bounties also allows to see bugs that were missed by tools or reviews, allowing improvement in the process. Bug bounties might not work for traditional software companies where the product has release cycle or is not on Internet. Active Fingerprinting of Encrypted VPNs Anna Shubina Anna Shubina, Dartmouth Institute for Security, Technology, and Society (I missed the start of her talk because another track went overtime. But I have the DVD of the talk, so I'll expand later) IPsec leaves fingerprints. Using netcat, one can easily visually distinguish various crypto chaining modes just from packet timing on a chart (example, DES-CBC versus AES-CBC) One can tell a lot about VPNs just from ping roundtrips (such as what router is used) Delayed packets are not informative about a network, especially if far away from the network More needed to explore about how TCP works in real life with respect to timing Making Attacks Go Backwards Fuzzynop FuzzyNop, Mandiant This talk is not about threat attribution (finding who), product solutions, politics, or sales pitches. But who are making these malware threats? It's not a single person or group—they have diverse skill levels. There's a lot of fat-fingered fumblers out there. Always look for low-hanging fruit first: "hiding" malware in the temp, recycle, or root directories creation of unnamed scheduled tasks obvious names of files and syscalls ("ClearEventLog") uncleared event logs. Clearing event log in itself, and time of clearing, is a red flag and good first clue to look for on a suspect system Reverse engineering is hard. Disassembler use takes practice and skill. A popular tool is IDA Pro, but it takes multiple interactive iterations to get a clean disassembly. Key loggers are used a lot in targeted attacks. They are typically custom code or built in a backdoor. A big tip-off is that non-printable characters need to be printed out (such as "[Ctrl]" "[RightShift]") or time stamp printf strings. Look for these in files. Presence is not proof they are used. Absence is not proof they are not used. Java exploits. Can parse jar file with idxparser.py and decomile Java file. Java typially used to target tech companies. Backdoors are the main persistence mechanism (provided externally) for malware. Also malware typically needs command and control. Application of Artificial Intelligence in Ad-Hoc Static Code Analysis John Ashaman John Ashaman, Security Innovation Initially John tried to analyze open source files with open source static analysis tools, but these showed thousands of false positives. Also tried using grep, but tis fails to find anything even mildly complex. So next John decided to write his own tool. His approach was to first generate a call graph then analyze the graph. However, the problem is that making a call graph is really hard. For example, one problem is "evil" coding techniques, such as passing function pointer. First the tool generated an Abstract Syntax Tree (AST) with the nodes created from method declarations and edges created from method use. Then the tool generated a control flow graph with the goal to find a path through the AST (a maze) from source to sink. The algorithm is to look at adjacent nodes to see if any are "scary" (a vulnerability), using heuristics for search order. The tool, called "Scat" (Static Code Analysis Tool), currently looks for C# vulnerabilities and some simple PHP. Later, he plans to add more PHP, then JSP and Java. For more information see his posts in Security Innovation blog and NRefactory on GitHub. Mask Your Checksums—The Gorry Details Eric (XlogicX) Davisson Eric (XlogicX) Davisson Sometimes in emailing or posting TCP/IP packets to analyze problems, you may want to mask the IP address. But to do this correctly, you need to mask the checksum too, or you'll leak information about the IP. Problem reports found in stackoverflow.com, sans.org, and pastebin.org are usually not masked, but a few companies do care. If only the IP is masked, the IP may be guessed from checksum (that is, it leaks data). Other parts of packet may leak more data about the IP. TCP and IP checksums both refer to the same data, so can get more bits of information out of using both checksums than just using one checksum. Also, one can usually determine the OS from the TTL field and ports in a packet header. If we get hundreds of possible results (16x each masked nibble that is unknown), one can do other things to narrow the results, such as look at packet contents for domain or geo information. With hundreds of results, can import as CSV format into a spreadsheet. Can corelate with geo data and see where each possibility is located. Eric then demoed a real email report with a masked IP packet attached. Was able to find the exact IP address, given the geo and university of the sender. Point is if you're going to mask a packet, do it right. Eric wouldn't usually bother, but do it correctly if at all, to not create a false impression of security. Adventures with weird machines thirty years after "Reflections on Trusting Trust" Sergey Bratus Sergey Bratus, Dartmouth College (and Julian Bangert and Rebecca Shapiro, not present) "Reflections on Trusting Trust" refers to Ken Thompson's classic 1984 paper. "You can't trust code that you did not totally create yourself." There's invisible links in the chain-of-trust, such as "well-installed microcode bugs" or in the compiler, and other planted bugs. Thompson showed how a compiler can introduce and propagate bugs in unmodified source. But suppose if there's no bugs and you trust the author, can you trust the code? Hell No! There's too many factors—it's Babylonian in nature. Why not? Well, Input is not well-defined/recognized (code's assumptions about "checked" input will be violated (bug/vunerabiliy). For example, HTML is recursive, but Regex checking is not recursive. Input well-formed but so complex there's no telling what it does For example, ELF file parsing is complex and has multiple ways of parsing. Input is seen differently by different pieces of program or toolchain Any Input is a program input executes on input handlers (drives state changes & transitions) only a well-defined execution model can be trusted (regex/DFA, PDA, CFG) Input handler either is a "recognizer" for the inputs as a well-defined language (see langsec.org) or it's a "virtual machine" for inputs to drive into pwn-age ELF ABI (UNIX/Linux executible file format) case study. Problems can arise from these steps (without planting bugs): compiler linker loader ld.so/rtld relocator DWARF (debugger info) exceptions The problem is you can't really automatically analyze code (it's the "halting problem" and undecidable). Only solution is to freeze code and sign it. But you can't freeze everything! Can't freeze ASLR or loading—must have tables and metadata. Any sufficiently complex input data is the same as VM byte code Example, ELF relocation entries + dynamic symbols == a Turing Complete Machine (TM). @bxsays created a Turing machine in Linux from relocation data (not code) in an ELF file. For more information, see Rebecca "bx" Shapiro's presentation from last year's Toorcon, "Programming Weird Machines with ELF Metadata" @bxsays did same thing with Mach-O bytecode Or a DWARF exception handling data .eh_frame + glibc == Turning Machine X86 MMU (IDT, GDT, TSS): used address translation to create a Turning Machine. Page handler reads and writes (on page fault) memory. Uses a page table, which can be used as Turning Machine byte code. Example on Github using this TM that will fly a glider across the screen Next Sergey talked about "Parser Differentials". That having one input format, but two parsers, will create confusion and opportunity for exploitation. For example, CSRs are parsed during creation by cert requestor and again by another parser at the CA. Another example is ELF—several parsers in OS tool chain, which are all different. Can have two different Program Headers (PHDRs) because ld.so parses multiple PHDRs. The second PHDR can completely transform the executable. This is described in paper in the first issue of International Journal of PoC. Conclusions trusting computers not only about bugs! Bugs are part of a problem, but no by far all of it complex data formats means bugs no "chain of trust" in Babylon! (that is, with parser differentials) we need to squeeze complexity out of data until data stops being "code equivalent" Further information See and langsec.org. USENIX WOOT 2013 (Workshop on Offensive Technologies) for "weird machines" papers and videos.

Read the article
Ancillary Objects: Separate Debug ELF Files For Solaris

- by Ali Bahrami

We introduced a new object ELF object type in Solaris 11 Update 1 called the Ancillary Object. This posting describes them, using material originally written during their development, the PSARC arc case, and the Solaris Linker and Libraries Manual. ELF objects contain allocable sections, which are mapped into memory at runtime, and non-allocable sections, which are present in the file for use by debuggers and observability tools, but which are not mapped or used at runtime. Typically, all of these sections exist within a single object file. Ancillary objects allow them to instead go into a separate file. There are different reasons given for wanting such a feature. One can debate whether the added complexity is worth the benefit, and in most cases it is not. However, one important case stands out — customers with very large 32-bit objects who are not ready or able to make the transition to 64-bits. We have customers who build extremely large 32-bit objects. Historically, the debug sections in these objects have used the stabs format, which is limited, but relatively compact. In recent years, the industry has transitioned to the powerful but verbose DWARF standard. In some cases, the size of these debug sections is large enough to push the total object file size past the fundamental 4GB limit for 32-bit ELF object files. The best, and ultimately only, solution to overly large objects is to transition to 64-bits. However, consider environments where: Hundreds of users may be executing the code on large shared systems. (32-bits use less memory and bus bandwidth, and on sparc runs just as fast as 64-bit code otherwise). Complex finely tuned code, where the original authors may no longer be available. Critical production code, that was expensive to qualify and bring online, and which is otherwise serving its intended purpose without issue. Users in these risk adverse and/or high scale categories have good reasons to push 32-bits objects to the limit before moving on. Ancillary objects offer these users a longer runway. Design The design of ancillary objects is intended to be simple, both to help human understanding when examining elfdump output, and to lower the bar for debuggers such as dbx to support them. The primary and ancillary objects have the same set of section headers, with the same names, in the same order (i.e. each section has the same index in both files). A single added section of type SHT_SUNW_ANCILLARY is added to both objects, containing information that allows a debugger to identify and validate both files relative to each other. Given one of these files, the ancillary section allows you to identify the other. Allocable sections go in the primary object, and non-allocable ones go into the ancillary object. A small set of non-allocable objects, notably the symbol table, are copied into both objects. As noted above, most sections are only written to one of the two objects, but both objects have the same section header array. The section header in the file that does not contain the section data is tagged with the SHF_SUNW_ABSENT section header flag to indicate its placeholder status. Compiler writers and others who produce objects can set the SUNW_SHF_PRIMARY section header flag to mark non-allocable sections that should go to the primary object rather than the ancillary. If you don't request an ancillary object, the Solaris ELF format is unchanged. Users who don't use ancillary objects do not pay for the feature. This is important, because they exist to serve a small subset of our users, and must not complicate the common case. If you do request an ancillary object, the runtime behavior of the primary object will be the same as that of a normal object. There is no added runtime cost. The primary and ancillary object together represent a logical single object. This is facilitated by the use of a single set of section headers. One can easily imagine a tool that can merge a primary and ancillary object into a single file, or the reverse. (Note that although this is an interesting intellectual exercise, we don't actually supply such a tool because there's little practical benefit above and beyond using ld to create the files). Among the benefits of this approach are: There is no need for per-file symbol tables to reflect the contents of each file. The same symbol table that would be produced for a standard object can be used. The section contents are identical in either case — there is no need to alter data to accommodate multiple files. It is very easy for a debugger to adapt to these new files, and the processing involved can be encapsulated in input/output routines. Most of the existing debugger implementation applies without modification. The limit of a 4GB 32-bit output object is now raised to 4GB of code, and 4GB of debug data. There is also the future possibility (not currently supported) to support multiple ancillary objects, each of which could contain up to 4GB of additional debug data. It must be noted however that the 32-bit DWARF debug format is itself inherently 32-bit limited, as it uses 32-bit offsets between debug sections, so the ability to employ multiple ancillary object files may not turn out to be useful. Using Ancillary Objects (From the Solaris Linker and Libraries Guide) By default, objects contain both allocable and non-allocable sections. Allocable sections are the sections that contain executable code and the data needed by that code at runtime. Non-allocable sections contain supplemental information that is not required to execute an object at runtime. These sections support the operation of debuggers and other observability tools. The non-allocable sections in an object are not loaded into memory at runtime by the operating system, and so, they have no impact on memory use or other aspects of runtime performance no matter their size. For convenience, both allocable and non-allocable sections are normally maintained in the same file. However, there are situations in which it can be useful to separate these sections. To reduce the size of objects in order to improve the speed at which they can be copied across wide area networks. To support fine grained debugging of highly optimized code requires considerable debug data. In modern systems, the debugging data can easily be larger than the code it describes. The size of a 32-bit object is limited to 4 Gbytes. In very large 32-bit objects, the debug data can cause this limit to be exceeded and prevent the creation of the object. To limit the exposure of internal implementation details. Traditionally, objects have been stripped of non-allocable sections in order to address these issues. Stripping is effective, but destroys data that might be needed later. The Solaris link-editor can instead write non-allocable sections to an ancillary object. This feature is enabled with the -z ancillary command line option. $ ld ... -z ancillary[=outfile] ...By default, the ancillary file is given the same name as the primary output object, with a .anc file extension. However, a different name can be provided by providing an outfile value to the -z ancillary option. When -z ancillary is specified, the link-editor performs the following actions. All allocable sections are written to the primary object. In addition, all non-allocable sections containing one or more input sections that have the SHF_SUNW_PRIMARY section header flag set are written to the primary object. All remaining non-allocable sections are written to the ancillary object. The following non-allocable sections are written to both the primary object and ancillary object. .shstrtab The section name string table. .symtab The full non-dynamic symbol table. .symtab_shndx The symbol table extended index section associated with .symtab. .strtab The non-dynamic string table associated with .symtab. .SUNW_ancillary Contains the information required to identify the primary and ancillary objects, and to identify the object being examined. The primary object and all ancillary objects contain the same array of sections headers. Each section has the same section index in every file. Although the primary and ancillary objects all define the same section headers, the data for most sections will be written to a single file as described above. If the data for a section is not present in a given file, the SHF_SUNW_ABSENT section header flag is set, and the sh_size field is 0. This organization makes it possible to acquire a full list of section headers, a complete symbol table, and a complete list of the primary and ancillary objects from either of the primary or ancillary objects. The following example illustrates the underlying implementation of ancillary objects. An ancillary object is created by adding the -z ancillary command line option to an otherwise normal compilation. The file utility shows that the result is an executable named a.out, and an associated ancillary object named a.out.anc. $ cat hello.c #include <stdio.h> int main(int argc, char **argv) { (void) printf("hello, world\n"); return (0); } $ cc -g -zancillary hello.c $ file a.out a.out.anc a.out: ELF 32-bit LSB executable 80386 Version 1 [FPU], dynamically linked, not stripped, ancillary object a.out.anc a.out.anc: ELF 32-bit LSB ancillary 80386 Version 1, primary object a.out $ ./a.out hello worldThe resulting primary object is an ordinary executable that can be executed in the usual manner. It is no different at runtime than an executable built without the use of ancillary objects, and then stripped of non-allocable content using the strip or mcs commands. As previously described, the primary object and ancillary objects contain the same section headers. To see how this works, it is helpful to use the elfdump utility to display these section headers and compare them. The following table shows the section header information for a selection of headers from the previous link-edit example. Index Section Name Type Primary Flags Ancillary Flags Primary Size Ancillary Size 13 .text PROGBITS ALLOC EXECINSTR ALLOC EXECINSTR SUNW_ABSENT 0x131 0 20 .data PROGBITS WRITE ALLOC WRITE ALLOC SUNW_ABSENT 0x4c 0 21 .symtab SYMTAB 0 0 0x450 0x450 22 .strtab STRTAB STRINGS STRINGS 0x1ad 0x1ad 24 .debug_info PROGBITS SUNW_ABSENT 0 0 0x1a7 28 .shstrtab STRTAB STRINGS STRINGS 0x118 0x118 29 .SUNW_ancillary SUNW_ancillary 0 0 0x30 0x30 The data for most sections is only present in one of the two files, and absent from the other file. The SHF_SUNW_ABSENT section header flag is set when the data is absent. The data for allocable sections needed at runtime are found in the primary object. The data for non-allocable sections used for debugging but not needed at runtime are placed in the ancillary file. A small set of non-allocable sections are fully present in both files. These are the .SUNW_ancillary section used to relate the primary and ancillary objects together, the section name string table .shstrtab, as well as the symbol table.symtab, and its associated string table .strtab. It is possible to strip the symbol table from the primary object. A debugger that encounters an object without a symbol table can use the .SUNW_ancillary section to locate the ancillary object, and access the symbol contained within. The primary object, and all associated ancillary objects, contain a .SUNW_ancillary section that allows all the objects to be identified and related together. $ elfdump -T SUNW_ancillary a.out a.out.anc a.out: Ancillary Section: .SUNW_ancillary index tag value [0] ANC_SUNW_CHECKSUM 0x8724 [1] ANC_SUNW_MEMBER 0x1 a.out [2] ANC_SUNW_CHECKSUM 0x8724 [3] ANC_SUNW_MEMBER 0x1a3 a.out.anc [4] ANC_SUNW_CHECKSUM 0xfbe2 [5] ANC_SUNW_NULL 0 a.out.anc: Ancillary Section: .SUNW_ancillary index tag value [0] ANC_SUNW_CHECKSUM 0xfbe2 [1] ANC_SUNW_MEMBER 0x1 a.out [2] ANC_SUNW_CHECKSUM 0x8724 [3] ANC_SUNW_MEMBER 0x1a3 a.out.anc [4] ANC_SUNW_CHECKSUM 0xfbe2 [5] ANC_SUNW_NULL 0 The ancillary sections for both objects contain the same number of elements, and are identical except for the first element. Each object, starting with the primary object, is introduced with a MEMBER element that gives the file name, followed by a CHECKSUM that identifies the object. In this example, the primary object is a.out, and has a checksum of 0x8724. The ancillary object is a.out.anc, and has a checksum of 0xfbe2. The first element in a .SUNW_ancillary section, preceding the MEMBER element for the primary object, is always a CHECKSUM element, containing the checksum for the file being examined. The presence of a .SUNW_ancillary section in an object indicates that the object has associated ancillary objects. The names of the primary and all associated ancillary objects can be obtained from the ancillary section from any one of the files. It is possible to determine which file is being examined from the larger set of files by comparing the first checksum value to the checksum of each member that follows. Debugger Access and Use of Ancillary Objects Debuggers and other observability tools must merge the information found in the primary and ancillary object files in order to build a complete view of the object. This is equivalent to processing the information from a single file. This merging is simplified by the primary object and ancillary objects containing the same section headers, and a single symbol table. The following steps can be used by a debugger to assemble the information contained in these files. Starting with the primary object, or any of the ancillary objects, locate the .SUNW_ancillary section. The presence of this section identifies the object as part of an ancillary group, contains information that can be used to obtain a complete list of the files and determine which of those files is the one currently being examined. Create a section header array in memory, using the section header array from the object being examined as an initial template. Open and read each file identified by the .SUNW_ancillary section in turn. For each file, fill in the in-memory section header array with the information for each section that does not have the SHF_SUNW_ABSENT flag set. The result will be a complete in-memory copy of the section headers with pointers to the data for all sections. Once this information has been acquired, the debugger can proceed as it would in the single file case, to access and control the running program. Note - The ELF definition of ancillary objects provides for a single primary object, and an arbitrary number of ancillary objects. At this time, the Oracle Solaris link-editor only produces a single ancillary object containing all non-allocable sections. This may change in the future. Debuggers and other observability tools should be written to handle the general case of multiple ancillary objects. ELF Implementation Details (From the Solaris Linker and Libraries Guide) To implement ancillary objects, it was necessary to extend the ELF format to add a new object type (ET_SUNW_ANCILLARY), a new section type (SHT_SUNW_ANCILLARY), and 2 new section header flags (SHF_SUNW_ABSENT, SHF_SUNW_PRIMARY). In this section, I will detail these changes, in the form of diffs to the Solaris Linker and Libraries manual. Part IV ELF Application Binary Interface Chapter 13: Object File Format Object File Format Edit Note: This existing section at the beginning of the chapter describes the ELF header. There's a table of object file types, which now includes the new ET_SUNW_ANCILLARY type. e_type Identifies the object file type, as listed in the following table. NameValueMeaning ET_NONE0No file type ET_REL1Relocatable file ET_EXEC2Executable file ET_DYN3Shared object file ET_CORE4Core file ET_LOSUNW0xfefeStart operating system specific range ET_SUNW_ANCILLARY0xfefeAncillary object file ET_HISUNW0xfefdEnd operating system specific range ET_LOPROC0xff00Start processor-specific range ET_HIPROC0xffffEnd processor-specific range Sections Edit Note: This overview section defines the section header structure, and provides a high level description of known sections. It was updated to define the new SHF_SUNW_ABSENT and SHF_SUNW_PRIMARY flags and the new SHT_SUNW_ANCILLARY section. ... sh_type Categorizes the section's contents and semantics. Section types and their descriptions are listed in Table 13-5. sh_flags Sections support 1-bit flags that describe miscellaneous attributes. Flag definitions are listed in Table 13-8. ... Table 13-5 ELF Section Types, sh_type NameValue . . . SHT_LOSUNW0x6fffffee SHT_SUNW_ancillary0x6fffffee . . . ... SHT_LOSUNW - SHT_HISUNW Values in this inclusive range are reserved for Oracle Solaris OS semantics. SHT_SUNW_ANCILLARY Present when a given object is part of a group of ancillary objects. Contains information required to identify all the files that make up the group. See Ancillary Section. ... Table 13-8 ELF Section Attribute Flags NameValue . . . SHF_MASKOS0x0ff00000 SHF_SUNW_NODISCARD0x00100000 SHF_SUNW_ABSENT0x00200000 SHF_SUNW_PRIMARY0x00400000 SHF_MASKPROC0xf0000000 . . . ... SHF_SUNW_ABSENT Indicates that the data for this section is not present in this file. When ancillary objects are created, the primary object and any ancillary objects, will all have the same section header array, to facilitate merging them to form a complete view of the object, and to allow them to use the same symbol tables. Each file contains a subset of the section data. The data for allocable sections is written to the primary object while the data for non-allocable sections is written to an ancillary file. The SHF_SUNW_ABSENT flag is used to indicate that the data for the section is not present in the object being examined. When the SHF_SUNW_ABSENT flag is set, the sh_size field of the section header must be 0. An application encountering an SHF_SUNW_ABSENT section can choose to ignore the section, or to search for the section data within one of the related ancillary files. SHF_SUNW_PRIMARY The default behavior when ancillary objects are created is to write all allocable sections to the primary object and all non-allocable sections to the ancillary objects. The SHF_SUNW_PRIMARY flag overrides this behavior. Any output section containing one more input section with the SHF_SUNW_PRIMARY flag set is written to the primary object without regard for its allocable status. ... Two members in the section header, sh_link, and sh_info, hold special information, depending on section type. Table 13-9 ELF sh_link and sh_info Interpretation sh_typesh_linksh_info . . . SHT_SUNW_ANCILLARY The section header index of the associated string table. 0 . . . Special Sections Edit Note: This section describes the sections used in Solaris ELF objects, using the types defined in the previous description of section types. It was updated to define the new .SUNW_ancillary (SHT_SUNW_ANCILLARY) section. Various sections hold program and control information. Sections in the following table are used by the system and have the indicated types and attributes. Table 13-10 ELF Special Sections NameTypeAttribute . . . .SUNW_ancillarySHT_SUNW_ancillaryNone . . . ... .SUNW_ancillary Present when a given object is part of a group of ancillary objects. Contains information required to identify all the files that make up the group. See Ancillary Section for details. ... Ancillary Section Edit Note: This new section provides the format reference describing the layout of a .SUNW_ancillary section and the meaning of the various tags. Note that these sections use the same tag/value concept used for dynamic and capabilities sections, and will be familiar to anyone used to working with ELF. In addition to the primary output object, the Solaris link-editor can produce one or more ancillary objects. Ancillary objects contain non-allocable sections that would normally be written to the primary object. When ancillary objects are produced, the primary object and all of the associated ancillary objects contain a SHT_SUNW_ancillary section, containing information that identifies these related objects. Given any one object from such a group, the ancillary section provides the information needed to identify and interpret the others. This section contains an array of the following structures. See sys/elf.h. typedef struct { Elf32_Word a_tag; union { Elf32_Word a_val; Elf32_Addr a_ptr; } a_un; } Elf32_Ancillary; typedef struct { Elf64_Xword a_tag; union { Elf64_Xword a_val; Elf64_Addr a_ptr; } a_un; } Elf64_Ancillary; For each object with this type, a_tag controls the interpretation of a_un. a_val These objects represent integer values with various interpretations. a_ptr These objects represent file offsets or addresses. The following ancillary tags exist. Table 13-NEW1 ELF Ancillary Array Tags NameValuea_un ANC_SUNW_NULL0Ignored ANC_SUNW_CHECKSUM1a_val ANC_SUNW_MEMBER2a_ptr ANC_SUNW_NULL Marks the end of the ancillary section. ANC_SUNW_CHECKSUM Provides the checksum for a file in the c_val element. When ANC_SUNW_CHECKSUM precedes the first instance of ANC_SUNW_MEMBER, it provides the checksum for the object from which the ancillary section is being read. When it follows an ANC_SUNW_MEMBER tag, it provides the checksum for that member. ANC_SUNW_MEMBER Specifies an object name. The a_ptr element contains the string table offset of a null-terminated string, that provides the file name. An ancillary section must always contain an ANC_SUNW_CHECKSUM before the first instance of ANC_SUNW_MEMBER, identifying the current object. Following that, there should be an ANC_SUNW_MEMBER for each object that makes up the complete set of objects. Each ANC_SUNW_MEMBER should be followed by an ANC_SUNW_CHECKSUM for that object. A typical ancillary section will therefore be structured as: TagMeaning ANC_SUNW_CHECKSUMChecksum of this object ANC_SUNW_MEMBERName of object #1 ANC_SUNW_CHECKSUMChecksum for object #1 . . . ANC_SUNW_MEMBERName of object N ANC_SUNW_CHECKSUMChecksum for object N ANC_SUNW_NULL An object can therefore identify itself by comparing the initial ANC_SUNW_CHECKSUM to each of the ones that follow, until it finds a match. Related Other Work The GNU developers have also encountered the need/desire to support separate debug information files, and use the solution detailed at http://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html. At the current time, the separate debug file is constructed by building the standard object first, and then copying the debug data out of it in a separate post processing step, Hence, it is limited to a total of 4GB of code and debug data, just as a single object file would be. They are aware of this, and I have seen online comments indicating that they may add direct support for generating these separate files to their link-editor. It is worth noting that the GNU objcopy utility is available on Solaris, and that the Studio dbx debugger is able to use these GNU style separate debug files even on Solaris. Although this is interesting in terms giving Linux users a familiar environment on Solaris, the 4GB limit means it is not an answer to the problem of very large 32-bit objects. We have also encountered issues with objcopy not understanding Solaris-specific ELF sections, when using this approach. The GNU community also has a current effort to adapt their DWARF debug sections in order to move them to separate files before passing the relocatable objects to the linker. The details of Project Fission can be found at http://gcc.gnu.org/wiki/DebugFission. The goal of this project appears to be to reduce the amount of data seen by the link-editor. The primary effort revolves around moving DWARF data to separate .dwo files so that the link-editor never encounters them. The details of modifying the DWARF data to be usable in this form are involved — please see the above URL for details.

Read the article
Dynamically loading Assemblies to reduce Runtime Depencies

- by Rick Strahl

I've been working on a request to the West Wind Application Configuration library to add JSON support. The config library is a very easy to use code-first approach to configuration: You create a class that holds the configuration data that inherits from a base configuration class, and then assign a persistence provider at runtime that determines where and how the configuration data is store. Currently the library supports .NET Configuration stores (web.config/app.config), XML files, SQL records and string storage.About once a week somebody asks me about JSON support and I've deflected this question for the longest time because frankly I think that JSON as a configuration store doesn't really buy a heck of a lot over XML. Both formats require the user to perform some fixup of the plain configuration data - in XML into XML tags, with JSON using JSON delimiters for properties and property formatting rules. Sure JSON is a little less verbose and maybe a little easier to read if you have hierarchical data, but overall the differences are pretty minor in my opinion. And yet - the requests keep rolling in.Hard Link Issues in a Component LibraryAnother reason I've been hesitant is that I really didn't want to pull in a dependency on an external JSON library - in this case JSON.NET - into the core library. If you're not using JSON.NET elsewhere I don't want a user to have to require a hard dependency on JSON.NET unless they want to use the JSON feature. JSON.NET is also sensitive to versions and doesn't play nice with multiple versions when hard linked. For example, when you have a reference to V4.4 in your project but the host application has a reference to version 4.5 you can run into assembly load problems. NuGet's Update-Package can solve some of this *if* you can recompile, but that's not ideal for a component that's supposed to be just plug and play. This is no criticism of JSON.NET - this really applies to any dependency that might change. So hard linking the DLL can be problematic for a number reasons, but the primary reason is to not force loading of JSON.NET unless you actually need it when you use the JSON configuration features of the library.Enter Dynamic LoadingSo rather than adding an assembly reference to the project, I decided that it would be better to dynamically load the DLL at runtime and then use dynamic typing to access various classes. This allows me to run without a hard assembly reference and allows more flexibility with version number differences now and in the future.But there are also a couple of downsides:No assembly reference means only dynamic access - no compiler type checking or IntellisenseRequirement for the host application to have reference to JSON.NET or else get runtime errorsThe former is minor, but the latter can be problematic. Runtime errors are always painful, but in this case I'm willing to live with this. If you want to use JSON configuration settings JSON.NET needs to be loaded in the project. If this is a Web project, it'll likely be there already.So there are a few things that are needed to make this work:Dynamically create an instance and optionally attempt to load an Assembly (if not loaded)Load types into dynamic variablesUse Reflection for a few tasks like statics/enumsThe dynamic keyword in C# makes the formerly most difficult Reflection part - method calls and property assignments - fairly painless. But as cool as dynamic is it doesn't handle all aspects of Reflection. Specifically it doesn't deal with object activation, truly dynamic (string based) member activation or accessing of non instance members, so there's still a little bit of work left to do with Reflection.Dynamic Object InstantiationThe first step in getting the process rolling is to instantiate the type you need to work with. This might be a two step process - loading the instance from a string value, since we don't have a hard type reference and potentially having to load the assembly. Although the host project might have a reference to JSON.NET, that instance might have not been loaded yet since it hasn't been accessed yet. In ASP.NET this won't be a problem, since ASP.NET preloads all referenced assemblies on AppDomain startup, but in other executable project, assemblies are just in time loaded only when they are accessed.Instantiating a type is a two step process: Finding the type reference and then activating it. Here's the generic code out of my ReflectionUtils library I use for this:/// <summary> /// Creates an instance of a type based on a string. Assumes that the type's /// </summary> /// <param name="typeName">Common name of the type</param> /// <param name="args">Any constructor parameters</param> /// <returns></returns> public static object CreateInstanceFromString(string typeName, params object[] args) { object instance = null; Type type = null; try { type = GetTypeFromName(typeName); if (type == null) return null; instance = Activator.CreateInstance(type, args); } catch { return null; } return instance; } /// <summary> /// Helper routine that looks up a type name and tries to retrieve the /// full type reference in the actively executing assemblies. /// </summary> /// <param name="typeName"></param> /// <returns></returns> public static Type GetTypeFromName(string typeName) { Type type = null; // Let default name binding find it type = Type.GetType(typeName, false); if (type != null) return type; // look through assembly list var assemblies = AppDomain.CurrentDomain.GetAssemblies(); // try to find manually foreach (Assembly asm in assemblies) { type = asm.GetType(typeName, false); if (type != null) break; } return type; } To use this for loading JSON.NET I have a small factory function that instantiates JSON.NET and sets a bunch of configuration settings on the generated object. The startup code also looks for failure and tries loading up the assembly when it fails since that's the main reason the load would fail. Finally it also caches the loaded instance for reuse (according to James the JSON.NET instance is thread safe and quite a bit faster when cached). Here's what the factory function looks like in JsonSerializationUtils:/// <summary> /// Dynamically creates an instance of JSON.NET /// </summary> /// <param name="throwExceptions">If true throws exceptions otherwise returns null</param> /// <returns>Dynamic JsonSerializer instance</returns> public static dynamic CreateJsonNet(bool throwExceptions = true) { if (JsonNet != null) return JsonNet; lock (SyncLock) { if (JsonNet != null) return JsonNet; // Try to create instance dynamic json = ReflectionUtils.CreateInstanceFromString("Newtonsoft.Json.JsonSerializer"); if (json == null) { try { var ass = AppDomain.CurrentDomain.Load("Newtonsoft.Json"); json = ReflectionUtils.CreateInstanceFromString("Newtonsoft.Json.JsonSerializer"); } catch (Exception ex) { if (throwExceptions) throw; return null; } } if (json == null) return null; json.ReferenceLoopHandling = (dynamic) ReflectionUtils.GetStaticProperty("Newtonsoft.Json.ReferenceLoopHandling", "Ignore"); // Enums as strings in JSON dynamic enumConverter = ReflectionUtils.CreateInstanceFromString("Newtonsoft.Json.Converters.StringEnumConverter"); json.Converters.Add(enumConverter); JsonNet = json; } return JsonNet; }This code's purpose is to return a fully configured JsonSerializer instance. As you can see the code tries to create an instance and when it fails tries to load the assembly, and then re-tries loading.Once the instance is loaded some configuration occurs on it. Specifically I set the ReferenceLoopHandling option to not blow up immediately when circular references are encountered. There are a host of other small config setting that might be useful to set, but the default seem to be good enough in recent versions. Note that I'm setting ReferenceLoopHandling which requires an Enum value to be set. There's no real easy way (short of using the cardinal numeric value) to set a property or pass parameters from static values or enums. This means I still need to use Reflection to make this work. I'm using the same ReflectionUtils class I previously used to handle this for me. The function looks up the type and then uses Type.InvokeMember() to read the static property.Another feature I need is have Enum values serialized as strings rather than numeric values which is the default. To do this I can use the StringEnumConverter to convert enums to strings by adding it to the Converters collection.As you can see there's still a bit of Reflection to be done even in C# 4+ with dynamic, but with a few helpers this process is relatively painless.Doing the actual JSON ConversionFinally I need to actually do my JSON conversions. For the Utility class I need serialization that works for both strings and files so I created four methods that handle these tasks two each for serialization and deserialization for string and file.Here's what the File Serialization looks like:/// <summary> /// Serializes an object instance to a JSON file. /// </summary> /// <param name="value">the value to serialize</param> /// <param name="fileName">Full path to the file to write out with JSON.</param> /// <param name="throwExceptions">Determines whether exceptions are thrown or false is returned</param> /// <param name="formatJsonOutput">if true pretty-formats the JSON with line breaks</param> /// <returns>true or false</returns> public static bool SerializeToFile(object value, string fileName, bool throwExceptions = false, bool formatJsonOutput = false) { dynamic writer = null; FileStream fs = null; try { Type type = value.GetType(); var json = CreateJsonNet(throwExceptions); if (json == null) return false; fs = new FileStream(fileName, FileMode.Create); var sw = new StreamWriter(fs, Encoding.UTF8); writer = Activator.CreateInstance(JsonTextWriterType, sw); if (formatJsonOutput) writer.Formatting = (dynamic)Enum.Parse(FormattingType, "Indented"); writer.QuoteChar = '"'; json.Serialize(writer, value); } catch (Exception ex) { Debug.WriteLine("JsonSerializer Serialize error: " + ex.Message); if (throwExceptions) throw; return false; } finally { if (writer != null) writer.Close(); if (fs != null) fs.Close(); } return true; }You can see more of the dynamic invocation in this code. First I grab the dynamic JsonSerializer instance using the CreateJsonNet() method shown earlier which returns a dynamic. I then create a JsonTextWriter and configure a couple of enum settings on it, and then call Serialize() on the serializer instance with the JsonTextWriter that writes the output to disk. Although this code is dynamic it's still fairly short and readable.For full circle operation here's the DeserializeFromFile() version:/// <summary> /// Deserializes an object from file and returns a reference. /// </summary> /// <param name="fileName">name of the file to serialize to</param> /// <param name="objectType">The Type of the object. Use typeof(yourobject class)</param> /// <param name="binarySerialization">determines whether we use Xml or Binary serialization</param> /// <param name="throwExceptions">determines whether failure will throw rather than return null on failure</param> /// <returns>Instance of the deserialized object or null. Must be cast to your object type</returns> public static object DeserializeFromFile(string fileName, Type objectType, bool throwExceptions = false) { dynamic json = CreateJsonNet(throwExceptions); if (json == null) return null; object result = null; dynamic reader = null; FileStream fs = null; try { fs = new FileStream(fileName, FileMode.Open, FileAccess.Read); var sr = new StreamReader(fs, Encoding.UTF8); reader = Activator.CreateInstance(JsonTextReaderType, sr); result = json.Deserialize(reader, objectType); reader.Close(); } catch (Exception ex) { Debug.WriteLine("JsonNetSerialization Deserialization Error: " + ex.Message); if (throwExceptions) throw; return null; } finally { if (reader != null) reader.Close(); if (fs != null) fs.Close(); } return result; }This code is a little more compact since there are no prettifying options to set. Here JsonTextReader is created dynamically and it receives the output from the Deserialize() operation on the serializer.You can take a look at the full JsonSerializationUtils.cs file on GitHub to see the rest of the operations, but the string operations are very similar - the code is fairly repetitive.These generic serialization utilities isolate the dynamic serialization logic that has to deal with the dynamic nature of JSON.NET, and any code that uses these functions is none the wiser that JSON.NET is dynamically loaded.Using the JsonSerializationUtils WrapperThe final consumer of the SerializationUtils wrapper is an actual ConfigurationProvider, that is responsible for handling reading and writing JSON values to and from files. The provider is simple a small wrapper around the SerializationUtils component and there's very little code to make this work now:The whole provider looks like this:/// <summary> /// Reads and Writes configuration settings in .NET config files and /// sections. Allows reading and writing to default or external files /// and specification of the configuration section that settings are /// applied to. /// </summary> public class JsonFileConfigurationProvider<TAppConfiguration> : ConfigurationProviderBase<TAppConfiguration> where TAppConfiguration: AppConfiguration, new() { /// <summary> /// Optional - the Configuration file where configuration settings are /// stored in. If not specified uses the default Configuration Manager /// and its default store. /// </summary> public string JsonConfigurationFile { get { return _JsonConfigurationFile; } set { _JsonConfigurationFile = value; } } private string _JsonConfigurationFile = string.Empty; public override bool Read(AppConfiguration config) { var newConfig = JsonSerializationUtils.DeserializeFromFile(JsonConfigurationFile, typeof(TAppConfiguration)) as TAppConfiguration; if (newConfig == null) { if(Write(config)) return true; return false; } DecryptFields(newConfig); DataUtils.CopyObjectData(newConfig, config, "Provider,ErrorMessage"); return true; } /// <summary> /// Return /// </summary> /// <typeparam name="TAppConfig"></typeparam> /// <returns></returns> public override TAppConfig Read<TAppConfig>() { var result = JsonSerializationUtils.DeserializeFromFile(JsonConfigurationFile, typeof(TAppConfig)) as TAppConfig; if (result != null) DecryptFields(result); return result; } /// <summary> /// Write configuration to XmlConfigurationFile location /// </summary> /// <param name="config"></param> /// <returns></returns> public override bool Write(AppConfiguration config) { EncryptFields(config); bool result = JsonSerializationUtils.SerializeToFile(config, JsonConfigurationFile,false,true); // Have to decrypt again to make sure the properties are readable afterwards DecryptFields(config); return result; } }This incidentally demonstrates how easy it is to create a new provider for the West Wind Application Configuration component. Simply implementing 3 methods will do in most cases.Note this code doesn't have any dynamic dependencies - all that's abstracted away in the JsonSerializationUtils(). From here on, serializing JSON is just a matter of calling the static methods on the SerializationUtils class.Already, there are several other places in some other tools where I use JSON serialization this is coming in very handy. With a couple of lines of code I was able to add JSON.NET support to an older AJAX library that I use replacing quite a bit of code that was previously in use. And for any other manual JSON operations (in a couple of apps I use JSON Serialization for 'blob' like document storage) this is also going to be handy.Performance?Some of you might be thinking that using dynamic and Reflection can't be good for performance. And you'd be right… In performing some informal testing it looks like the performance of the native code is nearly twice as fast as the dynamic code. Most of the slowness is attributable to type lookups. To test I created a native class that uses an actual reference to JSON.NET and performance was consistently around 85-90% faster with the referenced code. That being said though - I serialized 10,000 objects in 80ms vs. 45ms so this isn't hardly slouchy. For the configuration component speed is not that important because both read and write operations typically happen once on first access and then every once in a while. But for other operations - say a serializer trying to handle AJAX requests on a Web Server one would be well served to create a hard dependency.Dynamic Loading - Worth it?On occasion dynamic loading makes sense. But there's a price to be paid in added code complexity and a performance hit. But for some operations that are not pivotal to a component or application and only used under certain circumstances dynamic loading can be beneficial to avoid having to ship extra files and loading down distributions. These days when you create new projects in Visual Studio with 30 assemblies before you even add your own code, trying to keep file counts under control seems a good idea. It's not the kind of thing you do on a regular basis, but when needed it can be a useful tool. Hopefully some of you find this information useful…© Rick Strahl, West Wind Technologies, 2005-2013Posted in .NET C# Tweet !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src="//platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs"); (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
Using Stub Objects

- by user9154181

Having told the long and winding tale of where stub objects came from and how we use them to build Solaris, I'd like to focus now on the the nuts and bolts of building and using them. The following new features were added to the Solaris link-editor (ld) to support the production and use of stub objects: -z stub This new command line option informs ld that it is to build a stub object rather than a normal object. In this mode, it accepts the same command line arguments as usual, but will quietly ignore any objects and sharable object dependencies. STUB_OBJECT Mapfile Directive In order to build a stub version of an object, its mapfile must specify the STUB_OBJECT directive. When producing a non-stub object, the presence of STUB_OBJECT causes the link-editor to perform extra validation to ensure that the stub and non-stub objects will be compatible. ASSERT Mapfile Directive All data symbols exported from the object must have an ASSERT symbol directive in the mapfile that declares them as data and supplies the size, binding, bss attributes, and symbol aliasing details. When building the stub objects, the information in these ASSERT directives is used to create the data symbols. When building the real object, these ASSERT directives will ensure that the real object matches the linking interface presented by the stub. Although ASSERT was added to the link-editor in order to support stub objects, they are a general purpose feature that can be used independently of stub objects. For instance you might choose to use an ASSERT directive if you have a symbol that must have a specific address in order for the object to operate properly and you want to automatically ensure that this will always be the case. The material presented here is derived from a document I originally wrote during the development effort, which had the dual goals of providing supplemental materials for the stub object PSARC case, and as a set of edits that were eventually applied to the Oracle Solaris Linker and Libraries Manual (LLM). The Solaris 11 LLM contains this information in a more polished form. Stub Objects A stub object is a shared object, built entirely from mapfiles, that supplies the same linking interface as the real object, while containing no code or data. Stub objects cannot be used at runtime. However, an application can be built against a stub object, where the stub object provides the real object name to be used at runtime, and then use the real object at runtime. When building a stub object, the link-editor ignores any object or library files specified on the command line, and these files need not exist in order to build a stub. Since the compilation step can be omitted, and because the link-editor has relatively little work to do, stub objects can be built very quickly. Stub objects can be used to solve a variety of build problems: Speed Modern machines, using a version of make with the ability to parallelize operations, are capable of compiling and linking many objects simultaneously, and doing so offers significant speedups. However, it is typical that a given object will depend on other objects, and that there will be a core set of objects that nearly everything else depends on. It is necessary to impose an ordering that builds each object before any other object that requires it. This ordering creates bottlenecks that reduce the amount of parallelization that is possible and limits the overall speed at which the code can be built. Complexity/Correctness In a large body of code, there can be a large number of dependencies between the various objects. The makefiles or other build descriptions for these objects can become very complex and difficult to understand or maintain. The dependencies can change as the system evolves. This can cause a given set of makefiles to become slightly incorrect over time, leading to race conditions and mysterious rare build failures. Dependency Cycles It might be desirable to organize code as cooperating shared objects, each of which draw on the resources provided by the other. Such cycles cannot be supported in an environment where objects must be built before the objects that use them, even though the runtime linker is fully capable of loading and using such objects if they could be built. Stub shared objects offer an alternative method for building code that sidesteps the above issues. Stub objects can be quickly built for all the shared objects produced by the build. Then, all the real shared objects and executables can be built in parallel, in any order, using the stub objects to stand in for the real objects at link-time. Afterwards, the executables and real shared objects are kept, and the stub shared objects are discarded. Stub objects are built from a mapfile, which must satisfy the following requirements. The mapfile must specify the STUB_OBJECT directive. This directive informs the link-editor that the object can be built as a stub object, and as such causes the link-editor to perform validation and sanity checking intended to guarantee that an object and its stub will always provide identical linking interfaces. All function and data symbols that make up the external interface to the object must be explicitly listed in the mapfile. The mapfile must use symbol scope reduction ('*'), to remove any symbols not explicitly listed from the external interface. All global data exported from the object must have an ASSERT symbol attribute in the mapfile to specify the symbol type, size, and bss attributes. In the case where there are multiple symbols that reference the same data, the ASSERT for one of these symbols must specify the TYPE and SIZE attributes, while the others must use the ALIAS attribute to reference this primary symbol. Given such a mapfile, the stub and real versions of the shared object can be built using the same command line for each, adding the '-z stub' option to the link for the stub object, and omiting the option from the link for the real object. To demonstrate these ideas, the following code implements a shared object named idx5, which exports data from a 5 element array of integers, with each element initialized to contain its zero-based array index. This data is available as a global array, via an alternative alias data symbol with weak binding, and via a functional interface. % cat idx5.c int _idx5[5] = { 0, 1, 2, 3, 4 }; #pragma weak idx5 = _idx5 int idx5_func(int index) { if ((index 4)) return (-1); return (_idx5[index]); } A mapfile is required to describe the interface provided by this shared object. % cat mapfile $mapfile_version 2 STUB_OBJECT; SYMBOL_SCOPE { _idx5 { ASSERT { TYPE=data; SIZE=4[5] }; }; idx5 { ASSERT { BINDING=weak; ALIAS=_idx5 }; }; idx5_func; local: *; }; The following main program is used to print all the index values available from the idx5 shared object. % cat main.c #include <stdio.h> extern int _idx5[5], idx5[5], idx5_func(int); int main(int argc, char **argv) { int i; for (i = 0; i The following commands create a stub version of this shared object in a subdirectory named stublib. elfdump is used to verify that the resulting object is a stub. The command used to build the stub differs from that of the real object only in the addition of the -z stub option, and the use of a different output file name. This demonstrates the ease with which stub generation can be added to an existing makefile. % cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o stublib/libidx5.so.1 -zstub % ln -s libidx5.so.1 stublib/libidx5.so % elfdump -d stublib/libidx5.so | grep STUB [11] FLAGS_1 0x4000000 [ STUB ] The main program can now be built, using the stub object to stand in for the real shared object, and setting a runpath that will find the real object at runtime. However, as we have not yet built the real object, this program cannot yet be run. Attempts to cause the system to load the stub object are rejected, as the runtime linker knows that stub objects lack the actual code and data found in the real object, and cannot execute. % cc main.c -L stublib -R '$ORIGIN/lib' -lidx5 -lc % ./a.out ld.so.1: a.out: fatal: libidx5.so.1: open failed: No such file or directory Killed % LD_PRELOAD=stublib/libidx5.so.1 ./a.out ld.so.1: a.out: fatal: stublib/libidx5.so.1: stub shared object cannot be used at runtime Killed We build the real object using the same command as we used to build the stub, omitting the -z stub option, and writing the results to a different file. % cc -Kpic -G -M mapfile -h libidx5.so.1 idx5.c -o lib/libidx5.so.1 Once the real object has been built in the lib subdirectory, the program can be run. % ./a.out [0] 0 0 0 [1] 1 1 1 [2] 2 2 2 [3] 3 3 3 [4] 4 4 4 Mapfile Changes The version 2 mapfile syntax was extended in a number of places to accommodate stub objects. Conditional Input The version 2 mapfile syntax has the ability conditionalize mapfile input using the $if control directive. As you might imagine, these directives are used frequently with ASSERT directives for data, because a given data symbol will frequently have a different size in 32 or 64-bit code, or on differing hardware such as x86 versus sparc. The link-editor maintains an internal table of names that can be used in the logical expressions evaluated by $if and $elif. At startup, this table is initialized with items that describe the class of object (_ELF32 or _ELF64) and the type of the target machine (_sparc or _x86). We found that there were a small number of cases in the Solaris code base in which we needed to know what kind of object we were producing, so we added the following new predefined items in order to address that need: NameMeaning ...... _ET_DYNshared object _ET_EXECexecutable object _ET_RELrelocatable object ...... STUB_OBJECT Directive The new STUB_OBJECT directive informs the link-editor that the object described by the mapfile can be built as a stub object. STUB_OBJECT; A stub shared object is built entirely from the information in the mapfiles supplied on the command line. When the -z stub option is specified to build a stub object, the presence of the STUB_OBJECT directive in a mapfile is required, and the link-editor uses the information in symbol ASSERT attributes to create global symbols that match those of the real object. When the real object is built, the presence of STUB_OBJECT causes the link-editor to verify that the mapfiles accurately describe the real object interface, and that a stub object built from them will provide the same linking interface as the real object it represents. All function and data symbols that make up the external interface to the object must be explicitly listed in the mapfile. The mapfile must use symbol scope reduction ('*'), to remove any symbols not explicitly listed from the external interface. All global data in the object is required to have an ASSERT attribute that specifies the symbol type and size. If the ASSERT BIND attribute is not present, the link-editor provides a default assertion that the symbol must be GLOBAL. If the ASSERT SH_ATTR attribute is not present, or does not specify that the section is one of BITS or NOBITS, the link-editor provides a default assertion that the associated section is BITS. All data symbols that describe the same address and size are required to have ASSERT ALIAS attributes specified in the mapfile. If aliased symbols are discovered that do not have an ASSERT ALIAS specified, the link fails and no object is produced. These rules ensure that the mapfiles contain a description of the real shared object's linking interface that is sufficient to produce a stub object with a completely compatible linking interface. SYMBOL_SCOPE/SYMBOL_VERSION ASSERT Attribute The SYMBOL_SCOPE and SYMBOL_VERSION mapfile directives were extended with a symbol attribute named ASSERT. The syntax for the ASSERT attribute is as follows: ASSERT { ALIAS = symbol_name; BINDING = symbol_binding; TYPE = symbol_type; SH_ATTR = section_attributes; SIZE = size_value; SIZE = size_value[count]; }; The ASSERT attribute is used to specify the expected characteristics of the symbol. The link-editor compares the symbol characteristics that result from the link to those given by ASSERT attributes. If the real and asserted attributes do not agree, a fatal error is issued and the output object is not created. In normal use, the link editor evaluates the ASSERT attribute when present, but does not require them, or provide default values for them. The presence of the STUB_OBJECT directive in a mapfile alters the interpretation of ASSERT to require them under some circumstances, and to supply default assertions if explicit ones are not present. See the definition of the STUB_OBJECT Directive for the details. When the -z stub command line option is specified to build a stub object, the information provided by ASSERT attributes is used to define the attributes of the global symbols provided by the object. ASSERT accepts the following: ALIAS Name of a previously defined symbol that this symbol is an alias for. An alias symbol has the same type, value, and size as the main symbol. The ALIAS attribute is mutually exclusive to the TYPE, SIZE, and SH_ATTR attributes, and cannot be used with them. When ALIAS is specified, the type, size, and section attributes are obtained from the alias symbol. BIND Specifies an ELF symbol binding, which can be any of the STB_ constants defined in <sys/elf.h>, with the STB_ prefix removed (e.g. GLOBAL, WEAK). TYPE Specifies an ELF symbol type, which can be any of the STT_ constants defined in <sys/elf.h>, with the STT_ prefix removed (e.g. OBJECT, COMMON, FUNC). In addition, for compatibility with other mapfile usage, FUNCTION and DATA can be specified, for STT_FUNC and STT_OBJECT, respectively. TYPE is mutually exclusive to ALIAS, and cannot be used in conjunction with it. SH_ATTR Specifies attributes of the section associated with the symbol. The section_attributes that can be specified are given in the following table: Section AttributeMeaning BITSSection is not of type SHT_NOBITS NOBITSSection is of type SHT_NOBITS SH_ATTR is mutually exclusive to ALIAS, and cannot be used in conjunction with it. SIZE Specifies the expected symbol size. SIZE is mutually exclusive to ALIAS, and cannot be used in conjunction with it. The syntax for the size_value argument is as described in the discussion of the SIZE attribute below. SIZE The SIZE symbol attribute existed before support for stub objects was introduced. It is used to set the size attribute of a given symbol. This attribute results in the creation of a symbol definition. Prior to the introduction of the ASSERT SIZE attribute, the value of a SIZE attribute was always numeric. While attempting to apply ASSERT SIZE to the objects in the Solaris ON consolidation, I found that many data symbols have a size based on the natural machine wordsize for the class of object being produced. Variables declared as long, or as a pointer, will be 4 bytes in size in a 32-bit object, and 8 bytes in a 64-bit object. Initially, I employed the conditional $if directive to handle these cases as follows: $if _ELF32 foo { ASSERT { TYPE=data; SIZE=4 } }; bar { ASSERT { TYPE=data; SIZE=20 } }; $elif _ELF64 foo { ASSERT { TYPE=data; SIZE=8 } }; bar { ASSERT { TYPE=data; SIZE=40 } }; $else $error UNKNOWN ELFCLASS $endif I found that the situation occurs frequently enough that this is cumbersome. To simplify this case, I introduced the idea of the addrsize symbolic name, and of a repeat count, which together make it simple to specify machine word scalar or array symbols. Both the SIZE, and ASSERT SIZE attributes support this syntax: The size_value argument can be a numeric value, or it can be the symbolic name addrsize. addrsize represents the size of a machine word capable of holding a memory address. The link-editor substitutes the value 4 for addrsize when building 32-bit objects, and the value 8 when building 64-bit objects. addrsize is useful for representing the size of pointer variables and C variables of type long, as it automatically adjusts for 32 and 64-bit objects without requiring the use of conditional input. The size_value argument can be optionally suffixed with a count value, enclosed in square brackets. If count is present, size_value and count are multiplied together to obtain the final size value. Using this feature, the example above can be written more naturally as: foo { ASSERT { TYPE=data; SIZE=addrsize } }; bar { ASSERT { TYPE=data; SIZE=addrsize[5] } }; Exported Global Data Is Still A Bad Idea As you can see, the additional plumbing added to the Solaris link-editor to support stub objects is minimal. Furthermore, about 90% of that plumbing is dedicated to handling global data. We have long advised against global data exported from shared objects. There are many ways in which global data does not fit well with dynamic linking. Stub objects simply provide one more reason to avoid this practice. It is always better to export all data via a functional interface. You should always hide your data, and make it available to your users via a function that they can call to acquire the address of the data item. However, If you do have to support global data for a stub, perhaps because you are working with an already existing object, it is still easilily done, as shown above. Oracle does not like us to discuss hypothetical new features that don't exist in shipping product, so I'll end this section with a speculation. It might be possible to do more in this area to ease the difficulty of dealing with objects that have global data that the users of the library don't need. Perhaps someday... Conclusions It is easy to create stub objects for most objects. If your library only exports function symbols, all you have to do to build a faithful stub object is to add STUB_OBJECT; and then to use the same link command you're currently using, with the addition of the -z stub option. Happy Stubbing!

Read the article
Microsoft Introduces WebMatrix

- by Rick Strahl

originally published in CoDe Magazine Editorial Microsoft recently released the first CTP of a new development environment called WebMatrix, which along with some of its supporting technologies are squarely aimed at making the Microsoft Web Platform more approachable for first-time developers and hobbyists. But in the process, it also provides some updated technologies that can make life easier for existing .NET developers. Let’s face it: ASP.NET development isn’t exactly trivial unless you already have a fair bit of familiarity with sophisticated development practices. Stick a non-developer in front of Visual Studio .NET or even the Visual Web Developer Express edition and it’s not likely that the person in front of the screen will be very productive or feel inspired. Yet other technologies like PHP and even classic ASP did provide the ability for non-developers and hobbyists to become reasonably proficient in creating basic web content quickly and efficiently. WebMatrix appears to be Microsoft’s attempt to bring back some of that simplicity with a number of technologies and tools. The key is to provide a friendly and fully self-contained development environment that provides all the tools needed to build an application in one place, as well as tools that allow publishing of content and databases easily to the web server. WebMatrix is made up of several components and technologies: IIS Developer Express IIS Developer Express is a new, self-contained development web server that is fully compatible with IIS 7.5 and based on the same codebase that IIS 7.5 uses. This new development server replaces the much less compatible Cassini web server that’s been used in Visual Studio and the Express editions. IIS Express addresses a few shortcomings of the Cassini server such as the inability to serve custom ISAPI extensions (i.e., things like PHP or ASP classic for example), as well as not supporting advanced authentication. IIS Developer Express provides most of the IIS 7.5 feature set providing much better compatibility between development and live deployment scenarios. SQL Server Compact 4.0 Database access is a key component for most web-driven applications, but on the Microsoft stack this has mostly meant you have to use SQL Server or SQL Server Express. SQL Server Compact is not new-it’s been around for a few years, but it’s been severely hobbled in the past by terrible tool support and the inability to support more than a single connection in Microsoft’s attempt to avoid losing SQL Server licensing. The new release of SQL Server Compact 4.0 supports multiple connections and you can run it in ASP.NET web applications simply by installing an assembly into the bin folder of the web application. In effect, you don’t have to install a special system configuration to run SQL Compact as it is a drop-in database engine: Copy the small assembly into your BIN folder (or from the GAC if installed fully), create a connection string against a local file-based database file, and then start firing SQL requests. Additionally WebMatrix includes nice tools to edit the database tables and files, along with tools to easily upsize (and hopefully downsize in the future) to full SQL Server. This is a big win, pending compatibility and performance limits. In my simple testing the data engine performed well enough for small data sets. This is not only useful for web applications, but also for desktop applications for which a fully installed SQL engine like SQL Server would be overkill. Having a local data store in those applications that can potentially be accessed by multiple users is a welcome feature. ASP.NET Razor View Engine What? Yet another native ASP.NET view engine? We already have Web Forms and various different flavors of using that view engine with Web Forms and MVC. Do we really need another? Microsoft thinks so, and Razor is an implementation of a lightweight, script-only view engine. Unlike the Web Forms view engine, Razor works only with inline code, snippets, and markup; therefore, it is more in line with current thinking of what a view engine should represent. There’s no support for a “page model” or any of the other Web Forms features of the full-page framework, but just a lightweight scripting engine that works with plain markup plus embedded expressions and code. The markup syntax for Razor is geared for minimal typing, plus some progressive detection of where a script block/expression starts and ends. This results in a much leaner syntax than the typical ASP.NET Web Forms alligator (<% %>) tags. Razor uses the @ sign plus standard C# (or Visual Basic) block syntax to delineate code snippets and expressions. Here’s a very simple example of what Razor markup looks like along with some comment annotations: <!DOCTYPE html> <html> <head> <title></title> </head> <body> <h1>Razor Test</h1>  @DateTime.Now <hr />  @DateTime.Now.ToString("T")  @{ List<string> names = new List<string>(); names.Add("Rick"); names.Add("Markus"); names.Add("Claudio"); names.Add("Kevin"); }  <ul> @foreach(string name in names){ <li>@name</li> } </ul>  @if(true) {  <text> true </text>; } else {  <text> false </text>; } </body> </html> Like the Web Forms view engine, Razor parses pages into code, and then executes that run-time compiled code. Effectively a “page” becomes a code file with markup becoming literal text written into the Response stream, code snippets becoming raw code, and expressions being written out with Response.Write(). The code generated from Razor doesn’t look much different from similar Web Forms code that only uses script tags; so although the syntax may look different, the operational model is fairly similar to the Web Forms engine minus the overhead of the large Page object model. However, there are differences: -Razor pages are based on a new base class, Microsoft.WebPages.WebPage, which is hosted in the Microsoft.WebPages assembly that houses all the Razor engine parsing and processing logic. Browsing through the assembly (in the generated ASP.NET Temporary Files folder or GAC) will give you a good idea of the functionality that Razor provides. If you look closely, a lot of the feature set matches ASP.NET MVC’s view implementation as well as many of the helper classes found in MVC. It’s not hard to guess the motivation for this sort of view engine: For beginning developers the simple markup syntax is easier to work with, although you obviously still need to have some understanding of the .NET Framework in order to create dynamic content. The syntax is easier to read and grok and much shorter to type than ASP.NET alligator tags (<% %>) and also easier to understand aesthetically what’s happening in the markup code. Razor also is a better fit for Microsoft’s vision of ASP.NET MVC: It’s a new view engine without the baggage of Web Forms attached to it. The engine is more lightweight since it doesn’t carry all the features and object model of Web Forms with it and it can be instantiated directly outside of the HTTP environment, which has been rather tricky to do for the Web Forms view engine. Having a standalone script parser is a huge win for other applications as well – it makes it much easier to create script or meta driven output generators for many types of applications from code/screen generators, to simple form letters to data merging applications with user customizability. For me personally this is very useful side effect and who knows maybe Microsoft will actually standardize they’re scripting engines (die T4 die!) on this engine. Razor also better fits the “view-based” approach where the view is supposed to be mostly a visual representation that doesn’t hold much, if any, code. While you can still use code, the code you do write has to be self-contained. Overall I wouldn’t be surprised if Razor will become the new standard view engine for MVC in the future – and in fact there have been announcements recently that Razor will become the default script engine in ASP.NET MVC 3.0. Razor can also be used in existing Web Forms and MVC applications, although that’s not working currently unless you manually configure the script mappings and add the appropriate assemblies. It’s possible to do it, but it’s probably better to wait until Microsoft releases official support for Razor scripts in Visual Studio. Once that happens, you can simply drop .cshtml and .vbhtml pages into an existing ASP.NET project and they will work side by side with classic ASP.NET pages. WebMatrix Development Environment To tie all of these three technologies together, Microsoft is shipping WebMatrix with an integrated development environment. An integrated gallery manager makes it easy to download and load existing projects, and then extend them with custom functionality. It seems to be a prominent goal to provide community-oriented content that can act as a starting point, be it via a custom templates or a complete standard application. The IDE includes a project manager that works with a single project and provides an integrated IDE/editor for editing the .cshtml and .vbhtml pages. A run button allows you to quickly run pages in the project manager in a variety of browsers. There’s no debugging support for code at this time. Note that Razor pages don’t require explicit compilation, so making a change, saving, and then refreshing your page in the browser is all that’s needed to see changes while testing an application locally. It’s essentially using the auto-compiling Web Project that was introduced with .NET 2.0. All code is compiled during run time into dynamically created assemblies in the ASP.NET temp folder. WebMatrix also has PHP Editing support with syntax highlighting. You can load various PHP-based applications from the WebMatrix Web Gallery directly into the IDE. Most of the Web Gallery applications are ready to install and run without further configuration, with Wizards taking you through installation of tools, dependencies, and configuration of the database as needed. WebMatrix leverages the Web Platform installer to pull the pieces down from websites in a tight integration of tools that worked nicely for the four or five applications I tried this out on. Click a couple of check boxes and fill in a few simple configuration options and you end up with a running application that’s ready to be customized. Nice! You can easily deploy completed applications via WebDeploy (to an IIS server) or FTP directly from within the development environment. The deploy tool also can handle automatically uploading and installing the database and all related assemblies required, making deployment a simple one-click install step. Simplified Database Access The IDE contains a database editor that can edit SQL Compact and SQL Server databases. There is also a Database helper class that facilitates database access by providing easy-to-use, high-level query execution and iteration methods: @{ var db = Database.OpenFile("FirstApp.sdf"); string sql = "select * from customers where Id > @0"; } <ul> @foreach(var row in db.Query(sql,1)){ <li>@row.FirstName @row.LastName</li> } </ul> The query function takes a SQL statement plus any number of positional (@0,@1 etc.) SQL parameters by simple values. The result is returned as a collection of rows which in turn have a row object with dynamic properties for each of the columns giving easy (though untyped) access to each of the fields. Likewise Execute and ExecuteNonQuery allow execution of more complex queries using similar parameter passing schemes. Note these queries use string-based queries rather than LINQ or Entity Framework’s strongly typed LINQ queries. While this may seem like a step back, it’s also in line with the expectations of non .NET script developers who are quite used to writing and using SQL strings in code rather than using OR/M frameworks. The only question is why was something not included from the beginning in .NET and Microsoft made developers build custom implementations of these basic building blocks. The implementation looks a lot like a DataTable-style data access mechanism, but to be fair, this is a common approach in scripting languages. This type of syntax that uses simple, static, data object methods to perform simple data tasks with one line of code are common in scripting languages and are a good match for folks working in PHP/Python, etc. Seems like Microsoft has taken great advantage of .NET 4.0’s dynamic typing to provide this sort of interface for row iteration where each row has properties for each field. FWIW, all the examples demonstrate using local SQL Compact files - I was unable to get a SQL Server connection string to work with the Database class (the connection string wasn’t accepted). However, since the code in the page is still plain old .NET, you can easily use standard ADO.NET code or even LINQ or Entity Framework models that are created outside of WebMatrix in separate assemblies as required. The good the bad the obnoxious - It’s still .NET The beauty (or curse depending on how you look at it :)) of Razor and the compilation model is that, behind it all, it’s still .NET. Although the syntax may look foreign, it’s still all .NET behind the scenes. You can easily access existing tools, helpers, and utilities simply by adding them to the project as references or to the bin folder. Razor automatically recognizes any assembly reference from assemblies in the bin folder. In the default configuration, Microsoft provides a host of helper functions in a Microsoft.WebPages assembly (check it out in the ASP.NET temp folder for your application), which includes a host of HTML Helpers. If you’ve used ASP.NET MVC before, a lot of the helpers should look familiar. Documentation at the moment is sketchy-there’s a very rough API reference you can check out here: http://www.asp.net/webmatrix/tutorials/asp-net-web-pages-api-reference Who needs WebMatrix? Uhm… good Question Clearly Microsoft is trying hard to create an environment with WebMatrix that is easy to use for newbie developers. The goal seems to be simplicity in providing a minimal development environment and an easy-to-use script engine/language that makes it easy to get started with. There’s also some focus on community features that can be used as starting points, such as Web Gallery applications and templates. The community features in particular are very nice and something that would be nice to eventually see in Visual Studio as well. The question is whether this is too little too late. Developers who have been clamoring for a simpler development environment on the .NET stack have mostly left for other simpler platforms like PHP or Python which are catering to the down and dirty developer. Microsoft will be hard pressed to win those folks-and other hardcore PHP developers-back. Regardless of how much you dress up a script engine fronted by the .NET Framework, it’s still the .NET Framework and all the complexity that drives it. While .NET is a fine solution in its breadth and features once you get a basic handle on the core features, the bar of entry to being productive with the .NET Framework is still pretty high. The MVC style helpers Microsoft provides are a good step in the right direction, but I suspect it’s not enough to shield new developers from having to delve much deeper into the Framework to get even basic applications built. Razor and its helpers is trying to make .NET more accessible but the reality is that in order to do useful stuff that goes beyond the handful of simple helpers you still are going to have to write some C# or VB or other .NET code. If the target is a hobby/amateur/non-programmer the learning curve isn’t made any easier by WebMatrix it’s just been shifted a tad bit further along in your development endeavor when you run out of canned components that are supplied either by Microsoft or the community. The database helpers are interesting and actually I’ve heard a lot of discussion from various developers who’ve been resisting .NET for a really long time perking up at the prospect of easier data access in .NET than the ridiculous amount of code it takes to do even simple data access with raw ADO.NET. It seems sad that such a simple concept and implementation should trigger this sort of response (especially since it’s practically trivial to create helpers like these or pick them up from countless libraries available), but there it is. It also shows that there are plenty of developers out there who are more interested in ‘getting stuff done’ easily than necessarily following the latest and greatest practices which are overkill for many development scenarios. Sometimes it seems that all of .NET is focused on the big life changing issues of development, rather than the bread and butter scenarios that many developers are interested in to get their work accomplished. And that in the end may be WebMatrix’s main raison d'être: To bring some focus back at Microsoft that simpler and more high level solutions are actually needed to appeal to the non-high end developers as well as providing the necessary tools for the high end developers who want to follow the latest and greatest trends. The current version of WebMatrix hits many sweet spots, but it also feels like it has a long way to go before it really can be a tool that a beginning developer or an accomplished developer can feel comfortable with. Although there are some really good ideas in the environment (like the gallery for downloading apps and components) which would be a great addition for Visual Studio as well, the rest of the development environment just feels like crippleware with required functionality missing especially debugging and Intellisense, but also general editor support. It’s not clear whether these are because the product is still in an early alpha release or whether it’s simply designed that way to be a really limited development environment. While simple can be good, nobody wants to feel left out when it comes to necessary tool support and WebMatrix just has that left out feeling to it. If anything WebMatrix’s technology pieces (which are really independent of the WebMatrix product) are what are interesting to developers in general. The compact IIS implementation is a nice improvement for development scenarios and SQL Compact 4.0 seems to address a lot of concerns that people have had and have complained about for some time with previous SQL Compact implementations. By far the most interesting and useful technology though seems to be the Razor view engine for its light weight implementation and it’s decoupling from the ASP.NET/HTTP pipeline to provide a standalone scripting/view engine that is pluggable. The first winner of this is going to be ASP.NET MVC which can now have a cleaner view model that isn’t inconsistent due to the baggage of non-implemented WebForms features that don’t work in MVC. But I expect that Razor will end up in many other applications as a scripting and code generation engine eventually. Visual Studio integration for Razor is currently missing, but is promised for a later release. The ASP.NET MVC team has already mentioned that Razor will eventually become the default MVC view engine, which will guarantee continued growth and development of this tool along those lines. And the Razor engine and support tools actually inherit many of the features that MVC pioneered, so there’s some synergy flowing both ways between Razor and MVC. As an existing ASP.NET developer who’s already familiar with Visual Studio and ASP.NET development, the WebMatrix IDE doesn’t give you anything that you want. The tools provided are minimal and provide nothing that you can’t get in Visual Studio today, except the minimal Razor syntax highlighting, so there’s little need to take a step back. With Visual Studio integration coming later there’s little reason to look at WebMatrix for tooling. It’s good to see that Microsoft is giving some thought about the ease of use of .NET as a platform For so many years, we’ve been piling on more and more new features without trying to take a step back and see how complicated the development/configuration/deployment process has become. Sometimes it’s good to take a step - or several steps - back and take another look and realize just how far we’ve come. WebMatrix is one of those reminders and one that likely will result in some positive changes on the platform as a whole. © Rick Strahl, West Wind Technologies, 2005-2010Posted in ASP.NET IIS7

Read the article
Sniffing out SQL Code Smells: Inconsistent use of Symbolic names and Datatypes

- by Phil Factor

It is an awkward feeling. You’ve just delivered a database application that seems to be working fine in production, and you just run a few checks on it. You discover that there is a potential bug that, out of sheer good chance, hasn’t kicked in to produce an error; but it lurks, like a smoking bomb. Worse, maybe you find that the bug has started its evil work of corrupting the data, but in ways that nobody has, so far detected. You investigate, and find the damage. You are somehow going to have to repair it. Yes, it still very occasionally happens to me. It is not a nice feeling, and I do anything I can to prevent it happening. That’s why I’m interested in SQL code smells. SQL Code Smells aren’t necessarily bad practices, but just show you where to focus your attention when checking an application. Sometimes with databases the bugs can be subtle. SQL is rather like HTML: the language does its best to try to carry out your wishes, rather than to be picky about your bugs. Most of the time, this is a great benefit, but not always. One particular place where this can be detrimental is where you have implicit conversion between different data types. Most of the time it is completely harmless but we’re concerned about the occasional time it isn’t. Let’s give an example: String truncation. Let’s give another even more frightening one, rounding errors on assignment to a number of different precision. Each requires a blog-post to explain in detail and I’m not now going to try. Just remember that it is not always a good idea to assign data to variables, parameters or even columns when they aren’t the same datatype, especially if you are relying on implicit conversion to work its magic.For details of the problem and the consequences, see here: SR0014: Data loss might occur when casting from {Type1} to {Type2} . For any experienced Database Developer, this is a more frightening read than a Vampire Story. This is why one of the SQL Code Smells that makes me edgy, in my own or other peoples’ code, is to see parameters, variables and columns that have the same names and different datatypes. Whereas quite a lot of this is perfectly normal and natural, you need to check in case one of two things have gone wrong. Either sloppy naming, or mixed datatypes. Sure it is hard to remember whether you decided that the length of a log entry was 80 or 100 characters long, or the precision of a number. That is why a little check like this I’m going to show you is excellent for tidying up your code before you check it back into source Control! 1/ Checking Parameters only If you were just going to check parameters, you might just do this. It simply groups all the parameters, either input or output, of all the routines (e.g. stored procedures or functions) by their name and checks to see, in the HAVING clause, whether their data types are all the same. If not, it lists all the examples and their origin (the routine) Even this little check can occasionally be scarily revealing. ;WITH userParameter AS ( SELECT c.NAME AS ParameterName, OBJECT_SCHEMA_NAME(c.object_ID) + '.' + OBJECT_NAME(c.object_ID) AS ObjectName, t.name + ' ' + CASE --we may have to put in the length WHEN t.name IN ('char', 'varchar', 'nchar', 'nvarchar') THEN '(' + CASE WHEN c.max_length = -1 THEN 'MAX' ELSE CONVERT(VARCHAR(4), CASE WHEN t.name IN ('nchar', 'nvarchar') THEN c.max_length / 2 ELSE c.max_length END) END + ')' WHEN t.name IN ('decimal', 'numeric') THEN '(' + CONVERT(VARCHAR(4), c.precision) + ',' + CONVERT(VARCHAR(4), c.Scale) + ')' ELSE '' END --we've done with putting in the length + CASE WHEN XML_collection_ID <> 0 THEN --deal with object schema names '(' + CASE WHEN is_XML_Document = 1 THEN 'DOCUMENT ' ELSE 'CONTENT ' END + COALESCE( (SELECT QUOTENAME(ss.name) + '.' + QUOTENAME(sc.name) FROM sys.xml_schema_collections sc INNER JOIN Sys.Schemas ss ON sc.schema_ID = ss.schema_ID WHERE sc.xml_collection_ID = c.XML_collection_ID),'NULL') + ')' ELSE '' END AS [DataType] FROM sys.parameters c INNER JOIN sys.types t ON c.user_Type_ID = t.user_Type_ID WHERE OBJECT_SCHEMA_NAME(c.object_ID) <> 'sys' AND parameter_id>0)SELECT CONVERT(CHAR(80),objectName+'.'+ParameterName),DataType FROM UserParameterWHERE ParameterName IN (SELECT ParameterName FROM UserParameter GROUP BY ParameterName HAVING MIN(Datatype)<>MAX(DataType))ORDER BY ParameterName so, in a very small example here, we have a @ClosingDelimiter variable that is only CHAR(1) when, by the looks of it, it should be up to ten characters long, or even worse, a function that should be a char(1) and seems to let in a string of ten characters. Worth investigating. Then we have a @Comment variable that can't decide whether it is a VARCHAR(2000) or a VARCHAR(MAX) 2/ Columns and Parameters Actually, once we’ve cleared up the mess we’ve made of our parameter-naming in the database we’re inspecting, we’re going to be more interested in listing both columns and parameters. We can do this by modifying the routine to list columns as well as parameters. Because of the slight complexity of creating the string version of the datatypes, we will create a fake table of both columns and parameters so that they can both be processed the same way. After all, we want the datatypes to match Unfortunately, parameters do not expose all the attributes we are interested in, such as whether they are nullable (oh yes, subtle bugs happen if this isn’t consistent for a datatype). We’ll have to leave them out for this check. Voila! A slight modification of the first routine ;WITH userObject AS ( SELECT Name AS DataName,--the actual name of the parameter or column ('@' removed) --and the qualified object name of the routine OBJECT_SCHEMA_NAME(ObjectID) + '.' + OBJECT_NAME(ObjectID) AS ObjectName, --now the harder bit: the definition of the datatype. TypeName + ' ' + CASE --we may have to put in the length. e.g. CHAR (10) WHEN TypeName IN ('char', 'varchar', 'nchar', 'nvarchar') THEN '(' + CASE WHEN MaxLength = -1 THEN 'MAX' ELSE CONVERT(VARCHAR(4), CASE WHEN TypeName IN ('nchar', 'nvarchar') THEN MaxLength / 2 ELSE MaxLength END) END + ')' WHEN TypeName IN ('decimal', 'numeric')--a BCD number! THEN '(' + CONVERT(VARCHAR(4), Precision) + ',' + CONVERT(VARCHAR(4), Scale) + ')' ELSE '' END --we've done with putting in the length + CASE WHEN XML_collection_ID <> 0 --tush tush. XML THEN --deal with object schema names '(' + CASE WHEN is_XML_Document = 1 THEN 'DOCUMENT ' ELSE 'CONTENT ' END + COALESCE( (SELECT TOP 1 QUOTENAME(ss.name) + '.' + QUOTENAME(sc.Name) FROM sys.xml_schema_collections sc INNER JOIN Sys.Schemas ss ON sc.schema_ID = ss.schema_ID WHERE sc.xml_collection_ID = XML_collection_ID),'NULL') + ')' ELSE '' END AS [DataType], DataObjectType FROM (Select t.name AS TypeName, REPLACE(c.name,'@','') AS Name, c.max_length AS MaxLength, c.precision AS [Precision], c.scale AS [Scale], c.[Object_id] AS ObjectID, XML_collection_ID, is_XML_Document,'P' AS DataobjectType FROM sys.parameters c INNER JOIN sys.types t ON c.user_Type_ID = t.user_Type_ID AND parameter_id>0 UNION all Select t.name AS TypeName, c.name AS Name, c.max_length AS MaxLength, c.precision AS [Precision], c.scale AS [Scale], c.[Object_id] AS ObjectID, XML_collection_ID,is_XML_Document, 'C' AS DataobjectType FROM sys.columns c INNER JOIN sys.types t ON c.user_Type_ID = t.user_Type_ID WHERE OBJECT_SCHEMA_NAME(c.object_ID) <> 'sys' )f)SELECT CONVERT(CHAR(80),objectName+'.' + CASE WHEN DataobjectType ='P' THEN '@' ELSE '' END + DataName),DataType FROM UserObjectWHERE DataName IN (SELECT DataName FROM UserObject GROUP BY DataName HAVING MIN(Datatype)<>MAX(DataType))ORDER BY DataName Hmm. I can tell you I found quite a few minor issues with the various tabases I tested this on, and found some potential bugs that really leap out at you from the results. Here is the start of the result for AdventureWorks. Yes, AccountNumber is, for some reason, a Varchar(10) in the Customer table. Hmm. odd. Why is a city fifty characters long in that view? The idea of the description of a colour being 256 characters long seems over-ambitious. Go down the list and you'll spot other mistakes. There are no bugs, but just mess. We started out with a listing to examine parameters, then we mixed parameters and columns. Our last listing is for a slightly more in-depth look at table columns. You’ll notice that we’ve delibarately removed the indication of whether a column is persisted, or is an identity column because that gives us false positives for our code smells. If you just want to browse your metadata for other reasons (and it can quite help in some circumstances) then uncomment them! ;WITH userColumns AS ( SELECT c.NAME AS columnName, OBJECT_SCHEMA_NAME(c.object_ID) + '.' + OBJECT_NAME(c.object_ID) AS ObjectName, REPLACE(t.name + ' ' + CASE WHEN is_computed = 1 THEN ' AS ' + --do DDL for a computed column (SELECT definition FROM sys.computed_columns cc WHERE cc.object_id = c.object_id AND cc.column_ID = c.column_ID) --we may have to put in the length WHEN t.Name IN ('char', 'varchar', 'nchar', 'nvarchar') THEN '(' + CASE WHEN c.Max_Length = -1 THEN 'MAX' ELSE CONVERT(VARCHAR(4), CASE WHEN t.Name IN ('nchar', 'nvarchar') THEN c.Max_Length / 2 ELSE c.Max_Length END) END + ')' WHEN t.name IN ('decimal', 'numeric') THEN '(' + CONVERT(VARCHAR(4), c.precision) + ',' + CONVERT(VARCHAR(4), c.Scale) + ')' ELSE '' END + CASE WHEN c.is_rowguidcol = 1 THEN ' ROWGUIDCOL' ELSE '' END + CASE WHEN XML_collection_ID <> 0 THEN --deal with object schema names '(' + CASE WHEN is_XML_Document = 1 THEN 'DOCUMENT ' ELSE 'CONTENT ' END + COALESCE((SELECT QUOTENAME(ss.name) + '.' + QUOTENAME(sc.name) FROM sys.xml_schema_collections sc INNER JOIN Sys.Schemas ss ON sc.schema_ID = ss.schema_ID WHERE sc.xml_collection_ID = c.XML_collection_ID), 'NULL') + ')' ELSE '' END + CASE WHEN is_identity = 1 THEN CASE WHEN OBJECTPROPERTY(object_id, 'IsUserTable') = 1 AND COLUMNPROPERTY(object_id, c.name, 'IsIDNotForRepl') = 0 AND OBJECTPROPERTY(object_id, 'IsMSShipped') = 0 THEN '' ELSE ' NOT FOR REPLICATION ' END ELSE '' END + CASE WHEN c.is_nullable = 0 THEN ' NOT NULL' ELSE ' NULL' END + CASE WHEN c.default_object_id <> 0 THEN ' DEFAULT ' + object_Definition(c.default_object_id) ELSE '' END + CASE WHEN c.collation_name IS NULL THEN '' WHEN c.collation_name <> (SELECT collation_name FROM sys.databases WHERE name = DB_NAME()) COLLATE Latin1_General_CI_AS THEN COALESCE(' COLLATE ' + c.collation_name, '') ELSE '' END,' ',' ') AS [DataType]FROM sys.columns c INNER JOIN sys.types t ON c.user_Type_ID = t.user_Type_ID WHERE OBJECT_SCHEMA_NAME(c.object_ID) <> 'sys')SELECT CONVERT(CHAR(80),objectName+'.'+columnName),DataType FROM UserColumnsWHERE columnName IN (SELECT columnName FROM UserColumns GROUP BY columnName HAVING MIN(Datatype)<>MAX(DataType))ORDER BY columnName If you take a look down the results against Adventureworks, you'll see once again that there are things to investigate, mostly, in the illustration, discrepancies between null and non-null datatypes So I here you ask, what about temporary variables within routines? If ever there was a source of elusive bugs, you'll find it there. Sadly, these temporary variables are not stored in the metadata so we'll have to find a more subtle way of flushing these out, and that will, I'm afraid, have to wait!

Read the article
Implementing an async "read all currently available data from stream" operation

- by Jon

I recently provided an answer to this question: C# - Realtime console output redirection. As often happens, explaining stuff (here "stuff" was how I tackled a similar problem) leads you to greater understanding and/or, as is the case here, "oops" moments. I realized that my solution, as implemented, has a bug. The bug has little practical importance, but it has an extremely large importance to me as a developer: I can't rest easy knowing that my code has the potential to blow up. Squashing the bug is the purpose of this question. I apologize for the long intro, so let's get dirty. I wanted to build a class that allows me to receive input from a console's standard output Stream. Console output streams are of type FileStream; the implementation can cast to that, if needed. There is also an associated StreamReader already present to leverage. There is only one thing I need to implement in this class to achieve my desired functionality: an async "read all the data available this moment" operation. Reading to the end of the stream is not viable because the stream will not end unless the process closes the console output handle, and it will not do that because it is interactive and expecting input before continuing. I will be using that hypothetical async operation to implement event-based notification, which will be more convenient for my callers. The public interface of the class is this: public class ConsoleAutomator { public event EventHandler<ConsoleOutputReadEventArgs> StandardOutputRead; public void StartSendingEvents(); public void StopSendingEvents(); } StartSendingEvents and StopSendingEvents do what they advertise; for the purposes of this discussion, we can assume that events are always being sent without loss of generality. The class uses these two fields internally: protected readonly StringBuilder inputAccumulator = new StringBuilder(); protected readonly byte[] buffer = new byte[256]; The functionality of the class is implemented in the methods below. To get the ball rolling: public void StartSendingEvents(); { this.stopAutomation = false; this.BeginReadAsync(); } To read data out of the Stream without blocking, and also without requiring a carriage return char, BeginRead is called: protected void BeginReadAsync() { if (!this.stopAutomation) { this.StandardOutput.BaseStream.BeginRead( this.buffer, 0, this.buffer.Length, this.ReadHappened, null); } } The challenging part: BeginRead requires using a buffer. This means that when reading from the stream, it is possible that the bytes available to read ("incoming chunk") are larger than the buffer. Remember that the goal here is to read all of the chunk and call event subscribers exactly once for each chunk. To this end, if the buffer is full after EndRead, we don't send its contents to subscribers immediately but instead append them to a StringBuilder. The contents of the StringBuilder are only sent back whenever there is no more to read from the stream. private void ReadHappened(IAsyncResult asyncResult) { var bytesRead = this.StandardOutput.BaseStream.EndRead(asyncResult); if (bytesRead == 0) { this.OnAutomationStopped(); return; } var input = this.StandardOutput.CurrentEncoding.GetString( this.buffer, 0, bytesRead); this.inputAccumulator.Append(input); if (bytesRead < this.buffer.Length) { this.OnInputRead(); // only send back if we 're sure we got it all } this.BeginReadAsync(); // continue "looping" with BeginRead } After any read which is not enough to fill the buffer (in which case we know that there was no more data to be read during the last read operation), all accumulated data is sent to the subscribers: private void OnInputRead() { var handler = this.StandardOutputRead; if (handler == null) { return; } handler(this, new ConsoleOutputReadEventArgs(this.inputAccumulator.ToString())); this.inputAccumulator.Clear(); } (I know that as long as there are no subscribers the data gets accumulated forever. This is a deliberate decision). The good This scheme works almost perfectly: Async functionality without spawning any threads Very convenient to the calling code (just subscribe to an event) Never more than one event for each time data is available to be read Is almost agnostic to the buffer size The bad That last almost is a very big one. Consider what happens when there is an incoming chunk with length exactly equal to the size of the buffer. The chunk will be read and buffered, but the event will not be triggered. This will be followed up by a BeginRead that expects to find more data belonging to the current chunk in order to send it back all in one piece, but... there will be no more data in the stream. In fact, as long as data is put into the stream in chunks with length exactly equal to the buffer size, the data will be buffered and the event will never be triggered. This scenario may be highly unlikely to occur in practice, especially since we can pick any number for the buffer size, but the problem is there. Solution? Unfortunately, after checking the available methods on FileStream and StreamReader, I can't find anything which lets me peek into the stream while also allowing async methods to be used on it. One "solution" would be to have a thread wait on a ManualResetEvent after the "buffer filled" condition is detected. If the event is not signaled (by the async callback) in a small amount of time, then more data from the stream will not be forthcoming and the data accumulated so far should be sent to subscribers. However, this introduces the need for another thread, requires thread synchronization, and is plain inelegant. Specifying a timeout for BeginRead would also suffice (call back into my code every now and then so I can check if there's data to be sent back; most of the time there will not be anything to do, so I expect the performance hit to be negligible). But it looks like timeouts are not supported in FileStream. Since I imagine that async calls with timeouts are an option in bare Win32, another approach might be to PInvoke the hell out of the problem. But this is also undesirable as it will introduce complexity and simply be a pain to code. Is there an elegant way to get around the problem? Thanks for being patient enough to read all of this. Update: I definitely did not communicate the scenario well in my initial writeup. I have since revised the writeup quite a bit, but to be extra sure: The question is about how to implement an async "read all the data available this moment" operation. My apologies to the people who took the time to read and answer without me making my intent clear enough.

Read the article
Implementing a robust async stream reader

- by Jon

I recently provided an answer to this question: C# - Realtime console output redirection. As often happens, explaining stuff (here "stuff" was how I tackled a similar problem) leads you to greater understanding and/or, as is the case here, "oops" moments. I realized that my solution, as implemented, has a bug. The bug has little practical importance, but it has an extremely large importance to me as a developer: I can't rest easy knowing that my code has the potential to blow up. Squashing the bug is the purpose of this question. I apologize for the long intro, so let's get dirty. I wanted to build a class that allows me to receive input from a Stream in an event-based manner. The stream, in my scenario, is guaranteed to be a FileStream and there is also an associated StreamReader already present to leverage. The public interface of the class is this: public class MyStreamManager { public event EventHandler<ConsoleOutputReadEventArgs> StandardOutputRead; public void StartSendingEvents(); public void StopSendingEvents(); } Obviously this specific scenario has to do with a console's standard output, but that is a detail and does not play an important role. StartSendingEvents and StopSendingEvents do what they advertise; for the purposes of this discussion, we can assume that events are always being sent without loss of generality. The class uses these two fields internally: protected readonly StringBuilder inputAccumulator = new StringBuilder(); protected readonly byte[] buffer = new byte[256]; The functionality of the class is implemented in the methods below. To get the ball rolling: public void StartSendingEvents(); { this.stopAutomation = false; this.BeginReadAsync(); } To read data out of the Stream without blocking, and also without requiring a carriage return char, BeginRead is called: protected void BeginReadAsync() { if (!this.stopAutomation) { this.StandardOutput.BaseStream.BeginRead( this.buffer, 0, this.buffer.Length, this.ReadHappened, null); } } The challenging part: BeginRead requires using a buffer. This means that when reading from the stream, it is possible that the bytes available to read ("incoming chunk") are larger than the buffer. Since we are only handing off data from the stream to a consumer, and that consumer may well have inside knowledge about the size and/or format of these chunks, I want to call event subscribers exactly once for each chunk. Otherwise the abstraction breaks down and the subscribers have to buffer the incoming data and reconstruct the chunks themselves using said knowledge. This is much less convenient to the calling code, and detracts from the usefulness of my class. To this end, if the buffer is full after EndRead, we don't send its contents to subscribers immediately but instead append them to a StringBuilder. The contents of the StringBuilder are only sent back whenever there is no more to read from the stream (thus preserving the chunks). private void ReadHappened(IAsyncResult asyncResult) { var bytesRead = this.StandardOutput.BaseStream.EndRead(asyncResult); if (bytesRead == 0) { this.OnAutomationStopped(); return; } var input = this.StandardOutput.CurrentEncoding.GetString( this.buffer, 0, bytesRead); this.inputAccumulator.Append(input); if (bytesRead < this.buffer.Length) { this.OnInputRead(); // only send back if we 're sure we got it all } this.BeginReadAsync(); // continue "looping" with BeginRead } After any read which is not enough to fill the buffer, all accumulated data is sent to the subscribers: private void OnInputRead() { var handler = this.StandardOutputRead; if (handler == null) { return; } handler(this, new ConsoleOutputReadEventArgs(this.inputAccumulator.ToString())); this.inputAccumulator.Clear(); } (I know that as long as there are no subscribers the data gets accumulated forever. This is a deliberate decision). The good This scheme works almost perfectly: Async functionality without spawning any threads Very convenient to the calling code (just subscribe to an event) Maintains the "chunkiness" of the data; this allows the calling code to use inside knowledge of the data without doing any extra work Is almost agnostic to the buffer size (it will work correctly with any size buffer irrespective of the data being read) The bad That last almost is a very big one. Consider what happens when there is an incoming chunk with length exactly equal to the size of the buffer. The chunk will be read and buffered, but the event will not be triggered. This will be followed up by a BeginRead that expects to find more data belonging to the current chunk in order to send it back all in one piece, but... there will be no more data in the stream. In fact, as long as data is put into the stream in chunks with length exactly equal to the buffer size, the data will be buffered and the event will never be triggered. This scenario may be highly unlikely to occur in practice, especially since we can pick any number for the buffer size, but the problem is there. Solution? Unfortunately, after checking the available methods on FileStream and StreamReader, I can't find anything which lets me peek into the stream while also allowing async methods to be used on it. One "solution" would be to have a thread wait on a ManualResetEvent after the "buffer filled" condition is detected. If the event is not signaled (by the async callback) in a small amount of time, then more data from the stream will not be forthcoming and the data accumulated so far should be sent to subscribers. However, this introduces the need for another thread, requires thread synchronization, and is plain inelegant. Specifying a timeout for BeginRead would also suffice (call back into my code every now and then so I can check if there's data to be sent back; most of the time there will not be anything to do, so I expect the performance hit to be negligible). But it looks like timeouts are not supported in FileStream. Since I imagine that async calls with timeouts are an option in bare Win32, another approach might be to PInvoke the hell out of the problem. But this is also undesirable as it will introduce complexity and simply be a pain to code. Is there an elegant way to get around the problem? Thanks for being patient enough to read all of this.

Read the article
Implementing a robust async stream reader for a console

- by Jon

I recently provided an answer to this question: C# - Realtime console output redirection. As often happens, explaining stuff (here "stuff" was how I tackled a similar problem) leads you to greater understanding and/or, as is the case here, "oops" moments. I realized that my solution, as implemented, has a bug. The bug has little practical importance, but it has an extremely large importance to me as a developer: I can't rest easy knowing that my code has the potential to blow up. Squashing the bug is the purpose of this question. I apologize for the long intro, so let's get dirty. I wanted to build a class that allows me to receive input from a Stream in an event-based manner. The stream, in my scenario, is guaranteed to be a FileStream and there is also an associated StreamReader already present to leverage. The public interface of the class is this: public class MyStreamManager { public event EventHandler<ConsoleOutputReadEventArgs> StandardOutputRead; public void StartSendingEvents(); public void StopSendingEvents(); } Obviously this specific scenario has to do with a console's standard output. StartSendingEvents and StopSendingEvents do what they advertise; for the purposes of this discussion, we can assume that events are always being sent without loss of generality. The class uses these two fields internally: protected readonly StringBuilder inputAccumulator = new StringBuilder(); protected readonly byte[] buffer = new byte[256]; The functionality of the class is implemented in the methods below. To get the ball rolling: public void StartSendingEvents(); { this.stopAutomation = false; this.BeginReadAsync(); } To read data out of the Stream without blocking, and also without requiring a carriage return char, BeginRead is called: protected void BeginReadAsync() { if (!this.stopAutomation) { this.StandardOutput.BaseStream.BeginRead( this.buffer, 0, this.buffer.Length, this.ReadHappened, null); } } The challenging part: BeginRead requires using a buffer. This means that when reading from the stream, it is possible that the bytes available to read ("incoming chunk") are larger than the buffer. Since we are only handing off data from the stream to a consumer, and that consumer may well have inside knowledge about the size and/or format of these chunks, I want to call event subscribers exactly once for each chunk. Otherwise the abstraction breaks down and the subscribers have to buffer the incoming data and reconstruct the chunks themselves using said knowledge. This is much less convenient to the calling code, and detracts from the usefulness of my class. Edit: There are comments below correctly stating that since the data is coming from a stream, there is absolutely nothing that the receiver can infer about the structure of the data unless it is fully prepared to parse it. What I am trying to do here is leverage the "flush the output" "structure" that the owner of the console imparts while writing on it. I am prepared to assume (better: allow my caller to have the option to assume) that the OS will pass me the data written between two flushes of the stream in exactly one piece. To this end, if the buffer is full after EndRead, we don't send its contents to subscribers immediately but instead append them to a StringBuilder. The contents of the StringBuilder are only sent back whenever there is no more to read from the stream (thus preserving the chunks). private void ReadHappened(IAsyncResult asyncResult) { var bytesRead = this.StandardOutput.BaseStream.EndRead(asyncResult); if (bytesRead == 0) { this.OnAutomationStopped(); return; } var input = this.StandardOutput.CurrentEncoding.GetString( this.buffer, 0, bytesRead); this.inputAccumulator.Append(input); if (bytesRead < this.buffer.Length) { this.OnInputRead(); // only send back if we 're sure we got it all } this.BeginReadAsync(); // continue "looping" with BeginRead } After any read which is not enough to fill the buffer, all accumulated data is sent to the subscribers: private void OnInputRead() { var handler = this.StandardOutputRead; if (handler == null) { return; } handler(this, new ConsoleOutputReadEventArgs(this.inputAccumulator.ToString())); this.inputAccumulator.Clear(); } (I know that as long as there are no subscribers the data gets accumulated forever. This is a deliberate decision). The good This scheme works almost perfectly: Async functionality without spawning any threads Very convenient to the calling code (just subscribe to an event) Maintains the "chunkiness" of the data; this allows the calling code to use inside knowledge of the data without doing any extra work Is almost agnostic to the buffer size (it will work correctly with any size buffer irrespective of the data being read) The bad That last almost is a very big one. Consider what happens when there is an incoming chunk with length exactly equal to the size of the buffer. The chunk will be read and buffered, but the event will not be triggered. This will be followed up by a BeginRead that expects to find more data belonging to the current chunk in order to send it back all in one piece, but... there will be no more data in the stream. In fact, as long as data is put into the stream in chunks with length exactly equal to the buffer size, the data will be buffered and the event will never be triggered. This scenario may be highly unlikely to occur in practice, especially since we can pick any number for the buffer size, but the problem is there. Solution? Unfortunately, after checking the available methods on FileStream and StreamReader, I can't find anything which lets me peek into the stream while also allowing async methods to be used on it. One "solution" would be to have a thread wait on a ManualResetEvent after the "buffer filled" condition is detected. If the event is not signaled (by the async callback) in a small amount of time, then more data from the stream will not be forthcoming and the data accumulated so far should be sent to subscribers. However, this introduces the need for another thread, requires thread synchronization, and is plain inelegant. Specifying a timeout for BeginRead would also suffice (call back into my code every now and then so I can check if there's data to be sent back; most of the time there will not be anything to do, so I expect the performance hit to be negligible). But it looks like timeouts are not supported in FileStream. Since I imagine that async calls with timeouts are an option in bare Win32, another approach might be to PInvoke the hell out of the problem. But this is also undesirable as it will introduce complexity and simply be a pain to code. Is there an elegant way to get around the problem? Thanks for being patient enough to read all of this.

Read the article
Using the West Wind Web Toolkit to set up AJAX and REST Services

- by Rick Strahl

I frequently get questions about which option to use for creating AJAX and REST backends for ASP.NET applications. There are many solutions out there to do this actually, but when I have a choice - not surprisingly - I fall back to my own tools in the West Wind West Wind Web Toolkit. I've talked a bunch about the 'in-the-box' solutions in the past so for a change in this post I'll talk about the tools that I use in my own and customer applications to handle AJAX and REST based access to service resources using the West Wind West Wind Web Toolkit. Let me preface this by saying that I like things to be easy. Yes flexible is very important as well but not at the expense of over-complexity. The goal I've had with my tools is make it drop dead easy, with good performance while providing the core features that I'm after, which are: Easy AJAX/JSON Callbacks Ability to return any kind of non JSON content (string, stream, byte[], images) Ability to work with both XML and JSON interchangeably for input/output Access endpoints via POST data, RPC JSON calls, GET QueryString values or Routing interface Easy to use generic JavaScript client to make RPC calls (same syntax, just what you need) Ability to create clean URLS with Routing Ability to use standard ASP.NET HTTP Stack for HTTP semantics It's all about options! In this post I'll demonstrate most of these features (except XML) in a few simple and short samples which you can download. So let's take a look and see how you can build an AJAX callback solution with the West Wind Web Toolkit. Installing the Toolkit Assemblies The easiest and leanest way of using the Toolkit in your Web project is to grab it via NuGet: West Wind Web and AJAX Utilities (Westwind.Web) and drop it into the project by right clicking in your Project and choosing Manage NuGet Packages from anywhere in the Project. When done you end up with your project looking like this: What just happened? Nuget added two assemblies - Westwind.Web and Westwind.Utilities and the client ww.jquery.js library. It also added a couple of references into web.config: The default namespaces so they can be accessed in pages/views and a ScriptCompressionModule that the toolkit optionally uses to compress script resources served from within the assembly (namely ww.jquery.js and optionally jquery.js). Creating a new Service The West Wind Web Toolkit supports several ways of creating and accessing AJAX services, but for this post I'll stick to the lower level approach that works from any plain HTML page or of course MVC, WebForms, WebPages. There's also a WebForms specific control that makes this even easier but I'll leave that for another post. So, to create a new standalone AJAX/REST service we can create a new HttpHandler in the new project either as a pure class based handler or as a generic .ASHX handler. Both work equally well, but generic handlers don't require any web.config configuration so I'll use that here. In the root of the project add a Generic Handler. I'm going to call this one StockService.ashx. Once the handler has been created, edit the code and remove all of the handler body code. Then change the base class to CallbackHandler and add methods that have a [CallbackMethod] attribute. Here's the modified base handler implementation now looks like with an added HelloWorld method: using System; using Westwind.Web; namespace WestWindWebAjax { /// <summary> /// Handler implements CallbackHandler to provide REST/AJAX services /// </summary> public class SampleService : CallbackHandler { [CallbackMethod] public string HelloWorld(string name) { return "Hello " + name + ". Time is: " + DateTime.Now.ToString(); } } } Notice that the class inherits from CallbackHandler and that the HelloWorld service method is marked up with [CallbackMethod]. We're done here. Services Urlbased Syntax Once you compile, the 'service' is live can respond to requests. All CallbackHandlers support input in GET and POST formats, and can return results as JSON or XML. To check our fancy HelloWorld method we can now access the service like this: http://localhost/WestWindWebAjax/StockService.ashx?Method=HelloWorld&name=Rick which produces a default JSON response - in this case a string (wrapped in quotes as it's JSON): (note by default JSON will be downloaded by most browsers not displayed - various options are available to view JSON right in the browser) If I want to return the same data as XML I can tack on a &format=xml at the end of the querystring which produces: <string>Hello Rick. Time is: 11/1/2011 12:11:13 PM</string> Cleaner URLs with Routing Syntax If you want cleaner URLs for each operation you can also configure custom routes on a per URL basis similar to the way that WCF REST does. To do this you need to add a new RouteHandler to your application's startup code in global.asax.cs one for each CallbackHandler based service you create: protected void Application_Start(object sender, EventArgs e) { CallbackHandlerRouteHandler.RegisterRoutes<StockService>(RouteTable.Routes); } With this code in place you can now add RouteUrl properties to any of your service methods. For the HelloWorld method that doesn't make a ton of sense but here is what a routed clean URL might look like in definition: [CallbackMethod(RouteUrl="stocks/HelloWorld/{name}")] public string HelloWorld(string name) { return "Hello " + name + ". Time is: " + DateTime.Now.ToString(); } The same URL I previously used now becomes a bit shorter and more readable with: http://localhost/WestWindWebAjax/HelloWorld/Rick It's an easy way to create cleaner URLs and still get the same functionality. Calling the Service with $.getJSON() Since the result produced is JSON you can now easily consume this data using jQuery's getJSON method. First we need a couple of scripts - jquery.js and ww.jquery.js in the page: <!DOCTYPE html> <html> <head> <link href="Css/Westwind.css" rel="stylesheet" type="text/css" /> <script src="scripts/jquery.min.js" type="text/javascript"></script> <script src="scripts/ww.jquery.min.js" type="text/javascript"></script> </head> <body> Next let's add a small HelloWorld example form (what else) that has a single textbox to type a name, a button and a div tag to receive the result: <fieldset> <legend>Hello World</legend> Please enter a name: <input type="text" name="txtHello" id="txtHello" value="" /> <input type="button" id="btnSayHello" value="Say Hello (POST)" /> <input type="button" id="btnSayHelloGet" value="Say Hello (GET)" /> <div id="divHelloMessage" class="errordisplay" style="display:none;width: 450px;" > </div> </fieldset> Then to call the HelloWorld method a little jQuery is used to hook the document startup and the button click followed by the $.getJSON call to retrieve the data from the server. <script type="text/javascript"> $(document).ready(function () { $("#btnSayHelloGet").click(function () { $.getJSON("SampleService.ashx", { Method: "HelloWorld", name: $("#txtHello").val() }, function (result) { $("#divHelloMessage") .text(result) .fadeIn(1000); }); });</script> .getJSON() expects a full URL to the endpoint of our service, which is the ASHX file. We can either provide a full URL (SampleService.ashx?Method=HelloWorld&name=Rick) or we can just provide the base URL and an object that encodes the query string parameters for us using an object map that has a property that matches each parameter for the server method. We can also use the clean URL routing syntax, but using the object parameter encoding actually is safer as the parameters will get properly encoded by jQuery. The result returned is whatever the result on the server method is - in this case a string. The string is applied to the divHelloMessage element and we're done. Obviously this is a trivial example, but it demonstrates the basics of getting a JSON response back to the browser. AJAX Post Syntax - using ajaxCallMethod() The previous example allows you basic control over the data that you send to the server via querystring parameters. This works OK for simple values like short strings, numbers and boolean values, but doesn't really work if you need to pass something more complex like an object or an array back up to the server. To handle traditional RPC type messaging where the idea is to map server side functions and results to a client side invokation, POST operations can be used. The easiest way to use this functionality is to use ww.jquery.js and the ajaxCallMethod() function. ww.jquery wraps jQuery's AJAX functions and knows implicitly how to call a CallbackServer method with parameters and parse the result. Let's look at another simple example that posts a simple value but returns something more interesting. Let's start with the service method: [CallbackMethod(RouteUrl="stocks/{symbol}")] public StockQuote GetStockQuote(string symbol) { Response.Cache.SetExpires(DateTime.UtcNow.Add(new TimeSpan(0, 2, 0))); StockServer server = new StockServer(); var quote = server.GetStockQuote(symbol); if (quote == null) throw new ApplicationException("Invalid Symbol passed."); return quote; } This sample utilizes a small StockServer helper class (included in the sample) that downloads a stock quote from Yahoo's financial site via plain HTTP GET requests and formats it into a StockQuote object. Lets create a small HTML block that lets us query for the quote and display it: <fieldset> <legend>Single Stock Quote</legend> Please enter a stock symbol: <input type="text" name="txtSymbol" id="txtSymbol" value="msft" /> <input type="button" id="btnStockQuote" value="Get Quote" /> <div id="divStockDisplay" class="errordisplay" style="display:none; width: 450px;"> <div class="label-left">Company:</div> <div id="stockCompany"></div> <div class="label-left">Last Price:</div> <div id="stockLastPrice"></div> <div class="label-left">Quote Time:</div> <div id="stockQuoteTime"></div> </div> </fieldset> The final result looks something like this: Let's hook up the button handler to fire the request and fill in the data as shown: $("#btnStockQuote").click(function () { ajaxCallMethod("SampleService.ashx", "GetStockQuote", [$("#txtSymbol").val()], function (quote) { $("#divStockDisplay").show().fadeIn(1000); $("#stockCompany").text(quote.Company + " (" + quote.Symbol + ")"); $("#stockLastPrice").text(quote.LastPrice); $("#stockQuoteTime").text(quote.LastQuoteTime.formatDate("MMM dd, HH:mm EST")); }, onPageError); }); So we point at SampleService.ashx and the GetStockQuote method, passing a single parameter of the input symbol value. Then there are two handlers for success and failure callbacks. The success handler is the interesting part - it receives the stock quote as a result and assigns its values to various 'holes' in the stock display elements. The data that comes back over the wire is JSON and it looks like this: { "Symbol":"MSFT", "Company":"Microsoft Corpora", "OpenPrice":26.11, "LastPrice":26.01, "NetChange":0.02, "LastQuoteTime":"2011-11-03T02:00:00Z", "LastQuoteTimeString":"Nov. 11, 2011 4:20pm" } which is an object representation of the data. JavaScript can evaluate this JSON string back into an object easily and that's the reslut that gets passed to the success function. The quote data is then applied to existing page content by manually selecting items and applying them. There are other ways to do this more elegantly like using templates, but here we're only interested in seeing how the data is returned. The data in the object is typed - LastPrice is a number and QuoteTime is a date. Note about the date value: JavaScript doesn't have a date literal although the JSON embedded ISO string format used above ("2011-11-03T02:00:00Z") is becoming fairly standard for JSON serializers. However, JSON parsers don't deserialize dates by default and return them by string. This is why the StockQuote actually returns a string value of LastQuoteTimeString for the same date. ajaxMethodCallback always converts dates properly into 'real' dates and the example above uses the real date value along with a .formatDate() data extension (also in ww.jquery.js) to display the raw date properly. Errors and Exceptions So what happens if your code fails? For example if I pass an invalid stock symbol to the GetStockQuote() method you notice that the code does this: if (quote == null) throw new ApplicationException("Invalid Symbol passed."); CallbackHandler automatically pushes the exception message back to the client so it's easy to pick up the error message. Regardless of what kind of error occurs: Server side, client side, protocol errors - any error will fire the failure handler with an error object parameter. The error is returned to the client via a JSON response in the error callback. In the previous examples I called onPageError which is a generic routine in ww.jquery that displays a status message on the bottom of the screen. But of course you can also take over the error handling yourself: $("#btnStockQuote").click(function () { ajaxCallMethod("SampleService.ashx", "GetStockQuote", [$("#txtSymbol").val()], function (quote) { $("#divStockDisplay").fadeIn(1000); $("#stockCompany").text(quote.Company + " (" + quote.Symbol + ")"); $("#stockLastPrice").text(quote.LastPrice); $("#stockQuoteTime").text(quote.LastQuoteTime.formatDate("MMM dd, hh:mmt")); }, function (error, xhr) { $("#divErrorDisplay").text(error.message).fadeIn(1000); }); }); The error object has a isCallbackError, message and stackTrace properties, the latter of which is only populated when running in Debug mode, and this object is returned for all errors: Client side, transport and server side errors. Regardless of which type of error you get the same object passed (as well as the XHR instance optionally) which makes for a consistent error retrieval mechanism. Specifying HttpVerbs You can also specify HTTP Verbs that are allowed using the AllowedHttpVerbs option on the CallbackMethod attribute: [CallbackMethod(AllowedHttpVerbs=HttpVerbs.GET | HttpVerbs.POST)] public string HelloWorld(string name) { … } If you're building REST style API's this might be useful to force certain request semantics onto the client calling. For the above if call with a non-allowed HttpVerb the request returns a 405 error response along with a JSON (or XML) error object result. The default behavior is to allow all verbs access (HttpVerbs.All). Passing in object Parameters Up to now the parameters I passed were very simple. But what if you need to send something more complex like an object or an array? Let's look at another example now that passes an object from the client to the server. Keeping with the Stock theme here lets add a method called BuyOrder that lets us buy some shares for a stock. Consider the following service method that receives an StockBuyOrder object as a parameter: [CallbackMethod] public string BuyStock(StockBuyOrder buyOrder) { var server = new StockServer(); var quote = server.GetStockQuote(buyOrder.Symbol); if (quote == null) throw new ApplicationException("Invalid or missing stock symbol."); return string.Format("You're buying {0} shares of {1} ({2}) stock at {3} for a total of {4} on {5}.", buyOrder.Quantity, quote.Company, quote.Symbol, quote.LastPrice.ToString("c"), (quote.LastPrice * buyOrder.Quantity).ToString("c"), buyOrder.BuyOn.ToString("MMM d")); } public class StockBuyOrder { public string Symbol { get; set; } public int Quantity { get; set; } public DateTime BuyOn { get; set; } public StockBuyOrder() { BuyOn = DateTime.Now; } } This is a contrived do-nothing example that simply echoes back what was passed in, but it demonstrates how you can pass complex data to a callback method. On the client side we now have a very simple form that captures the three values on a form: <fieldset> <legend>Post a Stock Buy Order</legend> Enter a symbol: <input type="text" name="txtBuySymbol" id="txtBuySymbol" value="GLD" />   Qty: <input type="text" name="txtBuyQty" id="txtBuyQty" value="10" style="width: 50px" />   Buy on: <input type="text" name="txtBuyOn" id="txtBuyOn" value="<%= DateTime.Now.ToString("d") %>" style="width: 70px;" /> <input type="button" id="btnBuyStock" value="Buy Stock" /> <div id="divStockBuyMessage" class="errordisplay" style="display:none"></div> </fieldset> The completed form and demo then looks something like this: The client side code that picks up the input values and assigns them to object properties and sends the AJAX request looks like this: $("#btnBuyStock").click(function () { // create an object map that matches StockBuyOrder signature var buyOrder = { Symbol: $("#txtBuySymbol").val(), Quantity: $("#txtBuyQty").val() * 1, // number Entered: new Date() } ajaxCallMethod("SampleService.ashx", "BuyStock", [buyOrder], function (result) { $("#divStockBuyMessage").text(result).fadeIn(1000); }, onPageError); }); The code creates an object and attaches the properties that match the server side object passed to the BuyStock method. Each property that you want to update needs to be included and the type must match (ie. string, number, date in this case). Any missing properties will not be set but also not cause any errors. Pass POST data instead of Objects In the last example I collected a bunch of values from form variables and stuffed them into object variables in JavaScript code. While that works, often times this isn't really helping - I end up converting my types on the client and then doing another conversion on the server. If lots of input controls are on a page and you just want to pick up the values on the server via plain POST variables - that can be done too - and it makes sense especially if you're creating and filling the client side object only to push data to the server. Let's add another method to the server that once again lets us buy a stock. But this time let's not accept a parameter but rather send POST data to the server. Here's the server method receiving POST data: [CallbackMethod] public string BuyStockPost() { StockBuyOrder buyOrder = new StockBuyOrder(); buyOrder.Symbol = Request.Form["txtBuySymbol"]; ; int qty; int.TryParse(Request.Form["txtBuyQuantity"], out qty); buyOrder.Quantity = qty; DateTime time; DateTime.TryParse(Request.Form["txtBuyBuyOn"], out time); buyOrder.BuyOn = time; // Or easier way yet //FormVariableBinder.Unbind(buyOrder,null,"txtBuy"); var server = new StockServer(); var quote = server.GetStockQuote(buyOrder.Symbol); if (quote == null) throw new ApplicationException("Invalid or missing stock symbol."); return string.Format("You're buying {0} shares of {1} ({2}) stock at {3} for a total of {4} on {5}.", buyOrder.Quantity, quote.Company, quote.Symbol, quote.LastPrice.ToString("c"), (quote.LastPrice * buyOrder.Quantity).ToString("c"), buyOrder.BuyOn.ToString("MMM d")); } Clearly we've made this server method take more code than it did with the object parameter. We've basically moved the parameter assignment logic from the client to the server. As a result the client code to call this method is now a bit shorter since there's no client side shuffling of values from the controls to an object. $("#btnBuyStockPost").click(function () { ajaxCallMethod("SampleService.ashx", "BuyStockPost", [], // Note: No parameters - function (result) { $("#divStockBuyMessage").text(result).fadeIn(1000); }, onPageError, // Force all page Form Variables to be posted { postbackMode: "Post" }); }); The client simply calls the BuyStockQuote method and pushes all the form variables from the page up to the server which parses them instead. The feature that makes this work is one of the options you can pass to the ajaxCallMethod() function: { postbackMode: "Post" }); which directs the function to include form variable POST data when making the service call. Other options include PostNoViewState (for WebForms to strip out WebForms crap vars), PostParametersOnly (default), None. If you pass parameters those are always posted to the server except when None is set. The above code can be simplified a bit by using the FormVariableBinder helper, which can unbind form variables directly into an object: FormVariableBinder.Unbind(buyOrder,null,"txtBuy"); which replaces the manual Request.Form[] reading code. It receives the object to unbind into, a string of properties to skip, and an optional prefix which is stripped off form variables to match property names. The component is similar to the MVC model binder but it's independent of MVC. Returning non-JSON Data CallbackHandler also supports returning non-JSON/XML data via special return types. You can return raw non-JSON encoded strings like this: [CallbackMethod(ReturnAsRawString=true,ContentType="text/plain")] public string HelloWorldNoJSON(string name) { return "Hello " + name + ". Time is: " + DateTime.Now.ToString(); } Calling this method results in just a plain string - no JSON encoding with quotes around the result. This can be useful if your server handling code needs to return a string or HTML result that doesn't fit well for a page or other UI component. Any string output can be returned. You can also return binary data. Stream, byte[] and Bitmap/Image results are automatically streamed back to the client. Notice that you should set the ContentType of the request either on the CallbackMethod attribute or using Response.ContentType. This ensures the Web Server knows how to display your binary response. Using a stream response makes it possible to return any of data. Streamed data can be pretty handy to return bitmap data from a method. The following is a method that returns a stock history graph for a particular stock over a provided number of years: [CallbackMethod(ContentType="image/png",RouteUrl="stocks/history/graph/{symbol}/{years}")] public Stream GetStockHistoryGraph(string symbol, int years = 2,int width = 500, int height=350) { if (width == 0) width = 500; if (height == 0) height = 350; StockServer server = new StockServer(); return server.GetStockHistoryGraph(symbol,"Stock History for " + symbol,width,height,years); } I can now hook this up into the JavaScript code when I get a stock quote. At the end of the process I can assign the URL to the service that returns the image into the src property and so force the image to display. Here's the changed code: $("#btnStockQuote").click(function () { var symbol = $("#txtSymbol").val(); ajaxCallMethod("SampleService.ashx", "GetStockQuote", [symbol], function (quote) { $("#divStockDisplay").fadeIn(1000); $("#stockCompany").text(quote.Company + " (" + quote.Symbol + ")"); $("#stockLastPrice").text(quote.LastPrice); $("#stockQuoteTime").text(quote.LastQuoteTime.formatDate("MMM dd, hh:mmt")); // display a stock chart $("#imgStockHistory").attr("src", "stocks/history/graph/" + symbol + "/2"); },onPageError); }); The resulting output then looks like this: The charting code uses the new ASP.NET 4.0 Chart components via code to display a bar chart of the 2 year stock data as part of the StockServer class which you can find in the sample download. The ability to return arbitrary data from a service is useful as you can see - in this case the chart is clearly associated with the service and it's nice that the graph generation can happen off a handler rather than through a page. Images are common resources, but output can also be PDF reports, zip files for downloads etc. which is becoming increasingly more common to be returned from REST endpoints and other applications. Why reinvent? Obviously the examples I've shown here are pretty basic in terms of functionality. But I hope they demonstrate the core features of AJAX callbacks that you need to work through in most applications which is simple: return data, send back data and potentially retrieve data in various formats. While there are other solutions when it comes down to making AJAX callbacks and servicing REST like requests, I like the flexibility my home grown solution provides. Simply put it's still the easiest solution that I've found that addresses my common use cases: AJAX JSON RPC style callbacks Url based access XML and JSON Output from single method endpoint XML and JSON POST support, querystring input, routing parameter mapping UrlEncoded POST data support on callbacks Ability to return stream/raw string data Essentially ability to return ANYTHING from Service and pass anything All these features are available in various solutions but not together in one place. I've been using this code base for over 4 years now in a number of projects both for myself and commercial work and it's served me extremely well. Besides the AJAX functionality CallbackHandler provides, it's also an easy way to create any kind of output endpoint I need to create. Need to create a few simple routines that spit back some data, but don't want to create a Page or View or full blown handler for it? Create a CallbackHandler and add a method or multiple methods and you have your generic endpoints. It's a quick and easy way to add small code pieces that are pretty efficient as they're running through a pretty small handler implementation. I can have this up and running in a couple of minutes literally without any setup and returning just about any kind of data. Resources Download the Sample NuGet: Westwind Web and AJAX Utilities (Westwind.Web) ajaxCallMethod() Documentation Using the AjaxMethodCallback WebForms Control West Wind Web Toolkit Home Page West Wind Web Toolkit Source Code © Rick Strahl, West Wind Technologies, 2005-2011Posted in ASP.NET jQuery AJAX Tweet (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
value types in the vm

- by john.rose

value types in the vm p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} p.p2 {margin: 0.0px 0.0px 14.0px 0.0px; font: 14.0px Times} p.p3 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times} p.p4 {margin: 0.0px 0.0px 15.0px 0.0px; font: 14.0px Times} p.p5 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier} p.p6 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Courier; min-height: 17.0px} p.p7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p8 {margin: 0.0px 0.0px 0.0px 36.0px; text-indent: -36.0px; font: 14.0px Times; min-height: 18.0px} p.p9 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; min-height: 18.0px} p.p10 {margin: 0.0px 0.0px 12.0px 0.0px; font: 14.0px Times; color: #000000} li.li1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times} li.li7 {margin: 0.0px 0.0px 0.0px 0.0px; font: 14.0px Times; min-height: 18.0px} span.s1 {font: 14.0px Courier} span.s2 {color: #000000} span.s3 {font: 14.0px Courier; color: #000000} ol.ol1 {list-style-type: decimal} Or, enduring values for a changing world. Introduction A value type is a data type which, generally speaking, is designed for being passed by value in and out of methods, and stored by value in data structures. The only value types which the Java language directly supports are the eight primitive types. Java indirectly and approximately supports value types, if they are implemented in terms of classes. For example, both Integer and String may be viewed as value types, especially if their usage is restricted to avoid operations appropriate to Object. In this note, we propose a definition of value types in terms of a design pattern for Java classes, accompanied by a set of usage restrictions. We also sketch the relation of such value types to tuple types (which are a JVM-level notion), and point out JVM optimizations that can apply to value types. This note is a thought experiment to extend the JVM’s performance model in support of value types. The demonstration has two phases. Initially the extension can simply use design patterns, within the current bytecode architecture, and in today’s Java language. But if the performance model is to be realized in practice, it will probably require new JVM bytecode features, changes to the Java language, or both. We will look at a few possibilities for these new features. An Axiom of Value In the context of the JVM, a value type is a data type equipped with construction, assignment, and equality operations, and a set of typed components, such that, whenever two variables of the value type produce equal corresponding values for their components, the values of the two variables cannot be distinguished by any JVM operation. Here are some corollaries: A value type is immutable, since otherwise a copy could be constructed and the original could be modified in one of its components, allowing the copies to be distinguished. Changing the component of a value type requires construction of a new value. The equals and hashCode operations are strictly component-wise. If a value type is represented by a JVM reference, that reference cannot be successfully synchronized on, and cannot be usefully compared for reference equality. A value type can be viewed in terms of what it doesn’t do. We can say that a value type omits all value-unsafe operations, which could violate the constraints on value types. These operations, which are ordinarily allowed for Java object types, are pointer equality comparison (the acmp instruction), synchronization (the monitor instructions), all the wait and notify methods of class Object, and non-trivial finalize methods. The clone method is also value-unsafe, although for value types it could be treated as the identity function. Finally, and most importantly, any side effect on an object (however visible) also counts as an value-unsafe operation. A value type may have methods, but such methods must not change the components of the value. It is reasonable and useful to define methods like toString, equals, and hashCode on value types, and also methods which are specifically valuable to users of the value type. Representations of Value Value types have two natural representations in the JVM, unboxed and boxed. An unboxed value consists of the components, as simple variables. For example, the complex number x=(1+2i), in rectangular coordinate form, may be represented in unboxed form by the following pair of variables: /*Complex x = Complex.valueOf(1.0, 2.0):*/ double x_re = 1.0, x_im = 2.0; These variables might be locals, parameters, or fields. Their association as components of a single value is not defined to the JVM. Here is a sample computation which computes the norm of the difference between two complex numbers: double distance(/*Complex x:*/ double x_re, double x_im, /*Complex y:*/ double y_re, double y_im) { /*Complex z = x.minus(y):*/ double z_re = x_re - y_re, z_im = x_im - y_im; /*return z.abs():*/ return Math.sqrt(z_re*z_re + z_im*z_im); } A boxed representation groups component values under a single object reference. The reference is to a ‘wrapper class’ that carries the component values in its fields. (A primitive type can naturally be equated with a trivial value type with just one component of that type. In that view, the wrapper class Integer can serve as a boxed representation of value type int.) The unboxed representation of complex numbers is practical for many uses, but it fails to cover several major use cases: return values, array elements, and generic APIs. The two components of a complex number cannot be directly returned from a Java function, since Java does not support multiple return values. The same story applies to array elements: Java has no ’array of structs’ feature. (Double-length arrays are a possible workaround for complex numbers, but not for value types with heterogeneous components.) By generic APIs I mean both those which use generic types, like Arrays.asList and those which have special case support for primitive types, like String.valueOf and PrintStream.println. Those APIs do not support unboxed values, and offer some problems to boxed values. Any ’real’ JVM type should have a story for returns, arrays, and API interoperability. The basic problem here is that value types fall between primitive types and object types. Value types are clearly more complex than primitive types, and object types are slightly too complicated. Objects are a little bit dangerous to use as value carriers, since object references can be compared for pointer equality, and can be synchronized on. Also, as many Java programmers have observed, there is often a performance cost to using wrapper objects, even on modern JVMs. Even so, wrapper classes are a good starting point for talking about value types. If there were a set of structural rules and restrictions which would prevent value-unsafe operations on value types, wrapper classes would provide a good notation for defining value types. This note attempts to define such rules and restrictions. Let’s Start Coding Now it is time to look at some real code. Here is a definition, written in Java, of a complex number value type. @ValueSafe public final class Complex implements java.io.Serializable { // immutable component structure: public final double re, im; private Complex(double re, double im) { this.re = re; this.im = im; } // interoperability methods: public String toString() { return "Complex("+re+","+im+")"; } public List<Double> asList() { return Arrays.asList(re, im); } public boolean equals(Complex c) { return re == c.re && im == c.im; } public boolean equals(@ValueSafe Object x) { return x instanceof Complex && equals((Complex) x); } public int hashCode() { return 31*Double.valueOf(re).hashCode() + Double.valueOf(im).hashCode(); } // factory methods: public static Complex valueOf(double re, double im) { return new Complex(re, im); } public Complex changeRe(double re2) { return valueOf(re2, im); } public Complex changeIm(double im2) { return valueOf(re, im2); } public static Complex cast(@ValueSafe Object x) { return x == null ? ZERO : (Complex) x; } // utility methods and constants: public Complex plus(Complex c) { return new Complex(re+c.re, im+c.im); } public Complex minus(Complex c) { return new Complex(re-c.re, im-c.im); } public double abs() { return Math.sqrt(re*re + im*im); } public static final Complex PI = valueOf(Math.PI, 0.0); public static final Complex ZERO = valueOf(0.0, 0.0); } This is not a minimal definition, because it includes some utility methods and other optional parts. The essential elements are as follows: The class is marked as a value type with an annotation. The class is final, because it does not make sense to create subclasses of value types. The fields of the class are all non-private and final. (I.e., the type is immutable and structurally transparent.) From the supertype Object, all public non-final methods are overridden. The constructor is private. Beyond these bare essentials, we can observe the following features in this example, which are likely to be typical of all value types: One or more factory methods are responsible for value creation, including a component-wise valueOf method. There are utility methods for complex arithmetic and instance creation, such as plus and changeIm. There are static utility constants, such as PI. The type is serializable, using the default mechanisms. There are methods for converting to and from dynamically typed references, such as asList and cast. The Rules In order to use value types properly, the programmer must avoid value-unsafe operations. A helpful Java compiler should issue errors (or at least warnings) for code which provably applies value-unsafe operations, and should issue warnings for code which might be correct but does not provably avoid value-unsafe operations. No such compilers exist today, but to simplify our account here, we will pretend that they do exist. A value-safe type is any class, interface, or type parameter marked with the @ValueSafe annotation, or any subtype of a value-safe type. If a value-safe class is marked final, it is in fact a value type. All other value-safe classes must be abstract. The non-static fields of a value class must be non-public and final, and all its constructors must be private. Under the above rules, a standard interface could be helpful to define value types like Complex. Here is an example: @ValueSafe public interface ValueType extends java.io.Serializable { // All methods listed here must get redefined. // Definitions must be value-safe, which means // they may depend on component values only. List<? extends Object> asList(); int hashCode(); boolean equals(@ValueSafe Object c); String toString(); } //@ValueSafe inherited from supertype: public final class Complex implements ValueType { … The main advantage of such a conventional interface is that (unlike an annotation) it is reified in the runtime type system. It could appear as an element type or parameter bound, for facilities which are designed to work on value types only. More broadly, it might assist the JVM to perform dynamic enforcement of the rules for value types. Besides types, the annotation @ValueSafe can mark fields, parameters, local variables, and methods. (This is redundant when the type is also value-safe, but may be useful when the type is Object or another supertype of a value type.) Working forward from these annotations, an expression E is defined as value-safe if it satisfies one or more of the following: The type of E is a value-safe type. E names a field, parameter, or local variable whose declaration is marked @ValueSafe. E is a call to a method whose declaration is marked @ValueSafe. E is an assignment to a value-safe variable, field reference, or array reference. E is a cast to a value-safe type from a value-safe expression. E is a conditional expression E0 ? E1 : E2, and both E1 and E2 are value-safe. Assignments to value-safe expressions and initializations of value-safe names must take their values from value-safe expressions. A value-safe expression may not be the subject of a value-unsafe operation. In particular, it cannot be synchronized on, nor can it be compared with the “==” operator, not even with a null or with another value-safe type. In a program where all of these rules are followed, no value-type value will be subject to a value-unsafe operation. Thus, the prime axiom of value types will be satisfied, that no two value type will be distinguishable as long as their component values are equal. More Code To illustrate these rules, here are some usage examples for Complex: Complex pi = Complex.valueOf(Math.PI, 0); Complex zero = pi.changeRe(0); //zero = pi; zero.re = 0; ValueType vtype = pi; @SuppressWarnings("value-unsafe") Object obj = pi; @ValueSafe Object obj2 = pi; obj2 = new Object(); // ok List<Complex> clist = new ArrayList<Complex>(); clist.add(pi); // (ok assuming List.add param is @ValueSafe) List<ValueType> vlist = new ArrayList<ValueType>(); vlist.add(pi); // (ok) List<Object> olist = new ArrayList<Object>(); olist.add(pi); // warning: "value-unsafe" boolean z = pi.equals(zero); boolean z1 = (pi == zero); // error: reference comparison on value type boolean z2 = (pi == null); // error: reference comparison on value type boolean z3 = (pi == obj2); // error: reference comparison on value type synchronized (pi) { } // error: synch of value, unpredictable result synchronized (obj2) { } // unpredictable result Complex qq = pi; qq = null; // possible NPE; warning: “null-unsafe" qq = (Complex) obj; // warning: “null-unsafe" qq = Complex.cast(obj); // OK @SuppressWarnings("null-unsafe") Complex empty = null; // possible NPE qq = empty; // possible NPE (null pollution) The Payoffs It follows from this that either the JVM or the java compiler can replace boxed value-type values with unboxed ones, without affecting normal computations. Fields and variables of value types can be split into their unboxed components. Non-static methods on value types can be transformed into static methods which take the components as value parameters. Some common questions arise around this point in any discussion of value types. Why burden the programmer with all these extra rules? Why not detect programs automagically and perform unboxing transparently? The answer is that it is easy to break the rules accidently unless they are agreed to by the programmer and enforced. Automatic unboxing optimizations are tantalizing but (so far) unreachable ideal. In the current state of the art, it is possible exhibit benchmarks in which automatic unboxing provides the desired effects, but it is not possible to provide a JVM with a performance model that assures the programmer when unboxing will occur. This is why I’m writing this note, to enlist help from, and provide assurances to, the programmer. Basically, I’m shooting for a good set of user-supplied “pragmas” to frame the desired optimization. Again, the important thing is that the unboxing must be done reliably, or else programmers will have no reason to work with the extra complexity of the value-safety rules. There must be a reasonably stable performance model, wherein using a value type has approximately the same performance characteristics as writing the unboxed components as separate Java variables. There are some rough corners to the present scheme. Since Java fields and array elements are initialized to null, value-type computations which incorporate uninitialized variables can produce null pointer exceptions. One workaround for this is to require such variables to be null-tested, and the result replaced with a suitable all-zero value of the value type. That is what the “cast” method does above. Generically typed APIs like List<T> will continue to manipulate boxed values always, at least until we figure out how to do reification of generic type instances. Use of such APIs will elicit warnings until their type parameters (and/or relevant members) are annotated or typed as value-safe. Retrofitting List<T> is likely to expose flaws in the present scheme, which we will need to engineer around. Here are a couple of first approaches: public interface java.util.List<@ValueSafe T> extends Collection<T> { … public interface java.util.List<T extends Object|ValueType> extends Collection<T> { … (The second approach would require disjunctive types, in which value-safety is “contagious” from the constituent types.) With more transformations, the return value types of methods can also be unboxed. This may require significant bytecode-level transformations, and would work best in the presence of a bytecode representation for multiple value groups, which I have proposed elsewhere under the title “Tuples in the VM”. But for starters, the JVM can apply this transformation under the covers, to internally compiled methods. This would give a way to express multiple return values and structured return values, which is a significant pain-point for Java programmers, especially those who work with low-level structure types favored by modern vector and graphics processors. The lack of multiple return values has a strong distorting effect on many Java APIs. Even if the JVM fails to unbox a value, there is still potential benefit to the value type. Clustered computing systems something have copy operations (serialization or something similar) which apply implicitly to command operands. When copying JVM objects, it is extremely helpful to know when an object’s identity is important or not. If an object reference is a copied operand, the system may have to create a proxy handle which points back to the original object, so that side effects are visible. Proxies must be managed carefully, and this can be expensive. On the other hand, value types are exactly those types which a JVM can “copy and forget” with no downside. Array types are crucial to bulk data interfaces. (As data sizes and rates increase, bulk data becomes more important than scalar data, so arrays are definitely accompanying us into the future of computing.) Value types are very helpful for adding structure to bulk data, so a successful value type mechanism will make it easier for us to express richer forms of bulk data. Unboxing arrays (i.e., arrays containing unboxed values) will provide better cache and memory density, and more direct data movement within clustered or heterogeneous computing systems. They require the deepest transformations, relative to today’s JVM. There is an impedance mismatch between value-type arrays and Java’s covariant array typing, so compromises will need to be struck with existing Java semantics. It is probably worth the effort, since arrays of unboxed value types are inherently more memory-efficient than standard Java arrays, which rely on dependent pointer chains. It may be sufficient to extend the “value-safe” concept to array declarations, and allow low-level transformations to change value-safe array declarations from the standard boxed form into an unboxed tuple-based form. Such value-safe arrays would not be convertible to Object[] arrays. Certain connection points, such as Arrays.copyOf and System.arraycopy might need additional input/output combinations, to allow smooth conversion between arrays with boxed and unboxed elements. Alternatively, the correct solution may have to wait until we have enough reification of generic types, and enough operator overloading, to enable an overhaul of Java arrays. Implicit Method Definitions The example of class Complex above may be unattractively complex. I believe most or all of the elements of the example class are required by the logic of value types. If this is true, a programmer who writes a value type will have to write lots of error-prone boilerplate code. On the other hand, I think nearly all of the code (except for the domain-specific parts like plus and minus) can be implicitly generated. Java has a rule for implicitly defining a class’s constructor, if no it defines no constructors explicitly. Likewise, there are rules for providing default access modifiers for interface members. Because of the highly regular structure of value types, it might be reasonable to perform similar implicit transformations on value types. Here’s an example of a “highly implicit” definition of a complex number type: public class Complex implements ValueType { // implicitly final public double re, im; // implicitly public final //implicit methods are defined elementwise from te fields: // toString, asList, equals(2), hashCode, valueOf, cast //optionally, explicit methods (plus, abs, etc.) would go here } In other words, with the right defaults, a simple value type definition can be a one-liner. The observant reader will have noticed the similarities (and suitable differences) between the explicit methods above and the corresponding methods for List<T>. Another way to abbreviate such a class would be to make an annotation the primary trigger of the functionality, and to add the interface(s) implicitly: public @ValueType class Complex { … // implicitly final, implements ValueType (But to me it seems better to communicate the “magic” via an interface, even if it is rooted in an annotation.) Implicitly Defined Value Types So far we have been working with nominal value types, which is to say that the sequence of typed components is associated with a name and additional methods that convey the intention of the programmer. A simple ordered pair of floating point numbers can be variously interpreted as (to name a few possibilities) a rectangular or polar complex number or Cartesian point. The name and the methods convey the intended meaning. But what if we need a truly simple ordered pair of floating point numbers, without any further conceptual baggage? Perhaps we are writing a method (like “divideAndRemainder”) which naturally returns a pair of numbers instead of a single number. Wrapping the pair of numbers in a nominal type (like “QuotientAndRemainder”) makes as little sense as wrapping a single return value in a nominal type (like “Quotient”). What we need here are structural value types commonly known as tuples. For the present discussion, let us assign a conventional, JVM-friendly name to tuples, roughly as follows: public class java.lang.tuple.$DD extends java.lang.tuple.Tuple { double $1, $2; } Here the component names are fixed and all the required methods are defined implicitly. The supertype is an abstract class which has suitable shared declarations. The name itself mentions a JVM-style method parameter descriptor, which may be “cracked” to determine the number and types of the component fields. The odd thing about such a tuple type (and structural types in general) is it must be instantiated lazily, in response to linkage requests from one or more classes that need it. The JVM and/or its class loaders must be prepared to spin a tuple type on demand, given a simple name reference, $xyz, where the xyz is cracked into a series of component types. (Specifics of naming and name mangling need some tasteful engineering.) Tuples also seem to demand, even more than nominal types, some support from the language. (This is probably because notations for non-nominal types work best as combinations of punctuation and type names, rather than named constructors like Function3 or Tuple2.) At a minimum, languages with tuples usually (I think) have some sort of simple bracket notation for creating tuples, and a corresponding pattern-matching syntax (or “destructuring bind”) for taking tuples apart, at least when they are parameter lists. Designing such a syntax is no simple thing, because it ought to play well with nominal value types, and also with pre-existing Java features, such as method parameter lists, implicit conversions, generic types, and reflection. That is a task for another day. Other Use Cases Besides complex numbers and simple tuples there are many use cases for value types. Many tuple-like types have natural value-type representations. These include rational numbers, point locations and pixel colors, and various kinds of dates and addresses. Other types have a variable-length ‘tail’ of internal values. The most common example of this is String, which is (mathematically) a sequence of UTF-16 character values. Similarly, bit vectors, multiple-precision numbers, and polynomials are composed of sequences of values. Such types include, in their representation, a reference to a variable-sized data structure (often an array) which (somehow) represents the sequence of values. The value type may also include ’header’ information. Variable-sized values often have a length distribution which favors short lengths. In that case, the design of the value type can make the first few values in the sequence be direct ’header’ fields of the value type. In the common case where the header is enough to represent the whole value, the tail can be a shared null value, or even just a null reference. Note that the tail need not be an immutable object, as long as the header type encapsulates it well enough. This is the case with String, where the tail is a mutable (but never mutated) character array. Field types and their order must be a globally visible part of the API. The structure of the value type must be transparent enough to have a globally consistent unboxed representation, so that all callers and callees agree about the type and order of components that appear as parameters, return types, and array elements. This is a trade-off between efficiency and encapsulation, which is forced on us when we remove an indirection enjoyed by boxed representations. A JVM-only transformation would not care about such visibility, but a bytecode transformation would need to take care that (say) the components of complex numbers would not get swapped after a redefinition of Complex and a partial recompile. Perhaps constant pool references to value types need to declare the field order as assumed by each API user. This brings up the delicate status of private fields in a value type. It must always be possible to load, store, and copy value types as coordinated groups, and the JVM performs those movements by moving individual scalar values between locals and stack. If a component field is not public, what is to prevent hostile code from plucking it out of the tuple using a rogue aload or astore instruction? Nothing but the verifier, so we may need to give it more smarts, so that it treats value types as inseparable groups of stack slots or locals (something like long or double). My initial thought was to make the fields always public, which would make the security problem moot. But public is not always the right answer; consider the case of String, where the underlying mutable character array must be encapsulated to prevent security holes. I believe we can win back both sides of the tradeoff, by training the verifier never to split up the components in an unboxed value. Just as the verifier encapsulates the two halves of a 64-bit primitive, it can encapsulate the the header and body of an unboxed String, so that no code other than that of class String itself can take apart the values. Similar to String, we could build an efficient multi-precision decimal type along these lines: public final class DecimalValue extends ValueType { protected final long header; protected private final BigInteger digits; public DecimalValue valueOf(int value, int scale) { assert(scale >= 0); return new DecimalValue(((long)value << 32) + scale, null); } public DecimalValue valueOf(long value, int scale) { if (value == (int) value) return valueOf((int)value, scale); return new DecimalValue(-scale, new BigInteger(value)); } } Values of this type would be passed between methods as two machine words. Small values (those with a significand which fits into 32 bits) would be represented without any heap data at all, unless the DecimalValue itself were boxed. (Note the tension between encapsulation and unboxing in this case. It would be better if the header and digits fields were private, but depending on where the unboxing information must “leak”, it is probably safer to make a public revelation of the internal structure.) Note that, although an array of Complex can be faked with a double-length array of double, there is no easy way to fake an array of unboxed DecimalValues. (Either an array of boxed values or a transposed pair of homogeneous arrays would be reasonable fallbacks, in a current JVM.) Getting the full benefit of unboxing and arrays will require some new JVM magic. Although the JVM emphasizes portability, system dependent code will benefit from using machine-level types larger than 64 bits. For example, the back end of a linear algebra package might benefit from value types like Float4 which map to stock vector types. This is probably only worthwhile if the unboxing arrays can be packed with such values. More Daydreams A more finely-divided design for dynamic enforcement of value safety could feature separate marker interfaces for each invariant. An empty marker interface Unsynchronizable could cause suitable exceptions for monitor instructions on objects in marked classes. More radically, a Interchangeable marker interface could cause JVM primitives that are sensitive to object identity to raise exceptions; the strangest result would be that the acmp instruction would have to be specified as raising an exception. @ValueSafe public interface ValueType extends java.io.Serializable, Unsynchronizable, Interchangeable { … public class Complex implements ValueType { // inherits Serializable, Unsynchronizable, Interchangeable, @ValueSafe … It seems possible that Integer and the other wrapper types could be retro-fitted as value-safe types. This is a major change, since wrapper objects would be unsynchronizable and their references interchangeable. It is likely that code which violates value-safety for wrapper types exists but is uncommon. It is less plausible to retro-fit String, since the prominent operation String.intern is often used with value-unsafe code. We should also reconsider the distinction between boxed and unboxed values in code. The design presented above obscures that distinction. As another thought experiment, we could imagine making a first class distinction in the type system between boxed and unboxed representations. Since only primitive types are named with a lower-case initial letter, we could define that the capitalized version of a value type name always refers to the boxed representation, while the initial lower-case variant always refers to boxed. For example: complex pi = complex.valueOf(Math.PI, 0); Complex boxPi = pi; // convert to boxed myList.add(boxPi); complex z = myList.get(0); // unbox Such a convention could perhaps absorb the current difference between int and Integer, double and Double. It might also allow the programmer to express a helpful distinction among array types. As said above, array types are crucial to bulk data interfaces, but are limited in the JVM. Extending arrays beyond the present limitations is worth thinking about; for example, the Maxine JVM implementation has a hybrid object/array type. Something like this which can also accommodate value type components seems worthwhile. On the other hand, does it make sense for value types to contain short arrays? And why should random-access arrays be the end of our design process, when bulk data is often sequentially accessed, and it might make sense to have heterogeneous streams of data as the natural “jumbo” data structure. These considerations must wait for another day and another note. More Work It seems to me that a good sequence for introducing such value types would be as follows: Add the value-safety restrictions to an experimental version of javac. Code some sample applications with value types, including Complex and DecimalValue. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so. Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. A staggered roll-out like this would decouple language changes from bytecode changes, which is always a convenient thing. A similar investigation should be applied (concurrently) to array types. In this case, it seems to me that the starting point is in the JVM: Add an experimental unboxing array data structure to a production JVM, perhaps along the lines of Maxine hybrids. No bytecode or language support is required at first; everything can be done with encapsulated unsafe operations and/or method handles. Create an experimental JVM which internally unboxes value types but does not require new bytecodes to do so. Ensure the feasibility of the performance model for the sample applications. Add tuple-like bytecodes (with or without generic type reification) to a major revision of the JVM, and teach the Java compiler to switch in the new bytecodes without code changes. That’s enough musing me for now. Back to work!

Read the article
iPhone SDK vs Windows Phone 7 Series SDK Challenge, Part 1: Hello World!

In this series, I will be taking sample applications from the iPhone SDK and implementing them on Windows Phone 7 Series. My goal is to do as much of an apples-to-apples comparison as I can. This series will be written to not only compare and contrast how easy or difficult it is to complete tasks on either platform, how many lines of code, etc., but Id also like it to be a way for iPhone developers to either get started on Windows Phone 7 Series development, or for developers in general to learn the platform. Heres my methodology: Run the iPhone SDK app in the iPhone Simulator to get a feel for what it does and how it works, without looking at the implementation Implement the equivalent functionality on Windows Phone 7 Series using Silverlight. Compare the two implementations based on complexity, functionality, lines of code, number of files, etc. Add some functionality to the Windows Phone 7 Series app that shows off a way to make the scenario more interesting or leverages an aspect of the platform, or uses a better design pattern to implement the functionality. You can download Microsoft Visual Studio 2010 Express for Windows Phone CTP here, and the Expression Blend 4 Beta here. Hello World! Of course no first post would be allowed if it didnt focus on the hello world scenario. The iPhone SDK follows that tradition with the Your First iPhone Application walkthrough. I will say that the developer documentation for iPhone is pretty good. There are plenty of walkthoughs and they break things down into nicely sized steps and do a good job of bringing the user along. As expected, this application is quite simple. It comprises of a text box, a label, and a button. When you push the button, the label changes to Hello plus the word you typed into the text box. Makes perfect sense for a starter application. Theres not much to this but it covers a few basic elements: Laying out basic UI Handling user input Hooking up events Formatting text So, lets get started building a similar app for Windows Phone 7 Series! Implementing the UI: UI in Silverlight (and therefore Windows Phone 7) is defined in XAML, which is a declarative XML language also used by WPF on the desktop. For anyone thats familiar with similar types of markup, its relatively straightforward to learn, but has a lot of power in it once you get it figured out. Well talk more about that. This UI is very simple. When I look at this, I note a couple of things: Elements are arranged vertically They are all centered So, lets create our Application and then start with the UI. Once you have the the VS 2010 Express for Windows Phone tool running, create a new Windows Phone Project, and call it Hello World: Once created, youll see the designer on one side and your XAML on the other: Now, we can create our UI in one of three ways: Use the designer in Visual Studio to drag and drop the components Use the designer in Expression Blend 4 to drag and drop the components Enter the XAML by hand in either of the above Well start with (1), then kind of move to (3) just for instructional value. To develop this UI in the designer: First, delete all of the markup between inside of the Grid element (LayoutRoot). You should be left with just this XAML for your MainPage.xaml (i shortened all the xmlns declarations below for brevity): 1: <phoneNavigation:PhoneApplicationPage 2: x:Class="HelloWorld.MainPage" 3: xmlns="...[snip]" 4: FontFamily="{StaticResource PhoneFontFamilyNormal}" 5: FontSize="{StaticResource PhoneFontSizeNormal}" 6: Foreground="{StaticResource PhoneForegroundBrush}"> 7: 8: <Grid x:Name="LayoutRoot" Background="{StaticResource PhoneBackgroundBrush}"> 9: 10: </Grid> 11: 12: </phoneNavigation:PhoneApplicationPage> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Well be adding XAML at line 9, so thats the important part. Now, Click on the center area of the phone surface Open the Toolbox and double click StackPanel Double click TextBox Double click TextBlock Double click Button That will create the necessary UI elements but they wont be arranged quite right. Well fix it in a second. Heres the XAML that we end up with: 1: <StackPanel Height="100" HorizontalAlignment="Left" Margin="10,10,0,0" Name="stackPanel1" VerticalAlignment="Top" Width="200"> 2: <TextBox Height="32" Name="textBox1" Text="TextBox" Width="100" /> 3: <TextBlock Height="23" Name="textBlock1" Text="TextBlock" /> 4: <Button Content="Button" Height="70" Name="button1" Width="160" /> 5: </StackPanel> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The designer does its best at guessing what we want, but in this case we want things to be a bit simpler. So well just clean it up a bit. We want the items to be centered and we want them to have a little bit of a margin on either side, so heres what we end up with. Ive also made it match the values and style from the iPhone app: 1: <StackPanel Margin="10"> 2: <TextBox Name="textBox1" HorizontalAlignment="Stretch" Text="You" TextAlignment="Center"/> 3: <TextBlock Name="textBlock1" HorizontalAlignment="Center" Margin="0,100,0,0" Text="Hello You!" /> 4: <Button Name="button1" HorizontalAlignment="Center" Margin="0,150,0,0" Content="Hello"/> 5: </StackPanel> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Now lets take a look at what weve done there. Line 1: We removed all of the formatting from the StackPanel, except for Margin, as thats all we need. Since our parent element is a Grid, by default the StackPanel will be sized to fit in that space. The Margin says that we want to reserve 10 pixels on each side of the StackPanel. Line 2: Weve set the HorizontalAlignment of the TextBox to Stretch, which says that it should fill its parents size horizontally. We want to do this so the TextBox is always full-width. We also set TextAlignment to Center, to center the text. Line 3: In contrast to the TextBox above, we dont care how wide the TextBlock is, just so long as it is big enough for its text. Thatll happen automatically, so we just set its Horizontal alignment to Center. We also set a Margin above the TextBlock of 100 pixels to bump it down a bit, per the iPhone UI. Line 4: We do the same things here as in Line 3. Heres how the UI looks in the designer: Believe it or not, were almost done! Implementing the App Logic Now, we want the TextBlock to change its text when the Button is clicked. In the designer, double click the Button to be taken to the Event Handler for the Buttons Click event. In that event handler, we take the Text property from the TextBox, and format it into a string, then set it into the TextBlock. Thats it! 1: private void button1_Click(object sender, RoutedEventArgs e) 2: { 3: string name = textBox1.Text; 4: 5: // if there isn't a name set, just use "World" 6: if (String.IsNullOrEmpty(name)) 7: { 8: name = "World"; 9: } 10: 11: // set the value into the TextBlock 12: textBlock1.Text = String.Format("Hello {0}!", name); 13: 14: } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } We use the String.Format() method to handle the formatting for us. Now all thats left is to test the app in the Windows Phone Emulator and verify it does what we think it does! And it does! Comparing against the iPhone Looking at the iPhone example, there are basically three things that you have to touch as the developer: 1) The UI in the Nib file 2) The app delegate 3) The view controller Counting lines is a bit tricky here, but to try to keep this even, Im going to only count lines of code that I could not have (or would not have) generated with the tooling. Meaning, Im not counting XAML and Im not counting operations that happen in the Nib file with the XCode designer tool. So in the case of the above, even though I modified the XAML, I could have done all of those operations using the visual designer tool. And normally I would have, but the XAML is more instructive (and less steps!). Im interested in things that I, as the developer have to figure out in code. Im also not counting lines that just have a curly brace on them, or lines that are generated for me (e.g. method names that are generated for me when I make a connection, etc.) So, by that count, heres what I get from the code listing for the iPhone app found here: HelloWorldAppDelegate.h: 6 HelloWorldAppDelegate.m: 12 MyViewController.h: 8 MyViewController.m: 18 Which gives me a grand total of about 44 lines of code on iPhone. I really do recommend looking at the iPhone code for a comparison to the above. Now, for the Windows Phone 7 Series application, the only code I typed was in the event handler above Main.Xaml.cs: 4 So a total of 4 lines of code on Windows Phone 7. And more importantly, the process is just A LOT simpler. For example, I was surprised that the User Interface Designer in XCode doesnt automatically create instance variables for me and wire them up to the corresponding elements. I assumed I wouldnt have to write this code myself (and risk getting it wrong!). I dont need to worry about view controllers or anything. I just write my code. This blog post up to this point has covered almost every aspect of this apps development in a few pages. The iPhone tutorial has 5 top level steps with 2-3 sub sections of each. Now, its worth pointing out that the iPhone development model uses the Model View Controller (MVC) pattern, which is a very flexible and powerful pattern that enforces proper separation of concerns. But its fairly complex and difficult to understand when you first walk up to it. Here at Microsoft weve dabbled in MVC a bit, with frameworks like MFC on Visual C++ and with the ASP.NET MVC framework now. Both are very powerful frameworks. But one of the reasons weve stayed away from MVC with client UI frameworks is that its difficult to tool. We havent seen the type of value that beats double click, write code! for the broad set of scenarios. Another thing to think about is how many of those lines of code were focused on my apps functionality?. Or, the converse of How many lines of code were boilerplate plumbing? In both examples, the actual number of functional code lines is similar. I count most of them in MyViewController.m, in the changeGreeting method. Its about 7 lines of code that do the work of taking the value from the TextBox and putting it into the label. Versus 4 on the Windows Phone 7 side. But, unfortunately, on iPhone I still have to write that other 37 lines of code, just to get there. 10% of the code, 1 file instead of 4, its just much simpler. Making Some Tweaks It turns out, I can actually do this application with ZERO lines of code, if Im willing to change the spec a bit. The data binding functionality in Silverlight is incredibly powerful. And what I can do is databind the TextBoxs value directly to the TextBlock. Take some time looking at this XAML below. Youll see that I have added another nested StackPanel and two more TextBlocks. Why? Because thats how I build that string, and the nested StackPanel will lay things out Horizontally for me, as specified by the Orientation property. 1: <StackPanel Margin="10"> 2: <TextBox Name="textBox1" HorizontalAlignment="Stretch" Text="You" TextAlignment="Center"/> 3: <StackPanel Orientation="Horizontal" HorizontalAlignment="Center" Margin="0,100,0,0" > 4: <TextBlock Text="Hello " /> 5: <TextBlock Name="textBlock1" Text="{Binding ElementName=textBox1, Path=Text}" /> 6: <TextBlock Text="!" /> 7: </StackPanel> 8: <Button Name="button1" HorizontalAlignment="Center" Margin="0,150,0,0" Content="Hello" Click="button1_Click" /> 9: </StackPanel> .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } Now, the real action is there in the bolded TextBlock.Text property: Text="{Binding ElementName=textBox1, Path=Text}" .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } That does all the heavy lifting. It sets up a databinding between the TextBox.Text property on textBox1 and the TextBlock.Text property on textBlock1. As I change the text of the TextBox, the label updates automatically. In fact, I dont even need the button any more, so I could get rid of that altogether. And no button means no event handler. No event handler means no C# code at all. Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article
A first look at ConfORM - Part 1

- by thangchung

All source codes for this post can be found at here.Have you ever heard of ConfORM is not? I have read it three months ago when I wrote an post about NHibernate and Autofac. At that time, this project really has just started and still in beta version, so I still do not really care much. But recently when reading a book by Jason Dentler NHibernate 3.0 Cookbook, I started to pay attention to it. Author have mentioned quite a lot of OSS in his book. And now again I have reviewed ConfORM once again. I have been involved in ConfORM development group on google and read some articles about it. Fabio Maulo spent a lot of work for the OSS, and I hope it will adapt a great way for NHibernate (because he contributed to NHibernate that). So what is ConfORM? It is stand for Configuration ORM, and it was trying to use a lot of heuristic model for identifying entities from C# code. Today, it's mostly Model First Driven development, so the first thing is to build the entity model. This is really important and we can see it is the heart of business software. Then we have to tell DB about the entity of this model. We often will use Inversion Engineering here, Database Schema is will create based on recently Entity Model. From now we will absolutely not interested in the DB again, only focus on the Entity Model.Fluent NHibenate really good, I liked this OSS. Sharp Architecture and has done so well in Fluent NHibernate integration with applications. A Multiple Database technical in Sharp Architecture is truly awesome. It can receive configuration, a connection string and a dll containing entity model, which would then create a SessionFactory, finally caching inside the computer memory. As the number of SessionFactory can be very large and will full of the memory, it has also devised a way of caching SessionFactory in the file. This post I hope this will not completely explain about and building a model of multiple databases. I just tried to mount a number of posts from the community and apply some of my knowledge to build a management model Session for ConfORM.As well as Fluent NHibernate, ConfORM also supported on the interface mapping, see this to understand it. So the first thing we will build the Entity Model for it, and here is what I will use the model for this article. A simple model for managing news and polls, it will be too easy for a number of people, but I hope not to bring complexity to this post.I will then have some code to build super type for the Entity Model. public interface IEntity<TId> { TId Id { get; set; } } public abstract class EntityBase<TId> : IEntity<TId> { public virtual TId Id { get; set; } public override bool Equals(object obj) { return Equals(obj as EntityBase<TId>); } private static bool IsTransient(EntityBase<TId> obj) { return obj != null && Equals(obj.Id, default(TId)); } private Type GetUnproxiedType() { return GetType(); } public virtual bool Equals(EntityBase<TId> other) { if (other == null) return false; if (ReferenceEquals(this, other)) return true; if (!IsTransient(this) && !IsTransient(other) && Equals(Id, other.Id)) { var otherType = other.GetUnproxiedType(); var thisType = GetUnproxiedType(); return thisType.IsAssignableFrom(otherType) || otherType.IsAssignableFrom(thisType); } return false; } public override int GetHashCode() { if (Equals(Id, default(TId))) return base.GetHashCode(); return Id.GetHashCode(); } } Database schema will be created as:The next step is to build the ConORM builder to create a NHibernate Configuration. Patrick have a excellent article about it at here. Contract of it below: public interface IConfigBuilder { Configuration BuildConfiguration(string connectionString, string sessionFactoryName); } The idea here is that I will pass in a connection string and a set of the DLL containing the Entity Model and it makes me a NHibernate Configuration (shame that I stole this ideas of Sharp Architecture). And here is its code: public abstract class ConfORMConfigBuilder : RootObject, IConfigBuilder { private static IConfigurator _configurator; protected IEnumerable<Type> DomainTypes; private readonly IEnumerable<string> _assemblies; protected ConfORMConfigBuilder(IEnumerable<string> assemblies) : this(new Configurator(), assemblies) { _assemblies = assemblies; } protected ConfORMConfigBuilder(IConfigurator configurator, IEnumerable<string> assemblies) { _configurator = configurator; _assemblies = assemblies; } public abstract void GetDatabaseIntegration(IDbIntegrationConfigurationProperties dBIntegration, string connectionString); protected abstract HbmMapping GetMapping(); public Configuration BuildConfiguration(string connectionString, string sessionFactoryName) { Contract.Requires(!string.IsNullOrEmpty(connectionString), "ConnectionString is null or empty"); Contract.Requires(!string.IsNullOrEmpty(sessionFactoryName), "SessionFactory name is null or empty"); Contract.Requires(_configurator != null, "Configurator is null"); return CatchExceptionHelper.TryCatchFunction( () => { DomainTypes = GetTypeOfEntities(_assemblies); if (DomainTypes == null) throw new Exception("Type of domains is null"); var configure = new Configuration(); configure.SessionFactoryName(sessionFactoryName); configure.Proxy(p => p.ProxyFactoryFactory<ProxyFactoryFactory>()); configure.DataBaseIntegration(db => GetDatabaseIntegration(db, connectionString)); if (_configurator.GetAppSettingString("IsCreateNewDatabase").ConvertToBoolean()) { configure.SetProperty("hbm2ddl.auto", "create-drop"); } configure.Properties.Add("default_schema", _configurator.GetAppSettingString("DefaultSchema")); configure.AddDeserializedMapping(GetMapping(), _configurator.GetAppSettingString("DocumentFileName")); SchemaMetadataUpdater.QuoteTableAndColumns(configure); return configure; }, Logger); } protected IEnumerable<Type> GetTypeOfEntities(IEnumerable<string> assemblies) { var type = typeof(EntityBase<Guid>); var domainTypes = new List<Type>(); foreach (var assembly in assemblies) { var realAssembly = Assembly.LoadFrom(assembly); if (realAssembly == null) throw new NullReferenceException(); domainTypes.AddRange(realAssembly.GetTypes().Where( t => { if (t.BaseType != null) return string.Compare(t.BaseType.FullName, type.FullName) == 0; return false; })); } return domainTypes; } } I do not want to dependency on any RDBMS, so I made a builder as an abstract class, and so I will create a concrete instance for SQL Server 2008 as follows: public class SqlServerConfORMConfigBuilder : ConfORMConfigBuilder { public SqlServerConfORMConfigBuilder(IEnumerable<string> assemblies) : base(assemblies) { } public override void GetDatabaseIntegration(IDbIntegrationConfigurationProperties dBIntegration, string connectionString) { dBIntegration.Dialect<MsSql2008Dialect>(); dBIntegration.Driver<SqlClientDriver>(); dBIntegration.KeywordsAutoImport = Hbm2DDLKeyWords.AutoQuote; dBIntegration.IsolationLevel = IsolationLevel.ReadCommitted; dBIntegration.ConnectionString = connectionString; dBIntegration.LogSqlInConsole = true; dBIntegration.Timeout = 10; dBIntegration.LogFormatedSql = true; dBIntegration.HqlToSqlSubstitutions = "true 1, false 0, yes 'Y', no 'N'"; } protected override HbmMapping GetMapping() { var orm = new ObjectRelationalMapper(); orm.Patterns.PoidStrategies.Add(new GuidPoidPattern()); var patternsAppliers = new CoolPatternsAppliersHolder(orm); //patternsAppliers.Merge(new DatePropertyByNameApplier()).Merge(new MsSQL2008DateTimeApplier()); patternsAppliers.Merge(new ManyToOneColumnNamingApplier()); patternsAppliers.Merge(new OneToManyKeyColumnNamingApplier(orm)); var mapper = new Mapper(orm, patternsAppliers); var entities = new List<Type>(); DomainDefinition(orm); Customize(mapper); entities.AddRange(DomainTypes); return mapper.CompileMappingFor(entities); } private void DomainDefinition(IObjectRelationalMapper orm) { orm.TablePerClassHierarchy(new[] { typeof(EntityBase<Guid>) }); orm.TablePerClass(DomainTypes); orm.OneToOne<News, Poll>(); orm.ManyToOne<Category, News>(); orm.Cascade<Category, News>(Cascade.All); orm.Cascade<News, Poll>(Cascade.All); orm.Cascade<User, Poll>(Cascade.All); } private static void Customize(Mapper mapper) { CustomizeRelations(mapper); CustomizeTables(mapper); CustomizeColumns(mapper); } private static void CustomizeRelations(Mapper mapper) { } private static void CustomizeTables(Mapper mapper) { } private static void CustomizeColumns(Mapper mapper) { mapper.Class<Category>( cm => { cm.Property(x => x.Name, m => m.NotNullable(true)); cm.Property(x => x.CreatedDate, m => m.NotNullable(true)); }); mapper.Class<News>( cm => { cm.Property(x => x.Title, m => m.NotNullable(true)); cm.Property(x => x.ShortDescription, m => m.NotNullable(true)); cm.Property(x => x.Content, m => m.NotNullable(true)); }); mapper.Class<Poll>( cm => { cm.Property(x => x.Value, m => m.NotNullable(true)); cm.Property(x => x.VoteDate, m => m.NotNullable(true)); cm.Property(x => x.WhoVote, m => m.NotNullable(true)); }); mapper.Class<User>( cm => { cm.Property(x => x.UserName, m => m.NotNullable(true)); cm.Property(x => x.Password, m => m.NotNullable(true)); }); } } As you can see that we can do so many things in this class, such as custom entity relationships, custom binding on the columns, custom table name, ... Here I only made two so-Appliers for OneToMany and ManyToOne relationships, you can refer to it here public class ManyToOneColumnNamingApplier : IPatternApplier<PropertyPath, IManyToOneMapper> { #region IPatternApplier<PropertyPath,IManyToOneMapper> Members public void Apply(PropertyPath subject, IManyToOneMapper applyTo) { applyTo.Column(subject.ToColumnName() + "Id"); } #endregion #region IPattern<PropertyPath> Members public bool Match(PropertyPath subject) { return subject != null; } #endregion } public class OneToManyKeyColumnNamingApplier : OneToManyPattern, IPatternApplier<PropertyPath, ICollectionPropertiesMapper> { public OneToManyKeyColumnNamingApplier(IDomainInspector domainInspector) : base(domainInspector) { } #region Implementation of IPattern<PropertyPath> public bool Match(PropertyPath subject) { return Match(subject.LocalMember); } #endregion Implementation of IPattern<PropertyPath> #region Implementation of IPatternApplier<PropertyPath,ICollectionPropertiesMapper> public void Apply(PropertyPath subject, ICollectionPropertiesMapper applyTo) { applyTo.Key(km => km.Column(GetKeyColumnName(subject))); } #endregion Implementation of IPatternApplier<PropertyPath,ICollectionPropertiesMapper> protected virtual string GetKeyColumnName(PropertyPath subject) { Type propertyType = subject.LocalMember.GetPropertyOrFieldType(); Type childType = propertyType.DetermineCollectionElementType(); var entity = subject.GetContainerEntity(DomainInspector); var parentPropertyInChild = childType.GetFirstPropertyOfType(entity); var baseName = parentPropertyInChild == null ? subject.PreviousPath == null ? entity.Name : entity.Name + subject.PreviousPath : parentPropertyInChild.Name; return GetKeyColumnName(baseName); } protected virtual string GetKeyColumnName(string baseName) { return string.Format("{0}Id", baseName); } } Everyone also can download the ConfORM source at google code and see example inside it. Next part I will write about multiple database factory. Hope you enjoy about it. happy coding and see you next part.

Read the article
The Incremental Architect’s Napkin - #5 - Design functions for extensibility and readability

- by Ralf Westphal

Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/08/24/the-incremental-architectrsquos-napkin---5---design-functions-for.aspx The functionality of programs is entered via Entry Points. So what we´re talking about when designing software is a bunch of functions handling the requests represented by and flowing in through those Entry Points. Designing software thus consists of at least three phases: Analyzing the requirements to find the Entry Points and their signatures Designing the functionality to be executed when those Entry Points get triggered Implementing the functionality according to the design aka coding I presume, you´re familiar with phase 1 in some way. And I guess you´re proficient in implementing functionality in some programming language. But in my experience developers in general are not experienced in going through an explicit phase 2. “Designing functionality? What´s that supposed to mean?” you might already have thought. Here´s my definition: To design functionality (or functional design for short) means thinking about… well, functions. You find a solution for what´s supposed to happen when an Entry Point gets triggered in terms of functions. A conceptual solution that is, because those functions only exist in your head (or on paper) during this phase. But you may have guess that, because it´s “design” not “coding”. And here is, what functional design is not: It´s not about logic. Logic is expressions (e.g. +, -, && etc.) and control statements (e.g. if, switch, for, while etc.). Also I consider calling external APIs as logic. It´s equally basic. It´s what code needs to do in order to deliver some functionality or quality. Logic is what´s doing that needs to be done by software. Transformations are either done through expressions or API-calls. And then there is alternative control flow depending on the result of some expression. Basically it´s just jumps in Assembler, sometimes to go forward (if, switch), sometimes to go backward (for, while, do). But calling your own function is not logic. It´s not necessary to produce any outcome. Functionality is not enhanced by adding functions (subroutine calls) to your code. Nor is quality increased by adding functions. No performance gain, no higher scalability etc. through functions. Functions are not relevant to functionality. Strange, isn´t it. What they are important for is security of investment. By introducing functions into our code we can become more productive (re-use) and can increase evolvability (higher unterstandability, easier to keep code consistent). That´s no small feat, however. Evolvable code can hardly be overestimated. That´s why to me functional design is so important. It´s at the core of software development. To sum this up: Functional design is on a level of abstraction above (!) logical design or algorithmic design. Functional design is only done until you get to a point where each function is so simple you are very confident you can easily code it. Functional design an logical design (which mostly is coding, but can also be done using pseudo code or flow charts) are complementary. Software needs both. If you start coding right away you end up in a tangled mess very quickly. Then you need back out through refactoring. Functional design on the other hand is bloodless without actual code. It´s just a theory with no experiments to prove it. But how to do functional design? An example of functional design Let´s assume a program to de-duplicate strings. The user enters a number of strings separated by commas, e.g. a, b, a, c, d, b, e, c, a. And the program is supposed to clear this list of all doubles, e.g. a, b, c, d, e. There is only one Entry Point to this program: the user triggers the de-duplication by starting the program with the string list on the command line C:\>deduplicate "a, b, a, c, d, b, e, c, a" a, b, c, d, e …or by clicking on a GUI button. This leads to the Entry Point function to get called. It´s the program´s main function in case of the batch version or a button click event handler in the GUI version. That´s the physical Entry Point so to speak. It´s inevitable. What then happens is a three step process: Transform the input data from the user into a request. Call the request handler. Transform the output of the request handler into a tangible result for the user. Or to phrase it a bit more generally: Accept input. Transform input into output. Present output. This does not mean any of these steps requires a lot of effort. Maybe it´s just one line of code to accomplish it. Nevertheless it´s a distinct step in doing the processing behind an Entry Point. Call it an aspect or a responsibility - and you will realize it most likely deserves a function of its own to satisfy the Single Responsibility Principle (SRP). Interestingly the above list of steps is already functional design. There is no logic, but nevertheless the solution is described - albeit on a higher level of abstraction than you might have done yourself. But it´s still on a meta-level. The application to the domain at hand is easy, though: Accept string list from command line De-duplicate Present de-duplicated strings on standard output And this concrete list of processing steps can easily be transformed into code:static void Main(string[] args) { var input = Accept_string_list(args); var output = Deduplicate(input); Present_deduplicated_string_list(output); } Instead of a big problem there are three much smaller problems now. If you think each of those is trivial to implement, then go for it. You can stop the functional design at this point. But maybe, just maybe, you´re not so sure how to go about with the de-duplication for example. Then just implement what´s easy right now, e.g.private static string Accept_string_list(string[] args) { return args[0]; } private static void Present_deduplicated_string_list( string[] output) { var line = string.Join(", ", output); Console.WriteLine(line); } Accept_string_list() contains logic in the form of an API-call. Present_deduplicated_string_list() contains logic in the form of an expression and an API-call. And then repeat the functional design for the remaining processing step. What´s left is the domain logic: de-duplicating a list of strings. How should that be done? Without any logic at our disposal during functional design you´re left with just functions. So which functions could make up the de-duplication? Here´s a suggestion: De-duplicate Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Processing step 2 obviously was the core of the solution. That´s where real creativity was needed. That´s the core of the domain. But now after this refinement the implementation of each step is easy again:private static string[] Parse_string_list(string input) { return input.Split(',') .Select(s => s.Trim()) .ToArray(); } private static Dictionary<string,object> Compile_unique_strings(string[] strings) { return strings.Aggregate( new Dictionary<string, object>(), (agg, s) => { agg[s] = null; return agg; }); } private static string[] Serialize_unique_strings( Dictionary<string,object> dict) { return dict.Keys.ToArray(); } With these three additional functions Main() now looks like this:static void Main(string[] args) { var input = Accept_string_list(args); var strings = Parse_string_list(input); var dict = Compile_unique_strings(strings); var output = Serialize_unique_strings(dict); Present_deduplicated_string_list(output); } I think that´s very understandable code: just read it from top to bottom and you know how the solution to the problem works. It´s a mirror image of the initial design: Accept string list from command line Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Present de-duplicated strings on standard output You can even re-generate the design by just looking at the code. Code and functional design thus are always in sync - if you follow some simple rules. But about that later. And as a bonus: all the functions making up the process are small - which means easy to understand, too. So much for an initial concrete example. Now it´s time for some theory. Because there is method to this madness ;-) The above has only scratched the surface. Introducing Flow Design Functional design starts with a given function, the Entry Point. Its goal is to describe the behavior of the program when the Entry Point is triggered using a process, not an algorithm. An algorithm consists of logic, a process on the other hand consists just of steps or stages. Each processing step transforms input into output or a side effect. Also it might access resources, e.g. a printer, a database, or just memory. Processing steps thus can rely on state of some sort. This is different from Functional Programming, where functions are supposed to not be stateful and not cause side effects.[1] In its simplest form a process can be written as a bullet point list of steps, e.g. Get data from user Output result to user Transform data Parse data Map result for output Such a compilation of steps - possibly on different levels of abstraction - often is the first artifact of functional design. It can be generated by a team in an initial design brainstorming. Next comes ordering the steps. What should happen first, what next etc.? Get data from user Parse data Transform data Map result for output Output result to user That´s great for a start into functional design. It´s better than starting to code right away on a given function using TDD. Please get me right: TDD is a valuable practice. But it can be unnecessarily hard if the scope of a functionn is too large. But how do you know beforehand without investing some thinking? And how to do this thinking in a systematic fashion? My recommendation: For any given function you´re supposed to implement first do a functional design. Then, once you´re confident you know the processing steps - which are pretty small - refine and code them using TDD. You´ll see that´s much, much easier - and leads to cleaner code right away. For more information on this approach I call “Informed TDD” read my book of the same title. Thinking before coding is smart. And writing down the solution as a bunch of functions possibly is the simplest thing you can do, I´d say. It´s more according to the KISS (Keep It Simple, Stupid) principle than returning constants or other trivial stuff TDD development often is started with. So far so good. A simple ordered list of processing steps will do to start with functional design. As shown in the above example such steps can easily be translated into functions. Moving from design to coding thus is simple. However, such a list does not scale. Processing is not always that simple to be captured in a list. And then the list is just text. Again. Like code. That means the design is lacking visuality. Textual representations need more parsing by your brain than visual representations. Plus they are limited in their “dimensionality”: text just has one dimension, it´s sequential. Alternatives and parallelism are hard to encode in text. In addition the functional design using numbered lists lacks data. It´s not visible what´s the input, output, and state of the processing steps. That´s why functional design should be done using a lightweight visual notation. No tool is necessary to draw such designs. Use pen and paper; a flipchart, a whiteboard, or even a napkin is sufficient. Visualizing processes The building block of the functional design notation is a functional unit. I mostly draw it like this: Something is done, it´s clear what goes in, it´s clear what comes out, and it´s clear what the processing step requires in terms of state or hardware. Whenever input flows into a functional unit it gets processed and output is produced and/or a side effect occurs. Flowing data is the driver of something happening. That´s why I call this approach to functional design Flow Design. It´s about data flow instead of control flow. Control flow like in algorithms is of no concern to functional design. Thinking about control flow simply is too low level. Once you start with control flow you easily get bogged down by tons of details. That´s what you want to avoid during design. Design is supposed to be quick, broad brush, abstract. It should give overview. But what about all the details? As Robert C. Martin rightly said: “Programming is abot detail”. Detail is a matter of code. Once you start coding the processing steps you designed you can worry about all the detail you want. Functional design does not eliminate all the nitty gritty. It just postpones tackling them. To me that´s also an example of the SRP. Function design has the responsibility to come up with a solution to a problem posed by a single function (Entry Point). And later coding has the responsibility to implement the solution down to the last detail (i.e. statement, API-call). TDD unfortunately mixes both responsibilities. It´s just coding - and thereby trying to find detailed implementations (green phase) plus getting the design right (refactoring). To me that´s one reason why TDD has failed to deliver on its promise for many developers. Using functional units as building blocks of functional design processes can be depicted very easily. Here´s the initial process for the example problem: For each processing step draw a functional unit and label it. Choose a verb or an “action phrase” as a label, not a noun. Functional design is about activities, not state or structure. Then make the output of an upstream step the input of a downstream step. Finally think about the data that should flow between the functional units. Write the data above the arrows connecting the functional units in the direction of the data flow. Enclose the data description in brackets. That way you can clearly see if all flows have already been specified. Empty brackets mean “no data is flowing”, but nevertheless a signal is sent. A name like “list” or “strings” in brackets describes the data content. Use lower case labels for that purpose. A name starting with an upper case letter like “String” or “Customer” on the other hand signifies a data type. If you like, you also can combine descriptions with data types by separating them with a colon, e.g. (list:string) or (strings:string[]). But these are just suggestions from my practice with Flow Design. You can do it differently, if you like. Just be sure to be consistent. Flows wired-up in this manner I call one-dimensional (1D). Each functional unit just has one input and/or one output. A functional unit without an output is possible. It´s like a black hole sucking up input without producing any output. Instead it produces side effects. A functional unit without an input, though, does make much sense. When should it start to work? What´s the trigger? That´s why in the above process even the first processing step has an input. If you like, view such 1D-flows as pipelines. Data is flowing through them from left to right. But as you can see, it´s not always the same data. It get´s transformed along its passage: (args) becomes a (list) which is turned into (strings). The Principle of Mutual Oblivion A very characteristic trait of flows put together from function units is: no functional units knows another one. They are all completely independent of each other. Functional units don´t know where their input is coming from (or even when it´s gonna arrive). They just specify a range of values they can process. And they promise a certain behavior upon input arriving. Also they don´t know where their output is going. They just produce it in their own time independent of other functional units. That means at least conceptually all functional units work in parallel. Functional units don´t know their “deployment context”. They now nothing about the overall flow they are place in. They are just consuming input from some upstream, and producing output for some downstream. That makes functional units very easy to test. At least as long as they don´t depend on state or resources. I call this the Principle of Mutual Oblivion (PoMO). Functional units are oblivious of others as well as an overall context/purpose. They are just parts of a whole focused on a single responsibility. How the whole is built, how a larger goal is achieved, is of no concern to the single functional units. By building software in such a manner, functional design interestingly follows nature. Nature´s building blocks for organisms also follow the PoMO. The cells forming your body do not know each other. Take a nerve cell “controlling” a muscle cell for example:[2] The nerve cell does not know anything about muscle cells, let alone the specific muscel cell it is “attached to”. Likewise the muscle cell does not know anything about nerve cells, let a lone a specific nerve cell “attached to” it. Saying “the nerve cell is controlling the muscle cell” thus only makes sense when viewing both from the outside. “Control” is a concept of the whole, not of its parts. Control is created by wiring-up parts in a certain way. Both cells are mutually oblivious. Both just follow a contract. One produces Acetylcholine (ACh) as output, the other consumes ACh as input. Where the ACh is going, where it´s coming from neither cell cares about. Million years of evolution have led to this kind of division of labor. And million years of evolution have produced organism designs (DNA) which lead to the production of these different cell types (and many others) and also to their co-location. The result: the overall behavior of an organism. How and why this happened in nature is a mystery. For our software, though, it´s clear: functional and quality requirements needs to be fulfilled. So we as developers have to become “intelligent designers” of “software cells” which we put together to form a “software organism” which responds in satisfying ways to triggers from it´s environment. My bet is: If nature gets complex organisms working by following the PoMO, who are we to not apply this recipe for success to our much simpler “machines”? So my rule is: Wherever there is functionality to be delivered, because there is a clear Entry Point into software, design the functionality like nature would do it. Build it from mutually oblivious functional units. That´s what Flow Design is about. In that way it´s even universal, I´d say. Its notation can also be applied to biology: Never mind labeling the functional units with nouns. That´s ok in Flow Design. You´ll do that occassionally for functional units on a higher level of abstraction or when their purpose is close to hardware. Getting a cockroach to roam your bedroom takes 1,000,000 nerve cells (neurons). Getting the de-duplication program to do its job just takes 5 “software cells” (functional units). Both, though, follow the same basic principle. Translating functional units into code Moving from functional design to code is no rocket science. In fact it´s straightforward. There are two simple rules: Translate an input port to a function. Translate an output port either to a return statement in that function or to a function pointer visible to that function. The simplest translation of a functional unit is a function. That´s what you saw in the above example. Functions are mutually oblivious. That why Functional Programming likes them so much. It makes them composable. Which is the reason, nature works according to the PoMO. Let´s be clear about one thing: There is no dependency injection in nature. For all of an organism´s complexity no DI container is used. Behavior is the result of smooth cooperation between mutually oblivious building blocks. Functions will often be the adequate translation for the functional units in your designs. But not always. Take for example the case, where a processing step should not always produce an output. Maybe the purpose is to filter input. Here the functional unit consumes words and produces words. But it does not pass along every word flowing in. Some words are swallowed. Think of a spell checker. It probably should not check acronyms for correctness. There are too many of them. Or words with no more than two letters. Such words are called “stop words”. In the above picture the optionality of the output is signified by the astrisk outside the brackets. It means: Any number of (word) data items can flow from the functional unit for each input data item. It might be none or one or even more. This I call a stream of data. Such behavior cannot be translated into a function where output is generated with return. Because a function always needs to return a value. So the output port is translated into a function pointer or continuation which gets passed to the subroutine when called:[3]void filter_stop_words( string word, Action<string> onNoStopWord) { if (...check if not a stop word...) onNoStopWord(word); } If you want to be nitpicky you might call such a function pointer parameter an injection. And technically you´re right. Conceptually, though, it´s not an injection. Because the subroutine is not functionally dependent on the continuation. Firstly continuations are procedures, i.e. subroutines without a return type. Remember: Flow Design is about unidirectional data flow. Secondly the name of the formal parameter is chosen in a way as to not assume anything about downstream processing steps. onNoStopWord describes a situation (or event) within the functional unit only. Translating output ports into function pointers helps keeping functional units mutually oblivious in cases where output is optional or produced asynchronically. Either pass the function pointer to the function upon call. Or make it global by putting it on the encompassing class. Then it´s called an event. In C# that´s even an explicit feature.class Filter { public void filter_stop_words( string word) { if (...check if not a stop word...) onNoStopWord(word); } public event Action<string> onNoStopWord; } When to use a continuation and when to use an event dependens on how a functional unit is used in flows and how it´s packed together with others into classes. You´ll see examples further down the Flow Design road. Another example of 1D functional design Let´s see Flow Design once more in action using the visual notation. How about the famous word wrap kata? Robert C. Martin has posted a much cited solution including an extensive reasoning behind his TDD approach. So maybe you want to compare it to Flow Design. The function signature given is:string WordWrap(string text, int maxLineLength) {...} That´s not an Entry Point since we don´t see an application with an environment and users. Nevertheless it´s a function which is supposed to provide a certain functionality. The text passed in has to be reformatted. The input is a single line of arbitrary length consisting of words separated by spaces. The output should consist of one or more lines of a maximum length specified. If a word is longer than a the maximum line length it can be split in multiple parts each fitting in a line. Flow Design Let´s start by brainstorming the process to accomplish the feat of reformatting the text. What´s needed? Words need to be assembled into lines Words need to be extracted from the input text The resulting lines need to be assembled into the output text Words too long to fit in a line need to be split Does sound about right? I guess so. And it shows a kind of priority. Long words are a special case. So maybe there is a hint for an incremental design here. First let´s tackle “average words” (words not longer than a line). Here´s the Flow Design for this increment: The the first three bullet points turned into functional units with explicit data added. As the signature requires a text is transformed into another text. See the input of the first functional unit and the output of the last functional unit. In between no text flows, but words and lines. That´s good to see because thereby the domain is clearly represented in the design. The requirements are talking about words and lines and here they are. But note the asterisk! It´s not outside the brackets but inside. That means it´s not a stream of words or lines, but lists or sequences. For each text a sequence of words is output. For each sequence of words a sequence of lines is produced. The asterisk is used to abstract from the concrete implementation. Like with streams. Whether the list of words gets implemented as an array or an IEnumerable is not important during design. It´s an implementation detail. Does any processing step require further refinement? I don´t think so. They all look pretty “atomic” to me. And if not… I can always backtrack and refine a process step using functional design later once I´ve gained more insight into a sub-problem. Implementation The implementation is straightforward as you can imagine. The processing steps can all be translated into functions. Each can be tested easily and separately. Each has a focused responsibility. And the process flow becomes just a sequence of function calls: Easy to understand. It clearly states how word wrapping works - on a high level of abstraction. And it´s easy to evolve as you´ll see. Flow Design - Increment 2 So far only texts consisting of “average words” are wrapped correctly. Words not fitting in a line will result in lines too long. Wrapping long words is a feature of the requested functionality. Whether it´s there or not makes a difference to the user. To quickly get feedback I decided to first implement a solution without this feature. But now it´s time to add it to deliver the full scope. Fortunately Flow Design automatically leads to code following the Open Closed Principle (OCP). It´s easy to extend it - instead of changing well tested code. How´s that possible? Flow Design allows for extension of functionality by inserting functional units into the flow. That way existing functional units need not be changed. The data flow arrow between functional units is a natural extension point. No need to resort to the Strategy Pattern. No need to think ahead where extions might need to be made in the future. I just “phase in” the remaining processing step: Since neither Extract words nor Reformat know of their environment neither needs to be touched due to the “detour”. The new processing step accepts the output of the existing upstream step and produces data compatible with the existing downstream step. Implementation - Increment 2 A trivial implementation checking the assumption if this works does not do anything to split long words. The input is just passed on: Note how clean WordWrap() stays. The solution is easy to understand. A developer looking at this code sometime in the future, when a new feature needs to be build in, quickly sees how long words are dealt with. Compare this to Robert C. Martin´s solution:[4] How does this solution handle long words? Long words are not even part of the domain language present in the code. At least I need considerable time to understand the approach. Admittedly the Flow Design solution with the full implementation of long word splitting is longer than Robert C. Martin´s. At least it seems. Because his solution does not cover all the “word wrap situations” the Flow Design solution handles. Some lines would need to be added to be on par, I guess. But even then… Is a difference in LOC that important as long as it´s in the same ball park? I value understandability and openness for extension higher than saving on the last line of code. Simplicity is not just less code, it´s also clarity in design. But don´t take my word for it. Try Flow Design on larger problems and compare for yourself. What´s the easier, more straightforward way to clean code? And keep in mind: You ain´t seen all yet ;-) There´s more to Flow Design than described in this chapter. In closing I hope I was able to give you a impression of functional design that makes you hungry for more. To me it´s an inevitable step in software development. Jumping from requirements to code does not scale. And it leads to dirty code all to quickly. Some thought should be invested first. Where there is a clear Entry Point visible, it´s functionality should be designed using data flows. Because with data flows abstraction is possible. For more background on why that´s necessary read my blog article here. For now let me point out to you - if you haven´t already noticed - that Flow Design is a general purpose declarative language. It´s “programming by intention” (Shalloway et al.). Just write down how you think the solution should work on a high level of abstraction. This breaks down a large problem in smaller problems. And by following the PoMO the solutions to those smaller problems are independent of each other. So they are easy to test. Or you could even think about getting them implemented in parallel by different team members. Flow Design not only increases evolvability, but also helps becoming more productive. All team members can participate in functional design. This goes beyon collective code ownership. We´re talking collective design/architecture ownership. Because with Flow Design there is a common visual language to talk about functional design - which is the foundation for all other design activities. PS: If you like what you read, consider getting my ebook “The Incremental Architekt´s Napkin”. It´s where I compile all the articles in this series for easier reading. I like the strictness of Function Programming - but I also find it quite hard to live by. And it certainly is not what millions of programmers are used to. Also to me it seems, the real world is full of state and side effects. So why give them such a bad image? That´s why functional design takes a more pragmatic approach. State and side effects are ok for processing steps - but be sure to follow the SRP. Don´t put too much of it into a single processing step. ? Image taken from www.physioweb.org ? My code samples are written in C#. C# sports typed function pointers called delegates. Action is such a function pointer type matching functions with signature void someName(T t). Other languages provide similar ways to work with functions as first class citizens - even Java now in version 8. I trust you find a way to map this detail of my translation to your favorite programming language. I know it works for Java, C++, Ruby, JavaScript, Python, Go. And if you´re using a Functional Programming language it´s of course a no brainer. ? Taken from his blog post “The Craftsman 62, The Dark Path”. ?

Read the article
How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article

Search Results

Search found 1375 results on 55 pages for 'asymptotic complexity'.

Page 55/55 | < Previous Page | 51 52 53 54 55

- by Eric

- by Aaronaught

- by TimeSpace Traveller

- by gargantaun

- by Simon G

- by Anders Svensson

- by Eric Moyer

- by solublefish

- by Rick Strahl

- by James Michael Hare

- by danx

- by Ali Bahrami

- by Rick Strahl

- by user9154181

- by Rick Strahl

- by Phil Factor

- by Jon

- by Jon

- by Jon

- by Rick Strahl

- by john.rose

- by thangchung

- by Ralf Westphal

- by rchrd

< Previous Page | 51 52 53 54 55