Search Results

Search found 4513 results on 181 pages for 'james michael hare'.

Page 7/181 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >

  • C#/.NET Little Wonders: The Timeout static class

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. When I started the “Little Wonders” series, I really wanted to pay homage to parts of the .NET Framework that are often small but can help in big ways.  The item I have to discuss today really is a very small item in the .NET BCL, but once again I feel it can help make the intention of code much clearer and thus is worthy of note. The Problem - Magic numbers aren’t very readable or maintainable In my first Little Wonders Post (Five Little Wonders That Make Code Better) I mention the TimeSpan factory methods which, I feel, really help the readability of constructed TimeSpan instances. Just to quickly recap that discussion, ask yourself what the TimeSpan specified in each case below is 1: // Five minutes? Five Seconds? 2: var fiveWhat1 = new TimeSpan(0, 0, 5); 3: var fiveWhat2 = new TimeSpan(0, 0, 5, 0); 4: var fiveWhat3 = new TimeSpan(0, 0, 5, 0, 0); You’d think they’d all be the same unit of time, right?  After all, most overloads tend to tack additional arguments on the end.  But this is not the case with TimeSpan, where the constructor forms are:     TimeSpan(int hours, int minutes, int seconds);     TimeSpan(int days, int hours, int minutes, int seconds);     TimeSpan(int days, int hours, int minutes, int seconds, int milliseconds); Notice how in the 4 and 5 parameter version we suddenly have the parameter days slipping in front of hours?  This can make reading constructors like those above much harder.  Fortunately, there are TimeSpan factory methods to help make your intention crystal clear: 1: // Ah! Much clearer! 2: var fiveSeconds = TimeSpan.FromSeconds(5); These are great because they remove all ambiguity from the reader!  So in short, magic numbers in constructors and methods can be ambiguous, and anything we can do to clean up the intention of the developer will make the code much easier to read and maintain. Timeout – Readable identifiers for infinite timeout values In a similar way to TimeSpan, let’s consider specifying timeouts for some of .NET’s (or our own) many methods that allow you to specify timeout periods. For example, in the TPL Task class, there is a family of Wait() methods that can take TimeSpan or int for timeouts.  Typically, if you want to specify an infinite timeout, you’d just call the version that doesn’t take a timeout parameter at all: 1: myTask.Wait(); // infinite wait But there are versions that take the int or TimeSpan for timeout as well: 1: // Wait for 100 ms 2: myTask.Wait(100); 3:  4: // Wait for 5 seconds 5: myTask.Wait(TimeSpan.FromSeconds(5); Now, if we want to specify an infinite timeout to wait on the Task, we could pass –1 (or a TimeSpan set to –1 ms), which what the .NET BCL methods with timeouts use to represent an infinite timeout: 1: // Also infinite timeouts, but harder to read/maintain 2: myTask.Wait(-1); 3: myTask.Wait(TimeSpan.FromMilliseconds(-1)); However, these are not as readable or maintainable.  If you were writing this code, you might make the mistake of thinking 0 or int.MaxValue was an infinite timeout, and you’d be incorrect.  Also, reading the code above it isn’t as clear that –1 is infinite unless you happen to know that is the specified behavior. To make the code like this easier to read and maintain, there is a static class called Timeout in the System.Threading namespace which contains definition for infinite timeouts specified as both int and TimeSpan forms: Timeout.Infinite An integer constant with a value of –1 Timeout.InfiniteTimeSpan A static readonly TimeSpan which represents –1 ms (only available in .NET 4.5+) This makes our calls to Task.Wait() (or any other calls with timeouts) much more clear: 1: // intention to wait indefinitely is quite clear now 2: myTask.Wait(Timeout.Infinite); 3: myTask.Wait(Timeout.InfiniteTimeSpan); But wait, you may say, why would we care at all?  Why not use the version of Wait() that takes no arguments?  Good question!  When you’re directly calling the method with an infinite timeout that’s what you’d most likely do, but what if you are just passing along a timeout specified by a caller from higher up?  Or perhaps storing a timeout value from a configuration file, and want to default it to infinite? For example, perhaps you are designing a communications module and want to be able to shutdown gracefully, but if you can’t gracefully finish in a specified amount of time you want to force the connection closed.  You could create a Shutdown() method in your class, and take a TimeSpan or an int for the amount of time to wait for a clean shutdown – perhaps waiting for client to acknowledge – before terminating the connection.  So, assume we had a pub/sub system with a class to broadcast messages: 1: // Some class to broadcast messages to connected clients 2: public class Broadcaster 3: { 4: // ... 5:  6: // Shutdown connection to clients, wait for ack back from clients 7: // until all acks received or timeout, whichever happens first 8: public void Shutdown(int timeout) 9: { 10: // Kick off a task here to send shutdown request to clients and wait 11: // for the task to finish below for the specified time... 12:  13: if (!shutdownTask.Wait(timeout)) 14: { 15: // If Wait() returns false, we timed out and task 16: // did not join in time. 17: } 18: } 19: } We could even add an overload to allow us to use TimeSpan instead of int, to give our callers the flexibility to specify timeouts either way: 1: // overload to allow them to specify Timeout in TimeSpan, would 2: // just call the int version passing in the TotalMilliseconds... 3: public void Shutdown(TimeSpan timeout) 4: { 5: Shutdown(timeout.TotalMilliseconds); 6: } Notice in case of this class, we don’t assume the caller wants infinite timeouts, we choose to rely on them to tell us how long to wait.  So now, if they choose an infinite timeout, they could use the –1, which is more cryptic, or use Timeout class to make the intention clear: 1: // shutdown the broadcaster, waiting until all clients ack back 2: // without timing out. 3: myBroadcaster.Shutdown(Timeout.Infinite); We could even add a default argument using the int parameter version so that specifying no arguments to Shutdown() assumes an infinite timeout: 1: // Modified original Shutdown() method to add a default of 2: // Timeout.Infinite, works because Timeout.Infinite is a compile 3: // time constant. 4: public void Shutdown(int timeout = Timeout.Infinite) 5: { 6: // same code as before 7: } Note that you can’t default the ShutDown(TimeSpan) overload with Timeout.InfiniteTimeSpan since it is not a compile-time constant.  The only acceptable default for a TimeSpan parameter would be default(TimeSpan) which is zero milliseconds, which specified no wait, not infinite wait. Summary While Timeout.Infinite and Timeout.InfiniteTimeSpan are not earth-shattering classes in terms of functionality, they do give you very handy and readable constant values that you can use in your programs to help increase readability and maintainability when specifying infinite timeouts for various timeouts in the BCL and your own applications. Technorati Tags: C#,CSharp,.NET,Little Wonders,Timeout,Task

    Read the article

  • Of C# Iterators and Performance

    - by James Michael Hare
    Some of you reading this will be wondering, "what is an iterator" and think I'm locked in the world of C++.  Nope, I'm talking C# iterators.  No, not enumerators, iterators.   So, for those of you who do not know what iterators are in C#, I will explain it in summary, and for those of you who know what iterators are but are curious of the performance impacts, I will explore that as well.   Iterators have been around for a bit now, and there are still a bunch of people who don't know what they are or what they do.  I don't know how many times at work I've had a code review on my code and have someone ask me, "what's that yield word do?"   Basically, this post came to me as I was writing some extension methods to extend IEnumerable<T> -- I'll post some of the fun ones in a later post.  Since I was filtering the resulting list down, I was using the standard C# iterator concept; but that got me wondering: what are the performance implications of using an iterator versus returning a new enumeration?   So, to begin, let's look at a couple of methods.  This is a new (albeit contrived) method called Every(...).  The goal of this method is to access and enumeration and return every nth item in the enumeration (including the first).  So Every(2) would return items 0, 2, 4, 6, etc.   Now, if you wanted to write this in the traditional way, you may come up with something like this:       public static IEnumerable<T> Every<T>(this IEnumerable<T> list, int interval)     {         List<T> newList = new List<T>();         int count = 0;           foreach (var i in list)         {             if ((count++ % interval) == 0)             {                 newList.Add(i);             }         }           return newList;     }     So basically this method takes any IEnumerable<T> and returns a new IEnumerable<T> that contains every nth item.  Pretty straight forward.   The problem?  Well, Every<T>(...) will construct a list containing every nth item whether or not you care.  What happens if you were searching this result for a certain item and find that item after five tries?  You would have generated the rest of the list for nothing.   Enter iterators.  This C# construct uses the yield keyword to effectively defer evaluation of the next item until it is asked for.  This can be very handy if the evaluation itself is expensive or if there's a fair chance you'll never want to fully evaluate a list.   We see this all the time in Linq, where many expressions are chained together to do complex processing on a list.  This would be very expensive if each of these expressions evaluated their entire possible result set on call.    Let's look at the same example function, this time using an iterator:       public static IEnumerable<T> Every<T>(this IEnumerable<T> list, int interval)     {         int count = 0;         foreach (var i in list)         {             if ((count++ % interval) == 0)             {                 yield return i;             }         }     }   Notice it does not create a new return value explicitly, the only evidence of a return is the "yield return" statement.  What this means is that when an item is requested from the enumeration, it will enter this method and evaluate until it either hits a yield return (in which case that item is returned) or until it exits the method or hits a yield break (in which case the iteration ends.   Behind the scenes, this is all done with a class that the CLR creates behind the scenes that keeps track of the state of the iteration, so that every time the next item is asked for, it finds that item and then updates the current position so it knows where to start at next time.   It doesn't seem like a big deal, does it?  But keep in mind the key point here: it only returns items as they are requested. Thus if there's a good chance you will only process a portion of the return list and/or if the evaluation of each item is expensive, an iterator may be of benefit.   This is especially true if you intend your methods to be chainable similar to the way Linq methods can be chained.    For example, perhaps you have a List<int> and you want to take every tenth one until you find one greater than 10.  We could write that as:       List<int> someList = new List<int>();         // fill list here         someList.Every(10).TakeWhile(i => i <= 10);     Now is the difference more apparent?  If we use the first form of Every that makes a copy of the list.  It's going to copy the entire list whether we will need those items or not, that can be costly!    With the iterator version, however, it will only take items from the list until it finds one that is > 10, at which point no further items in the list are evaluated.   So, sounds neat eh?  But what's the cost is what you're probably wondering.  So I ran some tests using the two forms of Every above on lists varying from 5 to 500,000 integers and tried various things.    Now, iteration isn't free.  If you are more likely than not to iterate the entire collection every time, iterator has some very slight overhead:   Copy vs Iterator on 100% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 5 Copy 5 5 5 Iterator 5 50 50 Copy 28 50 50 Iterator 27 500 500 Copy 227 500 500 Iterator 247 5000 5000 Copy 2266 5000 5000 Iterator 2444 50,000 50,000 Copy 24,443 50,000 50,000 Iterator 24,719 500,000 500,000 Copy 250,024 500,000 500,000 Iterator 251,521   Notice that when iterating over the entire produced list, the times for the iterator are a little better for smaller lists, then getting just a slight bit worse for larger lists.  In reality, given the number of items and iterations, the result is near negligible, but just to show that iterators come at a price.  However, it should also be noted that the form of Every that returns a copy will have a left-over collection to garbage collect.   However, if we only partially evaluate less and less through the list, the savings start to show and make it well worth the overhead.  Let's look at what happens if you stop looking after 80% of the list:   Copy vs Iterator on 80% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 4 Copy 5 5 4 Iterator 5 50 40 Copy 27 50 40 Iterator 23 500 400 Copy 215 500 400 Iterator 200 5000 4000 Copy 2099 5000 4000 Iterator 1962 50,000 40,000 Copy 22,385 50,000 40,000 Iterator 19,599 500,000 400,000 Copy 236,427 500,000 400,000 Iterator 196,010       Notice that the iterator form is now operating quite a bit faster.  But the savings really add up if you stop on average at 50% (which most searches would typically do):     Copy vs Iterator on 50% of Collection (10,000 iterations) Collection Size Num Iterated Type Total ms 5 2 Copy 5 5 2 Iterator 4 50 25 Copy 25 50 25 Iterator 16 500 250 Copy 188 500 250 Iterator 126 5000 2500 Copy 1854 5000 2500 Iterator 1226 50,000 25,000 Copy 19,839 50,000 25,000 Iterator 12,233 500,000 250,000 Copy 208,667 500,000 250,000 Iterator 122,336   Now we see that if we only expect to go on average 50% into the results, we tend to shave off around 40% of the time.  And this is only for one level deep.  If we are using this in a chain of query expressions it only adds to the savings.   So my recommendation?  If you have a resonable expectation that someone may only want to partially consume your enumerable result, I would always tend to favor an iterator.  The cost if they iterate the whole thing does not add much at all -- and if they consume only partially, you reap some really good performance gains.   Next time I'll discuss some of my favorite extensions I've created to make development life a little easier and maintainability a little better.

    Read the article

  • C#: Why Decorate When You Can Intercept

    - by James Michael Hare
    We've all heard of the old Decorator Design Pattern (here) or used it at one time or another either directly or indirectly.  A decorator is a class that wraps a given abstract class or interface and presents the same (or a superset) public interface but "decorated" with additional functionality.   As a really simplistic example, consider the System.IO.BufferedStream, it itself is a descendent of System.IO.Stream and wraps the given stream with buffering logic while still presenting System.IO.Stream's public interface:   1: Stream buffStream = new BufferedStream(rawStream); Now, let's take a look at a custom-code example.  Let's say that we have a class in our data access layer that retrieves a list of products from a database:  1: // a class that handles our CRUD operations for products 2: public class ProductDao 3: { 4: ... 5:  6: // a method that would retrieve all available products 7: public IEnumerable<Product> GetAvailableProducts() 8: { 9: var results = new List<Product>(); 10:  11: // must create the connection 12: using (var con = _factory.CreateConnection()) 13: { 14: con.ConnectionString = _productsConnectionString; 15: con.Open(); 16:  17: // create the command 18: using (var cmd = _factory.CreateCommand()) 19: { 20: cmd.Connection = con; 21: cmd.CommandText = _getAllProductsStoredProc; 22: cmd.CommandType = CommandType.StoredProcedure; 23:  24: // get a reader and pass back all results 25: using (var reader = cmd.ExecuteReader()) 26: { 27: while(reader.Read()) 28: { 29: results.Add(new Product 30: { 31: Name = reader["product_name"].ToString(), 32: ... 33: }); 34: } 35: } 36: } 37: }            38:  39: return results; 40: } 41: } Yes, you could use EF or any myriad other choices for this sort of thing, but the germaine point is that you have some operation that takes a non-trivial amount of time.  What if, during the production day I notice that my application is performing slowly and I want to see how much of that slowness is in the query versus my code.  Well, I could easily wrap the logic block in a System.Diagnostics.Stopwatch and log the results to log4net or other logging flavor of choice: 1:     // a class that handles our CRUD operations for products 2:     public class ProductDao 3:     { 4:         private static readonly ILog _log = LogManager.GetLogger(typeof(ProductDao)); 5:         ... 6:         7:         // a method that would retrieve all available products 8:         public IEnumerable<Product> GetAvailableProducts() 9:         { 10:             var results = new List<Product>(); 11:             var timer = Stopwatch.StartNew(); 12:             13:             // must create the connection 14:             using (var con = _factory.CreateConnection()) 15:             { 16:                 con.ConnectionString = _productsConnectionString; 17:                 18:                 // and all that other DB code... 19:                 ... 20:             } 21:             22:             timer.Stop(); 23:             24:             if (timer.ElapsedMilliseconds > 5000) 25:             { 26:                 _log.WarnFormat("Long query in GetAvailableProducts() took {0} ms", 27:                     timer.ElapsedMillseconds); 28:             } 29:             30:             return results; 31:         } 32:     } In my eye, this is very ugly.  It violates Single Responsibility Principle (SRP), which says that a class should only ever have one responsibility, where responsibility is often defined as a reason to change.  This class (and in particular this method) has two reasons to change: If the method of retrieving products changes. If the method of logging changes. Well, we could “simplify” this using the Decorator Design Pattern (here).  If we followed the pattern to the letter, we'd need to create a base decorator that implements the DAOs public interface and forwards to the wrapped instance.  So let's assume we break out the ProductDAO interface into IProductDAO using your refactoring tool of choice (Resharper is great for this). Now, ProductDao will implement IProductDao and get rid of all logging logic: 1:     public class ProductDao : IProductDao 2:     { 3:         // this reverts back to original version except for the interface added 4:     } 5:  And we create the base Decorator that also implements the interface and forwards all calls: 1:     public class ProductDaoDecorator : IProductDao 2:     { 3:         private readonly IProductDao _wrappedDao; 4:         5:         // constructor takes the dao to wrap 6:         public ProductDaoDecorator(IProductDao wrappedDao) 7:         { 8:             _wrappedDao = wrappedDao; 9:         } 10:         11:         ... 12:         13:         // and then all methods just forward their calls 14:         public IEnumerable<Product> GetAvailableProducts() 15:         { 16:             return _wrappedDao.GetAvailableProducts(); 17:         } 18:     } This defines our base decorator, then we can create decorators that add items of interest, and for any methods we don't decorate, we'll get the default behavior which just forwards the call to the wrapper in the base decorator: 1:     public class TimedThresholdProductDaoDecorator : ProductDaoDecorator 2:     { 3:         private static readonly ILog _log = LogManager.GetLogger(typeof(TimedThresholdProductDaoDecorator)); 4:         5:         public TimedThresholdProductDaoDecorator(IProductDao wrappedDao) : 6:             base(wrappedDao) 7:         { 8:         } 9:         10:         ... 11:         12:         public IEnumerable<Product> GetAvailableProducts() 13:         { 14:             var timer = Stopwatch.StartNew(); 15:             16:             var results = _wrapped.GetAvailableProducts(); 17:             18:             timer.Stop(); 19:             20:             if (timer.ElapsedMilliseconds > 5000) 21:             { 22:                 _log.WarnFormat("Long query in GetAvailableProducts() took {0} ms", 23:                     timer.ElapsedMillseconds); 24:             } 25:             26:             return results; 27:         } 28:     } Well, it's a bit better.  Now the logging is in its own class, and the database logic is in its own class.  But we've essentially multiplied the number of classes.  We now have 3 classes and one interface!  Now if you want to do that same logging decorating on all your DAOs, imagine the code bloat!  Sure, you can simplify and avoid creating the base decorator, or chuck it all and just inherit directly.  But regardless all of these have the problem of tying the logging logic into the code itself. Enter the Interceptors.  Things like this to me are a perfect example of when it's good to write an Interceptor using your class library of choice.  Sure, you could design your own perfectly generic decorator with delegates and all that, but personally I'm a big fan of Castle's Dynamic Proxy (here) which is actually used by many projects including Moq. What DynamicProxy allows you to do is intercept calls into any object by wrapping it with a proxy on the fly that intercepts the method and allows you to add functionality.  Essentially, the code would now look like this using DynamicProxy: 1: // Note: I like hiding DynamicProxy behind the scenes so users 2: // don't have to explicitly add reference to Castle's libraries. 3: public static class TimeThresholdInterceptor 4: { 5: // Our logging handle 6: private static readonly ILog _log = LogManager.GetLogger(typeof(TimeThresholdInterceptor)); 7:  8: // Handle to Castle's proxy generator 9: private static readonly ProxyGenerator _generator = new ProxyGenerator(); 10:  11: // generic form for those who prefer it 12: public static object Create<TInterface>(object target, TimeSpan threshold) 13: { 14: return Create(typeof(TInterface), target, threshold); 15: } 16:  17: // Form that uses type instead 18: public static object Create(Type interfaceType, object target, TimeSpan threshold) 19: { 20: return _generator.CreateInterfaceProxyWithTarget(interfaceType, target, 21: new TimedThreshold(threshold, level)); 22: } 23:  24: // The interceptor that is created to intercept the interface calls. 25: // Hidden as a private inner class so not exposing Castle libraries. 26: private class TimedThreshold : IInterceptor 27: { 28: // The threshold as a positive timespan that triggers a log message. 29: private readonly TimeSpan _threshold; 30:  31: // interceptor constructor 32: public TimedThreshold(TimeSpan threshold) 33: { 34: _threshold = threshold; 35: } 36:  37: // Intercept functor for each method invokation 38: public void Intercept(IInvocation invocation) 39: { 40: // time the method invocation 41: var timer = Stopwatch.StartNew(); 42:  43: // the Castle magic that tells the method to go ahead 44: invocation.Proceed(); 45:  46: timer.Stop(); 47:  48: // check if threshold is exceeded 49: if (timer.Elapsed > _threshold) 50: { 51: _log.WarnFormat("Long execution in {0} took {1} ms", 52: invocation.Method.Name, 53: timer.ElapsedMillseconds); 54: } 55: } 56: } 57: } Yes, it's a bit longer, but notice that: This class ONLY deals with logging long method calls, no DAO interface leftovers. This class can be used to time ANY class that has an interface or virtual methods. Personally, I like to wrap and hide the usage of DynamicProxy and IInterceptor so that anyone who uses this class doesn't need to know to add a Castle library reference.  As far as they are concerned, they're using my interceptor.  If I change to a new library if a better one comes along, they're insulated. Now, all we have to do to use this is to tell it to wrap our ProductDao and it does the rest: 1: // wraps a new ProductDao with a timing interceptor with a threshold of 5 seconds 2: IProductDao dao = TimeThresholdInterceptor.Create<IProductDao>(new ProductDao(), 5000); Automatic decoration of all methods!  You can even refine the proxy so that it only intercepts certain methods. This is ideal for so many things.  These are just some of the interceptors we've dreamed up and use: Log parameters and returns of methods to XML for auditing. Block invocations to methods and return default value (stubbing). Throw exception if certain methods are called (good for blocking access to deprecated methods). Log entrance and exit of a method and the duration. Log a message if a method takes more than a given time threshold to execute. Whether you use DynamicProxy or some other technology, I hope you see the benefits this adds.  Does it completely eliminate all need for the Decorator pattern?  No, there may still be cases where you want to decorate a particular class with functionality that doesn't apply to the world at large. But for all those cases where you are using Decorator to add functionality that's truly generic.  I strongly suggest you give this a try!

    Read the article

  • C#/.NET Little Wonders: Using &lsquo;default&rsquo; to Get Default Values

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. Today’s little wonder is another of those small items that can help a lot in certain situations, especially when writing generics.  In particular, it is useful in determining what the default value of a given type would be. The Problem: what’s the default value for a generic type? There comes a time when you’re writing generic code where you may want to set an item of a given generic type.  Seems simple enough, right?  We’ll let’s see! Let’s say we want to query a Dictionary<TKey, TValue> for a given key and get back the value, but if the key doesn’t exist, we’d like a default value instead of throwing an exception. So, for example, we might have a the following dictionary defined: 1: var lookup = new Dictionary<int, string> 2: { 3: { 1, "Apple" }, 4: { 2, "Orange" }, 5: { 3, "Banana" }, 6: { 4, "Pear" }, 7: { 9, "Peach" } 8: }; And using those definitions, perhaps we want to do something like this: 1: // assume a default 2: string value = "Unknown"; 3:  4: // if the item exists in dictionary, get its value 5: if (lookup.ContainsKey(5)) 6: { 7: value = lookup[5]; 8: } But that’s inefficient, because then we’re double-hashing (once for ContainsKey() and once for the indexer).  Well, to avoid the double-hashing, we could use TryGetValue() instead: 1: string value; 2:  3: // if key exists, value will be put in value, if not default it 4: if (!lookup.TryGetValue(5, out value)) 5: { 6: value = "Unknown"; 7: } But the “flow” of using of TryGetValue() can get clunky at times when you just want to assign either the value or a default to a variable.  Essentially it’s 3-ish lines (depending on formatting) for 1 assignment.  So perhaps instead we’d like to write an extension method to support a cleaner interface that will return a default if the item isn’t found: 1: public static class DictionaryExtensions 2: { 3: public static TValue GetValueOrDefault<TKey, TValue>(this Dictionary<TKey, TValue> dict, 4: TKey key, TValue defaultIfNotFound) 5: { 6: TValue value; 7:  8: // value will be the result or the default for TValue 9: if (!dict.TryGetValue(key, out value)) 10: { 11: value = defaultIfNotFound; 12: } 13:  14: return value; 15: } 16: } 17:  So this creates an extension method on Dictionary<TKey, TValue> that will attempt to get a value using the given key, and will return the defaultIfNotFound as a stand-in if the key does not exist. This code compiles, fine, but what if we would like to go one step further and allow them to specify a default if not found, or accept the default for the type?  Obviously, we could overload the method to take the default or not, but that would be duplicated code and a bit heavy for just specifying a default.  It seems reasonable that we could set the not found value to be either the default for the type, or the specified value. So what if we defaulted the type to null? 1: public static TValue GetValueOrDefault<TKey, TValue>(this Dictionary<TKey, TValue> dict, 2: TKey key, TValue defaultIfNotFound = null) // ... No, this won’t work, because only reference types (and Nullable<T> wrapped types due to syntactical sugar) can be assigned to null.  So what about a calling parameterless constructor? 1: public static TValue GetValueOrDefault<TKey, TValue>(this Dictionary<TKey, TValue> dict, 2: TKey key, TValue defaultIfNotFound = new TValue()) // ... No, this won’t work either for several reasons.  First, we’d expect a reference type to return null, not an “empty” instance.  Secondly, not all reference types have a parameter-less constructor (string for example does not).  And finally, a constructor cannot be determined at compile-time, while default values can. The Solution: default(T) – returns the default value for type T Many of us know the default keyword for its uses in switch statements as the default case.  But it has another use as well: it can return us the default value for a given type.  And since it generates the same defaults that default field initialization uses, it can be determined at compile-time as well. For example: 1: var x = default(int); // x is 0 2:  3: var y = default(bool); // y is false 4:  5: var z = default(string); // z is null 6:  7: var t = default(TimeSpan); // t is a TimeSpan with Ticks == 0 8:  9: var n = default(int?); // n is a Nullable<int> with HasValue == false Notice that for numeric types the default is 0, and for reference types the default is null.  In addition, for struct types, the value is a default-constructed struct – which simply means a struct where every field has their default value (hence 0 Ticks for TimeSpan, etc.). So using this, we could modify our code to this: 1: public static class DictionaryExtensions 2: { 3: public static TValue GetValueOrDefault<TKey, TValue>(this Dictionary<TKey, TValue> dict, 4: TKey key, TValue defaultIfNotFound = default(TValue)) 5: { 6: TValue value; 7:  8: // value will be the result or the default for TValue 9: if (!dict.TryGetValue(key, out value)) 10: { 11: value = defaultIfNotFound; 12: } 13:  14: return value; 15: } 16: } Now, if defaultIfNotFound is unspecified, it will use default(TValue) which will be the default value for whatever value type the dictionary holds.  So let’s consider how we could use this: 1: lookup.GetValueOrDefault(1); // returns “Apple” 2:  3: lookup.GetValueOrDefault(5); // returns null 4:  5: lookup.GetValueOrDefault(5, “Unknown”); // returns “Unknown” 6:  Again, do not confuse a parameter-less constructor with the default value for a type.  Remember that the default value for any type is the compile-time default for any instance of that type (0 for numeric, false for bool, null for reference types, and struct will all default fields for struct).  Consider the difference: 1: // both zero 2: int i1 = default(int); 3: int i2 = new int(); 4:  5: // both “zeroed” structs 6: var dt1 = default(DateTime); 7: var dt2 = new DateTime(); 8:  9: // sb1 is null, sb2 is an “empty” string builder 10: var sb1 = default(StringBuilder()); 11: var sb2 = new StringBuilder(); So in the above code, notice that the value types all resolve the same whether using default or parameter-less construction.  This is because a value type is never null (even Nullable<T> wrapped types are never “null” in a reference sense), they will just by default contain fields with all default values. However, for reference types, the default is null and not a constructed instance.  Also it should be noted that not all classes have parameter-less constructors (string, for instance, doesn’t have one – and doesn’t need one). Summary Whenever you need to get the default value for a type, especially a generic type, consider using the default keyword.  This handy word will give you the default value for the given type at compile-time, which can then be used for initialization, optional parameters, etc. Technorati Tags: C#,CSharp,.NET,Little Wonders,default

    Read the article

  • Monitoring Database disk space

    - by Michael Freidgeim
    An article Data files: To Autogrow Or Not To Autogrow? recommends NOT to rely on auto-grow, because it causing delays in unplanned times.We should mtonitor database files(both data and log), and if they close to max capacity, manually increase the size. However it doesn't give references, how to monitor the free space inside databases. I've tried to look how to do it. It can be done manually using   execute sp_spaceused for the database in question or  sp_SOS (can be downloaded from http://searchsqlserver.techtarget.com/tip/Find-size-of-SQL-Server-tables-and-other-objects-with-stored-procedure)Alternatively you can run SQL commands as suggested in Http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=82359 by Michael Valentine Jonesselect [FREE_SPACE_MB] = convert(decimal(12,2),round((a.size-fileproperty(a.name,'SpaceUsed'))/128.000,2)) from dbo.sysfiles aMore useful article Monitor database file sizes with SQL Server Jobs describes how to setup monitoring Finally I found the excellent articleManaging Database Data Usage With Custom Space Alerts, that can be followed even support personnel without much DBA experience.

    Read the article

  • C#: Does an IDisposable in a Halted Iterator Dispose?

    - by James Michael Hare
    If that sounds confusing, let me give you an example. Let's say you expose a method to read a database of products, and instead of returning a List<Product> you return an IEnumerable<Product> in iterator form (yield return). This accomplishes several good things: The IDataReader is not passed out of the Data Access Layer which prevents abstraction leak and resource leak potentials. You don't need to construct a full List<Product> in memory (which could be very big) if you just want to forward iterate once. If you only want to consume up to a certain point in the list, you won't incur the database cost of looking up the other items. This could give us an example like: 1: // a sample data access object class to do standard CRUD operations. 2: public class ProductDao 3: { 4: private DbProviderFactory _factory = SqlClientFactory.Instance 5:  6: // a method that would retrieve all available products 7: public IEnumerable<Product> GetAvailableProducts() 8: { 9: // must create the connection 10: using (var con = _factory.CreateConnection()) 11: { 12: con.ConnectionString = _productsConnectionString; 13: con.Open(); 14:  15: // create the command 16: using (var cmd = _factory.CreateCommand()) 17: { 18: cmd.Connection = con; 19: cmd.CommandText = _getAllProductsStoredProc; 20: cmd.CommandType = CommandType.StoredProcedure; 21:  22: // get a reader and pass back all results 23: using (var reader = cmd.ExecuteReader()) 24: { 25: while(reader.Read()) 26: { 27: yield return new Product 28: { 29: Name = reader["product_name"].ToString(), 30: ... 31: }; 32: } 33: } 34: } 35: } 36: } 37: } The database details themselves are irrelevant. I will say, though, that I'm a big fan of using the System.Data.Common classes instead of your provider specific counterparts directly (SqlCommand, OracleCommand, etc). This lets you mock your data sources easily in unit testing and also allows you to swap out your provider in one line of code. In fact, one of the shared components I'm most proud of implementing was our group's DatabaseUtility library that simplifies all the database access above into one line of code in a thread-safe and provider-neutral way. I went with my own flavor instead of the EL due to the fact I didn't want to force internal company consumers to use the EL if they didn't want to, and it made it easy to allow them to mock their database for unit testing by providing a MockCommand, MockConnection, etc that followed the System.Data.Common model. One of these days I'll blog on that if anyone's interested. Regardless, you often have situations like the above where you are consuming and iterating through a resource that must be closed once you are finished iterating. For the reasons stated above, I didn't want to return IDataReader (that would force them to remember to Dispose it), and I didn't want to return List<Product> (that would force them to hold all products in memory) -- but the first time I wrote this, I was worried. What if you never consume the last item and exit the loop? Are the reader, command, and connection all disposed correctly? Of course, I was 99.999999% sure the creators of C# had already thought of this and taken care of it, but inspection in Reflector was difficult due to the nature of the state machines yield return generates, so I decided to try a quick example program to verify whether or not Dispose() will be called when an iterator is broken from outside the iterator itself -- i.e. before the iterator reports there are no more items. So I wrote a quick Sequencer class with a Dispose() method and an iterator for it. Yes, it is COMPLETELY contrived: 1: // A disposable sequence of int -- yes this is completely contrived... 2: internal class Sequencer : IDisposable 3: { 4: private int _i = 0; 5: private readonly object _mutex = new object(); 6:  7: // Constructs an int sequence. 8: public Sequencer(int start) 9: { 10: _i = start; 11: } 12:  13: // Gets the next integer 14: public int GetNext() 15: { 16: lock (_mutex) 17: { 18: return _i++; 19: } 20: } 21:  22: // Dispose the sequence of integers. 23: public void Dispose() 24: { 25: // force output immediately (flush the buffer) 26: Console.WriteLine("Disposed with last sequence number of {0}!", _i); 27: Console.Out.Flush(); 28: } 29: } And then I created a generator (infinite-loop iterator) that did the using block for auto-Disposal: 1: // simply defines an extension method off of an int to start a sequence 2: public static class SequencerExtensions 3: { 4: // generates an infinite sequence starting at the specified number 5: public static IEnumerable<int> GetSequence(this int starter) 6: { 7: // note the using here, will call Dispose() when block terminated. 8: using (var seq = new Sequencer(starter)) 9: { 10: // infinite loop on this generator, means must be bounded by caller! 11: while(true) 12: { 13: yield return seq.GetNext(); 14: } 15: } 16: } 17: } This is really the same conundrum as the database problem originally posed. Here we are using iteration (yield return) over a large collection (infinite sequence of integers). If we cut the sequence short by breaking iteration, will that using block exit and hence, Dispose be called? Well, let's see: 1: // The test program class 2: public class IteratorTest 3: { 4: // The main test method. 5: public static void Main() 6: { 7: Console.WriteLine("Going to consume 10 of infinite items"); 8: Console.Out.Flush(); 9:  10: foreach(var i in 0.GetSequence()) 11: { 12: // could use TakeWhile, but wanted to output right at break... 13: if(i >= 10) 14: { 15: Console.WriteLine("Breaking now!"); 16: Console.Out.Flush(); 17: break; 18: } 19:  20: Console.WriteLine(i); 21: Console.Out.Flush(); 22: } 23:  24: Console.WriteLine("Done with loop."); 25: Console.Out.Flush(); 26: } 27: } So, what do we see? Do we see the "Disposed" message from our dispose, or did the Dispose get skipped because from an "eyeball" perspective we should be locked in that infinite generator loop? Here's the results: 1: Going to consume 10 of infinite items 2: 0 3: 1 4: 2 5: 3 6: 4 7: 5 8: 6 9: 7 10: 8 11: 9 12: Breaking now! 13: Disposed with last sequence number of 11! 14: Done with loop. Yes indeed, when we break the loop, the state machine that C# generates for yield iterate exits the iteration through the using blocks and auto-disposes the IDisposable correctly. I must admit, though, the first time I wrote one, I began to wonder and that led to this test. If you've never seen iterators before (I wrote a previous entry here) the infinite loop may throw you, but you have to keep in mind it is not a linear piece of code, that every time you hit a "yield return" it cedes control back to the state machine generated for the iterator. And this state machine, I'm happy to say, is smart enough to clean up the using blocks correctly. I suspected those wily guys and gals at Microsoft engineered it well, and I wasn't disappointed. But, I've been bitten by assumptions before, so it's good to test and see. Yes, maybe you knew it would or figured it would, but isn't it nice to know? And as those campy 80s G.I. Joe cartoon public service reminders always taught us, "Knowing is half the battle...". Technorati Tags: C#,.NET

    Read the article

  • C#: LINQ vs foreach - Round 1.

    - by James Michael Hare
    So I was reading Peter Kellner's blog entry on Resharper 5.0 and its LINQ refactoring and thought that was very cool.  But that raised a point I had always been curious about in my head -- which is a better choice: manual foreach loops or LINQ?    The answer is not really clear-cut.  There are two sides to any code cost arguments: performance and maintainability.  The first of these is obvious and quantifiable.  Given any two pieces of code that perform the same function, you can run them side-by-side and see which piece of code performs better.   Unfortunately, this is not always a good measure.  Well written assembly language outperforms well written C++ code, but you lose a lot in maintainability which creates a big techncial debt load that is hard to offset as the application ages.  In contrast, higher level constructs make the code more brief and easier to understand, hence reducing technical cost.   Now, obviously in this case we're not talking two separate languages, we're comparing doing something manually in the language versus using a higher-order set of IEnumerable extensions that are in the System.Linq library.   Well, before we discuss any further, let's look at some sample code and the numbers.  First, let's take a look at the for loop and the LINQ expression.  This is just a simple find comparison:       // find implemented via LINQ     public static bool FindViaLinq(IEnumerable<int> list, int target)     {         return list.Any(item => item == target);     }         // find implemented via standard iteration     public static bool FindViaIteration(IEnumerable<int> list, int target)     {         foreach (var i in list)         {             if (i == target)             {                 return true;             }         }           return false;     }   Okay, looking at this from a maintainability point of view, the Linq expression is definitely more concise (8 lines down to 1) and is very readable in intention.  You don't have to actually analyze the behavior of the loop to determine what it's doing.   So let's take a look at performance metrics from 100,000 iterations of these methods on a List<int> of varying sizes filled with random data.  For this test, we fill a target array with 100,000 random integers and then run the exact same pseudo-random targets through both searches.                       List<T> On 100,000 Iterations     Method      Size     Total (ms)  Per Iteration (ms)  % Slower     Any         10       26          0.00046             30.00%     Iteration   10       20          0.00023             -     Any         100      116         0.00201             18.37%     Iteration   100      98          0.00118             -     Any         1000     1058        0.01853             16.78%     Iteration   1000     906         0.01155             -     Any         10,000   10,383      0.18189             17.41%     Iteration   10,000   8843        0.11362             -     Any         100,000  104,004     1.8297              18.27%     Iteration   100,000  87,941      1.13163             -   The LINQ expression is running about 17% slower for average size collections and worse for smaller collections.  Presumably, this is due to the overhead of the state machine used to track the iterators for the yield returns in the LINQ expressions, which seems about right in a tight loop such as this.   So what about other LINQ expressions?  After all, Any() is one of the more trivial ones.  I decided to try the TakeWhile() algorithm using a Count() to get the position stopped like the sample Pete was using in his blog that Resharper refactored for him into LINQ:       // Linq form     public static int GetTargetPosition1(IEnumerable<int> list, int target)     {         return list.TakeWhile(item => item != target).Count();     }       // traditionally iterative form     public static int GetTargetPosition2(IEnumerable<int> list, int target)     {         int count = 0;           foreach (var i in list)         {             if(i == target)             {                 break;             }               ++count;         }           return count;     }   Once again, the LINQ expression is much shorter, easier to read, and should be easier to maintain over time, reducing the cost of technical debt.  So I ran these through the same test data:                       List<T> On 100,000 Iterations     Method      Size     Total (ms)  Per Iteration (ms)  % Slower     TakeWhile   10       41          0.00041             128%     Iteration   10       18          0.00018             -     TakeWhile   100      171         0.00171             88%     Iteration   100      91          0.00091             -     TakeWhile   1000     1604        0.01604             94%     Iteration   1000     825         0.00825             -     TakeWhile   10,000   15765       0.15765             92%     Iteration   10,000   8204        0.08204             -     TakeWhile   100,000  156950      1.5695              92%     Iteration   100,000  81635       0.81635             -     Wow!  I expected some overhead due to the state machines iterators produce, but 90% slower?  That seems a little heavy to me.  So then I thought, well, what if TakeWhile() is not the right tool for the job?  The problem is TakeWhile returns each item for processing using yield return, whereas our for-loop really doesn't care about the item beyond using it as a stop condition to evaluate. So what if that back and forth with the iterator state machine is the problem?  Well, we can quickly create an (albeit ugly) lambda that uses the Any() along with a count in a closure (if a LINQ guru knows a better way PLEASE let me know!), after all , this is more consistent with what we're trying to do, we're trying to find the first occurence of an item and halt once we find it, we just happen to be counting on the way.  This mostly matches Any().       // a new method that uses linq but evaluates the count in a closure.     public static int TakeWhileViaLinq2(IEnumerable<int> list, int target)     {         int count = 0;         list.Any(item =>             {                 if(item == target)                 {                     return true;                 }                   ++count;                 return false;             });         return count;     }     Now how does this one compare?                         List<T> On 100,000 Iterations     Method         Size     Total (ms)  Per Iteration (ms)  % Slower     TakeWhile      10       41          0.00041             128%     Any w/Closure  10       23          0.00023             28%     Iteration      10       18          0.00018             -     TakeWhile      100      171         0.00171             88%     Any w/Closure  100      116         0.00116             27%     Iteration      100      91          0.00091             -     TakeWhile      1000     1604        0.01604             94%     Any w/Closure  1000     1101        0.01101             33%     Iteration      1000     825         0.00825             -     TakeWhile      10,000   15765       0.15765             92%     Any w/Closure  10,000   10802       0.10802             32%     Iteration      10,000   8204        0.08204             -     TakeWhile      100,000  156950      1.5695              92%     Any w/Closure  100,000  108378      1.08378             33%     Iteration      100,000  81635       0.81635             -     Much better!  It seems that the overhead of TakeAny() returning each item and updating the state in the state machine is drastically reduced by using Any() since Any() iterates forward until it finds the value we're looking for -- for the task we're attempting to do.   So the lesson there is, make sure when you use a LINQ expression you're choosing the best expression for the job, because if you're doing more work than you really need, you'll have a slower algorithm.  But this is true of any choice of algorithm or collection in general.     Even with the Any() with the count in the closure it is still about 30% slower, but let's consider that angle carefully.  For a list of 100,000 items, it was the difference between 1.01 ms and 0.82 ms roughly in a List<T>.  That's really not that bad at all in the grand scheme of things.  Even running at 90% slower with TakeWhile(), for the vast majority of my projects, an extra millisecond to save potential errors in the long term and improve maintainability is a small price to pay.  And if your typical list is 1000 items or less we're talking only microseconds worth of difference.   It's like they say: 90% of your performance bottlenecks are in 2% of your code, so over-optimizing almost never pays off.  So personally, I'll take the LINQ expression wherever I can because they will be easier to read and maintain (thus reducing technical debt) and I can rely on Microsoft's development to have coded and unit tested those algorithm fully for me instead of relying on a developer to code the loop logic correctly.   If something's 90% slower, yes, it's worth keeping in mind, but it's really not until you start get magnitudes-of-order slower (10x, 100x, 1000x) that alarm bells should really go off.  And if I ever do need that last millisecond of performance?  Well then I'll optimize JUST THAT problem spot.  To me it's worth it for the readability, speed-to-market, and maintainability.

    Read the article

  • C# Extension Methods - To Extend or Not To Extend...

    - by James Michael Hare
    I've been thinking a lot about extension methods lately, and I must admit I both love them and hate them. They are a lot like sugar, they taste so nice and sweet, but they'll rot your teeth if you eat them too much.   I can't deny that they aren't useful and very handy. One of the major components of the Shared Component library where I work is a set of useful extension methods. But, I also can't deny that they tend to be overused and abused to willy-nilly extend every living type.   So what constitutes a good extension method? Obviously, you can write an extension method for nearly anything whether it is a good idea or not. Many times, in fact, an idea seems like a good extension method but in retrospect really doesn't fit.   So what's the litmus test? To me, an extension method should be like in the movies when a person runs into their twin, separated at birth. You just know you're related. Obviously, that's hard to quantify, so let's try to put a few rules-of-thumb around them.   A good extension method should:     Apply to any possible instance of the type it extends.     Simplify logic and improve readability/maintainability.     Apply to the most specific type or interface applicable.     Be isolated in a namespace so that it does not pollute IntelliSense.     So let's look at a few examples in relation to these rules.   The first rule, to me, is the most important of all. Once again, it bears repeating, a good extension method should apply to all possible instances of the type it extends. It should feel like the long lost relative that should have been included in the original class but somehow was missing from the family tree.    Take this nifty little int extension, I saw this once in a blog and at first I really thought it was pretty cool, but then I started noticing a code smell I couldn't quite put my finger on. So let's look:       public static class IntExtensinos     {         public static int Seconds(int num)         {             return num * 1000;         }           public static int Minutes(int num)         {             return num * 60000;         }     }     This is so you could do things like:       ...     Thread.Sleep(5.Seconds());     ...     proxy.Timeout = 1.Minutes();     ...     Awww, you say, that's cute! Well, that's the problem, it's kitschy and it doesn't always apply (and incidentally you could achieve the same thing with TimeStamp.FromSeconds(5)). It's syntactical candy that looks cool, but tends to rot and pollute the code. It would allow things like:       total += numberOfTodaysOrders.Seconds();     which makes no sense and should never be allowed. The problem is you're applying an extension method to a logical domain, not a type domain. That is, the extension method Seconds() doesn't really apply to ALL ints, it applies to ints that are representative of time that you want to convert to milliseconds.    Do you see what I mean? The two problems, in a nutshell, are that a) Seconds() called off a non-time value makes no sense and b) calling Seconds() off something to pass to something that does not take milliseconds will be off by a factor of 1000 or worse.   Thus, in my mind, you should only ever have an extension method that applies to the whole domain of that type.   For example, this is one of my personal favorites:       public static bool IsBetween<T>(this T value, T low, T high)         where T : IComparable<T>     {         return value.CompareTo(low) >= 0 && value.CompareTo(high) <= 0;     }   This allows you to check if any IComparable<T> is within an upper and lower bound. Think of how many times you type something like:       if (response.Employee.Address.YearsAt >= 2         && response.Employee.Address.YearsAt <= 10)     {     ...     }     Now, you can instead type:       if(response.Employee.Address.YearsAt.IsBetween(2, 10))     {     ...     }     Note that this applies to all IComparable<T> -- that's ints, chars, strings, DateTime, etc -- and does not depend on any logical domain. In addition, it satisfies the second point and actually makes the code more readable and maintainable.   Let's look at the third point. In it we said that an extension method should fit the most specific interface or type possible. Now, I'm not saying if you have something that applies to enumerables, you create an extension for List, Array, Dictionary, etc (though you may have reasons for doing so), but that you should beware of making things TOO general.   For example, let's say we had an extension method like this:       public static T ConvertTo<T>(this object value)     {         return (T)Convert.ChangeType(value, typeof(T));     }         This lets you do more fluent conversions like:       double d = "5.0".ConvertTo<double>();     However, if you dig into Reflector (LOVE that tool) you will see that if the type you are calling on does not implement IConvertible, what you convert to MUST be the exact type or it will throw an InvalidCastException. Now this may or may not be what you want in this situation, and I leave that up to you. Things like this would fail:       object value = new Employee();     ...     // class cast exception because typeof(IEmployee) != typeof(Employee)     IEmployee emp = value.ConvertTo<IEmployee>();       Yes, that's a downfall of working with Convertible in general, but if you wanted your fluent interface to be more type-safe so that ConvertTo were only callable on IConvertibles (and let casting be a manual task), you could easily make it:         public static T ConvertTo<T>(this IConvertible value)     {         return (T)Convert.ChangeType(value, typeof(T));     }         This is what I mean by choosing the best type to extend. Consider that if we used the previous (object) version, every time we typed a dot ('.') on an instance we'd pull up ConvertTo() whether it was applicable or not. By filtering our extension method down to only valid types (those that implement IConvertible) we greatly reduce our IntelliSense pollution and apply a good level of compile-time correctness.   Now my fourth rule is just my general rule-of-thumb. Obviously, you can make extension methods as in-your-face as you want. I included all mine in my work libraries in its own sub-namespace, something akin to:       namespace Shared.Core.Extensions { ... }     This is in a library called Shared.Core, so just referencing the Core library doesn't pollute your IntelliSense, you have to actually do a using on Shared.Core.Extensions to bring the methods in. This is very similar to the way Microsoft puts its extension methods in System.Linq. This way, if you want 'em, you use the appropriate namespace. If you don't want 'em, they won't pollute your namespace.   To really make this work, however, that namespace should only include extension methods and subordinate types those extensions themselves may use. If you plant other useful classes in those namespaces, once a user includes it, they get all the extensions too.   Also, just as a personal preference, extension methods that aren't simply syntactical shortcuts, I like to put in a static utility class and then have extension methods for syntactical candy. For instance, I think it imaginable that any object could be converted to XML:       namespace Shared.Core     {         // A collection of XML Utility classes         public static class XmlUtility         {             ...             // Serialize an object into an xml string             public static string ToXml(object input)             {                 var xs = new XmlSerializer(input.GetType());                   // use new UTF8Encoding here, not Encoding.UTF8. The later includes                 // the BOM which screws up subsequent reads, the former does not.                 using (var memoryStream = new MemoryStream())                 using (var xmlTextWriter = new XmlTextWriter(memoryStream, new UTF8Encoding()))                 {                     xs.Serialize(xmlTextWriter, input);                     return Encoding.UTF8.GetString(memoryStream.ToArray());                 }             }             ...         }     }   I also wanted to be able to call this from an object like:       value.ToXml();     But here's the problem, if i made this an extension method from the start with that one little keyword "this", it would pop into IntelliSense for all objects which could be very polluting. Instead, I put the logic into a utility class so that users have the choice of whether or not they want to use it as just a class and not pollute IntelliSense, then in my extensions namespace, I add the syntactical candy:       namespace Shared.Core.Extensions     {         public static class XmlExtensions         {             public static string ToXml(this object value)             {                 return XmlUtility.ToXml(value);             }         }     }   So now it's the best of both worlds. On one hand, they can use the utility class if they don't want to pollute IntelliSense, and on the other hand they can include the Extensions namespace and use as an extension if they want. The neat thing is it also adheres to the Single Responsibility Principle. The XmlUtility is responsible for converting objects to XML, and the XmlExtensions is responsible for extending object's interface for ToXml().

    Read the article

  • C#/.NET Little Wonders: Tuples and Tuple Factory Methods

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can really help improve your code by making it easier to write and maintain.  This week, we look at the System.Tuple class and the handy factory methods for creating a Tuple by inferring the types. What is a Tuple? The System.Tuple is a class that tends to inspire a reaction in one of two ways: love or hate.  Simply put, a Tuple is a data structure that holds a specific number of items of a specific type in a specific order.  That is, a Tuple<int, string, int> is a tuple that contains exactly three items: an int, followed by a string, followed by an int.  The sequence is important not only to distinguish between two members of the tuple with the same type, but also for comparisons between tuples.  Some people tend to love tuples because they give you a quick way to combine multiple values into one result.  This can be handy for returning more than one value from a method (without using out or ref parameters), or for creating a compound key to a Dictionary, or any other purpose you can think of.  They can be especially handy when passing a series of items into a call that only takes one object parameter, such as passing an argument to a thread's startup routine.  In these cases, you do not need to define a class, simply create a tuple containing the types you wish to return, and you are ready to go? On the other hand, there are some people who see tuples as a crutch in object-oriented design.  They may view the tuple as a very watered down class with very little inherent semantic meaning.  As an example, what if you saw this in a piece of code: 1: var x = new Tuple<int, int>(2, 5); What are the contents of this tuple?  If the tuple isn't named appropriately, and if the contents of each member are not self evident from the type this can be a confusing question.  The people who tend to be against tuples would rather you explicitly code a class to contain the values, such as: 1: public sealed class RetrySettings 2: { 3: public int TimeoutSeconds { get; set; } 4: public int MaxRetries { get; set; } 5: } Here, the meaning of each int in the class is much more clear, but it's a bit more work to create the class and can clutter a solution with extra classes. So, what's the correct way to go?  That's a tough call.  You will have people who will argue quite well for one or the other.  For me, I consider the Tuple to be a tool to make it easy to collect values together easily.  There are times when I just need to combine items for a key or a result, in which case the tuple is short lived and so the meaning isn't easily lost and I feel this is a good compromise.  If the scope of the collection of items, though, is more application-wide I tend to favor creating a full class. Finally, it should be noted that tuples are immutable.  That means they are assigned a value at construction, and that value cannot be changed.  Now, of course if the tuple contains an item of a reference type, this means that the reference is immutable and not the item referred to. Tuples from 1 to N Tuples come in all sizes, you can have as few as one element in your tuple, or as many as you like.  However, since C# generics can't have an infinite generic type parameter list, any items after 7 have to be collapsed into another tuple, as we'll show shortly. So when you declare your tuple from sizes 1 (a 1-tuple or singleton) to 7 (a 7-tuple or septuple), simply include the appropriate number of type arguments: 1: // a singleton tuple of integer 2: Tuple<int> x; 3:  4: // or more 5: Tuple<int, double> y; 6:  7: // up to seven 8: Tuple<int, double, char, double, int, string, uint> z; Anything eight and above, and we have to nest tuples inside of tuples.  The last element of the 8-tuple is the generic type parameter Rest, this is special in that the Tuple checks to make sure at runtime that the type is a Tuple.  This means that a simple 8-tuple must nest a singleton tuple (one of the good uses for a singleton tuple, by the way) for the Rest property. 1: // an 8-tuple 2: Tuple<int, int, int, int, int, double, char, Tuple<string>> t8; 3:  4: // an 9-tuple 5: Tuple<int, int, int, int, double, int, char, Tuple<string, DateTime>> t9; 6:  7: // a 16-tuple 8: Tuple<int, int, int, int, int, int, int, Tuple<int, int, int, int, int, int, int, Tuple<int,int>>> t14; Notice that on the 14-tuple we had to have a nested tuple in the nested tuple.  Since the tuple can only support up to seven items, and then a rest element, that means that if the nested tuple needs more than seven items you must nest in it as well.  Constructing tuples Constructing tuples is just as straightforward as declaring them.  That said, you have two distinct ways to do it.  The first is to construct the tuple explicitly yourself: 1: var t3 = new Tuple<int, string, double>(1, "Hello", 3.1415927); This creates a triple that has an int, string, and double and assigns the values 1, "Hello", and 3.1415927 respectively.  Make sure the order of the arguments supplied matches the order of the types!  Also notice that we can't half-assign a tuple or create a default tuple.  Tuples are immutable (you can't change the values once constructed), so thus you must provide all values at construction time. Another way to easily create tuples is to do it implicitly using the System.Tuple static class's Create() factory methods.  These methods (much like C++'s std::make_pair method) will infer the types from the method call so you don't have to type them in.  This can dramatically reduce the amount of typing required especially for complex tuples! 1: // this 4-tuple is typed Tuple<int, double, string, char> 2: var t4 = Tuple.Create(42, 3.1415927, "Love", 'X'); Notice how much easier it is to use the factory methods and infer the types?  This can cut down on typing quite a bit when constructing tuples.  The Create() factory method can construct from a 1-tuple (singleton) to an 8-tuple (octuple), which of course will be a octuple where the last item is a singleton as we described before in nested tuples. Accessing tuple members Accessing a tuple's members is simplicity itself… mostly.  The properties for accessing up to the first seven items are Item1, Item2, …, Item7.  If you have an octuple or beyond, the final property is Rest which will give you the nested tuple which you can then access in a similar matter.  Once again, keep in mind that these are read-only properties and cannot be changed. 1: // for septuples and below, use the Item properties 2: var t1 = Tuple.Create(42, 3.14); 3:  4: Console.WriteLine("First item is {0} and second is {1}", 5: t1.Item1, t1.Item2); 6:  7: // for octuples and above, use Rest to retrieve nested tuple 8: var t9 = new Tuple<int, int, int, int, int, int, int, 9: Tuple<int, int>>(1,2,3,4,5,6,7,Tuple.Create(8,9)); 10:  11: Console.WriteLine("The 8th item is {0}", t9.Rest.Item1); Tuples are IStructuralComparable and IStructuralEquatable Most of you know about IComparable and IEquatable, what you may not know is that there are two sister interfaces to these that were added in .NET 4.0 to help support tuples.  These IStructuralComparable and IStructuralEquatable make it easy to compare two tuples for equality and ordering.  This is invaluable for sorting, and makes it easy to use tuples as a compound-key to a dictionary (one of my favorite uses)! Why is this so important?  Remember when we said that some folks think tuples are too generic and you should define a custom class?  This is all well and good, but if you want to design a custom class that can automatically order itself based on its members and build a hash code for itself based on its members, it is no longer a trivial task!  Thankfully the tuple does this all for you through the explicit implementations of these interfaces. For equality, two tuples are equal if all elements are equal between the two tuples, that is if t1.Item1 == t2.Item1 and t1.Item2 == t2.Item2, and so on.  For ordering, it's a little more complex in that it compares the two tuples one at a time starting at Item1, and sees which one has a smaller Item1.  If one has a smaller Item1, it is the smaller tuple.  However if both Item1 are the same, it compares Item2 and so on. For example: 1: var t1 = Tuple.Create(1, 3.14, "Hi"); 2: var t2 = Tuple.Create(1, 3.14, "Hi"); 3: var t3 = Tuple.Create(2, 2.72, "Bye"); 4:  5: // true, t1 == t2 because all items are == 6: Console.WriteLine("t1 == t2 : " + t1.Equals(t2)); 7:  8: // false, t1 != t2 because at least one item different 9: Console.WriteLine("t2 == t2 : " + t2.Equals(t3)); The actual implementation of IComparable, IEquatable, IStructuralComparable, and IStructuralEquatable is explicit, so if you want to invoke the methods defined there you'll have to manually cast to the appropriate interface: 1: // true because t1.Item1 < t3.Item1, if had been same would check Item2 and so on 2: Console.WriteLine("t1 < t3 : " + (((IComparable)t1).CompareTo(t3) < 0)); So, as I mentioned, the fact that tuples are automatically equatable and comparable (provided the types you use define equality and comparability as needed) means that we can use tuples for compound keys in hashing and ordering containers like Dictionary and SortedList: 1: var tupleDict = new Dictionary<Tuple<int, double, string>, string>(); 2:  3: tupleDict.Add(t1, "First tuple"); 4: tupleDict.Add(t2, "Second tuple"); 5: tupleDict.Add(t3, "Third tuple"); Because IEquatable defines GetHashCode(), and Tuple's IStructuralEquatable implementation creates this hash code by combining the hash codes of the members, this makes using the tuple as a complex key quite easy!  For example, let's say you are creating account charts for a financial application, and you want to cache those charts in a Dictionary based on the account number and the number of days of chart data (for example, a 1 day chart, 1 week chart, etc): 1: // the account number (string) and number of days (int) are key to get cached chart 2: var chartCache = new Dictionary<Tuple<string, int>, IChart>(); Summary The System.Tuple, like any tool, is best used where it will achieve a greater benefit.  I wouldn't advise overusing them, on objects with a large scope or it can become difficult to maintain.  However, when used properly in a well defined scope they can make your code cleaner and easier to maintain by removing the need for extraneous POCOs and custom property hashing and ordering. They are especially useful in defining compound keys to IDictionary implementations and for returning multiple values from methods, or passing multiple values to a single object parameter. Tweet Technorati Tags: C#,.NET,Tuple,Little Wonders

    Read the article

  • A Good Developer is So Hard to Find

    - by James Michael Hare
    Let me start out by saying I want to damn the writers of the Toughest Developer Puzzle Ever – 2. It is eating every last shred of my free time! But as I've been churning through each puzzle and marvelling at the brain teasers and trivia within, I began to think about interviewing developers and why it seems to be so hard to find good ones.  The problem is, it seems like no matter how hard we try to find the perfect way to separate the chaff from the wheat, inevitably someone will get hired who falls far short of expectations or someone will get passed over for missing a piece of trivia or a tricky brain teaser that could have been an excellent team member.   In shops that are primarily software-producing businesses or other heavily IT-oriented businesses (Microsoft, Amazon, etc) there often exists a much tighter bond between HR and the hiring development staff because development is their life-blood. Unfortunately, many of us work in places where IT is viewed as a cost or just a means to an end. In these shops, too often, HR and development staff may work against each other due to differences in opinion as to what a good developer is or what one is worth.  It seems that if you ask two different people what makes a good developer, often you will get three different opinions.   With the exception of those shops that are purely development-centric (you guys have it much easier!), most other shops have management who have very little knowledge about the development process.  Their view can often be that development is simply a skill that one learns and then once aquired, that developer can produce widgets as good as the next like workers on an assembly-line floor.  On the other side, you have many developers that feel that software development is an art unto itself and that the ability to create the most pure design or know the most obscure of keywords or write the shortest-possible obfuscated piece of code is a good coder.  So is it a skill?  An Art?  Or something entirely in between?   Saying that software is merely a skill and one just needs to learn the syntax and tools would be akin to saying anyone who knows English and can use Word can write a 300 page book that is accurate, meaningful, and stays true to the point.  This just isn't so.  It takes more than mere skill to take words and form a sentence, join those sentences into paragraphs, and those paragraphs into a document.  I've interviewed candidates who could answer obscure syntax and keyword questions and once they were hired could not code effectively at all.  So development must be more than a skill.   But on the other end, we have art.  Is development an art?  Is our end result to produce art?  I can marvel at a piece of code -- see it as concise and beautiful -- and yet that code most perform some stated function with accuracy and efficiency and maintainability.  None of these three things have anything to do with art, per se.  Art is beauty for its own sake and is a wonderful thing.  But if you apply that same though to development it just doesn't hold.  I've had developers tell me that all that matters is the end result and how you code it is entirely part of the art and I couldn't disagree more.  Yes, the end result, the accuracy, is the prime criteria to be met.  But if code is not maintainable and efficient, it would be just as useless as a beautiful car that breaks down once a week or that gets 2 miles to the gallon.  Yes, it may work in that it moves you from point A to point B and is pretty as hell, but if it can't be maintained or is not efficient, it's not a good solution.  So development must be something less than art.   In the end, I think I feel like development is a matter of craftsmanship.  We use our tools and we use our skills and set about to construct something that satisfies a purpose and yet is also elegant and efficient.  There is skill involved, and there is an art, but really it boils down to being able to craft code.  Crafting code is far more than writing code.  Anyone can write code if they know the syntax, but so few people can actually craft code that solves a purpose and craft it well.  So this is what I want to find, I want to find code craftsman!  But how?   I used to ask coding-trivia questions a long time ago and many people still fall back on this.  The thought is that if you ask the candidate some piece of coding trivia and they know the answer it must follow that they can craft good code.  For example:   What C++ keyword can be applied to a class/struct field to allow it to be changed even from a const-instance of that class/struct?  (answer: mutable)   So what do we prove if a candidate can answer this?  Only that they know what mutable means.  One would hope that this would infer that they'd know how to use it, and more importantly when and if it should ever be used!  But it rarely does!  The problem with triva questions is that you will either: Approve a really good developer who knows what some obscure keyword is (good) Reject a really good developer who never needed to use that keyword or is too inexperienced to know how to use it (bad) Approve a really bad developer who googled "C++ Interview Questions" and studied like hell but can't craft (very bad) Many HR departments love these kind of tests because they are short and easy to defend if a legal issue arrises on hiring decisions.  After all it's easy to say a person wasn't hired because they scored 30 out of 100 on some trivia test.  But unfortunately, you've eliminated a large part of your potential developer pool and possibly hired a few duds.  There are times I've hired candidates who knew every trivia question I could throw out them and couldn't craft.  And then there are times I've interviewed candidates who failed all my trivia but who I took a chance on who were my best finds ever.    So if not trivia, then what?  Brain teasers?  The thought is, these type of questions measure the thinking power of a candidate.  The problem is, once again, you will either: Approve a good candidate who has never heard the problem and can solve it (good) Reject a good candidate who just happens not to see the "catch" because they're nervous or it may be really obscure (bad) Approve a candidate who has studied enough interview brain teasers (once again, you can google em) to recognize the "catch" or knows the answer already (bad). Once again, you're eliminating good candidates and possibly accepting bad candidates.  In these cases, I think testing someone with brain teasers only tests their ability to answer brain teasers, not the ability to craft code. So how do we measure someone's ability to craft code?  Here's a novel idea: have them code!  Give them a computer and a compiler, or a whiteboard and a pen, or paper and pencil and have them construct a piece of code.  It just makes sense that if we're going to hire someone to code we should actually watch them code.  When they're done, we can judge them on several criteria: Correctness - does the candidate's solution accurately solve the problem proposed? Accuracy - is the candidate's solution reasonably syntactically correct? Efficiency - did the candidate write or use the more efficient data structures or algorithms for the job? Maintainability - was the candidate's code free of obfuscation and clever tricks that diminish readability? Persona - are they eager and willing or aloof and egotistical?  Will they work well within your team? It may sound simple, or it may sound crazy, but when I'm looking to hire a developer, I want to see them actually develop well-crafted code.

    Read the article

  • Visual Studio Little Wonders: Box Selection

    - by James Michael Hare
    So this week I decided I’d do a Little Wonder of a different kind and focus on an underused IDE improvement: Visual Studio’s Box Selection capability. This is a handy feature that many people still don’t realize was made available in Visual Studio 2010 (and beyond).  True, there have been other editors in the past with this capability, but now that it’s fully part of Visual Studio we can enjoy it’s goodness from within our own IDE. So, for those of you who don’t know what box selection is and what it allows you to do, read on! Sometimes, we want to select beyond the horizontal… The problem with traditional text selection in many editors is that it is horizontally oriented.  Sure, you can select multiple rows, but if you do you will pull in the entire row (at least for the middle rows).  Under the old selection scheme, if you wanted to select a portion of text from each row (a “box” of text) you were out of luck.  Box selection rectifies this by allowing you to select a box of text that bounded by a selection rectangle that you can grow horizontally or vertically.  So let’s think a situation that could occur where this comes in handy. Let’s say, for instance, that we are defining an enum in our code that we want to be able to translate into some string values (possibly to be stored in a database, output to screen, etc.). Perhaps such an enum would look like this: 1: public enum OrderType 2: { 3: Buy, // buy shares of a commodity 4: Sell, // sell shares of a commodity 5: Exchange, // exchange one commodity for another 6: Cancel, // cancel an order for a commodity 7: } 8:  Now, let’s say we are in the process of creating a Dictionary<K,V> to translate our OrderType: 1: var translator = new Dictionary<OrderType, string> 2: { 3: // do I really want to retype all this??? 4: }; Yes the example above is contrived so that we will pull some garbage if we do a multi-line select. I could select the lines above using the traditional multi-line selection: And then paste them into the translator code, which would result in this: 1: var translator = new Dictionary<OrderType, string> 2: { 3: Buy, // buy shares of a commodity 4: Sell, // sell shares of a commodity 5: Exchange, // exchange one commodity for another 6: Cancel, // cancel an order for a commodity 7: }; But I have a lot of junk there, sure I can manually clear it out, or use some search and replace magic, but if this were hundreds of lines instead of just a few that would quickly become cumbersome. The Box Selection Now that we have the ability to create box selections, we can select the box of text to delete!  Most of us are familiar with the fact we can drag the mouse (or hold [Shift] and use the arrow keys) to create a selection that can span multiple rows: Box selection, however, actually allows us to select a box instead of the typical horizontal lines: Then we can press the [delete] key and the pesky comments are all gone! You can do this either by holding down [Alt] while you select with your mouse, or by holding down [Alt+Shift] and using the arrow keys on the keyboard to grow the box horizontally or vertically. So now we have: 1: var translator = new Dictionary<OrderType, string> 2: { 3: Buy, 4: Sell, 5: Exchange, 6: Cancel, 7: }; Which is closer, but we still need an opening curly, the string to translate to, and the closing curly and comma. Fortunately, again, this is easy with box selections due to the fact box selection can even work for a zero-width selection! That is, hold down [Alt] and either drag down with no width, or hold down [Alt+Shift] and arrow down and you will define a selection range with no width, essentially, a vertical line selection: Notice the faint selection line on the right? So why is this useful? Well, just like with any selected range, we can type and it will replace the selection. What does this mean for box selections? It means that we can insert the same text all the way down on each line! If we have the same selection above, and type a curly and a space, we’d get: Imagine doing this over hundreds of lines and think of what a time saver it could be! Now make a zero-width selection on the other side: And type a curly and a comma, and we’d get: So close! Now finally, imagine we’ve already defined these strings somewhere and want to paste them in: 1: const private string BuyText = "Buy Shares"; 2: const private string SellText = "Sell Shares"; 3: const private string ExchangeText = "Exchange"; 4: const private string CancelText = "Cancel"; We can, again, use our box selection to pull out the constant names: And clicking copy (or [CTRL+C]) and then selecting a range to paste into: And finally clicking paste (or [CTRL+V]) to get the final result: 1: var translator = new Dictionary<OrderType, string> 2: { 3: { Buy, BuyText }, 4: { Sell, SellText }, 5: { Exchange, ExchangeText }, 6: { Cancel, CancelText }, 7: };   Sure, this was a contrived example, but I’m sure you’ll agree that it adds myriad possibilities of new ways to copy and paste vertical selections, as well as inserting text across a vertical slice. Summary: While box selection has been around in other editors, we finally get to experience it in VS2010 and beyond. It is extremely handy for selecting columns of information for cutting, copying, and pasting. In addition, it allows you to create a zero-width vertical insertion point that can be used to enter the same text across multiple rows. Imagine the time you can save adding repetitive code across multiple lines!  Try it, the more you use it, the more you’ll love it! Technorati Tags: C#,CSharp,.NET,Visual Studio,Little Wonders,Box Selection

    Read the article

  • C#/.NET Little Wonders: The Useful But Overlooked Sets

    - by James Michael Hare
    Once again we consider some of the lesser known classes and keywords of C#.  Today we will be looking at two set implementations in the System.Collections.Generic namespace: HashSet<T> and SortedSet<T>.  Even though most people think of sets as mathematical constructs, they are actually very useful classes that can be used to help make your application more performant if used appropriately. A Background From Math In mathematical terms, a set is an unordered collection of unique items.  In other words, the set {2,3,5} is identical to the set {3,5,2}.  In addition, the set {2, 2, 4, 1} would be invalid because it would have a duplicate item (2).  In addition, you can perform set arithmetic on sets such as: Intersections: The intersection of two sets is the collection of elements common to both.  Example: The intersection of {1,2,5} and {2,4,9} is the set {2}. Unions: The union of two sets is the collection of unique items present in either or both set.  Example: The union of {1,2,5} and {2,4,9} is {1,2,4,5,9}. Differences: The difference of two sets is the removal of all items from the first set that are common between the sets.  Example: The difference of {1,2,5} and {2,4,9} is {1,5}. Supersets: One set is a superset of a second set if it contains all elements that are in the second set. Example: The set {1,2,5} is a superset of {1,5}. Subsets: One set is a subset of a second set if all the elements of that set are contained in the first set. Example: The set {1,5} is a subset of {1,2,5}. If We’re Not Doing Math, Why Do We Care? Now, you may be thinking: why bother with the set classes in C# if you have no need for mathematical set manipulation?  The answer is simple: they are extremely efficient ways to determine ownership in a collection. For example, let’s say you are designing an order system that tracks the price of a particular equity, and once it reaches a certain point will trigger an order.  Now, since there’s tens of thousands of equities on the markets, you don’t want to track market data for every ticker as that would be a waste of time and processing power for symbols you don’t have orders for.  Thus, we just want to subscribe to the stock symbol for an equity order only if it is a symbol we are not already subscribed to. Every time a new order comes in, we will check the list of subscriptions to see if the new order’s stock symbol is in that list.  If it is, great, we already have that market data feed!  If not, then and only then should we subscribe to the feed for that symbol. So far so good, we have a collection of symbols and we want to see if a symbol is present in that collection and if not, add it.  This really is the essence of set processing, but for the sake of comparison, let’s say you do a list instead: 1: // class that handles are order processing service 2: public sealed class OrderProcessor 3: { 4: // contains list of all symbols we are currently subscribed to 5: private readonly List<string> _subscriptions = new List<string>(); 6:  7: ... 8: } Now whenever you are adding a new order, it would look something like: 1: public PlaceOrderResponse PlaceOrder(Order newOrder) 2: { 3: // do some validation, of course... 4:  5: // check to see if already subscribed, if not add a subscription 6: if (!_subscriptions.Contains(newOrder.Symbol)) 7: { 8: // add the symbol to the list 9: _subscriptions.Add(newOrder.Symbol); 10: 11: // do whatever magic is needed to start a subscription for the symbol 12: } 13:  14: // place the order logic! 15: } What’s wrong with this?  In short: performance!  Finding an item inside a List<T> is a linear - O(n) – operation, which is not a very performant way to find if an item exists in a collection. (I used to teach algorithms and data structures in my spare time at a local university, and when you began talking about big-O notation you could immediately begin to see eyes glossing over as if it was pure, useless theory that would not apply in the real world, but I did and still do believe it is something worth understanding well to make the best choices in computer science). Let’s think about this: a linear operation means that as the number of items increases, the time that it takes to perform the operation tends to increase in a linear fashion.  Put crudely, this means if you double the collection size, you might expect the operation to take something like the order of twice as long.  Linear operations tend to be bad for performance because they mean that to perform some operation on a collection, you must potentially “visit” every item in the collection.  Consider finding an item in a List<T>: if you want to see if the list has an item, you must potentially check every item in the list before you find it or determine it’s not found. Now, we could of course sort our list and then perform a binary search on it, but sorting is typically a linear-logarithmic complexity – O(n * log n) - and could involve temporary storage.  So performing a sort after each add would probably add more time.  As an alternative, we could use a SortedList<TKey, TValue> which sorts the list on every Add(), but this has a similar level of complexity to move the items and also requires a key and value, and in our case the key is the value. This is why sets tend to be the best choice for this type of processing: they don’t rely on separate keys and values for ordering – so they save space – and they typically don’t care about ordering – so they tend to be extremely performant.  The .NET BCL (Base Class Library) has had the HashSet<T> since .NET 3.5, but at that time it did not implement the ISet<T> interface.  As of .NET 4.0, HashSet<T> implements ISet<T> and a new set, the SortedSet<T> was added that gives you a set with ordering. HashSet<T> – For Unordered Storage of Sets When used right, HashSet<T> is a beautiful collection, you can think of it as a simplified Dictionary<T,T>.  That is, a Dictionary where the TKey and TValue refer to the same object.  This is really an oversimplification, but logically it makes sense.  I’ve actually seen people code a Dictionary<T,T> where they store the same thing in the key and the value, and that’s just inefficient because of the extra storage to hold both the key and the value. As it’s name implies, the HashSet<T> uses a hashing algorithm to find the items in the set, which means it does take up some additional space, but it has lightning fast lookups!  Compare the times below between HashSet<T> and List<T>: Operation HashSet<T> List<T> Add() O(1) O(1) at end O(n) in middle Remove() O(1) O(n) Contains() O(1) O(n)   Now, these times are amortized and represent the typical case.  In the very worst case, the operations could be linear if they involve a resizing of the collection – but this is true for both the List and HashSet so that’s a less of an issue when comparing the two. The key thing to note is that in the general case, HashSet is constant time for adds, removes, and contains!  This means that no matter how large the collection is, it takes roughly the exact same amount of time to find an item or determine if it’s not in the collection.  Compare this to the List where almost any add or remove must rearrange potentially all the elements!  And to find an item in the list (if unsorted) you must search every item in the List. So as you can see, if you want to create an unordered collection and have very fast lookup and manipulation, the HashSet is a great collection. And since HashSet<T> implements ICollection<T> and IEnumerable<T>, it supports nearly all the same basic operations as the List<T> and can use the System.Linq extension methods as well. All we have to do to switch from a List<T> to a HashSet<T>  is change our declaration.  Since List and HashSet support many of the same members, chances are we won’t need to change much else. 1: public sealed class OrderProcessor 2: { 3: private readonly HashSet<string> _subscriptions = new HashSet<string>(); 4:  5: // ... 6:  7: public PlaceOrderResponse PlaceOrder(Order newOrder) 8: { 9: // do some validation, of course... 10: 11: // check to see if already subscribed, if not add a subscription 12: if (!_subscriptions.Contains(newOrder.Symbol)) 13: { 14: // add the symbol to the list 15: _subscriptions.Add(newOrder.Symbol); 16: 17: // do whatever magic is needed to start a subscription for the symbol 18: } 19: 20: // place the order logic! 21: } 22:  23: // ... 24: } 25: Notice, we didn’t change any code other than the declaration for _subscriptions to be a HashSet<T>.  Thus, we can pick up the performance improvements in this case with minimal code changes. SortedSet<T> – Ordered Storage of Sets Just like HashSet<T> is logically similar to Dictionary<T,T>, the SortedSet<T> is logically similar to the SortedDictionary<T,T>. The SortedSet can be used when you want to do set operations on a collection, but you want to maintain that collection in sorted order.  Now, this is not necessarily mathematically relevant, but if your collection needs do include order, this is the set to use. So the SortedSet seems to be implemented as a binary tree (possibly a red-black tree) internally.  Since binary trees are dynamic structures and non-contiguous (unlike List and SortedList) this means that inserts and deletes do not involve rearranging elements, or changing the linking of the nodes.  There is some overhead in keeping the nodes in order, but it is much smaller than a contiguous storage collection like a List<T>.  Let’s compare the three: Operation HashSet<T> SortedSet<T> List<T> Add() O(1) O(log n) O(1) at end O(n) in middle Remove() O(1) O(log n) O(n) Contains() O(1) O(log n) O(n)   The MSDN documentation seems to indicate that operations on SortedSet are O(1), but this seems to be inconsistent with its implementation and seems to be a documentation error.  There’s actually a separate MSDN document (here) on SortedSet that indicates that it is, in fact, logarithmic in complexity.  Let’s put it in layman’s terms: logarithmic means you can double the collection size and typically you only add a single extra “visit” to an item in the collection.  Take that in contrast to List<T>’s linear operation where if you double the size of the collection you double the “visits” to items in the collection.  This is very good performance!  It’s still not as performant as HashSet<T> where it always just visits one item (amortized), but for the addition of sorting this is a good thing. Consider the following table, now this is just illustrative data of the relative complexities, but it’s enough to get the point: Collection Size O(1) Visits O(log n) Visits O(n) Visits 1 1 1 1 10 1 4 10 100 1 7 100 1000 1 10 1000   Notice that the logarithmic – O(log n) – visit count goes up very slowly compare to the linear – O(n) – visit count.  This is because since the list is sorted, it can do one check in the middle of the list, determine which half of the collection the data is in, and discard the other half (binary search).  So, if you need your set to be sorted, you can use the SortedSet<T> just like the HashSet<T> and gain sorting for a small performance hit, but it’s still faster than a List<T>. Unique Set Operations Now, if you do want to perform more set-like operations, both implementations of ISet<T> support the following, which play back towards the mathematical set operations described before: IntersectWith() – Performs the set intersection of two sets.  Modifies the current set so that it only contains elements also in the second set. UnionWith() – Performs a set union of two sets.  Modifies the current set so it contains all elements present both in the current set and the second set. ExceptWith() – Performs a set difference of two sets.  Modifies the current set so that it removes all elements present in the second set. IsSupersetOf() – Checks if the current set is a superset of the second set. IsSubsetOf() – Checks if the current set is a subset of the second set. For more information on the set operations themselves, see the MSDN description of ISet<T> (here). What Sets Don’t Do Don’t get me wrong, sets are not silver bullets.  You don’t really want to use a set when you want separate key to value lookups, that’s what the IDictionary implementations are best for. Also sets don’t store temporal add-order.  That is, if you are adding items to the end of a list all the time, your list is ordered in terms of when items were added to it.  This is something the sets don’t do naturally (though you could use a SortedSet with an IComparer with a DateTime but that’s overkill) but List<T> can. Also, List<T> allows indexing which is a blazingly fast way to iterate through items in the collection.  Iterating over all the items in a List<T> is generally much, much faster than iterating over a set. Summary Sets are an excellent tool for maintaining a lookup table where the item is both the key and the value.  In addition, if you have need for the mathematical set operations, the C# sets support those as well.  The HashSet<T> is the set of choice if you want the fastest possible lookups but don’t care about order.  In contrast the SortedSet<T> will give you a sorted collection at a slight reduction in performance.   Technorati Tags: C#,.Net,Little Wonders,BlackRabbitCoder,ISet,HashSet,SortedSet

    Read the article

  • Visual Studio Little Wonders: Quick Launch / Quick Access

    - by James Michael Hare
    Once again, in this series of posts I look at features of Visual Studio that may seem trivial, but can help improve your efficiency as a developer. The index of all my past little wonders posts can be found here. Well, my friends, this post will be a bit short because I’m in the middle of a bit of a move at the moment.  But, that said, I didn’t want to let the blog go completely silent this week, so I decided to add another Little Wonder to the list for the Visual Studio IDE. How often have you wanted to change an option or execute a command in Visual Studio, but can’t remember where the darn thing is in the menu, settings, etc.?  If so, Quick Launch in VS2012 (or Quick Access in VS2010 with the Productivity Power Tools extension) is just for you! Quick Launch / Quick Access – find a command or option quickly For those of you using Visual Studio 2012, Quick Launch is built right into the IDE at the top of the title bar, near the minimize, maximize, and close buttons: But do not despair if you are using Visual Studio 2010, you can get Quick Access from the Productivity Power Tools extension.  To do this, you can go to the extension manager: And then go to the gallery and search for Productivity Power Tools and install it.  If you don’t have VS2012 yet, then the Productivity Power Tools is the next best thing.  This extension updates VS2010 with features such as Quick Access, the Solution Navigator, searchable Add Reference Dialog, better tab wells, etc.  I highly recommend it! But back to the topic at hand!  In VS2012 Quick Launch is built into the IDE and can be accessed by clicking in the Quick Launch area of the title bar, or by pressing CTRL+Q.  If you have VS2010 with the PPT installed, though, it is called Quick Access and is accessible through View –> Quick Access: Regardless of which IDE you are using, the feature behaves mostly the same.  It allows you to search all of Visual Studio’s commands and options for a particular topic.  For example, let’s say you want to change from tabs to tabs expanded to spaces, but don’t remember where that option is buried.  You can bring up Quick Launch / Quick Access and type in “tabs”: And it brings up a list of all options on tabs, you can then choose the one appropriate to you and click on it and it will take you right there! A lot easier than diving through the options tree to find what you are looking for!  It also works on menu commands, for example if you can’t remember how to open the Output window: It shows you the menu items that will get you to the Output window, and (if applicable) the keyboard shortcuts.  Again, clicking on one of these will perform the action for you as well. There are also some tasks you can perform directly from Quick Launch / Quick Access.  For example, perhaps you are one of those people who like to have the line numbers in your editor (I do), so let’s bring up Quick Launch / Quick Access and type “line numbers”: And let’s select Turn Line Numbers On, and now our editor looks like: And Voila!  We have line numbers in VS2010.  You can do this in VS2012 too, but it takes you to the option settings instead of directly turning them off and on.  There are bound to be differences between the way the two editors organize settings and commands, but you get the point. So, as you can see, the Quick Launch / Quick Access feature in Visual Studio makes it easy to jump right to the options, commands, or tasks you are interested in without all the digging. Summary An IDE as powerful as Visual Studio has so many options and commands that it can be confusing to remember how to find and invoke them.  Quick Launch (Quick Access in VS2010 with Productivity Power Tools extension) is a quick and handy way to jump to any of these options, commands, or tasks quickly without having to remember in what menu or screen they are buried!  Technorati Tags: C#,CSharp,.NET,Little Wonders,Visual Studio,Quick Access,Quick Launch

    Read the article

  • Premature-Optimization and Performance Anxiety

    - by James Michael Hare
    While writing my post analyzing the new .NET 4 ConcurrentDictionary class (here), I fell into one of the classic blunders that I myself always love to warn about.  After analyzing the differences of time between a Dictionary with locking versus the new ConcurrentDictionary class, I noted that the ConcurrentDictionary was faster with read-heavy multi-threaded operations.  Then, I made the classic blunder of thinking that because the original Dictionary with locking was faster for those write-heavy uses, it was the best choice for those types of tasks.  In short, I fell into the premature-optimization anti-pattern. Basically, the premature-optimization anti-pattern is when a developer is coding very early for a perceived (whether rightly-or-wrongly) performance gain and sacrificing good design and maintainability in the process.  At best, the performance gains are usually negligible and at worst, can either negatively impact performance, or can degrade maintainability so much that time to market suffers or the code becomes very fragile due to the complexity. Keep in mind the distinction above.  I'm not talking about valid performance decisions.  There are decisions one should make when designing and writing an application that are valid performance decisions.  Examples of this are knowing the best data structures for a given situation (Dictionary versus List, for example) and choosing performance algorithms (linear search vs. binary search).  But these in my mind are macro optimizations.  The error is not in deciding to use a better data structure or algorithm, the anti-pattern as stated above is when you attempt to over-optimize early on in such a way that it sacrifices maintainability. In my case, I was actually considering trading the safety and maintainability gains of the ConcurrentDictionary (no locking required) for a slight performance gain by using the Dictionary with locking.  This would have been a mistake as I would be trading maintainability (ConcurrentDictionary requires no locking which helps readability) and safety (ConcurrentDictionary is safe for iteration even while being modified and you don't risk the developer locking incorrectly) -- and I fell for it even when I knew to watch out for it.  I think in my case, and it may be true for others as well, a large part of it was due to the time I was trained as a developer.  I began college in in the 90s when C and C++ was king and hardware speed and memory were still relatively priceless commodities and not to be squandered.  In those days, using a long instead of a short could waste precious resources, and as such, we were taught to try to minimize space and favor performance.  This is why in many cases such early code-bases were very hard to maintain.  I don't know how many times I heard back then to avoid too many function calls because of the overhead -- and in fact just last year I heard a new hire in the company where I work declare that she didn't want to refactor a long method because of function call overhead.  Now back then, that may have been a valid concern, but with today's modern hardware even if you're calling a trivial method in an extremely tight loop (which chances are the JIT compiler would optimize anyway) the results of removing method calls to speed up performance are negligible for the great majority of applications.  Now, obviously, there are those coding applications where speed is absolutely king (for example drivers, computer games, operating systems) where such sacrifices may be made.  But I would strongly advice against such optimization because of it's cost.  Many folks that are performing an optimization think it's always a win-win.  That they're simply adding speed to the application, what could possibly be wrong with that?  What they don't realize is the cost of their choice.  For every piece of straight-forward code that you obfuscate with performance enhancements, you risk the introduction of bugs in the long term technical debt of the application.  It will become so fragile over time that maintenance will become a nightmare.  I've seen such applications in places I have worked.  There are times I've seen applications where the designer was so obsessed with performance that they even designed their own memory management system for their application to try to squeeze out every ounce of performance.  Unfortunately, the application stability often suffers as a result and it is very difficult for anyone other than the original designer to maintain. I've even seen this recently where I heard a C++ developer bemoaning that in VS2010 the iterators are about twice as slow as they used to be because Microsoft added range checking (probably as part of the 0x standard implementation).  To me this was almost a joke.  Twice as slow sounds bad, but it almost never as bad as you think -- especially if you're gaining safety.  The only time twice is really that much slower is when once was too slow to begin with.  Think about it.  2 minutes is slow as a response time because 1 minute is slow.  But if an iterator takes 1 microsecond to move one position and a new, safer iterator takes 2 microseconds, this is trivial!  The only way you'd ever really notice this would be in iterating a collection just for the sake of iterating (i.e. no other operations).  To my mind, the added safety makes the extra time worth it. Always favor safety and maintainability when you can.  I know it can be a hard habit to break, especially if you started out your career early or in a language such as C where they are very performance conscious.  But in reality, these type of micro-optimizations only end up hurting you in the long run. Remember the two laws of optimization.  I'm not sure where I first heard these, but they are so true: For beginners: Do not optimize. For experts: Do not optimize yet. This is so true.  If you're a beginner, resist the urge to optimize at all costs.  And if you are an expert, delay that decision.  As long as you have chosen the right data structures and algorithms for your task, your performance will probably be more than sufficient.  Chances are it will be network, database, or disk hits that will be your slow-down, not your code.  As they say, 98% of your code's bottleneck is in 2% of your code so premature-optimization may add maintenance and safety debt that won't have any measurable impact.  Instead, code for maintainability and safety, and then, and only then, when you find a true bottleneck, then you should go back and optimize further.

    Read the article

  • More Fun with C# Iterators and Generators

    - by James Michael Hare
    In my last post, I talked quite a bit about iterators and how they can be really powerful tools for filtering a list of items down to a subset of items.  This had both pros and cons over returning a full collection, which, in summary, were:   Pros: If traversal is only partial, does not have to visit rest of collection. If evaluation method is costly, only incurs that cost on elements visited. Adds little to no garbage collection pressure.    Cons: Very slight performance impact if you know caller will always consume all items in collection. And as we saw in the last post, that con for the cost was very, very small and only really became evident on very tight loops consuming very large lists completely.    One of the key items to note, though, is the garbage!  In the traditional (return a new collection) method, if you have a 1,000,000 element collection, and wish to transform or filter it in some way, you have to allocate space for that copy of the collection.  That is, say you have a collection of 1,000,000 items and you want to double every item in the collection.  Well, that means you have to allocate a collection to hold those 1,000,000 items to return, which is a lot especially if you are just going to use it once and toss it.   Iterators, though, don't have this problem.  Each time you visit the node, it would return the doubled value of the node (in this example) and not allocate a second collection of 1,000,000 doubled items.  Do you see the distinction?  In both cases, we're consuming 1,000,000 items.  But in one case we pass back each doubled item which is just an int (for example's sake) on the stack and in the other case, we allocate a list containing 1,000,000 items which then must be garbage collected.   So iterators in C# are pretty cool, eh?  Well, here's one more thing a C# iterator can do that a traditional "return a new collection" transformation can't!   It can return **unbounded** collections!   I know, I know, that smells a lot like an infinite loop, eh?  Yes and no.  Basically, you're relying on the caller to put the bounds on the list, and as long as the caller doesn't you keep going.  Consider this example:   public static class Fibonacci {     // returns the infinite fibonacci sequence     public static IEnumerable<int> Sequence()     {         int iteration = 0;         int first = 1;         int second = 1;         int current = 0;         while (true)         {             if (iteration++ < 2)             {                 current = 1;             }             else             {                 current = first + second;                 second = first;                 first = current;             }             yield return current;         }     } }   Whoa, you say!  Yes, that's an infinite loop!  What the heck is going on there?  Yes, that was intentional.  Would it be better to have a fibonacci sequence that returns only a specific number of items?  Perhaps, but that wouldn't give you the power to defer the execution to the caller.   The beauty of this function is it is as infinite as the sequence itself!  The fibonacci sequence is unbounded, and so is this method.  It will continue to return fibonacci numbers for as long as you ask for them.  Now that's not something you can do with a traditional method that would return a collection of ints representing each number.  In that case you would eventually run out of memory as you got to higher and higher numbers.  This method, though, never runs out of memory.   Now, that said, you do have to know when you use it that it is an infinite collection and bound it appropriately.  Fortunately, Linq provides a lot of these extension methods for you!   Let's say you only want the first 10 fibonacci numbers:       foreach(var fib in Fibonacci.Sequence().Take(10))     {         Console.WriteLine(fib);     }   Or let's say you only want the fibonacci numbers that are less than 100:       foreach(var fib in Fibonacci.Sequence().TakeWhile(f => f < 100))     {         Console.WriteLine(fib);     }   So, you see, one of the nice things about iterators is their power to work with virtually any size (even infinite) collections without adding the garbage collection overhead of making new collections.    You can also do fun things like this to make a more "fluent" interface for for loops:   // A set of integer generator extension methods public static class IntExtensions {     // Begins counting to inifity, use To() to range this.     public static IEnumerable<int> Every(this int start)     {         // deliberately avoiding condition because keeps going         // to infinity for as long as values are pulled.         for (var i = start; ; ++i)         {             yield return i;         }     }     // Begins counting to infinity by the given step value, use To() to     public static IEnumerable<int> Every(this int start, int byEvery)     {         // deliberately avoiding condition because keeps going         // to infinity for as long as values are pulled.         for (var i = start; ; i += byEvery)         {             yield return i;         }     }     // Begins counting to inifity, use To() to range this.     public static IEnumerable<int> To(this int start, int end)     {         for (var i = start; i <= end; ++i)         {             yield return i;         }     }     // Ranges the count by specifying the upper range of the count.     public static IEnumerable<int> To(this IEnumerable<int> collection, int end)     {         return collection.TakeWhile(item => item <= end);     } }   Note that there are two versions of each method.  One that starts with an int and one that starts with an IEnumerable<int>.  This is to allow more power in chaining from either an existing collection or from an int.  This lets you do things like:   // count from 1 to 30 foreach(var i in 1.To(30)) {     Console.WriteLine(i); }     // count from 1 to 10 by 2s foreach(var i in 0.Every(2).To(10)) {     Console.WriteLine(i); }     // or, if you want an infinite sequence counting by 5s until something inside breaks you out... foreach(var i in 0.Every(5)) {     if (someCondition)     {         break;     }     ... }     Yes, those are kinda play functions and not particularly useful, but they show some of the power of generators and extension methods to form a fluid interface.   So what do you think?  What are some of your favorite generators and iterators?

    Read the article

  • Code Reuse is (Damn) Hard

    - by James Michael Hare
    Being a development team lead, the task of interviewing new candidates was part of my job.  Like any typical interview, we started with some easy questions to get them warmed up and help calm their nerves before hitting the hard stuff. One of those easier questions was almost always: “Name some benefits of object-oriented development.”  Nearly every time, the candidate would chime in with a plethora of canned answers which typically included: “it helps ease code reuse.”  Of course, this is a gross oversimplification.  Tools only ease reuse, its developers that ultimately can cause code to be reusable or not, regardless of the language or methodology. But it did get me thinking…  we always used to say that as part of our mantra as to why Object-Oriented Programming was so great.  With polymorphism, inheritance, encapsulation, etc. we in essence set up the concepts to help facilitate reuse as much as possible.  And yes, as a developer now of many years, I unquestionably held that belief for ages before it really struck me how my views on reuse have jaded over the years.  In fact, in many ways Agile rightly eschews reuse as taking a backseat to developing what's needed for the here and now.  It used to be I was in complete opposition to that view, but more and more I've come to see the logic in it.  Too many times I've seen developers (myself included) get lost in design paralysis trying to come up with the perfect abstraction that would stand all time.  Nearly without fail, all of these pieces of code become obsolete in a matter of months or years. It’s not that I don’t like reuse – it’s just that reuse is hard.  In fact, reuse is DAMN hard.  Many times it is just a distraction that eats up architect and developer time, and worse yet can be counter-productive and force wrong decisions.  Now don’t get me wrong, I love the idea of reusable code when it makes sense.  These are in the few cases where you are designing something that is inherently reusable.  The problem is, most business-class code is inherently unfit for reuse! Furthermore, the code that is reusable will often fail to be reused if you don’t have the proper framework in place for effective reuse that includes standardized versioning, building, releasing, and documenting the components.  That should always be standard across the board when promoting reusable code.  All of this is hard, and it should only be done when you have code that is truly reusable or you will be exerting a large amount of development effort for very little bang for your buck. But my goal here is not to get into how to reuse (that is a topic unto itself) but what should be reused.  First, let’s look at an extension method.  There’s many times where I want to kick off a thread to handle a task, then when I want to reign that thread in of course I want to do a Join on it.  But what if I only want to wait a limited amount of time and then Abort?  Well, I could of course write that logic out by hand each time, but it seemed like a great extension method: 1: public static class ThreadExtensions 2: { 3: public static bool JoinOrAbort(this Thread thread, TimeSpan timeToWait) 4: { 5: bool isJoined = false; 6:  7: if (thread != null) 8: { 9: isJoined = thread.Join(timeToWait); 10:  11: if (!isJoined) 12: { 13: thread.Abort(); 14: } 15: } 16: return isJoined; 17: } 18: } 19:  When I look at this code, I can immediately see things that jump out at me as reasons why this code is very reusable.  Some of them are standard OO principles, and some are kind-of home grown litmus tests: Single Responsibility Principle (SRP) – The only reason this extension method need change is if the Thread class itself changes (one responsibility). Stable Dependencies Principle (SDP) – This method only depends on classes that are more stable than it is (System.Threading.Thread), and in itself is very stable, hence other classes may safely depend on it. It is also not dependent on any business domain, and thus isn't subject to changes as the business itself changes. Open-Closed Principle (OCP) – This class is inherently closed to change. Small and Stable Problem Domain – This method only cares about System.Threading.Thread. All-or-None Usage – A user of a reusable class should want the functionality of that class, not parts of that functionality.  That’s not to say they most use every method, but they shouldn’t be using a method just to get half of its result. Cost of Reuse vs. Cost to Recreate – since this class is highly stable and minimally complex, we can offer it up for reuse very cheaply by promoting it as “ready-to-go” and already unit tested (important!) and available through a standard release cycle (very important!). Okay, all seems good there, now lets look at an entity and DAO.  I don’t know about you all, but there have been times I’ve been in organizations that get the grand idea that all DAOs and entities should be standardized and shared.  While this may work for small or static organizations, it’s near ludicrous for anything large or volatile. 1: namespace Shared.Entities 2: { 3: public class Account 4: { 5: public int Id { get; set; } 6:  7: public string Name { get; set; } 8:  9: public Address HomeAddress { get; set; } 10:  11: public int Age { get; set;} 12:  13: public DateTime LastUsed { get; set; } 14:  15: // etc, etc, etc... 16: } 17: } 18:  19: ... 20:  21: namespace Shared.DataAccess 22: { 23: public class AccountDao 24: { 25: public Account FindAccount(int id) 26: { 27: // dao logic to query and return account 28: } 29:  30: ... 31:  32: } 33: } Now to be fair, I’m not saying there doesn’t exist an organization where some entites may be extremely static and unchanging.  But at best such entities and DAOs will be problematic cases of reuse.  Let’s examine those same tests: Single Responsibility Principle (SRP) – The reasons to change for these classes will be strongly dependent on what the definition of the account is which can change over time and may have multiple influences depending on the number of systems an account can cover. Stable Dependencies Principle (SDP) – This method depends on the data model beneath itself which also is largely dependent on the business definition of an account which can be very inherently unstable. Open-Closed Principle (OCP) – This class is not really closed for modification.  Every time the account definition may change, you’d need to modify this class. Small and Stable Problem Domain – The definition of an account is inherently unstable and in fact may be very large.  What if you are designing a system that aggregates account information from several sources? All-or-None Usage – What if your view of the account encompasses data from 3 different sources but you only care about one of those sources or one piece of data?  Should you have to take the hit of looking up all the other data?  On the other hand, should you have ten different methods returning portions of data in chunks people tend to ask for?  Neither is really a great solution. Cost of Reuse vs. Cost to Recreate – DAOs are really trivial to rewrite, and unless your definition of an account is EXTREMELY stable, the cost to promote, support, and release a reusable account entity and DAO are usually far higher than the cost to recreate as needed. It’s no accident that my case for reuse was a utility class and my case for non-reuse was an entity/DAO.  In general, the smaller and more stable an abstraction is, the higher its level of reuse.  When I became the lead of the Shared Components Committee at my workplace, one of the original goals we looked at satisfying was to find (or create), version, release, and promote a shared library of common utility classes, frameworks, and data access objects.  Now, of course, many of you will point to nHibernate and Entity for the latter, but we were looking at larger, macro collections of data that span multiple data sources of varying types (databases, web services, etc). As we got deeper and deeper in the details of how to manage and release these items, it quickly became apparent that while the case for reuse was typically a slam dunk for utilities and frameworks, the data access objects just didn’t “smell” right.  We ended up having session after session of design meetings to try and find the right way to share these data access components. When someone asked me why it was taking so long to iron out the shared entities, my response was quite simple, “Reuse is hard...”  And that’s when I realized, that while reuse is an awesome goal and we should strive to make code maintainable, often times you end up creating far more work for yourself than necessary by trying to force code to be reusable that inherently isn’t. Think about classes the times you’ve worked in a company where in the design session people fight over the best way to implement a class to make it maximally reusable, extensible, and any other buzzwordable.  Then think about how quickly that design became obsolete.  Many times I set out to do a project and think, “yes, this is the best design, I can extend it easily!” only to find out the business requirements change COMPLETELY in such a way that the design is rendered invalid.  Code, in general, tends to rust and age over time.  As such, writing reusable code can often be difficult and many times ends up being a futile exercise and worse yet, sometimes makes the code harder to maintain because it obfuscates the design in the name of extensibility or reusability. So what do I think are reusable components? Generic Utility classes – these tend to be small classes that assist in a task and have no business context whatsoever. Implementation Abstraction Frameworks – home-grown frameworks that try to isolate changes to third party products you may be depending on (like writing a messaging abstraction layer for publishing/subscribing that is independent of whether you use JMS, MSMQ, etc). Simplification and Uniformity Frameworks – To some extent this is similar to an abstraction framework, but there may be one chosen provider but a development shop mandate to perform certain complex items in a certain way.  Or, perhaps to simplify and dumb-down a complex task for the average developer (such as implementing a particular development-shop’s method of encryption). And what are less reusable? Application and Business Layers – tend to fluctuate a lot as requirements change and new features are added, so tend to be an unstable dependency.  May be reused across applications but also very volatile. Entities and Data Access Layers – these tend to be tuned to the scope of the application, so reusing them can be hard unless the abstract is very stable. So what’s the big lesson?  Reuse is hard.  In fact it’s damn hard.  And much of the time I’m not convinced we should focus too hard on it. If you’re designing a utility or framework, then by all means design it for reuse.  But you most also really set down a good versioning, release, and documentation process to maximize your chances.  For anything else, design it to be maintainable and extendable, but don’t waste the effort on reusability for something that most likely will be obsolete in a year or two anyway.

    Read the article

  • ANTS Memory Profiler 7.0

    - by James Michael Hare
    I had always been a fan of ANTS products (Reflector is absolutely invaluable, and their performance profiler is great as well – very easy to use!), so I was curious to see what the ANTS Memory Profiler could show me. Background While a performance profiler will track how much time is typically spent in each unit of code, a memory profiler gives you much more detail on how and where your memory is being consumed and released in a program. As an example, I’d been working on a data access layer at work to call a market data web service.  This web service would take a list of symbols to quote and would return back the quote data.  To help consolidate the thousands of web requests per second we get and reduce load on the web services, we implemented a 5-second cache of quote data.  Not quite long enough to where customers will typically notice a quote go “stale”, but just long enough to be able to collapse multiple quote requests for the same symbol in a short period of time. A 5-second cache may not sound like much, but it actually pays off by saving us roughly 42% of our web service calls, while still providing relatively up-to-date information.  The question is whether or not the extra memory involved in maintaining the cache was worth it, so I decided to fire up the ANTS Memory Profiler and take a look at memory usage. First Impressions The main thing I’ve always loved about the ANTS tools is their ease of use.  Pretty much everything is right there in front of you in a way that makes it easy for you to find what you need with little digging required.  I’ve worked with other, older profilers before (that shall remain nameless other than to hint it was created by a very large chip maker) where it was a mind boggling experience to figure out how to do simple tasks. Not so with AMP.  The opening dialog is very straightforward.  You can choose from here whether to debug an executable, a web application (either in IIS or from VS’s web development server), windows services, etc. So I chose a .NET Executable and navigated to the build location of my test harness.  Then began profiling. At this point while the application is running, you can see a chart of the memory as it ebbs and wanes with allocations and collections.  At any given point in time, you can take snapshots (to compare states) zoom in, or choose to stop at any time.  Snapshots Taking a snapshot also gives you a breakdown of the managed memory heaps for each generation so you get an idea how many objects are staying around for extended periods of time (as an object lives and survives collections, it gets promoted into higher generations where collection becomes less frequent). Generating a snapshot brings up an analysis view with very handy graphs that show your generation sizes.  Almost all my memory is in Generation 1 in the managed memory component of the first graph, which is good news to me, because Gen 2 collections are much rarer.  I once3 made the mistake once of caching data for 30 minutes and found it didn’t get collected very quick after I released my reference because it had been promoted to Gen 2 – doh! Analysis It looks like (from the second pie chart) that the majority of the allocations were in the string class.  This also is expected for me because the majority of the memory allocated is in the web service responses, so it doesn’t seem the entities I’m adapting to (to prevent being too tightly coupled to the web service proxy classes, which can change easily out from under me) aren’t taking a significant portion of memory. I also appreciate that they have clear summary text in key places such as “No issues with large object heap fragmentation were detected”.  For novice users, this type of summary information can be critical to getting them to use a tool and develop a good working knowledge of it. There is also a handy link at the bottom for “What to look for on the summary” which loads a web page of help on key points to look for. Clicking over to the session overview, it’s easy to compare the samples at each snapshot to see how your memory is growing, shrinking, or staying relatively the same.  Looking at my snapshots, I’m pretty happy with the fact that memory allocation and heap size seems to be fairly stable and in control: Once again, you can check on the large object heap, generation one heap, and generation two heap across each snapshot to spot trends. Back on the analysis tab, we can go to the [Class List] button to get an idea what classes are making up the majority of our memory usage.  As was little surprise to me, System.String was the clear majority of my allocations, though I found it surprising that the System.Reflection.RuntimeMehtodInfo came in second.  I was curious about this, so I selected it and went into the [Instance Categorizer].  This view let me see where these instances to RuntimeMehtodInfo were coming from. So I scrolled back through the graph, and discovered that these were being held by the System.ServiceModel.ChannelFactoryRefCache and I was satisfied this was just an artifact of my WCF proxy. I also like that down at the bottom of the Instance Categorizer it gives you a series of filters and offers to guide you on which filter to use based on the problem you are trying to find.  For example, if I suspected a memory leak, I might try to filter for survivors in growing classes.  This means that for instances of a class that are growing in memory (more are being created than cleaned up), which ones are survivors (not collected) from garbage collection.  This might allow me to drill down and find places where I’m holding onto references by mistake and not freeing them! Finally, if you want to really see all your instances and who is holding onto them (preventing collection), you can go to the “Instance Retention Graph” which creates a graph showing what references are being held in memory and who is holding onto them. Visual Studio Integration Of course, VS has its own profiler built in – and for a free bundled profiler it is quite capable – but AMP gives a much cleaner and easier-to-use experience, and when you install it you also get the option of letting it integrate directly into VS. So once you go back into VS after installation, you’ll notice an ANTS menu which lets you launch the ANTS profiler directly from Visual Studio.   Clicking on one of these options fires up the project in the profiler immediately, allowing you to get right in.  It doesn’t integrate with the Visual Studio windows themselves (like the VS profiler does), but still the plethora of information it provides and the clear and concise manner in which it presents it makes it well worth it. Summary If you like the ANTS series of tools, you shouldn’t be disappointed with the ANTS Memory Profiler.  It was so easy to use that I was able to jump in with very little product knowledge and get the information I was looking it for. I’ve used other profilers before that came with 3-inch thick tomes that you had to read in order to get anywhere with the tool, and this one is not like that at all.  It’s built for your everyday developer to get in and find their problems quickly, and I like that! Tweet Technorati Tags: Influencers,ANTS,Memory,Profiler

    Read the article

  • C#/.NET Little Wonders: The Generic Func Delegates

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. Back in one of my three original “Little Wonders” Trilogy of posts, I had listed generic delegates as one of the Little Wonders of .NET.  Later, someone posted a comment saying said that they would love more detail on the generic delegates and their uses, since my original entry just scratched the surface of them. Last week, I began our look at some of the handy generic delegates built into .NET with a description of delegates in general, and the Action family of delegates.  For this week, I’ll launch into a look at the Func family of generic delegates and how they can be used to support generic, reusable algorithms and classes. Quick Delegate Recap Delegates are similar to function pointers in C++ in that they allow you to store a reference to a method.  They can store references to either static or instance methods, and can actually be used to chain several methods together in one delegate. Delegates are very type-safe and can be satisfied with any standard method, anonymous method, or a lambda expression.  They can also be null as well (refers to no method), so care should be taken to make sure that the delegate is not null before you invoke it. Delegates are defined using the keyword delegate, where the delegate’s type name is placed where you would typically place the method name: 1: // This delegate matches any method that takes string, returns nothing 2: public delegate void Log(string message); This delegate defines a delegate type named Log that can be used to store references to any method(s) that satisfies its signature (whether instance, static, lambda expression, etc.). Delegate instances then can be assigned zero (null) or more methods using the operator = which replaces the existing delegate chain, or by using the operator += which adds a method to the end of a delegate chain: 1: // creates a delegate instance named currentLogger defaulted to Console.WriteLine (static method) 2: Log currentLogger = Console.Out.WriteLine; 3:  4: // invokes the delegate, which writes to the console out 5: currentLogger("Hi Standard Out!"); 6:  7: // append a delegate to Console.Error.WriteLine to go to std error 8: currentLogger += Console.Error.WriteLine; 9:  10: // invokes the delegate chain and writes message to std out and std err 11: currentLogger("Hi Standard Out and Error!"); While delegates give us a lot of power, it can be cumbersome to re-create fairly standard delegate definitions repeatedly, for this purpose the generic delegates were introduced in various stages in .NET.  These support various method types with particular signatures. Note: a caveat with generic delegates is that while they can support multiple parameters, they do not match methods that contains ref or out parameters. If you want to a delegate to represent methods that takes ref or out parameters, you will need to create a custom delegate. We’ve got the Func… delegates Just like it’s cousin, the Action delegate family, the Func delegate family gives us a lot of power to use generic delegates to make classes and algorithms more generic.  Using them keeps us from having to define a new delegate type when need to make a class or algorithm generic. Remember that the point of the Action delegate family was to be able to perform an “action” on an item, with no return results.  Thus Action delegates can be used to represent most methods that take 0 to 16 arguments but return void.  You can assign a method The Func delegate family was introduced in .NET 3.5 with the advent of LINQ, and gives us the power to define a function that can be called on 0 to 16 arguments and returns a result.  Thus, the main difference between Action and Func, from a delegate perspective, is that Actions return nothing, but Funcs return a result. The Func family of delegates have signatures as follows: Func<TResult> – matches a method that takes no arguments, and returns value of type TResult. Func<T, TResult> – matches a method that takes an argument of type T, and returns value of type TResult. Func<T1, T2, TResult> – matches a method that takes arguments of type T1 and T2, and returns value of type TResult. Func<T1, T2, …, TResult> – and so on up to 16 arguments, and returns value of type TResult. These are handy because they quickly allow you to be able to specify that a method or class you design will perform a function to produce a result as long as the method you specify meets the signature. For example, let’s say you were designing a generic aggregator, and you wanted to allow the user to define how the values will be aggregated into the result (i.e. Sum, Min, Max, etc…).  To do this, we would ask the user of our class to pass in a method that would take the current total, the next value, and produce a new total.  A class like this could look like: 1: public sealed class Aggregator<TValue, TResult> 2: { 3: // holds method that takes previous result, combines with next value, creates new result 4: private Func<TResult, TValue, TResult> _aggregationMethod; 5:  6: // gets or sets the current result of aggregation 7: public TResult Result { get; private set; } 8:  9: // construct the aggregator given the method to use to aggregate values 10: public Aggregator(Func<TResult, TValue, TResult> aggregationMethod = null) 11: { 12: if (aggregationMethod == null) throw new ArgumentNullException("aggregationMethod"); 13:  14: _aggregationMethod = aggregationMethod; 15: } 16:  17: // method to add next value 18: public void Aggregate(TValue nextValue) 19: { 20: // performs the aggregation method function on the current result and next and sets to current result 21: Result = _aggregationMethod(Result, nextValue); 22: } 23: } Of course, LINQ already has an Aggregate extension method, but that works on a sequence of IEnumerable<T>, whereas this is designed to work more with aggregating single results over time (such as keeping track of a max response time for a service). We could then use this generic aggregator to find the sum of a series of values over time, or the max of a series of values over time (among other things): 1: // creates an aggregator that adds the next to the total to sum the values 2: var sumAggregator = new Aggregator<int, int>((total, next) => total + next); 3:  4: // creates an aggregator (using static method) that returns the max of previous result and next 5: var maxAggregator = new Aggregator<int, int>(Math.Max); So, if we were timing the response time of a web method every time it was called, we could pass that response time to both of these aggregators to get an idea of the total time spent in that web method, and the max time spent in any one call to the web method: 1: // total will be 13 and max 13 2: int responseTime = 13; 3: sumAggregator.Aggregate(responseTime); 4: maxAggregator.Aggregate(responseTime); 5:  6: // total will be 20 and max still 13 7: responseTime = 7; 8: sumAggregator.Aggregate(responseTime); 9: maxAggregator.Aggregate(responseTime); 10:  11: // total will be 40 and max now 20 12: responseTime = 20; 13: sumAggregator.Aggregate(responseTime); 14: maxAggregator.Aggregate(responseTime); The Func delegate family is useful for making generic algorithms and classes, and in particular allows the caller of the method or user of the class to specify a function to be performed in order to generate a result. What is the result of a Func delegate chain? If you remember, we said earlier that you can assign multiple methods to a delegate by using the += operator to chain them.  So how does this affect delegates such as Func that return a value, when applied to something like the code below? 1: Func<int, int, int> combo = null; 2:  3: // What if we wanted to aggregate the sum and max together? 4: combo += (total, next) => total + next; 5: combo += Math.Max; 6:  7: // what is the result? 8: var comboAggregator = new Aggregator<int, int>(combo); Well, in .NET if you chain multiple methods in a delegate, they will all get invoked, but the result of the delegate is the result of the last method invoked in the chain.  Thus, this aggregator would always result in the Math.Max() result.  The other chained method (the sum) gets executed first, but it’s result is thrown away: 1: // result is 13 2: int responseTime = 13; 3: comboAggregator.Aggregate(responseTime); 4:  5: // result is still 13 6: responseTime = 7; 7: comboAggregator.Aggregate(responseTime); 8:  9: // result is now 20 10: responseTime = 20; 11: comboAggregator.Aggregate(responseTime); So remember, you can chain multiple Func (or other delegates that return values) together, but if you do so you will only get the last executed result. Func delegates and co-variance/contra-variance in .NET 4.0 Just like the Action delegate, as of .NET 4.0, the Func delegate family is contra-variant on its arguments.  In addition, it is co-variant on its return type.  To support this, in .NET 4.0 the signatures of the Func delegates changed to: Func<out TResult> – matches a method that takes no arguments, and returns value of type TResult (or a more derived type). Func<in T, out TResult> – matches a method that takes an argument of type T (or a less derived type), and returns value of type TResult(or a more derived type). Func<in T1, in T2, out TResult> – matches a method that takes arguments of type T1 and T2 (or less derived types), and returns value of type TResult (or a more derived type). Func<in T1, in T2, …, out TResult> – and so on up to 16 arguments, and returns value of type TResult (or a more derived type). Notice the addition of the in and out keywords before each of the generic type placeholders.  As we saw last week, the in keyword is used to specify that a generic type can be contra-variant -- it can match the given type or a type that is less derived.  However, the out keyword, is used to specify that a generic type can be co-variant -- it can match the given type or a type that is more derived. On contra-variance, if you are saying you need an function that will accept a string, you can just as easily give it an function that accepts an object.  In other words, if you say “give me an function that will process dogs”, I could pass you a method that will process any animal, because all dogs are animals.  On the co-variance side, if you are saying you need a function that returns an object, you can just as easily pass it a function that returns a string because any string returned from the given method can be accepted by a delegate expecting an object result, since string is more derived.  Once again, in other words, if you say “give me a method that creates an animal”, I can pass you a method that will create a dog, because all dogs are animals. It really all makes sense, you can pass a more specific thing to a less specific parameter, and you can return a more specific thing as a less specific result.  In other words, pay attention to the direction the item travels (parameters go in, results come out).  Keeping that in mind, you can always pass more specific things in and return more specific things out. For example, in the code below, we have a method that takes a Func<object> to generate an object, but we can pass it a Func<string> because the return type of object can obviously accept a return value of string as well: 1: // since Func<object> is co-variant, this will access Func<string>, etc... 2: public static string Sequence(int count, Func<object> generator) 3: { 4: var builder = new StringBuilder(); 5:  6: for (int i=0; i<count; i++) 7: { 8: object value = generator(); 9: builder.Append(value); 10: } 11:  12: return builder.ToString(); 13: } Even though the method above takes a Func<object>, we can pass a Func<string> because the TResult type placeholder is co-variant and accepts types that are more derived as well: 1: // delegate that's typed to return string. 2: Func<string> stringGenerator = () => DateTime.Now.ToString(); 3:  4: // This will work in .NET 4.0, but not in previous versions 5: Sequence(100, stringGenerator); Previous versions of .NET implemented some forms of co-variance and contra-variance before, but .NET 4.0 goes one step further and allows you to pass or assign an Func<A, BResult> to a Func<Y, ZResult> as long as A is less derived (or same) as Y, and BResult is more derived (or same) as ZResult. Sidebar: The Func and the Predicate A method that takes one argument and returns a bool is generally thought of as a predicate.  Predicates are used to examine an item and determine whether that item satisfies a particular condition.  Predicates are typically unary, but you may also have binary and other predicates as well. Predicates are often used to filter results, such as in the LINQ Where() extension method: 1: var numbers = new[] { 1, 2, 4, 13, 8, 10, 27 }; 2:  3: // call Where() using a predicate which determines if the number is even 4: var evens = numbers.Where(num => num % 2 == 0); As of .NET 3.5, predicates are typically represented as Func<T, bool> where T is the type of the item to examine.  Previous to .NET 3.5, there was a Predicate<T> type that tended to be used (which we’ll discuss next week) and is still supported, but most developers recommend using Func<T, bool> now, as it prevents confusion with overloads that accept unary predicates and binary predicates, etc.: 1: // this seems more confusing as an overload set, because of Predicate vs Func 2: public static SomeMethod(Predicate<int> unaryPredicate) { } 3: public static SomeMethod(Func<int, int, bool> binaryPredicate) { } 4:  5: // this seems more consistent as an overload set, since just uses Func 6: public static SomeMethod(Func<int, bool> unaryPredicate) { } 7: public static SomeMethod(Func<int, int, bool> binaryPredicate) { } Also, even though Predicate<T> and Func<T, bool> match the same signatures, they are separate types!  Thus you cannot assign a Predicate<T> instance to a Func<T, bool> instance and vice versa: 1: // the same method, lambda expression, etc can be assigned to both 2: Predicate<int> isEven = i => (i % 2) == 0; 3: Func<int, bool> alsoIsEven = i => (i % 2) == 0; 4:  5: // but the delegate instances cannot be directly assigned, strongly typed! 6: // ERROR: cannot convert type... 7: isEven = alsoIsEven; 8:  9: // however, you can assign by wrapping in a new instance: 10: isEven = new Predicate<int>(alsoIsEven); 11: alsoIsEven = new Func<int, bool>(isEven); So, the general advice that seems to come from most developers is that Predicate<T> is still supported, but we should use Func<T, bool> for consistency in .NET 3.5 and above. Sidebar: Func as a Generator for Unit Testing One area of difficulty in unit testing can be unit testing code that is based on time of day.  We’d still want to unit test our code to make sure the logic is accurate, but we don’t want the results of our unit tests to be dependent on the time they are run. One way (of many) around this is to create an internal generator that will produce the “current” time of day.  This would default to returning result from DateTime.Now (or some other method), but we could inject specific times for our unit testing.  Generators are typically methods that return (generate) a value for use in a class/method. For example, say we are creating a CacheItem<T> class that represents an item in the cache, and we want to make sure the item shows as expired if the age is more than 30 seconds.  Such a class could look like: 1: // responsible for maintaining an item of type T in the cache 2: public sealed class CacheItem<T> 3: { 4: // helper method that returns the current time 5: private static Func<DateTime> _timeGenerator = () => DateTime.Now; 6:  7: // allows internal access to the time generator 8: internal static Func<DateTime> TimeGenerator 9: { 10: get { return _timeGenerator; } 11: set { _timeGenerator = value; } 12: } 13:  14: // time the item was cached 15: public DateTime CachedTime { get; private set; } 16:  17: // the item cached 18: public T Value { get; private set; } 19:  20: // item is expired if older than 30 seconds 21: public bool IsExpired 22: { 23: get { return _timeGenerator() - CachedTime > TimeSpan.FromSeconds(30.0); } 24: } 25:  26: // creates the new cached item, setting cached time to "current" time 27: public CacheItem(T value) 28: { 29: Value = value; 30: CachedTime = _timeGenerator(); 31: } 32: } Then, we can use this construct to unit test our CacheItem<T> without any time dependencies: 1: var baseTime = DateTime.Now; 2:  3: // start with current time stored above (so doesn't drift) 4: CacheItem<int>.TimeGenerator = () => baseTime; 5:  6: var target = new CacheItem<int>(13); 7:  8: // now add 15 seconds, should still be non-expired 9: CacheItem<int>.TimeGenerator = () => baseTime.AddSeconds(15); 10:  11: Assert.IsFalse(target.IsExpired); 12:  13: // now add 31 seconds, should now be expired 14: CacheItem<int>.TimeGenerator = () => baseTime.AddSeconds(31); 15:  16: Assert.IsTrue(target.IsExpired); Now we can unit test for 1 second before, 1 second after, 1 millisecond before, 1 day after, etc.  Func delegates can be a handy tool for this type of value generation to support more testable code.  Summary Generic delegates give us a lot of power to make truly generic algorithms and classes.  The Func family of delegates is a great way to be able to specify functions to calculate a result based on 0-16 arguments.  Stay tuned in the weeks that follow for other generic delegates in the .NET Framework!   Tweet Technorati Tags: .NET, C#, CSharp, Little Wonders, Generics, Func, Delegates

    Read the article

  • C#/.NET Little Wonders: The Nullable static class

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. Today we’re going to look at an interesting Little Wonder that can be used to mitigate what could be considered a Little Pitfall.  The Little Wonder we’ll be examining is the System.Nullable static class.  No, not the System.Nullable<T> class, but a static helper class that has one useful method in particular that we will examine… but first, let’s look at the Little Pitfall that makes this wonder so useful. Little Pitfall: Comparing nullable value types using <, >, <=, >= Examine this piece of code, without examining it too deeply, what’s your gut reaction as to the result? 1: int? x = null; 2:  3: if (x < 100) 4: { 5: Console.WriteLine("True, {0} is less than 100.", 6: x.HasValue ? x.ToString() : "null"); 7: } 8: else 9: { 10: Console.WriteLine("False, {0} is NOT less than 100.", 11: x.HasValue ? x.ToString() : "null"); 12: } Your gut would be to say true right?  It would seem to make sense that a null integer is less than the integer constant 100.  But the result is actually false!  The null value is not less than 100 according to the less-than operator. It looks even more outrageous when you consider this also evaluates to false: 1: int? x = null; 2:  3: if (x < int.MaxValue) 4: { 5: // ... 6: } So, are we saying that null is less than every valid int value?  If that were true, null should be less than int.MinValue, right?  Well… no: 1: int? x = null; 2:  3: // um... hold on here, x is NOT less than min value? 4: if (x < int.MinValue) 5: { 6: // ... 7: } So what’s going on here?  If we use greater than instead of less than, we see the same little dilemma: 1: int? x = null; 2:  3: // once again, null is not greater than anything either... 4: if (x > int.MinValue) 5: { 6: // ... 7: } It turns out that four of the comparison operators (<, <=, >, >=) are designed to return false anytime at least one of the arguments is null when comparing System.Nullable wrapped types that expose the comparison operators (short, int, float, double, DateTime, TimeSpan, etc.).  What’s even odder is that even though the two equality operators (== and !=) work correctly, >= and <= have the same issue as < and > and return false if both System.Nullable wrapped operator comparable types are null! 1: DateTime? x = null; 2: DateTime? y = null; 3:  4: if (x <= y) 5: { 6: Console.WriteLine("You'd think this is true, since both are null, but it's not."); 7: } 8: else 9: { 10: Console.WriteLine("It's false because <=, <, >, >= don't work on null."); 11: } To make matters even more confusing, take for example your usual check to see if something is less than, greater to, or equal: 1: int? x = null; 2: int? y = 100; 3:  4: if (x < y) 5: { 6: Console.WriteLine("X is less than Y"); 7: } 8: else if (x > y) 9: { 10: Console.WriteLine("X is greater than Y"); 11: } 12: else 13: { 14: // We fall into the "equals" assumption, but clearly null != 100! 15: Console.WriteLine("X is equal to Y"); 16: } Yes, this code outputs “X is equal to Y” because both the less-than and greater-than operators return false when a Nullable wrapped operator comparable type is null.  This violates a lot of our assumptions because we assume is something is not less than something, and it’s not greater than something, it must be equal.  So keep in mind, that the only two comparison operators that work on Nullable wrapped types where at least one is null are the equals (==) and not equals (!=) operators: 1: int? x = null; 2: int? y = 100; 3:  4: if (x == y) 5: { 6: Console.WriteLine("False, x is null, y is not."); 7: } 8:  9: if (x != y) 10: { 11: Console.WriteLine("True, x is null, y is not."); 12: } Solution: The Nullable static class So we’ve seen that <, <=, >, and >= have some interesting and perhaps unexpected behaviors that can trip up a novice developer who isn’t expecting the kinks that System.Nullable<T> types with comparison operators can throw.  How can we easily mitigate this? Well, obviously, you could do null checks before each check, but that starts to get ugly: 1: if (x.HasValue) 2: { 3: if (y.HasValue) 4: { 5: if (x < y) 6: { 7: Console.WriteLine("x < y"); 8: } 9: else if (x > y) 10: { 11: Console.WriteLine("x > y"); 12: } 13: else 14: { 15: Console.WriteLine("x == y"); 16: } 17: } 18: else 19: { 20: Console.WriteLine("x > y because y is null and x isn't"); 21: } 22: } 23: else if (y.HasValue) 24: { 25: Console.WriteLine("x < y because x is null and y isn't"); 26: } 27: else 28: { 29: Console.WriteLine("x == y because both are null"); 30: } Yes, we could probably simplify this logic a bit, but it’s still horrendous!  So what do we do if we want to consider null less than everything and be able to properly compare Nullable<T> wrapped value types? The key is the System.Nullable static class.  This class is a companion class to the System.Nullable<T> class and allows you to use a few helper methods for Nullable<T> wrapped types, including a static Compare<T>() method of the. What’s so big about the static Compare<T>() method?  It implements an IComparer compatible comparison on Nullable<T> types.  Why do we care?  Well, if you look at the MSDN description for how IComparer works, you’ll read: Comparing null with any type is allowed and does not generate an exception when using IComparable. When sorting, null is considered to be less than any other object. This is what we probably want!  We want null to be less than everything!  So now we can change our logic to use the Nullable.Compare<T>() static method: 1: int? x = null; 2: int? y = 100; 3:  4: if (Nullable.Compare(x, y) < 0) 5: { 6: // Yes! x is null, y is not, so x is less than y according to Compare(). 7: Console.WriteLine("x < y"); 8: } 9: else if (Nullable.Compare(x, y) > 0) 10: { 11: Console.WriteLine("x > y"); 12: } 13: else 14: { 15: Console.WriteLine("x == y"); 16: } Summary So, when doing math comparisons between two numeric values where one of them may be a null Nullable<T>, consider using the System.Nullable.Compare<T>() method instead of the comparison operators.  It will treat null less than any value, and will avoid logic consistency problems when relying on < returning false to indicate >= is true and so on. Tweet   Technorati Tags: C#,C-Sharp,.NET,Little Wonders,Little Pitfalls,Nulalble

    Read the article

  • C#/.NET Little Wonders: Use Cast() and TypeOf() to Change Sequence Type

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. We’ve seen how the Select() extension method lets you project a sequence from one type to a new type which is handy for getting just parts of items, or building new items.  But what happens when the items in the sequence are already the type you want, but the sequence itself is typed to an interface or super-type instead of the sub-type you need? For example, you may have a sequence of Rectangle stored in an IEnumerable<Shape> and want to consider it an IEnumerable<Rectangle> sequence instead.  Today we’ll look at two handy extension methods, Cast<TResult>() and OfType<TResult>() which help you with this task. Cast<TResult>() – Attempt to cast all items to type TResult So, the first thing we can do would be to attempt to create a sequence of TResult from every item in the source sequence.  Typically we’d do this if we had an IEnumerable<T> where we knew that every item was actually a TResult where TResult inherits/implements T. For example, assume the typical Shape example classes: 1: // abstract base class 2: public abstract class Shape { } 3:  4: // a basic rectangle 5: public class Rectangle : Shape 6: { 7: public int Widtgh { get; set; } 8: public int Height { get; set; } 9: } And let’s assume we have a sequence of Shape where every Shape is a Rectangle… 1: var shapes = new List<Shape> 2: { 3: new Rectangle { Width = 3, Height = 5 }, 4: new Rectangle { Width = 10, Height = 13 }, 5: // ... 6: }; To get the sequence of Shape as a sequence of Rectangle, of course, we could use a Select() clause, such as: 1: // select each Shape, cast it to Rectangle 2: var rectangles = shapes 3: .Select(s => (Rectangle)s) 4: .ToList(); But that’s a bit verbose, and fortunately there is already a facility built in and ready to use in the form of the Cast<TResult>() extension method: 1: // cast each item to Rectangle and store in a List<Rectangle> 2: var rectangles = shapes 3: .Cast<Rectangle>() 4: .ToList(); However, we should note that if anything in the list cannot be cast to a Rectangle, you will get an InvalidCastException thrown at runtime.  Thus, if our Shape sequence had a Circle in it, the call to Cast<Rectangle>() would have failed.  As such, you should only do this when you are reasonably sure of what the sequence actually contains (or are willing to handle an exception if you’re wrong). Another handy use of Cast<TResult>() is using it to convert an IEnumerable to an IEnumerable<T>.  If you look at the signature, you’ll see that the Cast<TResult>() extension method actually extends the older, object-based IEnumerable interface instead of the newer, generic IEnumerable<T>.  This is your gateway method for being able to use LINQ on older, non-generic sequences.  For example, consider the following: 1: // the older, non-generic collections are sequence of object 2: var shapes = new ArrayList 3: { 4: new Rectangle { Width = 3, Height = 13 }, 5: new Rectangle { Width = 10, Height = 20 }, 6: // ... 7: }; Since this is an older, object based collection, we cannot use the LINQ extension methods on it directly.  For example, if I wanted to query the Shape sequence for only those Rectangles whose Width is > 5, I can’t do this: 1: // compiler error, Where() operates on IEnumerable<T>, not IEnumerable 2: var bigRectangles = shapes.Where(r => r.Width > 5); However, I can use Cast<Rectangle>() to treat my ArrayList as an IEnumerable<Rectangle> and then do the query! 1: // ah, that’s better! 2: var bigRectangles = shapes.Cast<Rectangle>().Where(r => r.Width > 5); Or, if you prefer, in LINQ query expression syntax: 1: var bigRectangles = from s in shapes.Cast<Rectangle>() 2: where s.Width > 5 3: select s; One quick warning: Cast<TResult>() only attempts to cast, it won’t perform a cast conversion.  That is, consider this: 1: var intList = new List<int> { 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89 }; 2:  3: // casting ints to longs, this should work, right? 4: var asLong = intList.Cast<long>().ToList(); Will the code above work?  No, you’ll get a InvalidCastException. Remember that Cast<TResult>() is an extension of IEnumerable, thus it is a sequence of object, which means that it will box every int as an object as it enumerates over it, and there is no cast conversion from object to long, and thus the cast fails.  In other words, a cast from int to long will succeed because there is a conversion from int to long.  But a cast from int to object to long will not, because you can only unbox an item by casting it to its exact type. For more information on why cast-converting boxed values doesn’t work, see this post on The Dangers of Casting Boxed Values (here). OfType<TResult>() – Filter sequence to only items of type TResult So, we’ve seen how we can use Cast<TResult>() to change the type of our sequence, when we expect all the items of the sequence to be of a specific type.  But what do we do when a sequence contains many different types, and we are only concerned with a subset of a given type? For example, what if a sequence of Shape contains Rectangle and Circle instances, and we just want to select all of the Rectangle instances?  Well, let’s say we had this sequence of Shape: 1: var shapes = new List<Shape> 2: { 3: new Rectangle { Width = 3, Height = 5 }, 4: new Rectangle { Width = 10, Height = 13 }, 5: new Circle { Radius = 10 }, 6: new Square { Side = 13 }, 7: // ... 8: }; Well, we could get the rectangles using Select(), like: 1: var onlyRectangles = shapes.Where(s => s is Rectangle).ToList(); But fortunately, an easier way has already been written for us in the form of the OfType<T>() extension method: 1: // returns only a sequence of the shapes that are Rectangles 2: var onlyRectangles = shapes.OfType<Rectangle>().ToList(); Now we have a sequence of only the Rectangles in the original sequence, we can also use this to chain other queries that depend on Rectangles, such as: 1: // select only Rectangles, then filter to only those more than 2: // 5 units wide... 3: var onlyBigRectangles = shapes.OfType<Rectangle>() 4: .Where(r => r.Width > 5) 5: .ToList(); The OfType<Rectangle>() will filter the sequence to only the items that are of type Rectangle (or a subclass of it), and that results in an IEnumerable<Rectangle>, we can then apply the other LINQ extension methods to query that list further. Just as Cast<TResult>() is an extension method on IEnumerable (and not IEnumerable<T>), the same is true for OfType<T>().  This means that you can use OfType<TResult>() on object-based collections as well. For example, given an ArrayList containing Shapes, as below: 1: // object-based collections are a sequence of object 2: var shapes = new ArrayList 3: { 4: new Rectangle { Width = 3, Height = 5 }, 5: new Rectangle { Width = 10, Height = 13 }, 6: new Circle { Radius = 10 }, 7: new Square { Side = 13 }, 8: // ... 9: }; We can use OfType<Rectangle> to filter the sequence to only Rectangle items (and subclasses), and then chain other LINQ expressions, since we will then be of type IEnumerable<Rectangle>: 1: // OfType() converts the sequence of object to a new sequence 2: // containing only Rectangle or sub-types of Rectangle. 3: var onlyBigRectangles = shapes.OfType<Rectangle>() 4: .Where(r => r.Width > 5) 5: .ToList(); Summary So now we’ve seen two different ways to get a sequence of a superclass or interface down to a more specific sequence of a subclass or implementation.  The Cast<TResult>() method casts every item in the source sequence to type TResult, and the OfType<TResult>() method selects only those items in the source sequence that are of type TResult. You can use these to downcast sequences, or adapt older types and sequences that only implement IEnumerable (such as DataTable, ArrayList, etc.). Technorati Tags: C#,CSharp,.NET,LINQ,Little Wonders,TypeOf,Cast,IEnumerable<T>

    Read the article

  • C#: A "Dumbed-Down" C++?

    - by James Michael Hare
    I was spending a lovely day this last weekend watching my sons play outside in one of the better weekends we've had here in Saint Louis for quite some time, and whilst watching them and making sure no limbs were broken or eyes poked out with sticks and other various potential injuries, I was perusing (in the correct sense of the word) this month's MSDN magazine to get a sense of the latest VS2010 features in both IDE and in languages. When I got to the back pages, I saw a wonderful article by David S. Platt entitled, "In Praise of Dumbing Down"  (msdn.microsoft.com/en-us/magazine/ee336129.aspx).  The title captivated me and I read it and found myself agreeing with it completely especially as it related to my first post on divorcing C++ as my favorite language. Unfortunately, as Mr. Platt mentions, the term dumbing-down has negative connotations, but is really and truly a good thing.  You are, in essence, taking something that is extremely complex and reducing it to something that is much easier to use and far less error prone.  Adding safeties to power tools and anti-kick mechanisms to chainsaws are in some sense "dumbing them down" to the common user -- but that also makes them safer and more accessible for the common user.  This was exactly my point with C++ and C#.  I did not mean to infer that C++ was not a useful or good language, but that in a very high percentage of cases, is too complex and error prone for the job at hand. Choosing the correct programming language for a job is a lot like choosing any other tool for a task.  For example: if I want to dig a French drain in my lawn, I can attempt to use a huge tractor-like backhoe and the job would be done far quicker than if I would dig it by hand.  I can't deny that the backhoe has the raw power and speed to perform.  But you also cannot deny that my chances of injury or chances of severing utility lines or other resources climb at an exponential rate inverse to the amount of training I may have on that machinery. Is C++ a powerful tool?  Oh yes, and it's great for those tasks where speed and performance are paramount.  But for most of us, it's the wrong tool.  And keep in mind, I say this even though I have 17 years of experience in using it and feel myself highly adept in utilizing its features both in the standard libraries, the STL, and in supplemental libraries such as BOOST.  Which, although greatly help with adding powerful features quickly, do very little to curb the relative dangers of the language. So, you may say, the fault is in the developer, that if the developer had some higher skills or if we only hired C++ experts this would not be an issue.  Now, I will concede there is some truth to this.  Obviously, the higher skilled C++ developers you hire the better the chance they will produce highly performant and error-free code.  However, what good is that to the average developer who cannot afford a full stable of C++ experts? That's my point with C#:  It's like a kinder, gentler C++.  It gives you nearly the same speed, and in many ways even more power than C++, and it gives you a much softer cushion for novices to fall against if they code less-than-optimally.  A bug is a bug, of course, in any language, but C# does a good job of hiding and taking on the task of handling almost all of the resource issues that make C++ so tricky.  For my money, C# is much more maintainable, more feature-rich, second only slightly in performance, faster to market, and -- last but not least -- safer and easier to use.  That's why, where I work, I much prefer to see the developers moving to C#.  The quantity of bugs is much lower, and we don't need to hire "experts" to achieve the same results since the language itself handles those resource pitfalls so prevalent in poorly written C++ code.  C++ will still have its place in the world, and I'm sure I'll still use it now and again where it is truly the correct tool for the job, but for nearly every other project C# is a wonderfully "dumbed-down" version of C++ -- in the very best sense -- and to me, that's the smart choice.

    Read the article

  • C#: Handling Notifications: inheritance, events, or delegates?

    - by James Michael Hare
    Often times as developers we have to design a class where we get notification when certain things happen. In older object-oriented code this would often be implemented by overriding methods -- with events, delegates, and interfaces, however, we have far more elegant options. So, when should you use each of these methods and what are their strengths and weaknesses? Now, for the purposes of this article when I say notification, I'm just talking about ways for a class to let a user know that something has occurred. This can be through any programmatic means such as inheritance, events, delegates, etc. So let's build some context. I'm sitting here thinking about a provider neutral messaging layer for the place I work, and I got to the point where I needed to design the message subscriber which will receive messages from the message bus. Basically, what we want is to be able to create a message listener and have it be called whenever a new message arrives. Now, back before the flood we would have done this via inheritance and an abstract class: 1:  2: // using inheritance - omitting argument null checks and halt logic 3: public abstract class MessageListener 4: { 5: private ISubscriber _subscriber; 6: private bool _isHalted = false; 7: private Thread _messageThread; 8:  9: // assign the subscriber and start the messaging loop 10: public MessageListener(ISubscriber subscriber) 11: { 12: _subscriber = subscriber; 13: _messageThread = new Thread(MessageLoop); 14: _messageThread.Start(); 15: } 16:  17: // user will override this to process their messages 18: protected abstract void OnMessageReceived(Message msg); 19:  20: // handle the looping in the thread 21: private void MessageLoop() 22: { 23: while(!_isHalted) 24: { 25: // as long as processing, wait 1 second for message 26: Message msg = _subscriber.Receive(TimeSpan.FromSeconds(1)); 27: if(msg != null) 28: { 29: OnMessageReceived(msg); 30: } 31: } 32: } 33: ... 34: } It seems so odd to write this kind of code now. Does it feel odd to you? Maybe it's just because I've gotten so used to delegation that I really don't like the feel of this. To me it is akin to saying that if I want to drive my car I need to derive a new instance of it just to put myself in the driver's seat. And yet, unquestionably, five years ago I would have probably written the code as you see above. To me, inheritance is a flawed approach for notifications due to several reasons: Inheritance is one of the HIGHEST forms of coupling. You can't seal the listener class because it depends on sub-classing to work. Because C# does not allow multiple-inheritance, I've spent my one inheritance implementing this class. Every time you need to listen to a bus, you have to derive a class which leads to lots of trivial sub-classes. The act of consuming a message should be a separate responsibility than the act of listening for a message (SRP). Inheritance is such a strong statement (this IS-A that) that it should only be used in building type hierarchies and not for overriding use-specific behaviors and notifications. Chances are, if a class needs to be inherited to be used, it most likely is not designed as well as it could be in today's modern programming languages. So lets look at the other tools available to us for getting notified instead. Here's a few other choices to consider. Have the listener expose a MessageReceived event. Have the listener accept a new IMessageHandler interface instance. Have the listener accept an Action<Message> delegate. Really, all of these are different forms of delegation. Now, .NET events are a bit heavier than the other types of delegates in terms of run-time execution, but they are a great way to allow others using your class to subscribe to your events: 1: // using event - ommiting argument null checks and halt logic 2: public sealed class MessageListener 3: { 4: private ISubscriber _subscriber; 5: private bool _isHalted = false; 6: private Thread _messageThread; 7:  8: // assign the subscriber and start the messaging loop 9: public MessageListener(ISubscriber subscriber) 10: { 11: _subscriber = subscriber; 12: _messageThread = new Thread(MessageLoop); 13: _messageThread.Start(); 14: } 15:  16: // user will override this to process their messages 17: public event Action<Message> MessageReceived; 18:  19: // handle the looping in the thread 20: private void MessageLoop() 21: { 22: while(!_isHalted) 23: { 24: // as long as processing, wait 1 second for message 25: Message msg = _subscriber.Receive(TimeSpan.FromSeconds(1)); 26: if(msg != null && MessageReceived != null) 27: { 28: MessageReceived(msg); 29: } 30: } 31: } 32: } Note, now we can seal the class to avoid changes and the user just needs to provide a message handling method: 1: theListener.MessageReceived += CustomReceiveMethod; However, personally I don't think events hold up as well in this case because events are largely optional. To me, what is the point of a listener if you create one with no event listeners? So in my mind, use events when handling the notification is optional. So how about the delegation via interface? I personally like this method quite a bit. Basically what it does is similar to inheritance method mentioned first, but better because it makes it easy to split the part of the class that doesn't change (the base listener behavior) from the part that does change (the user-specified action after receiving a message). So assuming we had an interface like: 1: public interface IMessageHandler 2: { 3: void OnMessageReceived(Message receivedMessage); 4: } Our listener would look like this: 1: // using delegation via interface - omitting argument null checks and halt logic 2: public sealed class MessageListener 3: { 4: private ISubscriber _subscriber; 5: private IMessageHandler _handler; 6: private bool _isHalted = false; 7: private Thread _messageThread; 8:  9: // assign the subscriber and start the messaging loop 10: public MessageListener(ISubscriber subscriber, IMessageHandler handler) 11: { 12: _subscriber = subscriber; 13: _handler = handler; 14: _messageThread = new Thread(MessageLoop); 15: _messageThread.Start(); 16: } 17:  18: // handle the looping in the thread 19: private void MessageLoop() 20: { 21: while(!_isHalted) 22: { 23: // as long as processing, wait 1 second for message 24: Message msg = _subscriber.Receive(TimeSpan.FromSeconds(1)); 25: if(msg != null) 26: { 27: _handler.OnMessageReceived(msg); 28: } 29: } 30: } 31: } And they would call it by creating a class that implements IMessageHandler and pass that instance into the constructor of the listener. I like that this alleviates the issues of inheritance and essentially forces you to provide a handler (as opposed to events) on construction. Well, this is good, but personally I think we could go one step further. While I like this better than events or inheritance, it still forces you to implement a specific method name. What if that name collides? Furthermore if you have lots of these you end up either with large classes inheriting multiple interfaces to implement one method, or lots of small classes. Also, if you had one class that wanted to manage messages from two different subscribers differently, it wouldn't be able to because the interface can't be overloaded. This brings me to using delegates directly. In general, every time I think about creating an interface for something, and if that interface contains only one method, I start thinking a delegate is a better approach. Now, that said delegates don't accomplish everything an interface can. Obviously having the interface allows you to refer to the classes that implement the interface which can be very handy. In this case, though, really all you want is a method to handle the messages. So let's look at a method delegate: 1: // using delegation via delegate - omitting argument null checks and halt logic 2: public sealed class MessageListener 3: { 4: private ISubscriber _subscriber; 5: private Action<Message> _handler; 6: private bool _isHalted = false; 7: private Thread _messageThread; 8:  9: // assign the subscriber and start the messaging loop 10: public MessageListener(ISubscriber subscriber, Action<Message> handler) 11: { 12: _subscriber = subscriber; 13: _handler = handler; 14: _messageThread = new Thread(MessageLoop); 15: _messageThread.Start(); 16: } 17:  18: // handle the looping in the thread 19: private void MessageLoop() 20: { 21: while(!_isHalted) 22: { 23: // as long as processing, wait 1 second for message 24: Message msg = _subscriber.Receive(TimeSpan.FromSeconds(1)); 25: if(msg != null) 26: { 27: _handler(msg); 28: } 29: } 30: } 31: } Here the MessageListener now takes an Action<Message>.  For those of you unfamiliar with the pre-defined delegate types in .NET, that is a method with the signature: void SomeMethodName(Message). The great thing about delegates is it gives you a lot of power. You could create an anonymous delegate, a lambda, or specify any other method as long as it satisfies the Action<Message> signature. This way, you don't need to define an arbitrary helper class or name the method a specific thing. Incidentally, we could combine both the interface and delegate approach to allow maximum flexibility. Doing this, the user could either pass in a delegate, or specify a delegate interface: 1: // using delegation - give users choice of interface or delegate 2: public sealed class MessageListener 3: { 4: private ISubscriber _subscriber; 5: private Action<Message> _handler; 6: private bool _isHalted = false; 7: private Thread _messageThread; 8:  9: // assign the subscriber and start the messaging loop 10: public MessageListener(ISubscriber subscriber, Action<Message> handler) 11: { 12: _subscriber = subscriber; 13: _handler = handler; 14: _messageThread = new Thread(MessageLoop); 15: _messageThread.Start(); 16: } 17:  18: // passes the interface method as a delegate using method group 19: public MessageListener(ISubscriber subscriber, IMessageHandler handler) 20: : this(subscriber, handler.OnMessageReceived) 21: { 22: } 23:  24: // handle the looping in the thread 25: private void MessageLoop() 26: { 27: while(!_isHalted) 28: { 29: // as long as processing, wait 1 second for message 30: Message msg = _subscriber.Receive(TimeSpan.FromSeconds(1)); 31: if(msg != null) 32: { 33: _handler(msg); 34: } 35: } 36: } 37: } } This is the method I tend to prefer because it allows the user of the class to choose which method works best for them. You may be curious about the actual performance of these different methods. 1: Enter iterations: 2: 1000000 3:  4: Inheritance took 4 ms. 5: Events took 7 ms. 6: Interface delegation took 4 ms. 7: Lambda delegate took 5 ms. Before you get too caught up in the numbers, however, keep in mind that this is performance over over 1,000,000 iterations. Since they are all < 10 ms which boils down to fractions of a micro-second per iteration so really any of them are a fine choice performance wise. As such, I think the choice of what to do really boils down to what you're trying to do. Here's my guidelines: Inheritance should be used only when defining a collection of related types with implementation specific behaviors, it should not be used as a hook for users to add their own functionality. Events should be used when subscription is optional or multi-cast is desired. Interface delegation should be used when you wish to refer to implementing classes by the interface type or if the type requires several methods to be implemented. Delegate method delegation should be used when you only need to provide one method and do not need to refer to implementers by the interface name.

    Read the article

  • C#/.NET Little Wonders &ndash; Cross Calling Constructors

    - by James Michael Hare
    Just a small post today, it’s the final iteration before our release and things are crazy here!  This is another little tidbit that I love using, and it should be fairly common knowledge, yet I’ve noticed many times that less experienced developers tend to have redundant constructor code when they overload their constructors. The Problem – repetitive code is less maintainable Let’s say you were designing a messaging system, and so you want to create a class to represent the properties for a Receiver, so perhaps you design a ReceiverProperties class to represent this collection of properties. Perhaps, you decide to make ReceiverProperties immutable, and so you have several constructors that you can use for alternative construction: 1: // Constructs a set of receiver properties. 2: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable, bool isBuffered) 3: { 4: ReceiverType = receiverType; 5: Source = source; 6: IsDurable = isDurable; 7: IsBuffered = isBuffered; 8: } 9: 10: // Constructs a set of receiver properties with buffering on by default. 11: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable) 12: { 13: ReceiverType = receiverType; 14: Source = source; 15: IsDurable = isDurable; 16: IsBuffered = true; 17: } 18:  19: // Constructs a set of receiver properties with buffering on and durability off. 20: public ReceiverProperties(ReceiverType receiverType, string source) 21: { 22: ReceiverType = receiverType; 23: Source = source; 24: IsDurable = false; 25: IsBuffered = true; 26: } Note: keep in mind this is just a simple example for illustration, and in same cases default parameters can also help clean this up, but they have issues of their own. While strictly speaking, there is nothing wrong with this code, logically, it suffers from maintainability flaws.  Consider what happens if you add a new property to the class?  You have to remember to guarantee that it is set appropriately in every constructor call. This can cause subtle bugs and becomes even uglier when the constructors do more complex logic, error handling, or there are numerous potential overloads (especially if you can’t easily see them all on one screen’s height). The Solution – cross-calling constructors I’d wager nearly everyone knows how to call your base class’s constructor, but you can also cross-call to one of the constructors in the same class by using the this keyword in the same way you use base to call a base constructor. 1: // Constructs a set of receiver properties. 2: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable, bool isBuffered) 3: { 4: ReceiverType = receiverType; 5: Source = source; 6: IsDurable = isDurable; 7: IsBuffered = isBuffered; 8: } 9: 10: // Constructs a set of receiver properties with buffering on by default. 11: public ReceiverProperties(ReceiverType receiverType, string source, bool isDurable) 12: : this(receiverType, source, isDurable, true) 13: { 14: } 15:  16: // Constructs a set of receiver properties with buffering on and durability off. 17: public ReceiverProperties(ReceiverType receiverType, string source) 18: : this(receiverType, source, false, true) 19: { 20: } Notice, there is much less code.  In addition, the code you have has no repetitive logic.  You can define the main constructor that takes all arguments, and the remaining constructors with defaults simply cross-call the main constructor, passing in the defaults. Yes, in some cases default parameters can ease some of this for you, but default parameters only work for compile-time constants (null, string and number literals).  For example, if you were creating a TradingDataAdapter that relied on an implementation of ITradingDao which is the data access object to retreive records from the database, you might want two constructors: one that takes an ITradingDao reference, and a default constructor which constructs a specific ITradingDao for ease of use: 1: public TradingDataAdapter(ITradingDao dao) 2: { 3: _tradingDao = dao; 4:  5: // other constructor logic 6: } 7:  8: public TradingDataAdapter() 9: { 10: _tradingDao = new SqlTradingDao(); 11:  12: // same constructor logic as above 13: }   As you can see, this isn’t something we can solve with a default parameter, but we could with cross-calling constructors: 1: public TradingDataAdapter(ITradingDao dao) 2: { 3: _tradingDao = dao; 4:  5: // other constructor logic 6: } 7:  8: public TradingDataAdapter() 9: : this(new SqlTradingDao()) 10: { 11: }   So in cases like this where you have constructors with non compiler-time constant defaults, default parameters can’t help you and cross-calling constructors is one of your best options. Summary When you have just one constructor doing the job of initializing the class, you can consolidate all your logic and error-handling in one place, thus ensuring that your behavior will be consistent across the constructor calls. This makes the code more maintainable and even easier to read.  There will be some cases where cross-calling constructors may be sub-optimal or not possible (if, for example, the overloaded constructors take completely different types and are not just “defaulting” behaviors). You can also use default parameters, of course, but default parameter behavior in a class hierarchy can be problematic (default values are not inherited and in fact can differ) so sometimes multiple constructors are actually preferable. Regardless of why you may need to have multiple constructors, consider cross-calling where you can to reduce redundant logic and clean up the code.   Technorati Tags: C#,.NET,Little Wonders

    Read the article

  • C#/.NET Little Wonders: Skip() and Take()

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. I’ve covered many valuable methods from System.Linq class library before, so you already know it’s packed with extension-method goodness.  Today I’d like to cover two small families I’ve neglected to mention before: Skip() and Take().  While these methods seem so simple, they are an easy way to create sub-sequences for IEnumerable<T>, much the way GetRange() creates sub-lists for List<T>. Skip() and SkipWhile() The Skip() family of methods is used to ignore items in a sequence until either a certain number are passed, or until a certain condition becomes false.  This makes the methods great for starting a sequence at a point possibly other than the first item of the original sequence.   The Skip() family of methods contains the following methods (shown below in extension method syntax): Skip(int count) Ignores the specified number of items and returns a sequence starting at the item after the last skipped item (if any).  SkipWhile(Func<T, bool> predicate) Ignores items as long as the predicate returns true and returns a sequence starting with the first item to invalidate the predicate (if any).  SkipWhile(Func<T, int, bool> predicate) Same as above, but passes not only the item itself to the predicate, but also the index of the item.  For example: 1: var list = new[] { 3.14, 2.72, 42.0, 9.9, 13.0, 101.0 }; 2:  3: // sequence contains { 2.72, 42.0, 9.9, 13.0, 101.0 } 4: var afterSecond = list.Skip(1); 5: Console.WriteLine(string.Join(", ", afterSecond)); 6:  7: // sequence contains { 42.0, 9.9, 13.0, 101.0 } 8: var afterFirstDoubleDigit = list.SkipWhile(v => v < 10.0); 9: Console.WriteLine(string.Join(", ", afterFirstDoubleDigit)); Note that the SkipWhile() stops skipping at the first item that returns false and returns from there to the rest of the sequence, even if further items in that sequence also would satisfy the predicate (otherwise, you’d probably be using Where() instead, of course). If you do use the form of SkipWhile() which also passes an index into the predicate, then you should keep in mind that this is the index of the item in the sequence you are calling SkipWhile() from, not the index in the original collection.  That is, consider the following: 1: var list = new[] { 1.0, 1.1, 1.2, 2.2, 2.3, 2.4 }; 2:  3: // Get all items < 10, then 4: var whatAmI = list 5: .Skip(2) 6: .SkipWhile((i, x) => i > x); For this example the result above is 2.4, and not 1.2, 2.2, 2.3, 2.4 as some might expect.  The key is knowing what the index is that’s passed to the predicate in SkipWhile().  In the code above, because Skip(2) skips 1.0 and 1.1, the sequence passed to SkipWhile() begins at 1.2 and thus it considers the “index” of 1.2 to be 0 and not 2.  This same logic applies when using any of the extension methods that have an overload that allows you to pass an index into the delegate, such as SkipWhile(), TakeWhile(), Select(), Where(), etc.  It should also be noted, that it’s fine to Skip() more items than exist in the sequence (an empty sequence is the result), or even to Skip(0) which results in the full sequence.  So why would it ever be useful to return Skip(0) deliberately?  One reason might be to return a List<T> as an immutable sequence.  Consider this class: 1: public class MyClass 2: { 3: private List<int> _myList = new List<int>(); 4:  5: // works on surface, but one can cast back to List<int> and mutate the original... 6: public IEnumerable<int> OneWay 7: { 8: get { return _myList; } 9: } 10:  11: // works, but still has Add() etc which throw at runtime if accidentally called 12: public ReadOnlyCollection<int> AnotherWay 13: { 14: get { return new ReadOnlyCollection<int>(_myList); } 15: } 16:  17: // immutable, can't be cast back to List<int>, doesn't have methods that throw at runtime 18: public IEnumerable<int> YetAnotherWay 19: { 20: get { return _myList.Skip(0); } 21: } 22: } This code snippet shows three (among many) ways to return an internal sequence in varying levels of immutability.  Obviously if you just try to return as IEnumerable<T> without doing anything more, there’s always the danger the caller could cast back to List<T> and mutate your internal structure.  You could also return a ReadOnlyCollection<T>, but this still has the mutating methods, they just throw at runtime when called instead of giving compiler errors.  Finally, you can return the internal list as a sequence using Skip(0) which skips no items and just runs an iterator through the list.  The result is an iterator, which cannot be cast back to List<T>.  Of course, there’s many ways to do this (including just cloning the list, etc.) but the point is it illustrates a potential use of using an explicit Skip(0). Take() and TakeWhile() The Take() and TakeWhile() methods can be though of as somewhat of the inverse of Skip() and SkipWhile().  That is, while Skip() ignores the first X items and returns the rest, Take() returns a sequence of the first X items and ignores the rest.  Since they are somewhat of an inverse of each other, it makes sense that their calling signatures are identical (beyond the method name obviously): Take(int count) Returns a sequence containing up to the specified number of items. Anything after the count is ignored. TakeWhile(Func<T, bool> predicate) Returns a sequence containing items as long as the predicate returns true.  Anything from the point the predicate returns false and beyond is ignored. TakeWhile(Func<T, int, bool> predicate) Same as above, but passes not only the item itself to the predicate, but also the index of the item. So, for example, we could do the following: 1: var list = new[] { 1.0, 1.1, 1.2, 2.2, 2.3, 2.4 }; 2:  3: // sequence contains 1.0 and 1.1 4: var firstTwo = list.Take(2); 5:  6: // sequence contains 1.0, 1.1, 1.2 7: var underTwo = list.TakeWhile(i => i < 2.0); The same considerations for SkipWhile() with index apply to TakeWhile() with index, of course.  Using Skip() and Take() for sub-sequences A few weeks back, I talked about The List<T> Range Methods and showed how they could be used to get a sub-list of a List<T>.  This works well if you’re dealing with List<T>, or don’t mind converting to List<T>.  But if you have a simple IEnumerable<T> sequence and want to get a sub-sequence, you can also use Skip() and Take() to much the same effect: 1: var list = new List<double> { 1.0, 1.1, 1.2, 2.2, 2.3, 2.4 }; 2:  3: // results in List<T> containing { 1.2, 2.2, 2.3 } 4: var subList = list.GetRange(2, 3); 5:  6: // results in sequence containing { 1.2, 2.2, 2.3 } 7: var subSequence = list.Skip(2).Take(3); I say “much the same effect” because there are some differences.  First of all GetRange() will throw if the starting index or the count are greater than the number of items in the list, but Skip() and Take() do not.  Also GetRange() is a method off of List<T>, thus it can use direct indexing to get to the items much more efficiently, whereas Skip() and Take() operate on sequences and may actually have to walk through the items they skip to create the resulting sequence.  So each has their pros and cons.  My general rule of thumb is if I’m already working with a List<T> I’ll use GetRange(), but for any plain IEnumerable<T> sequence I’ll tend to prefer Skip() and Take() instead. Summary The Skip() and Take() families of LINQ extension methods are handy for producing sub-sequences from any IEnumerable<T> sequence.  Skip() will ignore the specified number of items and return the rest of the sequence, whereas Take() will return the specified number of items and ignore the rest of the sequence.  Similarly, the SkipWhile() and TakeWhile() methods can be used to skip or take items, respectively, until a given predicate returns false.    Technorati Tags: C#, CSharp, .NET, LINQ, IEnumerable<T>, Skip, Take, SkipWhile, TakeWhile

    Read the article

  • C#/.NET Little Wonders: Constraining Generics with Where Clause

    - by James Michael Hare
    Back when I was primarily a C++ developer, I loved C++ templates.  The power of writing very reusable generic classes brought the art of programming to a brand new level.  Unfortunately, when .NET 1.0 came about, they didn’t have a template equivalent.  With .NET 2.0 however, we finally got generics, which once again let us spread our wings and program more generically in the world of .NET However, C# generics behave in some ways very differently from their C++ template cousins.  There is a handy clause, however, that helps you navigate these waters to make your generics more powerful. The Problem – C# Assumes Lowest Common Denominator In C++, you can create a template and do nearly anything syntactically possible on the template parameter, and C++ will not check if the method/fields/operations invoked are valid until you declare a realization of the type.  Let me illustrate with a C++ example: 1: // compiles fine, C++ makes no assumptions as to T 2: template <typename T> 3: class ReverseComparer 4: { 5: public: 6: int Compare(const T& lhs, const T& rhs) 7: { 8: return rhs.CompareTo(lhs); 9: } 10: }; Notice that we are invoking a method CompareTo() off of template type T.  Because we don’t know at this point what type T is, C++ makes no assumptions and there are no errors. C++ tends to take the path of not checking the template type usage until the method is actually invoked with a specific type, which differs from the behavior of C#: 1: // this will NOT compile! C# assumes lowest common denominator. 2: public class ReverseComparer<T> 3: { 4: public int Compare(T lhs, T rhs) 5: { 6: return lhs.CompareTo(rhs); 7: } 8: } So why does C# give us a compiler error even when we don’t yet know what type T is?  This is because C# took a different path in how they made generics.  Unless you specify otherwise, for the purposes of the code inside the generic method, T is basically treated like an object (notice I didn’t say T is an object). That means that any operations, fields, methods, properties, etc that you attempt to use of type T must be available at the lowest common denominator type: object.  Now, while object has the broadest applicability, it also has the fewest specific.  So how do we allow our generic type placeholder to do things more than just what object can do? Solution: Constraint the Type With Where Clause So how do we get around this in C#?  The answer is to constrain the generic type placeholder with the where clause.  Basically, the where clause allows you to specify additional constraints on what the actual type used to fill the generic type placeholder must support. You might think that narrowing the scope of a generic means a weaker generic.  In reality, though it limits the number of types that can be used with the generic, it also gives the generic more power to deal with those types.  In effect these constraints says that if the type meets the given constraint, you can perform the activities that pertain to that constraint with the generic placeholders. Constraining Generic Type to Interface or Superclass One of the handiest where clause constraints is the ability to specify the type generic type must implement a certain interface or be inherited from a certain base class. For example, you can’t call CompareTo() in our first C# generic without constraints, but if we constrain T to IComparable<T>, we can: 1: public class ReverseComparer<T> 2: where T : IComparable<T> 3: { 4: public int Compare(T lhs, T rhs) 5: { 6: return lhs.CompareTo(rhs); 7: } 8: } Now that we’ve constrained T to an implementation of IComparable<T>, this means that our variables of generic type T may now call any members specified in IComparable<T> as well.  This means that the call to CompareTo() is now legal. If you constrain your type, also, you will get compiler warnings if you attempt to use a type that doesn’t meet the constraint.  This is much better than the syntax error you would get within C++ template code itself when you used a type not supported by a C++ template. Constraining Generic Type to Only Reference Types Sometimes, you want to assign an instance of a generic type to null, but you can’t do this without constraints, because you have no guarantee that the type used to realize the generic is not a value type, where null is meaningless. Well, we can fix this by specifying the class constraint in the where clause.  By declaring that a generic type must be a class, we are saying that it is a reference type, and this allows us to assign null to instances of that type: 1: public static class ObjectExtensions 2: { 3: public static TOut Maybe<TIn, TOut>(this TIn value, Func<TIn, TOut> accessor) 4: where TOut : class 5: where TIn : class 6: { 7: return (value != null) ? accessor(value) : null; 8: } 9: } In the example above, we want to be able to access a property off of a reference, and if that reference is null, pass the null on down the line.  To do this, both the input type and the output type must be reference types (yes, nullable value types could also be considered applicable at a logical level, but there’s not a direct constraint for those). Constraining Generic Type to only Value Types Similarly to constraining a generic type to be a reference type, you can also constrain a generic type to be a value type.  To do this you use the struct constraint which specifies that the generic type must be a value type (primitive, struct, enum, etc). Consider the following method, that will convert anything that is IConvertible (int, double, string, etc) to the value type you specify, or null if the instance is null. 1: public static T? ConvertToNullable<T>(IConvertible value) 2: where T : struct 3: { 4: T? result = null; 5:  6: if (value != null) 7: { 8: result = (T)Convert.ChangeType(value, typeof(T)); 9: } 10:  11: return result; 12: } Because T was constrained to be a value type, we can use T? (System.Nullable<T>) where we could not do this if T was a reference type. Constraining Generic Type to Require Default Constructor You can also constrain a type to require existence of a default constructor.  Because by default C# doesn’t know what constructors a generic type placeholder does or does not have available, it can’t typically allow you to call one.  That said, if you give it the new() constraint, it will mean that the type used to realize the generic type must have a default (no argument) constructor. Let’s assume you have a generic adapter class that, given some mappings, will adapt an item from type TFrom to type TTo.  Because it must create a new instance of type TTo in the process, we need to specify that TTo has a default constructor: 1: // Given a set of Action<TFrom,TTo> mappings will map TFrom to TTo 2: public class Adapter<TFrom, TTo> : IEnumerable<Action<TFrom, TTo>> 3: where TTo : class, new() 4: { 5: // The list of translations from TFrom to TTo 6: public List<Action<TFrom, TTo>> Translations { get; private set; } 7:  8: // Construct with empty translation and reverse translation sets. 9: public Adapter() 10: { 11: // did this instead of auto-properties to allow simple use of initializers 12: Translations = new List<Action<TFrom, TTo>>(); 13: } 14:  15: // Add a translator to the collection, useful for initializer list 16: public void Add(Action<TFrom, TTo> translation) 17: { 18: Translations.Add(translation); 19: } 20:  21: // Add a translator that first checks a predicate to determine if the translation 22: // should be performed, then translates if the predicate returns true 23: public void Add(Predicate<TFrom> conditional, Action<TFrom, TTo> translation) 24: { 25: Translations.Add((from, to) => 26: { 27: if (conditional(from)) 28: { 29: translation(from, to); 30: } 31: }); 32: } 33:  34: // Translates an object forward from TFrom object to TTo object. 35: public TTo Adapt(TFrom sourceObject) 36: { 37: var resultObject = new TTo(); 38:  39: // Process each translation 40: Translations.ForEach(t => t(sourceObject, resultObject)); 41:  42: return resultObject; 43: } 44:  45: // Returns an enumerator that iterates through the collection. 46: public IEnumerator<Action<TFrom, TTo>> GetEnumerator() 47: { 48: return Translations.GetEnumerator(); 49: } 50:  51: // Returns an enumerator that iterates through a collection. 52: IEnumerator IEnumerable.GetEnumerator() 53: { 54: return GetEnumerator(); 55: } 56: } Notice, however, you can’t specify any other constructor, you can only specify that the type has a default (no argument) constructor. Summary The where clause is an excellent tool that gives your .NET generics even more power to perform tasks higher than just the base "object level" behavior.  There are a few things you cannot specify with constraints (currently) though: Cannot specify the generic type must be an enum. Cannot specify the generic type must have a certain property or method without specifying a base class or interface – that is, you can’t say that the generic must have a Start() method. Cannot specify that the generic type allows arithmetic operations. Cannot specify that the generic type requires a specific non-default constructor. In addition, you cannot overload a template definition with different, opposing constraints.  For example you can’t define a Adapter<T> where T : struct and Adapter<T> where T : class.  Hopefully, in the future we will get some of these things to make the where clause even more useful, but until then what we have is extremely valuable in making our generics more user friendly and more powerful!   Technorati Tags: C#,.NET,Little Wonders,BlackRabbitCoder,where,generics

    Read the article

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >