Search Results

Search found 80 results on 4 pages for 'southpaw hare'.

Page 1/4 | 1 2 3 4  | Next Page >

  • The Tortoise and the Hare

    <b>Legal World and Childhood Dreams:</b> "Summary: The paper explains how computer software is protected and the relationship between open source software and copyright."

    Read the article

  • My Latest Hare-Brained Scheme

    - by Liam McLennan
    I have not had a significant side project for a while but I have been working on a product idea. Its an analytics application that analyses twitter data and reports on market sentiment. The target market is companies who want to track trends in consumer sentiment. My idea is to teach the application to divide relevant tweets into ‘positive’ and ‘negative’ categories. If the input was the set of tweets featuring the word ‘telstra’ the application would find the following tweet:   and put it in the ‘negative’ category. Collecting data in this fashion facilitates the creation of graphs such as: which can then be correlated against events, such as a share offer or new product release. I may go ahead and build this, just because I am a programmer and it amuses me to do so. My concerns are: There  is no market for this tool There is a market, but I don’t understand it and have no way to reach it.

    Read the article

  • The Tortoise and the Cyber-Hare

    The relentless pace of life we now live gives many the impression that everything works at this pace and that millionaires are made overnight. Unfortunately, it's not like that with SEO - time is you biggest friend and your biggest enemy.

    Read the article

  • Can my tortoise vs. hare race be improved?

    - by FredOverflow
    Here is my code for detecting cycles in a linked list: do { hare = hare.next(); if (hare == back) return; hare = hare.next(); if (hare == back) return; tortoise = tortoise.next(); } while (tortoise != hare); throw new AssertionError("cyclic linkage"); Is there a way to get rid of the code duplication inside the loop? Am I right in assuming that I don't need a check after making the tortoise take a step forward? As I see it, the tortoise can never reach the end of the list before the hare (contrary to the fable). Any other ways to simplify/beautify this code?

    Read the article

  • Difference between Sound and Music

    - by Southpaw Hare
    What are the key differences between the Sound and Music classes in Pygame? What are the limitations of each? In what situation would one use one or the other? Is there a benefit to using them in an unintuitive way such as using Sound objects to play music files or visa-versa? Are there specifically issues with channel limitations, and do one or both have the potential to be dropped from their channel unreliably? What are the risks of playing music as a Sound?

    Read the article

  • What's the difference between Pygame's Sound and Music classes?

    - by Southpaw Hare
    What are the key differences between the Sound and Music classes in Pygame? What are the limitations of each? In what situation would one use one or the other? Is there a benefit to using them in an unintuitive way such as using Sound objects to play music files or visa-versa? Are there specifically issues with channel limitations, and do one or both have the potential to be dropped from their channel unreliably? What are the risks of playing music as a Sound?

    Read the article

  • C#/.NET Little Wonders: The Concurrent Collections (1 of 3)

    - by James Michael Hare
    Once again we consider some of the lesser known classes and keywords of C#.  In the next few weeks, we will discuss the concurrent collections and how they have changed the face of concurrent programming. This week’s post will begin with a general introduction and discuss the ConcurrentStack<T> and ConcurrentQueue<T>.  Then in the following post we’ll discuss the ConcurrentDictionary<T> and ConcurrentBag<T>.  Finally, we shall close on the third post with a discussion of the BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. A brief history of collections In the beginning was the .NET 1.0 Framework.  And out of this framework emerged the System.Collections namespace, and it was good.  It contained all the basic things a growing programming language needs like the ArrayList and Hashtable collections.  The main problem, of course, with these original collections is that they held items of type object which means you had to be disciplined enough to use them correctly or you could end up with runtime errors if you got an object of a type you weren't expecting. Then came .NET 2.0 and generics and our world changed forever!  With generics the C# language finally got an equivalent of the very powerful C++ templates.  As such, the System.Collections.Generic was born and we got type-safe versions of all are favorite collections.  The List<T> succeeded the ArrayList and the Dictionary<TKey,TValue> succeeded the Hashtable and so on.  The new versions of the library were not only safer because they checked types at compile-time, in many cases they were more performant as well.  So much so that it's Microsoft's recommendation that the System.Collections original collections only be used for backwards compatibility. So we as developers came to know and love the generic collections and took them into our hearts and embraced them.  The problem is, thread safety in both the original collections and the generic collections can be problematic, for very different reasons. Now, if you are only doing single-threaded development you may not care – after all, no locking is required.  Even if you do have multiple threads, if a collection is “load-once, read-many” you don’t need to do anything to protect that container from multi-threaded access, as illustrated below: 1: public static class OrderTypeTranslator 2: { 3: // because this dictionary is loaded once before it is ever accessed, we don't need to synchronize 4: // multi-threaded read access 5: private static readonly Dictionary<string, char> _translator = new Dictionary<string, char> 6: { 7: {"New", 'N'}, 8: {"Update", 'U'}, 9: {"Cancel", 'X'} 10: }; 11:  12: // the only public interface into the dictionary is for reading, so inherently thread-safe 13: public static char? Translate(string orderType) 14: { 15: char charValue; 16: if (_translator.TryGetValue(orderType, out charValue)) 17: { 18: return charValue; 19: } 20:  21: return null; 22: } 23: } Unfortunately, most of our computer science problems cannot get by with just single-threaded applications or with multi-threading in a load-once manner.  Looking at  today's trends, it's clear to see that computers are not so much getting faster because of faster processor speeds -- we've nearly reached the limits we can push through with today's technologies -- but more because we're adding more cores to the boxes.  With this new hardware paradigm, it is even more important to use multi-threaded applications to take full advantage of parallel processing to achieve higher application speeds. So let's look at how to use collections in a thread-safe manner. Using historical collections in a concurrent fashion The early .NET collections (System.Collections) had a Synchronized() static method that could be used to wrap the early collections to make them completely thread-safe.  This paradigm was dropped in the generic collections (System.Collections.Generic) because having a synchronized wrapper resulted in atomic locks for all operations, which could prove overkill in many multithreading situations.  Thus the paradigm shifted to having the user of the collection specify their own locking, usually with an external object: 1: public class OrderAggregator 2: { 3: private static readonly Dictionary<string, List<Order>> _orders = new Dictionary<string, List<Order>>(); 4: private static readonly _orderLock = new object(); 5:  6: public void Add(string accountNumber, Order newOrder) 7: { 8: List<Order> ordersForAccount; 9:  10: // a complex operation like this should all be protected 11: lock (_orderLock) 12: { 13: if (!_orders.TryGetValue(accountNumber, out ordersForAccount)) 14: { 15: _orders.Add(accountNumber, ordersForAccount = new List<Order>()); 16: } 17:  18: ordersForAccount.Add(newOrder); 19: } 20: } 21: } Notice how we’re performing several operations on the dictionary under one lock.  With the Synchronized() static methods of the early collections, you wouldn’t be able to specify this level of locking (a more macro-level).  So in the generic collections, it was decided that if a user needed synchronization, they could implement their own locking scheme instead so that they could provide synchronization as needed. The need for better concurrent access to collections Here’s the problem: it’s relatively easy to write a collection that locks itself down completely for access, but anything more complex than that can be difficult and error-prone to write, and much less to make it perform efficiently!  For example, what if you have a Dictionary that has frequent reads but in-frequent updates?  Do you want to lock down the entire Dictionary for every access?  This would be overkill and would prevent concurrent reads.  In such cases you could use something like a ReaderWriterLockSlim which allows for multiple readers in a lock, and then once a writer grabs the lock it blocks all further readers until the writer is done (in a nutshell).  This is all very complex stuff to consider. Fortunately, this is where the Concurrent Collections come in.  The Parallel Computing Platform team at Microsoft went through great pains to determine how to make a set of concurrent collections that would have the best performance characteristics for general case multi-threaded use. Now, as in all things involving threading, you should always make sure you evaluate all your container options based on the particular usage scenario and the degree of parallelism you wish to acheive. This article should not be taken to understand that these collections are always supperior to the generic collections. Each fills a particular need for a particular situation. Understanding what each container is optimized for is key to the success of your application whether it be single-threaded or multi-threaded. General points to consider with the concurrent collections The MSDN points out that the concurrent collections all support the ICollection interface. However, since the collections are already synchronized, the IsSynchronized property always returns false, and SyncRoot always returns null.  Thus you should not attempt to use these properties for synchronization purposes. Note that since the concurrent collections also may have different operations than the traditional data structures you may be used to.  Now you may ask why they did this, but it was done out of necessity to keep operations safe and atomic.  For example, in order to do a Pop() on a stack you have to know the stack is non-empty, but between the time you check the stack’s IsEmpty property and then do the Pop() another thread may have come in and made the stack empty!  This is why some of the traditional operations have been changed to make them safe for concurrent use. In addition, some properties and methods in the concurrent collections achieve concurrency by creating a snapshot of the collection, which means that some operations that were traditionally O(1) may now be O(n) in the concurrent models.  I’ll try to point these out as we talk about each collection so you can be aware of any potential performance impacts.  Finally, all the concurrent containers are safe for enumeration even while being modified, but some of the containers support this in different ways (snapshot vs. dirty iteration).  Once again I’ll highlight how thread-safe enumeration works for each collection. ConcurrentStack<T>: The thread-safe LIFO container The ConcurrentStack<T> is the thread-safe counterpart to the System.Collections.Generic.Stack<T>, which as you may remember is your standard last-in-first-out container.  If you think of algorithms that favor stack usage (for example, depth-first searches of graphs and trees) then you can see how using a thread-safe stack would be of benefit. The ConcurrentStack<T> achieves thread-safe access by using System.Threading.Interlocked operations.  This means that the multi-threaded access to the stack requires no traditional locking and is very, very fast! For the most part, the ConcurrentStack<T> behaves like it’s Stack<T> counterpart with a few differences: Pop() was removed in favor of TryPop() Returns true if an item existed and was popped and false if empty. PushRange() and TryPopRange() were added Allows you to push multiple items and pop multiple items atomically. Count takes a snapshot of the stack and then counts the items. This means it is a O(n) operation, if you just want to check for an empty stack, call IsEmpty instead which is O(1). ToArray() and GetEnumerator() both also take snapshots. This means that iteration over a stack will give you a static view at the time of the call and will not reflect updates. Pushing on a ConcurrentStack<T> works just like you’d expect except for the aforementioned PushRange() method that was added to allow you to push a range of items concurrently. 1: var stack = new ConcurrentStack<string>(); 2:  3: // adding to stack is much the same as before 4: stack.Push("First"); 5:  6: // but you can also push multiple items in one atomic operation (no interleaves) 7: stack.PushRange(new [] { "Second", "Third", "Fourth" }); For looking at the top item of the stack (without removing it) the Peek() method has been removed in favor of a TryPeek().  This is because in order to do a peek the stack must be non-empty, but between the time you check for empty and the time you execute the peek the stack contents may have changed.  Thus the TryPeek() was created to be an atomic check for empty, and then peek if not empty: 1: // to look at top item of stack without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (stack.TryPeek(out item)) 5: { 6: Console.WriteLine("Top item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Stack was empty."); 11: } Finally, to remove items from the stack, we have the TryPop() for single, and TryPopRange() for multiple items.  Just like the TryPeek(), these operations replace Pop() since we need to ensure atomically that the stack is non-empty before we pop from it: 1: // to remove items, use TryPop or TryPopRange to get multiple items atomically (no interleaves) 2: if (stack.TryPop(out item)) 3: { 4: Console.WriteLine("Popped " + item); 5: } 6:  7: // TryPopRange will only pop up to the number of spaces in the array, the actual number popped is returned. 8: var poppedItems = new string[2]; 9: int numPopped = stack.TryPopRange(poppedItems); 10:  11: foreach (var theItem in poppedItems.Take(numPopped)) 12: { 13: Console.WriteLine("Popped " + theItem); 14: } Finally, note that as stated before, GetEnumerator() and ToArray() gets a snapshot of the data at the time of the call.  That means if you are enumerating the stack you will get a snapshot of the stack at the time of the call.  This is illustrated below: 1: var stack = new ConcurrentStack<string>(); 2:  3: // adding to stack is much the same as before 4: stack.Push("First"); 5:  6: var results = stack.GetEnumerator(); 7:  8: // but you can also push multiple items in one atomic operation (no interleaves) 9: stack.PushRange(new [] { "Second", "Third", "Fourth" }); 10:  11: while(results.MoveNext()) 12: { 13: Console.WriteLine("Stack only has: " + results.Current); 14: } The only item that will be printed out in the above code is "First" because the snapshot was taken before the other items were added. This may sound like an issue, but it’s really for safety and is more correct.  You don’t want to enumerate a stack and have half a view of the stack before an update and half a view of the stack after an update, after all.  In addition, note that this is still thread-safe, whereas iterating through a non-concurrent collection while updating it in the old collections would cause an exception. ConcurrentQueue<T>: The thread-safe FIFO container The ConcurrentQueue<T> is the thread-safe counterpart of the System.Collections.Generic.Queue<T> class.  The concurrent queue uses an underlying list of small arrays and lock-free System.Threading.Interlocked operations on the head and tail arrays.  Once again, this allows us to do thread-safe operations without the need for heavy locks! The ConcurrentQueue<T> (like the ConcurrentStack<T>) has some departures from the non-concurrent counterpart.  Most notably: Dequeue() was removed in favor of TryDequeue(). Returns true if an item existed and was dequeued and false if empty. Count does not take a snapshot It subtracts the head and tail index to get the count.  This results overall in a O(1) complexity which is quite good.  It’s still recommended, however, that for empty checks you call IsEmpty instead of comparing Count to zero. ToArray() and GetEnumerator() both take snapshots. This means that iteration over a queue will give you a static view at the time of the call and will not reflect updates. The Enqueue() method on the ConcurrentQueue<T> works much the same as the generic Queue<T>: 1: var queue = new ConcurrentQueue<string>(); 2:  3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5: queue.Enqueue("Second"); 6: queue.Enqueue("Third"); For front item access, the TryPeek() method must be used to attempt to see the first item if the queue.  There is no Peek() method since, as you’ll remember, we can only peek on a non-empty queue, so we must have an atomic TryPeek() that checks for empty and then returns the first item if the queue is non-empty. 1: // to look at first item in queue without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (queue.TryPeek(out item)) 5: { 6: Console.WriteLine("First item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Queue was empty."); 11: } Then, to remove items you use TryDequeue().  Once again this is for the same reason we have TryPeek() and not Peek(): 1: // to remove items, use TryDequeue. If queue is empty returns false. 2: if (queue.TryDequeue(out item)) 3: { 4: Console.WriteLine("Dequeued first item " + item); 5: } Just like the concurrent stack, the ConcurrentQueue<T> takes a snapshot when you call ToArray() or GetEnumerator() which means that subsequent updates to the queue will not be seen when you iterate over the results.  Thus once again the code below will only show the first item, since the other items were added after the snapshot. 1: var queue = new ConcurrentQueue<string>(); 2:  3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5:  6: var iterator = queue.GetEnumerator(); 7:  8: queue.Enqueue("Second"); 9: queue.Enqueue("Third"); 10:  11: // only shows First 12: while (iterator.MoveNext()) 13: { 14: Console.WriteLine("Dequeued item " + iterator.Current); 15: } Using collections concurrently You’ll notice in the examples above I stuck to using single-threaded examples so as to make them deterministic and the results obvious.  Of course, if we used these collections in a truly multi-threaded way the results would be less deterministic, but would still be thread-safe and with no locking on your part required! For example, say you have an order processor that takes an IEnumerable<Order> and handles each other in a multi-threaded fashion, then groups the responses together in a concurrent collection for aggregation.  This can be done easily with the TPL’s Parallel.ForEach(): 1: public static IEnumerable<OrderResult> ProcessOrders(IEnumerable<Order> orderList) 2: { 3: var proxy = new OrderProxy(); 4: var results = new ConcurrentQueue<OrderResult>(); 5:  6: // notice that we can process all these in parallel and put the results 7: // into our concurrent collection without needing any external locking! 8: Parallel.ForEach(orderList, 9: order => 10: { 11: var result = proxy.PlaceOrder(order); 12:  13: results.Enqueue(result); 14: }); 15:  16: return results; 17: } Summary Obviously, if you do not need multi-threaded safety, you don’t need to use these collections, but when you do need multi-threaded collections these are just the ticket! The plethora of features (I always think of the movie The Three Amigos when I say plethora) built into these containers and the amazing way they acheive thread-safe access in an efficient manner is wonderful to behold. Stay tuned next week where we’ll continue our discussion with the ConcurrentBag<T> and the ConcurrentDictionary<TKey,TValue>. For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here.   Tweet Technorati Tags: C#,.NET,Concurrent Collections,Collections,Multi-Threading,Little Wonders,BlackRabbitCoder,James Michael Hare

    Read the article

  • C#/.NET Little Wonders: The ConcurrentDictionary

    - by James Michael Hare
    Once again we consider some of the lesser known classes and keywords of C#.  In this series of posts, we will discuss how the concurrent collections have been developed to help alleviate these multi-threading concerns.  Last week’s post began with a general introduction and discussed the ConcurrentStack<T> and ConcurrentQueue<T>.  Today's post discusses the ConcurrentDictionary<T> (originally I had intended to discuss ConcurrentBag this week as well, but ConcurrentDictionary had enough information to create a very full post on its own!).  Finally next week, we shall close with a discussion of the ConcurrentBag<T> and BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. Recap As you'll recall from the previous post, the original collections were object-based containers that accomplished synchronization through a Synchronized member.  While these were convenient because you didn't have to worry about writing your own synchronization logic, they were a bit too finely grained and if you needed to perform multiple operations under one lock, the automatic synchronization didn't buy much. With the advent of .NET 2.0, the original collections were succeeded by the generic collections which are fully type-safe, but eschew automatic synchronization.  This cuts both ways in that you have a lot more control as a developer over when and how fine-grained you want to synchronize, but on the other hand if you just want simple synchronization it creates more work. With .NET 4.0, we get the best of both worlds in generic collections.  A new breed of collections was born called the concurrent collections in the System.Collections.Concurrent namespace.  These amazing collections are fine-tuned to have best overall performance for situations requiring concurrent access.  They are not meant to replace the generic collections, but to simply be an alternative to creating your own locking mechanisms. Among those concurrent collections were the ConcurrentStack<T> and ConcurrentQueue<T> which provide classic LIFO and FIFO collections with a concurrent twist.  As we saw, some of the traditional methods that required calls to be made in a certain order (like checking for not IsEmpty before calling Pop()) were replaced in favor of an umbrella operation that combined both under one lock (like TryPop()). Now, let's take a look at the next in our series of concurrent collections!For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here. ConcurrentDictionary – the fully thread-safe dictionary The ConcurrentDictionary<TKey,TValue> is the thread-safe counterpart to the generic Dictionary<TKey, TValue> collection.  Obviously, both are designed for quick – O(1) – lookups of data based on a key.  If you think of algorithms where you need lightning fast lookups of data and don’t care whether the data is maintained in any particular ordering or not, the unsorted dictionaries are generally the best way to go. Note: as a side note, there are sorted implementations of IDictionary, namely SortedDictionary and SortedList which are stored as an ordered tree and a ordered list respectively.  While these are not as fast as the non-sorted dictionaries – they are O(log2 n) – they are a great combination of both speed and ordering -- and still greatly outperform a linear search. Now, once again keep in mind that if all you need to do is load a collection once and then allow multi-threaded reading you do not need any locking.  Examples of this tend to be situations where you load a lookup or translation table once at program start, then keep it in memory for read-only reference.  In such cases locking is completely non-productive. However, most of the time when we need a concurrent dictionary we are interleaving both reads and updates.  This is where the ConcurrentDictionary really shines!  It achieves its thread-safety with no common lock to improve efficiency.  It actually uses a series of locks to provide concurrent updates, and has lockless reads!  This means that the ConcurrentDictionary gets even more efficient the higher the ratio of reads-to-writes you have. ConcurrentDictionary and Dictionary differences For the most part, the ConcurrentDictionary<TKey,TValue> behaves like it’s Dictionary<TKey,TValue> counterpart with a few differences.  Some notable examples of which are: Add() does not exist in the concurrent dictionary. This means you must use TryAdd(), AddOrUpdate(), or GetOrAdd().  It also means that you can’t use a collection initializer with the concurrent dictionary. TryAdd() replaced Add() to attempt atomic, safe adds. Because Add() only succeeds if the item doesn’t already exist, we need an atomic operation to check if the item exists, and if not add it while still under an atomic lock. TryUpdate() was added to attempt atomic, safe updates. If we want to update an item, we must make sure it exists first and that the original value is what we expected it to be.  If all these are true, we can update the item under one atomic step. TryRemove() was added to attempt atomic, safe removes. To safely attempt to remove a value we need to see if the key exists first, this checks for existence and removes under an atomic lock. AddOrUpdate() was added to attempt an thread-safe “upsert”. There are many times where you want to insert into a dictionary if the key doesn’t exist, or update the value if it does.  This allows you to make a thread-safe add-or-update. GetOrAdd() was added to attempt an thread-safe query/insert. Sometimes, you want to query for whether an item exists in the cache, and if it doesn’t insert a starting value for it.  This allows you to get the value if it exists and insert if not. Count, Keys, Values properties take a snapshot of the dictionary. Accessing these properties may interfere with add and update performance and should be used with caution. ToArray() returns a static snapshot of the dictionary. That is, the dictionary is locked, and then copied to an array as a O(n) operation.  GetEnumerator() is thread-safe and efficient, but allows dirty reads. Because reads require no locking, you can safely iterate over the contents of the dictionary.  The only downside is that, depending on timing, you may get dirty reads. Dirty reads during iteration The last point on GetEnumerator() bears some explanation.  Picture a scenario in which you call GetEnumerator() (or iterate using a foreach, etc.) and then, during that iteration the dictionary gets updated.  This may not sound like a big deal, but it can lead to inconsistent results if used incorrectly.  The problem is that items you already iterated over that are updated a split second after don’t show the update, but items that you iterate over that were updated a split second before do show the update.  Thus you may get a combination of items that are “stale” because you iterated before the update, and “fresh” because they were updated after GetEnumerator() but before the iteration reached them. Let’s illustrate with an example, let’s say you load up a concurrent dictionary like this: 1: // load up a dictionary. 2: var dictionary = new ConcurrentDictionary<string, int>(); 3:  4: dictionary["A"] = 1; 5: dictionary["B"] = 2; 6: dictionary["C"] = 3; 7: dictionary["D"] = 4; 8: dictionary["E"] = 5; 9: dictionary["F"] = 6; Then you have one task (using the wonderful TPL!) to iterate using dirty reads: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); And one task to attempt updates in a separate thread (probably): 1: // attempt updates in a separate thread 2: var updateTask = new Task(() => 3: { 4: // iterates, and updates the value by one 5: foreach (var pair in dictionary) 6: { 7: dictionary[pair.Key] = pair.Value + 1; 8: } 9: }); Now that we’ve done this, we can fire up both tasks and wait for them to complete: 1: // start both tasks 2: updateTask.Start(); 3: iterationTask.Start(); 4:  5: // wait for both to complete. 6: Task.WaitAll(updateTask, iterationTask); Now, if I you didn’t know about the dirty reads, you may have expected to see the iteration before the updates (such as A:1, B:2, C:3, D:4, E:5, F:6).  However, because the reads are dirty, we will quite possibly get a combination of some updated, some original.  My own run netted this result: 1: F:6 2: E:6 3: D:5 4: C:4 5: B:3 6: A:2 Note that, of course, iteration is not in order because ConcurrentDictionary, like Dictionary, is unordered.  Also note that both E and F show the value 6.  This is because the output task reached F before the update, but the updates for the rest of the items occurred before their output (probably because console output is very slow, comparatively). If we want to always guarantee that we will get a consistent snapshot to iterate over (that is, at the point we ask for it we see precisely what is in the dictionary and no subsequent updates during iteration), we should iterate over a call to ToArray() instead: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary.ToArray()) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); The atomic Try…() methods As you can imagine TryAdd() and TryRemove() have few surprises.  Both first check the existence of the item to determine if it can be added or removed based on whether or not the key currently exists in the dictionary: 1: // try add attempts an add and returns false if it already exists 2: if (dictionary.TryAdd("G", 7)) 3: Console.WriteLine("G did not exist, now inserted with 7"); 4: else 5: Console.WriteLine("G already existed, insert failed."); TryRemove() also has the virtue of returning the value portion of the removed entry matching the given key: 1: // attempt to remove the value, if it exists it is removed and the original is returned 2: int removedValue; 3: if (dictionary.TryRemove("C", out removedValue)) 4: Console.WriteLine("Removed C and its value was " + removedValue); 5: else 6: Console.WriteLine("C did not exist, remove failed."); Now TryUpdate() is an interesting creature.  You might think from it’s name that TryUpdate() first checks for an item’s existence, and then updates if the item exists, otherwise it returns false.  Well, note quite... It turns out when you call TryUpdate() on a concurrent dictionary, you pass it not only the new value you want it to have, but also the value you expected it to have before the update.  If the item exists in the dictionary, and it has the value you expected, it will update it to the new value atomically and return true.  If the item is not in the dictionary or does not have the value you expected, it is not modified and false is returned. 1: // attempt to update the value, if it exists and if it has the expected original value 2: if (dictionary.TryUpdate("G", 42, 7)) 3: Console.WriteLine("G existed and was 7, now it's 42."); 4: else 5: Console.WriteLine("G either didn't exist, or wasn't 7."); The composite Add methods The ConcurrentDictionary also has composite add methods that can be used to perform updates and gets, with an add if the item is not existing at the time of the update or get. The first of these, AddOrUpdate(), allows you to add a new item to the dictionary if it doesn’t exist, or update the existing item if it does.  For example, let’s say you are creating a dictionary of counts of stock ticker symbols you’ve subscribed to from a market data feed: 1: public sealed class SubscriptionManager 2: { 3: private readonly ConcurrentDictionary<string, int> _subscriptions = new ConcurrentDictionary<string, int>(); 4:  5: // adds a new subscription, or increments the count of the existing one. 6: public void AddSubscription(string tickerKey) 7: { 8: // add a new subscription with count of 1, or update existing count by 1 if exists 9: var resultCount = _subscriptions.AddOrUpdate(tickerKey, 1, (symbol, count) => count + 1); 10:  11: // now check the result to see if we just incremented the count, or inserted first count 12: if (resultCount == 1) 13: { 14: // subscribe to symbol... 15: } 16: } 17: } Notice the update value factory Func delegate.  If the key does not exist in the dictionary, the add value is used (in this case 1 representing the first subscription for this symbol), but if the key already exists, it passes the key and current value to the update delegate which computes the new value to be stored in the dictionary.  The return result of this operation is the value used (in our case: 1 if added, existing value + 1 if updated). Likewise, the GetOrAdd() allows you to attempt to retrieve a value from the dictionary, and if the value does not currently exist in the dictionary it will insert a value.  This can be handy in cases where perhaps you wish to cache data, and thus you would query the cache to see if the item exists, and if it doesn’t you would put the item into the cache for the first time: 1: public sealed class PriceCache 2: { 3: private readonly ConcurrentDictionary<string, double> _cache = new ConcurrentDictionary<string, double>(); 4:  5: // adds a new subscription, or increments the count of the existing one. 6: public double QueryPrice(string tickerKey) 7: { 8: // check for the price in the cache, if it doesn't exist it will call the delegate to create value. 9: return _cache.GetOrAdd(tickerKey, symbol => GetCurrentPrice(symbol)); 10: } 11:  12: private double GetCurrentPrice(string tickerKey) 13: { 14: // do code to calculate actual true price. 15: } 16: } There are other variations of these two methods which vary whether a value is provided or a factory delegate, but otherwise they work much the same. Oddities with the composite Add methods The AddOrUpdate() and GetOrAdd() methods are totally thread-safe, on this you may rely, but they are not atomic.  It is important to note that the methods that use delegates execute those delegates outside of the lock.  This was done intentionally so that a user delegate (of which the ConcurrentDictionary has no control of course) does not take too long and lock out other threads. This is not necessarily an issue, per se, but it is something you must consider in your design.  The main thing to consider is that your delegate may get called to generate an item, but that item may not be the one returned!  Consider this scenario: A calls GetOrAdd and sees that the key does not currently exist, so it calls the delegate.  Now thread B also calls GetOrAdd and also sees that the key does not currently exist, and for whatever reason in this race condition it’s delegate completes first and it adds its new value to the dictionary.  Now A is done and goes to get the lock, and now sees that the item now exists.  In this case even though it called the delegate to create the item, it will pitch it because an item arrived between the time it attempted to create one and it attempted to add it. Let’s illustrate, assume this totally contrived example program which has a dictionary of char to int.  And in this dictionary we want to store a char and it’s ordinal (that is, A = 1, B = 2, etc).  So for our value generator, we will simply increment the previous value in a thread-safe way (perhaps using Interlocked): 1: public static class Program 2: { 3: private static int _nextNumber = 0; 4:  5: // the holder of the char to ordinal 6: private static ConcurrentDictionary<char, int> _dictionary 7: = new ConcurrentDictionary<char, int>(); 8:  9: // get the next id value 10: public static int NextId 11: { 12: get { return Interlocked.Increment(ref _nextNumber); } 13: } Then, we add a method that will perform our insert: 1: public static void Inserter() 2: { 3: for (int i = 0; i < 26; i++) 4: { 5: _dictionary.GetOrAdd((char)('A' + i), key => NextId); 6: } 7: } Finally, we run our test by starting two tasks to do this work and get the results… 1: public static void Main() 2: { 3: // 3 tasks attempting to get/insert 4: var tasks = new List<Task> 5: { 6: new Task(Inserter), 7: new Task(Inserter) 8: }; 9:  10: tasks.ForEach(t => t.Start()); 11: Task.WaitAll(tasks.ToArray()); 12:  13: foreach (var pair in _dictionary.OrderBy(p => p.Key)) 14: { 15: Console.WriteLine(pair.Key + ":" + pair.Value); 16: } 17: } If you run this with only one task, you get the expected A:1, B:2, ..., Z:26.  But running this in parallel you will get something a bit more complex.  My run netted these results: 1: A:1 2: B:3 3: C:4 4: D:5 5: E:6 6: F:7 7: G:8 8: H:9 9: I:10 10: J:11 11: K:12 12: L:13 13: M:14 14: N:15 15: O:16 16: P:17 17: Q:18 18: R:19 19: S:20 20: T:21 21: U:22 22: V:23 23: W:24 24: X:25 25: Y:26 26: Z:27 Notice that B is 3?  This is most likely because both threads attempted to call GetOrAdd() at roughly the same time and both saw that B did not exist, thus they both called the generator and one thread got back 2 and the other got back 3.  However, only one of those threads can get the lock at a time for the actual insert, and thus the one that generated the 3 won and the 3 was inserted and the 2 got discarded.  This is why on these methods your factory delegates should be careful not to have any logic that would be unsafe if the value they generate will be pitched in favor of another item generated at roughly the same time.  As such, it is probably a good idea to keep those generators as stateless as possible. Summary The ConcurrentDictionary is a very efficient and thread-safe version of the Dictionary generic collection.  It has all the benefits of type-safety that it’s generic collection counterpart does, and in addition is extremely efficient especially when there are more reads than writes concurrently. Tweet Technorati Tags: C#, .NET, Concurrent Collections, Collections, Little Wonders, Black Rabbit Coder,James Michael Hare

    Read the article

  • C#: System.Collections.Concurrent.ConcurrentQueue vs. Queue

    - by James Michael Hare
    I love new toys, so of course when .NET 4.0 came out I felt like the proverbial kid in the candy store!  Now, some people get all excited about the IDE and it’s new features or about changes to WPF and Silver Light and yes, those are all very fine and grand.  But me, I get all excited about things that tend to affect my life on the backside of development.  That’s why when I heard there were going to be concurrent container implementations in the latest version of .NET I was salivating like Pavlov’s dog at the dinner bell. They seem so simple, really, that one could easily overlook them.  Essentially they are implementations of containers (many that mirror the generic collections, others are new) that have either been optimized with very efficient, limited, or no locking but are still completely thread safe -- and I just had to see what kind of an improvement that would translate into. Since part of my job as a solutions architect here where I work is to help design, develop, and maintain the systems that process tons of requests each second, the thought of extremely efficient thread-safe containers was extremely appealing.  Of course, they also rolled out a whole parallel development framework which I won’t get into in this post but will cover bits and pieces of as time goes by. This time, I was mainly curious as to how well these new concurrent containers would perform compared to areas in our code where we manually synchronize them using lock or some other mechanism.  So I set about to run a processing test with a series of producers and consumers that would be either processing a traditional System.Collections.Generic.Queue or a System.Collection.Concurrent.ConcurrentQueue. Now, I wanted to keep the code as common as possible to make sure that the only variance was the container, so I created a test Producer and a test Consumer.  The test Producer takes an Action<string> delegate which is responsible for taking a string and placing it on whichever queue we’re testing in a thread-safe manner: 1: internal class Producer 2: { 3: public int Iterations { get; set; } 4: public Action<string> ProduceDelegate { get; set; } 5: 6: public void Produce() 7: { 8: for (int i = 0; i < Iterations; i++) 9: { 10: ProduceDelegate(“Hello”); 11: } 12: } 13: } Then likewise, I created a consumer that took a Func<string> that would read from whichever queue we’re testing and return either the string if data exists or null if not.  Then, if the item doesn’t exist, it will do a 10 ms wait before testing again.  Once all the producers are done and join the main thread, a flag will be set in each of the consumers to tell them once the queue is empty they can shut down since no other data is coming: 1: internal class Consumer 2: { 3: public Func<string> ConsumeDelegate { get; set; } 4: public bool HaltWhenEmpty { get; set; } 5: 6: public void Consume() 7: { 8: bool processing = true; 9: 10: while (processing) 11: { 12: string result = ConsumeDelegate(); 13: 14: if(result == null) 15: { 16: if (HaltWhenEmpty) 17: { 18: processing = false; 19: } 20: else 21: { 22: Thread.Sleep(TimeSpan.FromMilliseconds(10)); 23: } 24: } 25: else 26: { 27: DoWork(); // do something non-trivial so consumers lag behind a bit 28: } 29: } 30: } 31: } Okay, now that we’ve done that, we can launch threads of varying numbers using lambdas for each different method of production/consumption.  First let's look at the lambdas for a typical System.Collections.Generics.Queue with locking: 1: // lambda for putting to typical Queue with locking... 2: var productionDelegate = s => 3: { 4: lock (_mutex) 5: { 6: _mutexQueue.Enqueue(s); 7: } 8: }; 9:  10: // and lambda for typical getting from Queue with locking... 11: var consumptionDelegate = () => 12: { 13: lock (_mutex) 14: { 15: if (_mutexQueue.Count > 0) 16: { 17: return _mutexQueue.Dequeue(); 18: } 19: } 20: return null; 21: }; Nothing new or interesting here.  Just typical locks on an internal object instance.  Now let's look at using a ConcurrentQueue from the System.Collections.Concurrent library: 1: // lambda for putting to a ConcurrentQueue, notice it needs no locking! 2: var productionDelegate = s => 3: { 4: _concurrentQueue.Enqueue(s); 5: }; 6:  7: // lambda for getting from a ConcurrentQueue, once again, no locking required. 8: var consumptionDelegate = () => 9: { 10: string s; 11: return _concurrentQueue.TryDequeue(out s) ? s : null; 12: }; So I pass each of these lambdas and the number of producer and consumers threads to launch and take a look at the timing results.  Basically I’m timing from the time all threads start and begin producing/consuming to the time that all threads rejoin.  I won't bore you with the test code, basically it just launches code that creates the producers and consumers and launches them in their own threads, then waits for them all to rejoin.  The following are the timings from the start of all threads to the Join() on all threads completing.  The producers create 10,000,000 items evenly between themselves and then when all producers are done they trigger the consumers to stop once the queue is empty. These are the results in milliseconds from the ordinary Queue with locking: 1: Consumers Producers 1 2 3 Time (ms) 2: ---------- ---------- ------ ------ ------ --------- 3: 1 1 4284 5153 4226 4554.33 4: 10 10 4044 3831 5010 4295.00 5: 100 100 5497 5378 5612 5495.67 6: 1000 1000 24234 25409 27160 25601.00 And the following are the results in milliseconds from the ConcurrentQueue with no locking necessary: 1: Consumers Producers 1 2 3 Time (ms) 2: ---------- ---------- ------ ------ ------ --------- 3: 1 1 3647 3643 3718 3669.33 4: 10 10 2311 2136 2142 2196.33 5: 100 100 2480 2416 2190 2362.00 6: 1000 1000 7289 6897 7061 7082.33 Note that even though obviously 2000 threads is quite extreme, the concurrent queue actually scales really well, whereas the traditional queue with simple locking scales much more poorly. I love the new concurrent collections, they look so much simpler without littering your code with the locking logic, and they perform much better.  All in all, a great new toy to add to your arsenal of multi-threaded processing!

    Read the article

  • C# 4: The Curious ConcurrentDictionary

    - by James Michael Hare
    In my previous post (here) I did a comparison of the new ConcurrentQueue versus the old standard of a System.Collections.Generic Queue with simple locking.  The results were exactly what I would have hoped, that the ConcurrentQueue was faster with multi-threading for most all situations.  In addition, concurrent collections have the added benefit that you can enumerate them even if they're being modified. So I set out to see what the improvements would be for the ConcurrentDictionary, would it have the same performance benefits as the ConcurrentQueue did?  Well, after running some tests and multiple tweaks and tunes, I have good and bad news. But first, let's look at the tests.  Obviously there's many things we can do with a dictionary.  One of the most notable uses, of course, in a multi-threaded environment is for a small, local in-memory cache.  So I set about to do a very simple simulation of a cache where I would create a test class that I'll just call an Accessor.  This accessor will attempt to look up a key in the dictionary, and if the key exists, it stops (i.e. a cache "hit").  However, if the lookup fails, it will then try to add the key and value to the dictionary (i.e. a cache "miss").  So here's the Accessor that will run the tests: 1: internal class Accessor 2: { 3: public int Hits { get; set; } 4: public int Misses { get; set; } 5: public Func<int, string> GetDelegate { get; set; } 6: public Action<int, string> AddDelegate { get; set; } 7: public int Iterations { get; set; } 8: public int MaxRange { get; set; } 9: public int Seed { get; set; } 10:  11: public void Access() 12: { 13: var randomGenerator = new Random(Seed); 14:  15: for (int i=0; i<Iterations; i++) 16: { 17: // give a wide spread so will have some duplicates and some unique 18: var target = randomGenerator.Next(1, MaxRange); 19:  20: // attempt to grab the item from the cache 21: var result = GetDelegate(target); 22:  23: // if the item doesn't exist, add it 24: if(result == null) 25: { 26: AddDelegate(target, target.ToString()); 27: Misses++; 28: } 29: else 30: { 31: Hits++; 32: } 33: } 34: } 35: } Note that so I could test different implementations, I defined a GetDelegate and AddDelegate that will call the appropriate dictionary methods to add or retrieve items in the cache using various techniques. So let's examine the three techniques I decided to test: Dictionary with mutex - Just your standard generic Dictionary with a simple lock construct on an internal object. Dictionary with ReaderWriterLockSlim - Same Dictionary, but now using a lock designed to let multiple readers access simultaneously and then locked when a writer needs access. ConcurrentDictionary - The new ConcurrentDictionary from System.Collections.Concurrent that is supposed to be optimized to allow multiple threads to access safely. So the approach to each of these is also fairly straight-forward.  Let's look at the GetDelegate and AddDelegate implementations for the Dictionary with mutex lock: 1: var addDelegate = (key,val) => 2: { 3: lock (_mutex) 4: { 5: _dictionary[key] = val; 6: } 7: }; 8: var getDelegate = (key) => 9: { 10: lock (_mutex) 11: { 12: string val; 13: return _dictionary.TryGetValue(key, out val) ? val : null; 14: } 15: }; Nothing new or fancy here, just your basic lock on a private object and then query/insert into the Dictionary. Now, for the Dictionary with ReadWriteLockSlim it's a little more complex: 1: var addDelegate = (key,val) => 2: { 3: _readerWriterLock.EnterWriteLock(); 4: _dictionary[key] = val; 5: _readerWriterLock.ExitWriteLock(); 6: }; 7: var getDelegate = (key) => 8: { 9: string val; 10: _readerWriterLock.EnterReadLock(); 11: if(!_dictionary.TryGetValue(key, out val)) 12: { 13: val = null; 14: } 15: _readerWriterLock.ExitReadLock(); 16: return val; 17: }; And finally, the ConcurrentDictionary, which since it does all it's own concurrency control, is remarkably elegant and simple: 1: var addDelegate = (key,val) => 2: { 3: _concurrentDictionary[key] = val; 4: }; 5: var getDelegate = (key) => 6: { 7: string s; 8: return _concurrentDictionary.TryGetValue(key, out s) ? s : null; 9: };                    Then, I set up a test harness that would simply ask the user for the number of concurrent Accessors to attempt to Access the cache (as specified in Accessor.Access() above) and then let them fly and see how long it took them all to complete.  Each of these tests was run with 10,000,000 cache accesses divided among the available Accessor instances.  All times are in milliseconds. 1: Dictionary with Mutex Locking 2: --------------------------------------------------- 3: Accessors Mostly Misses Mostly Hits 4: 1 7916 3285 5: 10 8293 3481 6: 100 8799 3532 7: 1000 8815 3584 8:  9:  10: Dictionary with ReaderWriterLockSlim Locking 11: --------------------------------------------------- 12: Accessors Mostly Misses Mostly Hits 13: 1 8445 3624 14: 10 11002 4119 15: 100 11076 3992 16: 1000 14794 4861 17:  18:  19: Concurrent Dictionary 20: --------------------------------------------------- 21: Accessors Mostly Misses Mostly Hits 22: 1 17443 3726 23: 10 14181 1897 24: 100 15141 1994 25: 1000 17209 2128 The first test I did across the board is the Mostly Misses category.  The mostly misses (more adds because data requested was not in the dictionary) shows an interesting trend.  In both cases the Dictionary with the simple mutex lock is much faster, and the ConcurrentDictionary is the slowest solution.  But this got me thinking, and a little research seemed to confirm it, maybe the ConcurrentDictionary is more optimized to concurrent "gets" than "adds".  So since the ratio of misses to hits were 2 to 1, I decided to reverse that and see the results. So I tweaked the data so that the number of keys were much smaller than the number of iterations to give me about a 2 to 1 ration of hits to misses (twice as likely to already find the item in the cache than to need to add it).  And yes, indeed here we see that the ConcurrentDictionary is indeed faster than the standard Dictionary here.  I have a strong feeling that as the ration of hits-to-misses gets higher and higher these number gets even better as well.  This makes sense since the ConcurrentDictionary is read-optimized. Also note that I tried the tests with capacity and concurrency hints on the ConcurrentDictionary but saw very little improvement, I think this is largely because on the 10,000,000 hit test it quickly ramped up to the correct capacity and concurrency and thus the impact was limited to the first few milliseconds of the run. So what does this tell us?  Well, as in all things, ConcurrentDictionary is not a panacea.  It won't solve all your woes and it shouldn't be the only Dictionary you ever use.  So when should we use each? Use System.Collections.Generic.Dictionary when: You need a single-threaded Dictionary (no locking needed). You need a multi-threaded Dictionary that is loaded only once at creation and never modified (no locking needed). You need a multi-threaded Dictionary to store items where writes are far more prevalent than reads (locking needed). And use System.Collections.Concurrent.ConcurrentDictionary when: You need a multi-threaded Dictionary where the writes are far more prevalent than reads. You need to be able to iterate over the collection without locking it even if its being modified. Both Dictionaries have their strong suits, I have a feeling this is just one where you need to know from design what you hope to use it for and make your decision based on that criteria.

    Read the article

  • C#: System.Lazy&lt;T&gt; and the Singleton Design Pattern

    - by James Michael Hare
    So we've all coded a Singleton at one time or another.  It's a really simple pattern and can be a slightly more elegant alternative to global variables.  Make no mistake, Singletons can be abused and are often over-used -- but occasionally you find a Singleton is the most elegant solution. For those of you not familiar with a Singleton, the basic Design Pattern is that a Singleton class is one where there is only ever one instance of the class created.  This means that constructors must be private to avoid users creating their own instances, and a static property (or method in languages without properties) is defined that returns a single static instance. 1: public class Singleton 2: { 3: // the single instance is defined in a static field 4: private static readonly Singleton _instance = new Singleton(); 5:  6: // constructor private so users can't instantiate on their own 7: private Singleton() 8: { 9: } 10:  11: // read-only property that returns the static field 12: public static Singleton Instance 13: { 14: get 15: { 16: return _instance; 17: } 18: } 19: } This is the most basic singleton, notice the key features: Static readonly field that contains the one and only instance. Constructor is private so it can only be called by the class itself. Static property that returns the single instance. Looks like it satisfies, right?  There's just one (potential) problem.  C# gives you no guarantee of when the static field _instance will be created.  This is because the C# standard simply states that classes (which are marked in the IL as BeforeFieldInit) can have their static fields initialized any time before the field is accessed.  This means that they may be initialized on first use, they may be initialized at some other time before, you can't be sure when. So what if you want to guarantee your instance is truly lazy.  That is, that it is only created on first call to Instance?  Well, there's a few ways to do this.  First we'll show the old ways, and then talk about how .Net 4.0's new System.Lazy<T> type can help make the lazy-Singleton cleaner. Obviously, we could take on the lazy construction ourselves, but being that our Singleton may be accessed by many different threads, we'd need to lock it down. 1: public class LazySingleton1 2: { 3: // lock for thread-safety laziness 4: private static readonly object _mutex = new object(); 5:  6: // static field to hold single instance 7: private static LazySingleton1 _instance = null; 8:  9: // property that does some locking and then creates on first call 10: public static LazySingleton1 Instance 11: { 12: get 13: { 14: if (_instance == null) 15: { 16: lock (_mutex) 17: { 18: if (_instance == null) 19: { 20: _instance = new LazySingleton1(); 21: } 22: } 23: } 24:  25: return _instance; 26: } 27: } 28:  29: private LazySingleton1() 30: { 31: } 32: } This is a standard double-check algorithm so that you don't lock if the instance has already been created.  However, because it's possible two threads can go through the first if at the same time the first time back in, you need to check again after the lock is acquired to avoid creating two instances. Pretty straightforward, but ugly as all heck.  Well, you could also take advantage of the C# standard's BeforeFieldInit and define your class with a static constructor.  It need not have a body, just the presence of the static constructor will remove the BeforeFieldInit attribute on the class and guarantee that no fields are initialized until the first static field, property, or method is called.   1: public class LazySingleton2 2: { 3: // because of the static constructor, this won't get created until first use 4: private static readonly LazySingleton2 _instance = new LazySingleton2(); 5:  6: // Returns the singleton instance using lazy-instantiation 7: public static LazySingleton2 Instance 8: { 9: get { return _instance; } 10: } 11:  12: // private to prevent direct instantiation 13: private LazySingleton2() 14: { 15: } 16:  17: // removes BeforeFieldInit on class so static fields not 18: // initialized before they are used 19: static LazySingleton2() 20: { 21: } 22: } Now, while this works perfectly, I hate it.  Why?  Because it's relying on a non-obvious trick of the IL to guarantee laziness.  Just looking at this code, you'd have no idea that it's doing what it's doing.  Worse yet, you may decide that the empty static constructor serves no purpose and delete it (which removes your lazy guarantee).  Worse-worse yet, they may alter the rules around BeforeFieldInit in the future which could change this. So, what do I propose instead?  .Net 4.0 adds the System.Lazy type which guarantees thread-safe lazy-construction.  Using System.Lazy<T>, we get: 1: public class LazySingleton3 2: { 3: // static holder for instance, need to use lambda to construct since constructor private 4: private static readonly Lazy<LazySingleton3> _instance 5: = new Lazy<LazySingleton3>(() => new LazySingleton3()); 6:  7: // private to prevent direct instantiation. 8: private LazySingleton3() 9: { 10: } 11:  12: // accessor for instance 13: public static LazySingleton3 Instance 14: { 15: get 16: { 17: return _instance.Value; 18: } 19: } 20: } Note, you need your lambda to call the private constructor as Lazy's default constructor can only call public constructors of the type passed in (which we can't have by definition of a Singleton).  But, because the lambda is defined inside our type, it has access to the private members so it's perfect. Note how the Lazy<T> makes it obvious what you're doing (lazy construction), instead of relying on an IL generation side-effect.  This way, it's more maintainable.  Lazy<T> has many other uses as well, obviously, but I really love how elegant and readable it makes the lazy Singleton.

    Read the article

  • Revisiting ANTS Performance Profiler 7.4

    - by James Michael Hare
    Last year, I did a small review on the ANTS Performance Profiler 6.3, now that it’s a year later and a major version number higher, I thought I’d revisit the review and revise my last post. This post will take the same examples as the original post and update them to show what’s new in version 7.4 of the profiler. Background A performance profiler’s main job is to keep track of how much time is typically spent in each unit of code. This helps when we have a program that is not running at the performance we expect, and we want to know where the program is experiencing issues. There are many profilers out there of varying capabilities. Red Gate’s typically seem to be the very easy to “jump in” and get started with very little training required. So let’s dig into the Performance Profiler. I’ve constructed a very crude program with some obvious inefficiencies. It’s a simple program that generates random order numbers (or really could be any unique identifier), adds it to a list, sorts the list, then finds the max and min number in the list. Ignore the fact it’s very contrived and obviously inefficient, we just want to use it as an example to show off the tool: 1: // our test program 2: public static class Program 3: { 4: // the number of iterations to perform 5: private static int _iterations = 1000000; 6: 7: // The main method that controls it all 8: public static void Main() 9: { 10: var list = new List<string>(); 11: 12: for (int i = 0; i < _iterations; i++) 13: { 14: var x = GetNextId(); 15: 16: AddToList(list, x); 17: 18: var highLow = GetHighLow(list); 19: 20: if ((i % 1000) == 0) 21: { 22: Console.WriteLine("{0} - High: {1}, Low: {2}", i, highLow.Item1, highLow.Item2); 23: Console.Out.Flush(); 24: } 25: } 26: } 27: 28: // gets the next order id to process (random for us) 29: public static string GetNextId() 30: { 31: var random = new Random(); 32: var num = random.Next(1000000, 9999999); 33: return num.ToString(); 34: } 35: 36: // add it to our list - very inefficiently! 37: public static void AddToList(List<string> list, string item) 38: { 39: list.Add(item); 40: list.Sort(); 41: } 42: 43: // get high and low of order id range - very inefficiently! 44: public static Tuple<int,int> GetHighLow(List<string> list) 45: { 46: return Tuple.Create(list.Max(s => Convert.ToInt32(s)), list.Min(s => Convert.ToInt32(s))); 47: } 48: } So let’s run it through the profiler and see what happens! Visual Studio Integration First, let’s look at how the ANTS profilers integrate with Visual Studio’s menu system. Once you install the ANTS profilers, you will get an ANTS menu item with several options: Notice that you can either Profile Performance or Launch ANTS Performance Profiler. These sound similar but achieve two slightly different actions: Profile Performance: this immediately launches the profiler with all defaults selected to profile the active project in Visual Studio. Launch ANTS Performance Profiler: this launches the profiler much the same way as starting it from the Start Menu. The profiler will pre-populate the application and path information, but allow you to change the settings before beginning the profile run. So really, the main difference is that Profile Performance immediately begins profiling with the default selections, where Launch ANTS Performance Profiler allows you to change the defaults and attach to an already-running application. Let’s Fire it Up! So when you fire up ANTS either via Start Menu or Launch ANTS Performance Profiler menu in Visual Studio, you are presented with a very simple dialog to get you started: Notice you can choose from many different options for application type. You can profile executables, services, web applications, or just attach to a running process. In fact, in version 7.4 we see two new options added: ASP.NET Web Application (IIS Express) SharePoint web application (IIS) So this gives us an additional way to profile ASP.NET applications and the ability to profile SharePoint applications as well. You can also choose your level of detail in the Profiling Mode drop down. If you choose Line-Level and method-level timings detail, you will get a lot more detail on the method durations, but this will also slow down profiling somewhat. If you really need the profiler to be as unintrusive as possible, you can change it to Sample method-level timings. This is performing very light profiling, where basically the profiler collects timings of a method by examining the call-stack at given intervals. Which method you choose depends a lot on how much detail you need to find the issue and how sensitive your program issues are to timing. So for our example, let’s just go with the line and method timing detail. So, we check that all the options are correct (if you launch from VS2010, the executable and path are filled in already), and fire it up by clicking the [Start Profiling] button. Profiling the Application Once you start profiling the application, you will see a real-time graph of CPU usage that will indicate how much your application is using the CPU(s) on your system. During this time, you can select segments of the graph and bookmark them, giving them mnemonic names. This can be useful if you want to compare performance in one part of the run to another part of the run. Notice that once you select a block, it will give you the call tree breakdown for that selection only, and the relative performance of those calls. Once you feel you have collected enough information, you can click [Stop Profiling] to stop the application run and information collection and begin a more thorough analysis. Analyzing Method Timings So now that we’ve halted the run, we can look around the GUI and see what we can see. By default, the times are shown in terms of percentage of time of the total run of the application, though you can change it in the View menu item to milliseconds, ticks, or seconds as well. This won’t affect the percentages of methods, it only affects what units the times are shown. Notice also that the major hotspot seems to be in a method without source, ANTS Profiler will filter these out by default, but you can right-click on the line and remove the filter to see more detail. This proves especially handy when a bottleneck is due to a method in the BCL. So now that we’ve removed the filter, we see a bit more detail: In addition, ANTS Performance Profiler gives you the ability to decompile the methods without source so that you can dive even deeper, though typically this isn’t necessary for our purposes. When looking at timings, there are generally two types of timings for each method call: Time: This is the time spent ONLY in this method, not including calls this method makes to other methods. Time With Children: This is the total of time spent in both this method AND including calls this method makes to other methods. In other words, the Time tells you how much work is being done exclusively in this method, and the Time With Children tells you how much work is being done inclusively in this method and everything it calls. You can also choose to display the methods in a tree or in a grid. The tree view is the default and it shows the method calls arranged in terms of the tree representing all method calls and the parent method that called them, etc. This is useful for when you find a hot-spot method, you can see who is calling it to determine if the problem is the method itself, or if it is being called too many times. The grid method represents each method only once with its totals and is useful for quickly seeing what method is the trouble spot. In addition, you can choose to display Methods with source which are generally the methods you wrote (as opposed to native or BCL code), or Any Method which shows not only your methods, but also native calls, JIT overhead, synchronization waits, etc. So these are just two ways of viewing the same data, and you’re free to choose the organization that best suits what information you are after. Analyzing Method Source If we look at the timings above, we see that our AddToList() method (and in particular, it’s call to the List<T>.Sort() method in the BCL) is the hot-spot in this analysis. If ANTS sees a method that is consuming the most time, it will flag it as a hot-spot to help call out potential areas of concern. This doesn’t mean the other statistics aren’t meaningful, but that the hot-spot is most likely going to be your biggest bang-for-the-buck to concentrate on. So let’s select the AddToList() method, and see what it shows in the source window below: Notice the source breakout in the bottom pane when you select a method (from either tree or grid view). This shows you the timings in this method per line of code. This gives you a major indicator of where the trouble-spot in this method is. So in this case, we see that performing a Sort() on the List<T> after every Add() is killing our performance! Of course, this was a very contrived, duh moment, but you’d be surprised how many performance issues become duh moments. Note that this one line is taking up 86% of the execution time of this application! If we eliminate this bottleneck, we should see drastic improvement in the performance. So to fix this, if we still wanted to maintain the List<T> we’d have many options, including: delay Sort() until after all Add() methods, using a SortedSet, SortedList, or SortedDictionary depending on which is most appropriate, or forgoing the sorting all together and using a Dictionary. Rinse, Repeat! So let’s just change all instances of List<string> to SortedSet<string> and run this again through the profiler: Now we see the AddToList() method is no longer our hot-spot, but now the Max() and Min() calls are! This is good because we’ve eliminated one hot-spot and now we can try to correct this one as well. As before, we can then optimize this part of the code (possibly by taking advantage of the fact the list is now sorted and returning the first and last elements). We can then rinse and repeat this process until we have eliminated as many bottlenecks as possible. Calls by Web Request Another feature that was added recently is the ability to view .NET methods grouped by the HTTP requests that caused them to run. This can be helpful in determining which pages, web services, etc. are causing hot spots in your web applications. Summary If you like the other ANTS tools, you’ll like the ANTS Performance Profiler as well. It is extremely easy to use with very little product knowledge required to get up and running. There are profilers built into the higher product lines of Visual Studio, of course, which are also powerful and easy to use. But for quickly jumping in and finding hot spots rapidly, Red Gate’s Performance Profiler 7.4 is an excellent choice. Technorati Tags: Influencers,ANTS,Performance Profiler,Profiler

    Read the article

  • C#: Adding Functionality to 3rd Party Libraries With Extension Methods

    - by James Michael Hare
    Ever have one of those third party libraries that you love but it's missing that one feature or one piece of syntactical candy that would make it so much more useful?  This, I truly think, is one of the best uses of extension methods.  I began discussing extension methods in my last post (which you find here) where I expounded upon what I thought were some rules of thumb for using extension methods correctly.  As long as you keep in line with those (or similar) rules, they can often be useful for adding that little extra functionality or syntactical simplification for a library that you have little or no control over. Oh sure, you could take an open source project, download the source and add the methods you want, but then every time the library is updated you have to re-add your changes, which can be cumbersome and error prone.  And yes, you could possibly extend a class in a third party library and override features, but that's only if the class is not sealed, static, or constructed via factories. This is the perfect place to use an extension method!  And the best part is, you and your development team don't need to change anything!  Simply add the using for the namespace the extensions are in! So let's consider this example.  I love log4net!  Of all the logging libraries I've played with, it, to me, is one of the most flexible and configurable logging libraries and it performs great.  But this isn't about log4net, well, not directly.  So why would I want to add functionality?  Well, it's missing one thing I really want in the ILog interface: ability to specify logging level at runtime. For example, let's say I declare my ILog instance like so:     using log4net;     public class LoggingTest     {         private static readonly ILog _log = LogManager.GetLogger(typeof(LoggingTest));         ...     }     If you don't know log4net, the details aren't important, just to show that the field _log is the logger I have gotten from log4net. So now that I have that, I can log to it like so:     _log.Debug("This is the lowest level of logging and just for debugging output.");     _log.Info("This is an informational message.  Usual normal operation events.");     _log.Warn("This is a warning, something suspect but not necessarily wrong.");     _log.Error("This is an error, some sort of processing problem has happened.");     _log.Fatal("Fatals usually indicate the program is dying hideously."); And there's many flavors of each of these to log using string formatting, to log exceptions, etc.  But one thing there isn't: the ability to easily choose the logging level at runtime.  Notice, the logging levels above are chosen at compile time.  Of course, you could do some fun stuff with lambdas and wrap it, but that would obscure the simplicity of the interface.  And yes there is a Logger property you can dive down into where you can specify a Level, but the Level properties don't really match the ILog interface exactly and then you have to manually build a LogEvent and... well, it gets messy.  I want something simple and sexy so I can say:     _log.Log(someLevel, "This will be logged at whatever level I choose at runtime!");     Now, some purists out there might say you should always know what level you want to log at, and for the most part I agree with them.  For the most party the ILog interface satisfies 99% of my needs.  In fact, for most application logging yes you do always know the level you will be logging at, but when writing a utility class, you may not always know what level your user wants. I'll tell you, one of my favorite things is to write reusable components.  If I had my druthers I'd write framework libraries and shared components all day!  And being able to easily log at a runtime-chosen level is a big need for me.  After all, if I want my code to really be re-usable, I shouldn't force a user to deal with the logging level I choose. One of my favorite uses for this is in Interceptors -- I'll describe Interceptors in my next post and some of my favorites -- for now just know that an Interceptor wraps a class and allows you to add functionality to an existing method without changing it's signature.  At the risk of over-simplifying, it's a very generic implementation of the Decorator design pattern. So, say for example that you were writing an Interceptor that would time method calls and emit a log message if the method call execution time took beyond a certain threshold of time.  For instance, maybe if your database calls take more than 5,000 ms, you want to log a warning.  Or if a web method call takes over 1,000 ms, you want to log an informational message.  This would be an excellent use of logging at a generic level. So here was my personal wish-list of requirements for my task: Be able to determine if a runtime-specified logging level is enabled. Be able to log generically at a runtime-specified logging level. Have the same look-and-feel of the existing Debug, Info, Warn, Error, and Fatal calls.    Having the ability to also determine if logging for a level is on at runtime is also important so you don't spend time building a potentially expensive logging message if that level is off.  Consider an Interceptor that may log parameters on entrance to the method.  If you choose to log those parameter at DEBUG level and if DEBUG is not on, you don't want to spend the time serializing those parameters. Now, mine may not be the most elegant solution, but it performs really well since the enum I provide all uses contiguous values -- while it's never guaranteed, contiguous switch values usually get compiled into a jump table in IL which is VERY performant - O(1) - but even if it doesn't, it's still so fast you'd never need to worry about it. So first, I need a way to let users pass in logging levels.  Sure, log4net has a Level class, but it's a class with static members and plus it provides way too many options compared to ILog interface itself -- and wouldn't perform as well in my level-check -- so I define an enum like below.     namespace Shared.Logging.Extensions     {         // enum to specify available logging levels.         public enum LoggingLevel         {             Debug,             Informational,             Warning,             Error,             Fatal         }     } Now, once I have this, writing the extension methods I need is trivial.  Once again, I would typically /// comment fully, but I'm eliminating for blogging brevity:     namespace Shared.Logging.Extensions     {         // the extension methods to add functionality to the ILog interface         public static class LogExtensions         {             // Determines if logging is enabled at a given level.             public static bool IsLogEnabled(this ILog logger, LoggingLevel level)             {                 switch (level)                 {                     case LoggingLevel.Debug:                         return logger.IsDebugEnabled;                     case LoggingLevel.Informational:                         return logger.IsInfoEnabled;                     case LoggingLevel.Warning:                         return logger.IsWarnEnabled;                     case LoggingLevel.Error:                         return logger.IsErrorEnabled;                     case LoggingLevel.Fatal:                         return logger.IsFatalEnabled;                 }                                 return false;             }             // Logs a simple message - uses same signature except adds LoggingLevel             public static void Log(this ILog logger, LoggingLevel level, object message)             {                 switch (level)                 {                     case LoggingLevel.Debug:                         logger.Debug(message);                         break;                     case LoggingLevel.Informational:                         logger.Info(message);                         break;                     case LoggingLevel.Warning:                         logger.Warn(message);                         break;                     case LoggingLevel.Error:                         logger.Error(message);                         break;                     case LoggingLevel.Fatal:                         logger.Fatal(message);                         break;                 }             }             // Logs a message and exception to the log at specified level.             public static void Log(this ILog logger, LoggingLevel level, object message, Exception exception)             {                 switch (level)                 {                     case LoggingLevel.Debug:                         logger.Debug(message, exception);                         break;                     case LoggingLevel.Informational:                         logger.Info(message, exception);                         break;                     case LoggingLevel.Warning:                         logger.Warn(message, exception);                         break;                     case LoggingLevel.Error:                         logger.Error(message, exception);                         break;                     case LoggingLevel.Fatal:                         logger.Fatal(message, exception);                         break;                 }             }             // Logs a formatted message to the log at the specified level.              public static void LogFormat(this ILog logger, LoggingLevel level, string format,                                          params object[] args)             {                 switch (level)                 {                     case LoggingLevel.Debug:                         logger.DebugFormat(format, args);                         break;                     case LoggingLevel.Informational:                         logger.InfoFormat(format, args);                         break;                     case LoggingLevel.Warning:                         logger.WarnFormat(format, args);                         break;                     case LoggingLevel.Error:                         logger.ErrorFormat(format, args);                         break;                     case LoggingLevel.Fatal:                         logger.FatalFormat(format, args);                         break;                 }             }         }     } So there it is!  I didn't have to modify the log4net source code, so if a new version comes out, i can just add the new assembly with no changes.  I didn't have to subclass and worry about developers not calling my sub-class instead of the original.  I simply provide the extension methods and it's as if the long lost extension methods were always a part of the ILog interface! Consider a very contrived example using the original interface:     // using the original ILog interface     public class DatabaseUtility     {         private static readonly ILog _log = LogManager.Create(typeof(DatabaseUtility));                 // some theoretical method to time         IDataReader Execute(string statement)         {             var timer = new System.Diagnostics.Stopwatch();                         // do DB magic                                    // this is hard-coded to warn, if want to change at runtime tough luck!             if (timer.ElapsedMilliseconds > 5000 && _log.IsWarnEnabled)             {                 _log.WarnFormat("Statement {0} took too long to execute.", statement);             }             ...         }     }     Now consider this alternate call where the logging level could be perhaps a property of the class          // using the original ILog interface     public class DatabaseUtility     {         private static readonly ILog _log = LogManager.Create(typeof(DatabaseUtility));                 // allow logging level to be specified by user of class instead         public LoggingLevel ThresholdLogLevel { get; set; }                 // some theoretical method to time         IDataReader Execute(string statement)         {             var timer = new System.Diagnostics.Stopwatch();                         // do DB magic                                    // this is hard-coded to warn, if want to change at runtime tough luck!             if (timer.ElapsedMilliseconds > 5000 && _log.IsLogEnabled(ThresholdLogLevel))             {                 _log.LogFormat(ThresholdLogLevel, "Statement {0} took too long to execute.",                     statement);             }             ...         }     } Next time, I'll show one of my favorite uses for these extension methods in an Interceptor.

    Read the article

  • C#/.NET Little Wonders: The Joy of Anonymous Types

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. In the .NET 3 Framework, Microsoft introduced the concept of anonymous types, which provide a way to create a quick, compiler-generated types at the point of instantiation.  These may seem trivial, but are very handy for concisely creating lightweight, strongly-typed objects containing only read-only properties that can be used within a given scope. Creating an Anonymous Type In short, an anonymous type is a reference type that derives directly from object and is defined by its set of properties base on their names, number, types, and order given at initialization.  In addition to just holding these properties, it is also given appropriate overridden implementations for Equals() and GetHashCode() that take into account all of the properties to correctly perform property comparisons and hashing.  Also overridden is an implementation of ToString() which makes it easy to display the contents of an anonymous type instance in a fairly concise manner. To construct an anonymous type instance, you use basically the same initialization syntax as with a regular type.  So, for example, if we wanted to create an anonymous type to represent a particular point, we could do this: 1: var point = new { X = 13, Y = 7 }; Note the similarity between anonymous type initialization and regular initialization.  The main difference is that the compiler generates the type name and the properties (as readonly) based on the names and order provided, and inferring their types from the expressions they are assigned to. It is key to remember that all of those factors (number, names, types, order of properties) determine the anonymous type.  This is important, because while these two instances share the same anonymous type: 1: // same names, types, and order 2: var point1 = new { X = 13, Y = 7 }; 3: var point2 = new { X = 5, Y = 0 }; These similar ones do not: 1: var point3 = new { Y = 3, X = 5 }; // different order 2: var point4 = new { X = 3, Y = 5.0 }; // different type for Y 3: var point5 = new {MyX = 3, MyY = 5 }; // different names 4: var point6 = new { X = 1, Y = 2, Z = 3 }; // different count Limitations on Property Initialization Expressions The expression for a property in an anonymous type initialization cannot be null (though it can evaluate to null) or an anonymous function.  For example, the following are illegal: 1: // Null can't be used directly. Null reference of what type? 2: var cantUseNull = new { Value = null }; 3:  4: // Anonymous methods cannot be used. 5: var cantUseAnonymousFxn = new { Value = () => Console.WriteLine(“Can’t.”) }; Note that the restriction on null is just that you can’t use it directly as the expression, because otherwise how would it be able to determine the type?  You can, however, use it indirectly assigning a null expression such as a typed variable with the value null, or by casting null to a specific type: 1: string str = null; 2: var fineIndirectly = new { Value = str }; 3: var fineCast = new { Value = (string)null }; All of the examples above name the properties explicitly, but you can also implicitly name properties if they are being set from a property, field, or variable.  In these cases, when a field, property, or variable is used alone, and you don’t specify a property name assigned to it, the new property will have the same name.  For example: 1: int variable = 42; 2:  3: // creates two properties named varriable and Now 4: var implicitProperties = new { variable, DateTime.Now }; Is the same type as: 1: var explicitProperties = new { variable = variable, Now = DateTime.Now }; But this only works if you are using an existing field, variable, or property directly as the expression.  If you use a more complex expression then the name cannot be inferred: 1: // can't infer the name variable from variable * 2, must name explicitly 2: var wontWork = new { variable * 2, DateTime.Now }; In the example above, since we typed variable * 2, it is no longer just a variable and thus we would have to assign the property a name explicitly. ToString() on Anonymous Types One of the more trivial overrides that an anonymous type provides you is a ToString() method that prints the value of the anonymous type instance in much the same format as it was initialized (except actual values instead of expressions as appropriate of course). For example, if you had: 1: var point = new { X = 13, Y = 42 }; And then print it out: 1: Console.WriteLine(point.ToString()); You will get: 1: { X = 13, Y = 42 } While this isn’t necessarily the most stunning feature of anonymous types, it can be handy for debugging or logging values in a fairly easy to read format. Comparing Anonymous Type Instances Because anonymous types automatically create appropriate overrides of Equals() and GetHashCode() based on the underlying properties, we can reliably compare two instances or get hash codes.  For example, if we had the following 3 points: 1: var point1 = new { X = 1, Y = 2 }; 2: var point2 = new { X = 1, Y = 2 }; 3: var point3 = new { Y = 2, X = 1 }; If we compare point1 and point2 we’ll see that Equals() returns true because they overridden version of Equals() sees that the types are the same (same number, names, types, and order of properties) and that the values are the same.   In addition, because all equal objects should have the same hash code, we’ll see that the hash codes evaluate to the same as well: 1: // true, same type, same values 2: Console.WriteLine(point1.Equals(point2)); 3:  4: // true, equal anonymous type instances always have same hash code 5: Console.WriteLine(point1.GetHashCode() == point2.GetHashCode()); However, if we compare point2 and point3 we get false.  Even though the names, types, and values of the properties are the same, the order is not, thus they are two different types and cannot be compared (and thus return false).  And, since they are not equal objects (even though they have the same value) there is a good chance their hash codes are different as well (though not guaranteed): 1: // false, different types 2: Console.WriteLine(point2.Equals(point3)); 3:  4: // quite possibly false (was false on my machine) 5: Console.WriteLine(point2.GetHashCode() == point3.GetHashCode()); Using Anonymous Types Now that we’ve created instances of anonymous types, let’s actually use them.  The property names (whether implicit or explicit) are used to access the individual properties of the anonymous type.  The main thing, once again, to keep in mind is that the properties are readonly, so you cannot assign the properties a new value (note: this does not mean that instances referred to by a property are immutable – for more information check out C#/.NET Fundamentals: Returning Data Immutably in a Mutable World). Thus, if we have the following anonymous type instance: 1: var point = new { X = 13, Y = 42 }; We can get the properties as you’d expect: 1: Console.WriteLine(“The point is: ({0},{1})”, point.X, point.Y); But we cannot alter the property values: 1: // compiler error, properties are readonly 2: point.X = 99; Further, since the anonymous type name is only known by the compiler, there is no easy way to pass anonymous type instances outside of a given scope.  The only real choices are to pass them as object or dynamic.  But really that is not the intention of using anonymous types.  If you find yourself needing to pass an anonymous type outside of a given scope, you should really consider making a POCO (Plain Old CLR Type – i.e. a class that contains just properties to hold data with little/no business logic) instead. Given that, why use them at all?  Couldn’t you always just create a POCO to represent every anonymous type you needed?  Sure you could, but then you might litter your solution with many small POCO classes that have very localized uses. It turns out this is the key to when to use anonymous types to your advantage: when you just need a lightweight type in a local context to store intermediate results, consider an anonymous type – but when that result is more long-lived and used outside of the current scope, consider a POCO instead. So what do we mean by intermediate results in a local context?  Well, a classic example would be filtering down results from a LINQ expression.  For example, let’s say we had a List<Transaction>, where Transaction is defined something like: 1: public class Transaction 2: { 3: public string UserId { get; set; } 4: public DateTime At { get; set; } 5: public decimal Amount { get; set; } 6: // … 7: } And let’s say we had this data in our List<Transaction>: 1: var transactions = new List<Transaction> 2: { 3: new Transaction { UserId = "Jim", At = DateTime.Now, Amount = 2200.00m }, 4: new Transaction { UserId = "Jim", At = DateTime.Now, Amount = -1100.00m }, 5: new Transaction { UserId = "Jim", At = DateTime.Now.AddDays(-1), Amount = 900.00m }, 6: new Transaction { UserId = "John", At = DateTime.Now.AddDays(-2), Amount = 300.00m }, 7: new Transaction { UserId = "John", At = DateTime.Now, Amount = -10.00m }, 8: new Transaction { UserId = "Jane", At = DateTime.Now, Amount = 200.00m }, 9: new Transaction { UserId = "Jane", At = DateTime.Now, Amount = -50.00m }, 10: new Transaction { UserId = "Jaime", At = DateTime.Now.AddDays(-3), Amount = -100.00m }, 11: new Transaction { UserId = "Jaime", At = DateTime.Now.AddDays(-3), Amount = 300.00m }, 12: }; So let’s say we wanted to get the transactions for each day for each user.  That is, for each day we’d want to see the transactions each user performed.  We could do this very simply with a nice LINQ expression, without the need of creating any POCOs: 1: // group the transactions based on an anonymous type with properties UserId and Date: 2: byUserAndDay = transactions 3: .GroupBy(tx => new { tx.UserId, tx.At.Date }) 4: .OrderBy(grp => grp.Key.Date) 5: .ThenBy(grp => grp.Key.UserId); Now, those of you who have attempted to use custom classes as a grouping type before (such as GroupBy(), Distinct(), etc.) may have discovered the hard way that LINQ gets a lot of its speed by utilizing not on Equals(), but also GetHashCode() on the type you are grouping by.  Thus, when you use custom types for these purposes, you generally end up having to write custom Equals() and GetHashCode() implementations or you won’t get the results you were expecting (the default implementations of Equals() and GetHashCode() are reference equality and reference identity based respectively). As we said before, it turns out that anonymous types already do these critical overrides for you.  This makes them even more convenient to use!  Instead of creating a small POCO to handle this grouping, and then having to implement a custom Equals() and GetHashCode() every time, we can just take advantage of the fact that anonymous types automatically override these methods with appropriate implementations that take into account the values of all of the properties. Now, we can look at our results: 1: foreach (var group in byUserAndDay) 2: { 3: // the group’s Key is an instance of our anonymous type 4: Console.WriteLine("{0} on {1:MM/dd/yyyy} did:", group.Key.UserId, group.Key.Date); 5:  6: // each grouping contains a sequence of the items. 7: foreach (var tx in group) 8: { 9: Console.WriteLine("\t{0}", tx.Amount); 10: } 11: } And see: 1: Jaime on 06/18/2012 did: 2: -100.00 3: 300.00 4:  5: John on 06/19/2012 did: 6: 300.00 7:  8: Jim on 06/20/2012 did: 9: 900.00 10:  11: Jane on 06/21/2012 did: 12: 200.00 13: -50.00 14:  15: Jim on 06/21/2012 did: 16: 2200.00 17: -1100.00 18:  19: John on 06/21/2012 did: 20: -10.00 Again, sure we could have just built a POCO to do this, given it an appropriate Equals() and GetHashCode() method, but that would have bloated our code with so many extra lines and been more difficult to maintain if the properties change.  Summary Anonymous types are one of those Little Wonders of the .NET language that are perfect at exactly that time when you need a temporary type to hold a set of properties together for an intermediate result.  While they are not very useful beyond the scope in which they are defined, they are excellent in LINQ expressions as a way to create and us intermediary values for further expressions and analysis. Anonymous types are defined by the compiler based on the number, type, names, and order of properties created, and they automatically implement appropriate Equals() and GetHashCode() overrides (as well as ToString()) which makes them ideal for LINQ expressions where you need to create a set of properties to group, evaluate, etc. Technorati Tags: C#,CSharp,.NET,Little Wonders,Anonymous Types,LINQ

    Read the article

  • C#/.NET Little Pitfalls: The Dangers of Casting Boxed Values

    - by James Michael Hare
    Starting a new series to parallel the Little Wonders series.  In this series, I will examine some of the small pitfalls that can occasionally trip up developers. Introduction: Of Casts and Conversions What happens when we try to assign from an int and a double and vice-versa? 1: double pi = 3.14; 2: int theAnswer = 42; 3:  4: // implicit widening conversion, compiles! 5: double doubleAnswer = theAnswer; 6:  7: // implicit narrowing conversion, compiler error! 8: int intPi = pi; As you can see from the comments above, a conversion from a value type where there is no potential data loss is can be done with an implicit conversion.  However, when converting from one value type to another may result in a loss of data, you must make the conversion explicit so the compiler knows you accept this risk.  That is why the conversion from double to int will not compile with an implicit conversion, we can make the conversion explicit by adding a cast: 1: // explicit narrowing conversion using a cast, compiler 2: // succeeds, but results may have data loss: 3: int intPi = (int)pi; So for value types, the conversions (implicit and explicit) both convert the original value to a new value of the given type.  With widening and narrowing references, however, this is not the case.  Converting reference types is a bit different from converting value types.  First of all when you perform a widening or narrowing you don’t really convert the instance of the object, you just convert the reference itself to the wider or narrower reference type, but both the original and new reference type both refer back to the same object. Secondly, widening and narrowing for reference types refers the going down and up the class hierarchy instead of referring to precision as in value types.  That is, a narrowing conversion for a reference type means you are going down the class hierarchy (for example from Shape to Square) whereas a widening conversion means you are going up the class hierarchy (from Square to Shape).  1: var square = new Square(); 2:  3: // implicitly convers because all squares are shapes 4: // (that is, all subclasses can be referenced by a superclass reference) 5: Shape myShape = square; 6:  7: // implicit conversion not possible, not all shapes are squares! 8: // (that is, not all superclasses can be referenced by a subclass reference) 9: Square mySquare = (Square) myShape; So we had to cast the Shape back to Square because at that point the compiler has no way of knowing until runtime whether the Shape in question is truly a Square.  But, because the compiler knows that it’s possible for a Shape to be a Square, it will compile.  However, if the object referenced by myShape is not truly a Square at runtime, you will get an invalid cast exception. Of course, there are other forms of conversions as well such as user-specified conversions and helper class conversions which are beyond the scope of this post.  The main thing we want to focus on is this seemingly innocuous casting method of widening and narrowing conversions that we come to depend on every day and, in some cases, can bite us if we don’t fully understand what is going on!  The Pitfall: Conversions on Boxed Value Types Can Fail What if you saw the following code and – knowing nothing else – you were asked if it was legal or not, what would you think: 1: // assuming x is defined above this and this 2: // assignment is syntactically legal. 3: x = 3.14; 4:  5: // convert 3.14 to int. 6: int truncated = (int)x; You may think that since x is obviously a double (can’t be a float) because 3.14 is a double literal, but this is inaccurate.  Our x could also be dynamic and this would work as well, or there could be user-defined conversions in play.  But there is another, even simpler option that can often bite us: what if x is object? 1: object x; 2:  3: x = 3.14; 4:  5: int truncated = (int) x; On the surface, this seems fine.  We have a double and we place it into an object which can be done implicitly through boxing (no cast) because all types inherit from object.  Then we cast it to int.  This theoretically should be possible because we know we can explicitly convert a double to an int through a conversion process which involves truncation. But here’s the pitfall: when casting an object to another type, we are casting a reference type, not a value type!  This means that it will attempt to see at runtime if the value boxed and referred to by x is of type int or derived from type int.  Since it obviously isn’t (it’s a double after all) we get an invalid cast exception! Now, you may say this looks awfully contrived, but in truth we can run into this a lot if we’re not careful.  Consider using an IDataReader to read from a database, and then attempting to select a result row of a particular column type: 1: using (var connection = new SqlConnection("some connection string")) 2: using (var command = new SqlCommand("select * from employee", connection)) 3: using (var reader = command.ExecuteReader()) 4: { 5: while (reader.Read()) 6: { 7: // if the salary is not an int32 in the SQL database, this is an error! 8: // doesn't matter if short, long, double, float, reader [] returns object! 9: total += (int) reader["annual_salary"]; 10: } 11: } Notice that since the reader indexer returns object, if we attempt to convert using a cast to a type, we have to make darn sure we use the true, actual type or this will fail!  If the SQL database column is a double, float, short, etc this will fail at runtime with an invalid cast exception because it attempts to convert the object reference! So, how do you get around this?  There are two ways, you could first cast the object to its actual type (double), and then do a narrowing cast to on the value to int.  Or you could use a helper class like Convert which analyzes the actual run-time type and will perform a conversion as long as the type implements IConvertible. 1: object x; 2:  3: x = 3.14; 4:  5: // if you want to cast, must cast out of object to double, then 6: // cast convert. 7: int truncated = (int)(double) x; 8:  9: // or you can call a helper class like Convert which examines runtime 10: // type of the value being converted 11: int anotherTruncated = Convert.ToInt32(x); Summary You should always be careful when performing a conversion cast from values boxed in object that you are actually casting to the true type (or a sub-type). Since casting from object is a widening of the reference, be careful that you either know the exact, explicit type you expect to be held in the object, or instead avoid the cast and use a helper class to perform a safe conversion to the type you desire. Technorati Tags: C#,.NET,Pitfalls,Little Pitfalls,BlackRabbitCoder

    Read the article

  • C#/.NET Fundamentals: Choosing the Right Collection Class

    - by James Michael Hare
    The .NET Base Class Library (BCL) has a wide array of collection classes at your disposal which make it easy to manage collections of objects. While it's great to have so many classes available, it can be daunting to choose the right collection to use for any given situation. As hard as it may be, choosing the right collection can be absolutely key to the performance and maintainability of your application! This post will look at breaking down any confusion between each collection and the situations in which they excel. We will be spending most of our time looking at the System.Collections.Generic namespace, which is the recommended set of collections. The Generic Collections: System.Collections.Generic namespace The generic collections were introduced in .NET 2.0 in the System.Collections.Generic namespace. This is the main body of collections you should tend to focus on first, as they will tend to suit 99% of your needs right up front. It is important to note that the generic collections are unsynchronized. This decision was made for performance reasons because depending on how you are using the collections its completely possible that synchronization may not be required or may be needed on a higher level than simple method-level synchronization. Furthermore, concurrent read access (all writes done at beginning and never again) is always safe, but for concurrent mixed access you should either synchronize the collection or use one of the concurrent collections. So let's look at each of the collections in turn and its various pros and cons, at the end we'll summarize with a table to help make it easier to compare and contrast the different collections. The Associative Collection Classes Associative collections store a value in the collection by providing a key that is used to add/remove/lookup the item. Hence, the container associates the value with the key. These collections are most useful when you need to lookup/manipulate a collection using a key value. For example, if you wanted to look up an order in a collection of orders by an order id, you might have an associative collection where they key is the order id and the value is the order. The Dictionary<TKey,TVale> is probably the most used associative container class. The Dictionary<TKey,TValue> is the fastest class for associative lookups/inserts/deletes because it uses a hash table under the covers. Because the keys are hashed, the key type should correctly implement GetHashCode() and Equals() appropriately or you should provide an external IEqualityComparer to the dictionary on construction. The insert/delete/lookup time of items in the dictionary is amortized constant time - O(1) - which means no matter how big the dictionary gets, the time it takes to find something remains relatively constant. This is highly desirable for high-speed lookups. The only downside is that the dictionary, by nature of using a hash table, is unordered, so you cannot easily traverse the items in a Dictionary in order. The SortedDictionary<TKey,TValue> is similar to the Dictionary<TKey,TValue> in usage but very different in implementation. The SortedDictionary<TKey,TValye> uses a binary tree under the covers to maintain the items in order by the key. As a consequence of sorting, the type used for the key must correctly implement IComparable<TKey> so that the keys can be correctly sorted. The sorted dictionary trades a little bit of lookup time for the ability to maintain the items in order, thus insert/delete/lookup times in a sorted dictionary are logarithmic - O(log n). Generally speaking, with logarithmic time, you can double the size of the collection and it only has to perform one extra comparison to find the item. Use the SortedDictionary<TKey,TValue> when you want fast lookups but also want to be able to maintain the collection in order by the key. The SortedList<TKey,TValue> is the other ordered associative container class in the generic containers. Once again SortedList<TKey,TValue>, like SortedDictionary<TKey,TValue>, uses a key to sort key-value pairs. Unlike SortedDictionary, however, items in a SortedList are stored as an ordered array of items. This means that insertions and deletions are linear - O(n) - because deleting or adding an item may involve shifting all items up or down in the list. Lookup time, however is O(log n) because the SortedList can use a binary search to find any item in the list by its key. So why would you ever want to do this? Well, the answer is that if you are going to load the SortedList up-front, the insertions will be slower, but because array indexing is faster than following object links, lookups are marginally faster than a SortedDictionary. Once again I'd use this in situations where you want fast lookups and want to maintain the collection in order by the key, and where insertions and deletions are rare. The Non-Associative Containers The other container classes are non-associative. They don't use keys to manipulate the collection but rely on the object itself being stored or some other means (such as index) to manipulate the collection. The List<T> is a basic contiguous storage container. Some people may call this a vector or dynamic array. Essentially it is an array of items that grow once its current capacity is exceeded. Because the items are stored contiguously as an array, you can access items in the List<T> by index very quickly. However inserting and removing in the beginning or middle of the List<T> are very costly because you must shift all the items up or down as you delete or insert respectively. However, adding and removing at the end of a List<T> is an amortized constant operation - O(1). Typically List<T> is the standard go-to collection when you don't have any other constraints, and typically we favor a List<T> even over arrays unless we are sure the size will remain absolutely fixed. The LinkedList<T> is a basic implementation of a doubly-linked list. This means that you can add or remove items in the middle of a linked list very quickly (because there's no items to move up or down in contiguous memory), but you also lose the ability to index items by position quickly. Most of the time we tend to favor List<T> over LinkedList<T> unless you are doing a lot of adding and removing from the collection, in which case a LinkedList<T> may make more sense. The HashSet<T> is an unordered collection of unique items. This means that the collection cannot have duplicates and no order is maintained. Logically, this is very similar to having a Dictionary<TKey,TValue> where the TKey and TValue both refer to the same object. This collection is very useful for maintaining a collection of items you wish to check membership against. For example, if you receive an order for a given vendor code, you may want to check to make sure the vendor code belongs to the set of vendor codes you handle. In these cases a HashSet<T> is useful for super-quick lookups where order is not important. Once again, like in Dictionary, the type T should have a valid implementation of GetHashCode() and Equals(), or you should provide an appropriate IEqualityComparer<T> to the HashSet<T> on construction. The SortedSet<T> is to HashSet<T> what the SortedDictionary<TKey,TValue> is to Dictionary<TKey,TValue>. That is, the SortedSet<T> is a binary tree where the key and value are the same object. This once again means that adding/removing/lookups are logarithmic - O(log n) - but you gain the ability to iterate over the items in order. For this collection to be effective, type T must implement IComparable<T> or you need to supply an external IComparer<T>. Finally, the Stack<T> and Queue<T> are two very specific collections that allow you to handle a sequential collection of objects in very specific ways. The Stack<T> is a last-in-first-out (LIFO) container where items are added and removed from the top of the stack. Typically this is useful in situations where you want to stack actions and then be able to undo those actions in reverse order as needed. The Queue<T> on the other hand is a first-in-first-out container which adds items at the end of the queue and removes items from the front. This is useful for situations where you need to process items in the order in which they came, such as a print spooler or waiting lines. So that's the basic collections. Let's summarize what we've learned in a quick reference table.  Collection Ordered? Contiguous Storage? Direct Access? Lookup Efficiency Manipulate Efficiency Notes Dictionary No Yes Via Key Key: O(1) O(1) Best for high performance lookups. SortedDictionary Yes No Via Key Key: O(log n) O(log n) Compromise of Dictionary speed and ordering, uses binary search tree. SortedList Yes Yes Via Key Key: O(log n) O(n) Very similar to SortedDictionary, except tree is implemented in an array, so has faster lookup on preloaded data, but slower loads. List No Yes Via Index Index: O(1) Value: O(n) O(n) Best for smaller lists where direct access required and no ordering. LinkedList No No No Value: O(n) O(1) Best for lists where inserting/deleting in middle is common and no direct access required. HashSet No Yes Via Key Key: O(1) O(1) Unique unordered collection, like a Dictionary except key and value are same object. SortedSet Yes No Via Key Key: O(log n) O(log n) Unique ordered collection, like SortedDictionary except key and value are same object. Stack No Yes Only Top Top: O(1) O(1)* Essentially same as List<T> except only process as LIFO Queue No Yes Only Front Front: O(1) O(1) Essentially same as List<T> except only process as FIFO   The Original Collections: System.Collections namespace The original collection classes are largely considered deprecated by developers and by Microsoft itself. In fact they indicate that for the most part you should always favor the generic or concurrent collections, and only use the original collections when you are dealing with legacy .NET code. Because these collections are out of vogue, let's just briefly mention the original collection and their generic equivalents: ArrayList A dynamic, contiguous collection of objects. Favor the generic collection List<T> instead. Hashtable Associative, unordered collection of key-value pairs of objects. Favor the generic collection Dictionary<TKey,TValue> instead. Queue First-in-first-out (FIFO) collection of objects. Favor the generic collection Queue<T> instead. SortedList Associative, ordered collection of key-value pairs of objects. Favor the generic collection SortedList<T> instead. Stack Last-in-first-out (LIFO) collection of objects. Favor the generic collection Stack<T> instead. In general, the older collections are non-type-safe and in some cases less performant than their generic counterparts. Once again, the only reason you should fall back on these older collections is for backward compatibility with legacy code and libraries only. The Concurrent Collections: System.Collections.Concurrent namespace The concurrent collections are new as of .NET 4.0 and are included in the System.Collections.Concurrent namespace. These collections are optimized for use in situations where multi-threaded read and write access of a collection is desired. The concurrent queue, stack, and dictionary work much as you'd expect. The bag and blocking collection are more unique. Below is the summary of each with a link to a blog post I did on each of them. ConcurrentQueue Thread-safe version of a queue (FIFO). For more information see: C#/.NET Little Wonders: The ConcurrentStack and ConcurrentQueue ConcurrentStack Thread-safe version of a stack (LIFO). For more information see: C#/.NET Little Wonders: The ConcurrentStack and ConcurrentQueue ConcurrentBag Thread-safe unordered collection of objects. Optimized for situations where a thread may be bother reader and writer. For more information see: C#/.NET Little Wonders: The ConcurrentBag and BlockingCollection ConcurrentDictionary Thread-safe version of a dictionary. Optimized for multiple readers (allows multiple readers under same lock). For more information see C#/.NET Little Wonders: The ConcurrentDictionary BlockingCollection Wrapper collection that implement producers & consumers paradigm. Readers can block until items are available to read. Writers can block until space is available to write (if bounded). For more information see C#/.NET Little Wonders: The ConcurrentBag and BlockingCollection Summary The .NET BCL has lots of collections built in to help you store and manipulate collections of data. Understanding how these collections work and knowing in which situations each container is best is one of the key skills necessary to build more performant code. Choosing the wrong collection for the job can make your code much slower or even harder to maintain if you choose one that doesn’t perform as well or otherwise doesn’t exactly fit the situation. Remember to avoid the original collections and stick with the generic collections.  If you need concurrent access, you can use the generic collections if the data is read-only, or consider the concurrent collections for mixed-access if you are running on .NET 4.0 or higher.   Tweet Technorati Tags: C#,.NET,Collecitons,Generic,Concurrent,Dictionary,List,Stack,Queue,SortedList,SortedDictionary,HashSet,SortedSet

    Read the article

  • C#: String Concatenation vs Format vs StringBuilder

    - by James Michael Hare
    I was looking through my groups’ C# coding standards the other day and there were a couple of legacy items in there that caught my eye.  They had been passed down from committee to committee so many times that no one even thought to second guess and try them for a long time.  It’s yet another example of how micro-optimizations can often get the best of us and cause us to write code that is not as maintainable as it could be for the sake of squeezing an extra ounce of performance out of our software. So the two standards in question were these, in paraphrase: Prefer StringBuilder or string.Format() to string concatenation. Prefer string.Equals() with case-insensitive option to string.ToUpper().Equals(). Now some of you may already know what my results are going to show, as these items have been compared before on many blogs, but I think it’s always worth repeating and trying these yourself.  So let’s dig in. The first test was a pretty standard one.  When concattenating strings, what is the best choice: StringBuilder, string concattenation, or string.Format()? So before we being I read in a number of iterations from the console and a length of each string to generate.  Then I generate that many random strings of the given length and an array to hold the results.  Why am I so keen to keep the results?  Because I want to be able to snapshot the memory and don’t want garbage collection to collect the strings, hence the array to keep hold of them.  I also didn’t want the random strings to be part of the allocation, so I pre-allocate them and the array up front before the snapshot.  So in the code snippets below: num – Number of iterations. strings – Array of randomly generated strings. results – Array to hold the results of the concatenation tests. timer – A System.Diagnostics.Stopwatch() instance to time code execution. start – Beginning memory size. stop – Ending memory size. after – Memory size after final GC. So first, let’s look at the concatenation loop: 1: // build num strings using concattenation. 2: for (int i = 0; i < num; i++) 3: { 4: results[i] = "This is test #" + i + " with a result of " + strings[i]; 5: } Pretty standard, right?  Next for string.Format(): 1: // build strings using string.Format() 2: for (int i = 0; i < num; i++) 3: { 4: results[i] = string.Format("This is test #{0} with a result of {1}", i, strings[i]); 5: }   Finally, StringBuilder: 1: // build strings using StringBuilder 2: for (int i = 0; i < num; i++) 3: { 4: var builder = new StringBuilder(); 5: builder.Append("This is test #"); 6: builder.Append(i); 7: builder.Append(" with a result of "); 8: builder.Append(strings[i]); 9: results[i] = builder.ToString(); 10: } So I take each of these loops, and time them by using a block like this: 1: // get the total amount of memory used, true tells it to run GC first. 2: start = System.GC.GetTotalMemory(true); 3:  4: // restart the timer 5: timer.Reset(); 6: timer.Start(); 7:  8: // *** code to time and measure goes here. *** 9:  10: // get the current amount of memory, stop the timer, then get memory after GC. 11: stop = System.GC.GetTotalMemory(false); 12: timer.Stop(); 13: other = System.GC.GetTotalMemory(true); So let’s look at what happens when I run each of these blocks through the timer and memory check at 500,000 iterations: 1: Operator + - Time: 547, Memory: 56104540/55595960 - 500000 2: string.Format() - Time: 749, Memory: 57295812/55595960 - 500000 3: StringBuilder - Time: 608, Memory: 55312888/55595960 – 500000   Egad!  string.Format brings up the rear and + triumphs, well, at least in terms of speed.  The concat burns more memory than StringBuilder but less than string.Format().  This shows two main things: StringBuilder is not always the panacea many think it is. The difference between any of the three is miniscule! The second point is extremely important!  You will often here people who will grasp at results and say, “look, operator + is 10% faster than StringBuilder so always use StringBuilder.”  Statements like this are a disservice and often misleading.  For example, if I had a good guess at what the size of the string would be, I could have preallocated my StringBuffer like so:   1: for (int i = 0; i < num; i++) 2: { 3: // pre-declare StringBuilder to have 100 char buffer. 4: var builder = new StringBuilder(100); 5: builder.Append("This is test #"); 6: builder.Append(i); 7: builder.Append(" with a result of "); 8: builder.Append(strings[i]); 9: results[i] = builder.ToString(); 10: }   Now let’s look at the times: 1: Operator + - Time: 551, Memory: 56104412/55595960 - 500000 2: string.Format() - Time: 753, Memory: 57296484/55595960 - 500000 3: StringBuilder - Time: 525, Memory: 59779156/55595960 - 500000   Whoa!  All of the sudden StringBuilder is back on top again!  But notice, it takes more memory now.  This makes perfect sense if you examine the IL behind the scenes.  Whenever you do a string concat (+) in your code, it examines the lengths of the arguments and creates a StringBuilder behind the scenes of the appropriate size for you. But even IF we know the approximate size of our StringBuilder, look how much less readable it is!  That’s why I feel you should always take into account both readability and performance.  After all, consider all these timings are over 500,000 iterations.   That’s at best  0.0004 ms difference per call which is neglidgable at best.  The key is to pick the best tool for the job.  What do I mean?  Consider these awesome words of wisdom: Concatenate (+) is best at concatenating.  StringBuilder is best when you need to building. Format is best at formatting. Totally Earth-shattering, right!  But if you consider it carefully, it actually has a lot of beauty in it’s simplicity.  Remember, there is no magic bullet.  If one of these always beat the others we’d only have one and not three choices. The fact is, the concattenation operator (+) has been optimized for speed and looks the cleanest for joining together a known set of strings in the simplest manner possible. StringBuilder, on the other hand, excels when you need to build a string of inderterminant length.  Use it in those times when you are looping till you hit a stop condition and building a result and it won’t steer you wrong. String.Format seems to be the looser from the stats, but consider which of these is more readable.  Yes, ignore the fact that you could do this with ToString() on a DateTime.  1: // build a date via concatenation 2: var date1 = (month < 10 ? string.Empty : "0") + month + '/' 3: + (day < 10 ? string.Empty : "0") + '/' + year; 4:  5: // build a date via string builder 6: var builder = new StringBuilder(10); 7: if (month < 10) builder.Append('0'); 8: builder.Append(month); 9: builder.Append('/'); 10: if (day < 10) builder.Append('0'); 11: builder.Append(day); 12: builder.Append('/'); 13: builder.Append(year); 14: var date2 = builder.ToString(); 15:  16: // build a date via string.Format 17: var date3 = string.Format("{0:00}/{1:00}/{2:0000}", month, day, year); 18:  So the strength in string.Format is that it makes constructing a formatted string easy to read.  Yes, it’s slower, but look at how much more elegant it is to do zero-padding and anything else string.Format does. So my lesson is, don’t look for the silver bullet!  Choose the best tool.  Micro-optimization almost always bites you in the end because you’re sacrificing readability for performance, which is almost exactly the wrong choice 90% of the time. I love the rules of optimization.  They’ve been stated before in many forms, but here’s how I always remember them: For Beginners: Do not optimize. For Experts: Do not optimize yet. It’s so true.  Most of the time on today’s modern hardware, a micro-second optimization at the sake of readability will net you nothing because it won’t be your bottleneck.  Code for readability, choose the best tool for the job which will usually be the most readable and maintainable as well.  Then, and only then, if you need that extra performance boost after profiling your code and exhausting all other options… then you can start to think about optimizing.

    Read the article

  • C#/.NET Little Wonders: The EventHandler and EventHandler&lt;TEventArgs&gt; delegates

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. In the last two weeks, we examined the Action family of delegates (and delegates in general), and the Func family of delegates and how they can be used to support generic, reusable algorithms and classes. So this week, we are going to look at a handy pair of delegates that can be used to eliminate the need for defining custom delegates when creating events: the EventHandler and EventHandler<TEventArgs> delegates. Events and delegates Before we begin, let’s quickly consider events in .NET.  According to the MSDN: An event in C# is a way for a class to provide notifications to clients of that class when some interesting thing happens to an object. So, basically, you can create an event in a type so that users of that type can subscribe to notifications of things of interest.  How is this different than some of the delegate programming that we talked about in the last two weeks?  Well, you can think of an event as a special access modifier on a delegate.  Some differences between the two are: Events are a special access case of delegates They behave much like delegates instances inside the type they are declared in, but outside of that type they can only be (un)subscribed to. Events can specify add/remove behavior explicitly If you want to do additional work when someone subscribes or unsubscribes to an event, you can specify the add and remove actions explicitly. Events have access modifiers, but these only specify the access level of those who can (un)subscribe A public event, for example, means anyone can (un)subscribe, but it does not mean that anyone can raise (invoke) the event directly.  Events can only be raised by the type that contains them In contrast, if a delegate is visible, it can be invoked outside of the object (not even in a sub-class!). Events tend to be for notifications only, and should be treated as optional Semantically speaking, events typically don’t perform work on the the class directly, but tend to just notify subscribers when something of note occurs. My basic rule-of-thumb is that if you are just wanting to notify any listeners (who may or may not care) that something has happened, use an event.  However, if you want the caller to provide some function to perform to direct the class about how it should perform work, make it a delegate. Declaring events using custom delegates To declare an event in a type, we simply use the event keyword and specify its delegate type.  For example, let’s say you wanted to create a new TimeOfDayTimer that triggers at a given time of the day (as opposed to on an interval).  We could write something like this: 1: public delegate void TimeOfDayHandler(object source, ElapsedEventArgs e); 2:  3: // A timer that will fire at time of day each day. 4: public class TimeOfDayTimer : IDisposable 5: { 6: // Event that is triggered at time of day. 7: public event TimeOfDayHandler Elapsed; 8:  9: // ... 10: } The first thing to note is that the event is a delegate type, which tells us what types of methods may subscribe to it.  The second thing to note is the signature of the event handler delegate, according to the MSDN: The standard signature of an event handler delegate defines a method that does not return a value, whose first parameter is of type Object and refers to the instance that raises the event, and whose second parameter is derived from type EventArgs and holds the event data. If the event does not generate event data, the second parameter is simply an instance of EventArgs. Otherwise, the second parameter is a custom type derived from EventArgs and supplies any fields or properties needed to hold the event data. So, in a nutshell, the event handler delegates should return void and take two parameters: An object reference to the object that raised the event. An EventArgs (or a subclass of EventArgs) reference to event specific information. Even if your event has no additional information to provide, you are still expected to provide an EventArgs instance.  In this case, feel free to pass the EventArgs.Empty singleton instead of creating new instances of EventArgs (to avoid generating unneeded memory garbage). The EventHandler delegate Because many events have no additional information to pass, and thus do not require custom EventArgs, the signature of the delegates for subscribing to these events is typically: 1: // always takes an object and an EventArgs reference 2: public delegate void EventHandler(object sender, EventArgs e) It would be insane to recreate this delegate for every class that had a basic event with no additional event data, so there already exists a delegate for you called EventHandler that has this very definition!  Feel free to use it to define any events which supply no additional event information: 1: public class Cache 2: { 3: // event that is raised whenever the cache performs a cleanup 4: public event EventHandler OnCleanup; 5:  6: // ... 7: } This will handle any event with the standard EventArgs (no additional information).  But what of events that do need to supply additional information?  Does that mean we’re out of luck for subclasses of EventArgs?  That’s where the generic for of EventHandler comes into play… The generic EventHandler<TEventArgs> delegate Starting with the introduction of generics in .NET 2.0, we have a generic delegate called EventHandler<TEventArgs>.  Its signature is as follows: 1: public delegate void EventHandler<TEventArgs>(object sender, TEventArgs e) 2: where TEventArgs : EventArgs This is similar to EventHandler except it has been made generic to support the more general case.  Thus, it will work for any delegate where the first argument is an object (the sender) and the second argument is a class derived from EventArgs (the event data). For example, let’s say we wanted to create a message receiver, and we wanted it to have a few events such as OnConnected that will tell us when a connection is established (probably with no additional information) and OnMessageReceived that will tell us when a new message arrives (probably with a string for the new message text). So for OnMessageReceived, our MessageReceivedEventArgs might look like this: 1: public sealed class MessageReceivedEventArgs : EventArgs 2: { 3: public string Message { get; set; } 4: } And since OnConnected needs no event argument type defined, our class might look something like this: 1: public class MessageReceiver 2: { 3: // event that is called when the receiver connects with sender 4: public event EventHandler OnConnected; 5:  6: // event that is called when a new message is received. 7: public event EventHandler<MessageReceivedEventArgs> OnMessageReceived; 8:  9: // ... 10: } Notice, nowhere did we have to define a delegate to fit our event definition, the EventHandler and generic EventHandler<TEventArgs> delegates fit almost anything we’d need to do with events. Sidebar: Thread-safety and raising an event When the time comes to raise an event, we should always check to make sure there are subscribers, and then only raise the event if anyone is subscribed.  This is important because if no one is subscribed to the event, then the instance will be null and we will get a NullReferenceException if we attempt to raise the event. 1: // This protects against NullReferenceException... or does it? 2: if (OnMessageReceived != null) 3: { 4: OnMessageReceived(this, new MessageReceivedEventArgs(aMessage)); 5: } The above code seems to handle the null reference if no one is subscribed, but there’s a problem if this is being used in multi-threaded environments.  For example, assume we have thread A which is about to raise the event, and it checks and clears the null check and is about to raise the event.  However, before it can do that thread B unsubscribes to the event, which sets the delegate to null.  Now, when thread A attempts to raise the event, this causes the NullReferenceException that we were hoping to avoid! To counter this, the simplest best-practice method is to copy the event (just a multicast delegate) to a temporary local variable just before we raise it.  Since we are inside the class where this event is being raised, we can copy it to a local variable like this, and it will protect us from multi-threading since multicast delegates are immutable and assignments are atomic: 1: // always make copy of the event multi-cast delegate before checking 2: // for null to avoid race-condition between the null-check and raising it. 3: var handler = OnMessageReceived; 4: 5: if (handler != null) 6: { 7: handler(this, new MessageReceivedEventArgs(aMessage)); 8: } The very slight trade-off is that it’s possible a class may get an event after it unsubscribes in a multi-threaded environment, but this is a small risk and classes should be prepared for this possibility anyway.  For a more detailed discussion on this, check out this excellent Eric Lippert blog post on Events and Races. Summary Generic delegates give us a lot of power to make generic algorithms and classes, and the EventHandler delegate family gives us the flexibility to create events easily, without needing to redefine delegates over and over.  Use them whenever you need to define events with or without specialized EventArgs.   Tweet Technorati Tags: .NET, C#, CSharp, Little Wonders, Generics, Delegates, EventHandler

    Read the article

  • C#/.NET Little Wonders: A Redux

    - by James Michael Hare
    I gave my Little Wonders presentation to the Topeka Dot Net Users' Group today, so re-posting the links to all the previous posts for them. The Presentation: C#/.NET Little Wonders: A Presentation The Original Trilogy: C#/.NET Five Little Wonders (part 1) C#/.NET Five More Little Wonders (part 2) C#/.NET Five Final Little Wonders (part 3) The Subsequent Sequels: C#/.NET Little Wonders: ToDictionary() and ToList() C#/.NET Little Wonders: DateTime is Packed With Goodies C#/.NET Little Wonders: Fun With Enum Methods C#/.NET Little Wonders: Cross-Calling Constructors C#/.NET Little Wonders: Constraining Generics With Where Clause C#/.NET Little Wonders: Comparer<T>.Default C#/.NET Little Wonders: The Useful (But Overlooked) Sets The Concurrent Wonders: C#/.NET Little Wonders: The Concurrent Collections (1 of 3) - ConcurrentQueue and ConcurrentStack C#/.NET Little Wonders: The Concurrent Collections (2 of 3) - ConcurrentDictionary Tweet   Technorati Tags: .NET,C#,Little Wonders

    Read the article

  • C#/.NET Little Wonders: Of LINQ and Lambdas - A Presentation

    - by James Michael Hare
    Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. Today I’m giving a brief beginner’s guide to LINQ and Lambdas at the St. Louis .NET User’s Group so I thought I’d post the presentation here as well.  I updated the presentation a bit as well as added some notes on the query syntax.  Enjoy! The C#/.NET Fundaments: Of Lambdas and LINQ Presentation Of Lambdas and LINQ View more presentations from BlackRabbitCoder   Technorati Tags: C#, CSharp, .NET, Little Wonders, LINQ, Lambdas

    Read the article

  • C#: Optional Parameters - Pros and Pitfalls

    - by James Michael Hare
    When Microsoft rolled out Visual Studio 2010 with C# 4, I was very excited to learn how I could apply all the new features and enhancements to help make me and my team more productive developers. Default parameters have been around forever in C++, and were intentionally omitted in Java in favor of using overloading to satisfy that need as it was though that having too many default parameters could introduce code safety issues.  To some extent I can understand that move, as I’ve been bitten by default parameter pitfalls before, but at the same time I feel like Java threw out the baby with the bathwater in that move and I’m glad to see C# now has them. This post briefly discusses the pros and pitfalls of using default parameters.  I’m avoiding saying cons, because I really don’t believe using default parameters is a negative thing, I just think there are things you must watch for and guard against to avoid abuses that can cause code safety issues. Pro: Default Parameters Can Simplify Code Let’s start out with positives.  Consider how much cleaner it is to reduce all the overloads in methods or constructors that simply exist to give the semblance of optional parameters.  For example, we could have a Message class defined which allows for all possible initializations of a Message: 1: public class Message 2: { 3: // can either cascade these like this or duplicate the defaults (which can introduce risk) 4: public Message() 5: : this(string.Empty) 6: { 7: } 8:  9: public Message(string text) 10: : this(text, null) 11: { 12: } 13:  14: public Message(string text, IDictionary<string, string> properties) 15: : this(text, properties, -1) 16: { 17: } 18:  19: public Message(string text, IDictionary<string, string> properties, long timeToLive) 20: { 21: // ... 22: } 23: }   Now consider the same code with default parameters: 1: public class Message 2: { 3: // can either cascade these like this or duplicate the defaults (which can introduce risk) 4: public Message(string text = "", IDictionary<string, string> properties = null, long timeToLive = -1) 5: { 6: // ... 7: } 8: }   Much more clean and concise and no repetitive coding!  In addition, in the past if you wanted to be able to cleanly supply timeToLive and accept the default on text and properties above, you would need to either create another overload, or pass in the defaults explicitly.  With named parameters, though, we can do this easily: 1: var msg = new Message(timeToLive: 100);   Pro: Named Parameters can Improve Readability I must say one of my favorite things with the default parameters addition in C# is the named parameters.  It lets code be a lot easier to understand visually with no comments.  Think how many times you’ve run across a TimeSpan declaration with 4 arguments and wondered if they were passing in days/hours/minutes/seconds or hours/minutes/seconds/milliseconds.  A novice running through your code may wonder what it is.  Named arguments can help resolve the visual ambiguity: 1: // is this days/hours/minutes/seconds (no) or hours/minutes/seconds/milliseconds (yes) 2: var ts = new TimeSpan(1, 2, 3, 4); 3:  4: // this however is visually very explicit 5: var ts = new TimeSpan(days: 1, hours: 2, minutes: 3, seconds: 4);   Or think of the times you’ve run across something passing a Boolean literal and wondered what it was: 1: // what is false here? 2: var sub = CreateSubscriber(hostname, port, false); 3:  4: // aha! Much more visibly clear 5: var sub = CreateSubscriber(hostname, port, isBuffered: false);   Pitfall: Don't Insert new Default Parameters In Between Existing Defaults Now let’s consider a two potential pitfalls.  The first is really an abuse.  It’s not really a fault of the default parameters themselves, but a fault in the use of them.  Let’s consider that Message constructor again with defaults.  Let’s say you want to add a messagePriority to the message and you think this is more important than a timeToLive value, so you decide to put messagePriority before it in the default, this gives you: 1: public class Message 2: { 3: public Message(string text = "", IDictionary<string, string> properties = null, int priority = 5, long timeToLive = -1) 4: { 5: // ... 6: } 7: }   Oh boy have we set ourselves up for failure!  Why?  Think of all the code out there that could already be using the library that already specified the timeToLive, such as this possible call: 1: var msg = new Message(“An error occurred”, myProperties, 1000);   Before this specified a message with a TTL of 1000, now it specifies a message with a priority of 1000 and a time to live of -1 (infinite).  All of this with NO compiler errors or warnings. So the rule to take away is if you are adding new default parameters to a method that’s currently in use, make sure you add them to the end of the list or create a brand new method or overload. Pitfall: Beware of Default Parameters in Inheritance and Interface Implementation Now, the second potential pitfalls has to do with inheritance and interface implementation.  I’ll illustrate with a puzzle: 1: public interface ITag 2: { 3: void WriteTag(string tagName = "ITag"); 4: } 5:  6: public class BaseTag : ITag 7: { 8: public virtual void WriteTag(string tagName = "BaseTag") { Console.WriteLine(tagName); } 9: } 10:  11: public class SubTag : BaseTag 12: { 13: public override void WriteTag(string tagName = "SubTag") { Console.WriteLine(tagName); } 14: } 15:  16: public static class Program 17: { 18: public static void Main() 19: { 20: SubTag subTag = new SubTag(); 21: BaseTag subByBaseTag = subTag; 22: ITag subByInterfaceTag = subTag; 23:  24: // what happens here? 25: subTag.WriteTag(); 26: subByBaseTag.WriteTag(); 27: subByInterfaceTag.WriteTag(); 28: } 29: }   What happens?  Well, even though the object in each case is SubTag whose tag is “SubTag”, you will get: 1: SubTag 2: BaseTag 3: ITag   Why?  Because default parameter are resolved at compile time, not runtime!  This means that the default does not belong to the object being called, but by the reference type it’s being called through.  Since the SubTag instance is being called through an ITag reference, it will use the default specified in ITag. So the moral of the story here is to be very careful how you specify defaults in interfaces or inheritance hierarchies.  I would suggest avoiding repeating them, and instead concentrating on the layer of classes or interfaces you must likely expect your caller to be calling from. For example, if you have a messaging factory that returns an IMessage which can be either an MsmqMessage or JmsMessage, it only makes since to put the defaults at the IMessage level since chances are your user will be using the interface only. So let’s sum up.  In general, I really love default and named parameters in C# 4.0.  I think they’re a great tool to help make your code easier to read and maintain when used correctly. On the plus side, default parameters: Reduce redundant overloading for the sake of providing optional calling structures. Improve readability by being able to name an ambiguous argument. But remember to make sure you: Do not insert new default parameters in the middle of an existing set of default parameters, this may cause unpredictable behavior that may not necessarily throw a syntax error – add to end of list or create new method. Be extremely careful how you use default parameters in inheritance hierarchies and interfaces – choose the most appropriate level to add the defaults based on expected usage. Technorati Tags: C#,.NET,Software,Default Parameters

    Read the article

  • C#/.NET Little Wonders: ConcurrentBag and BlockingCollection

    - by James Michael Hare
    In the first week of concurrent collections, began with a general introduction and discussed the ConcurrentStack<T> and ConcurrentQueue<T>.  The last post discussed the ConcurrentDictionary<T> .  Finally this week, we shall close with a discussion of the ConcurrentBag<T> and BlockingCollection<T>. For more of the "Little Wonders" posts, see C#/.NET Little Wonders: A Redux. Recap As you'll recall from the previous posts, the original collections were object-based containers that accomplished synchronization through a Synchronized member.  With the advent of .NET 2.0, the original collections were succeeded by the generic collections which are fully type-safe, but eschew automatic synchronization.  With .NET 4.0, a new breed of collections was born in the System.Collections.Concurrent namespace.  Of these, the final concurrent collection we will examine is the ConcurrentBag and a very useful wrapper class called the BlockingCollection. For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this informative whitepaper by the Microsoft Parallel Computing Platform team here. ConcurrentBag<T> – Thread-safe unordered collection. Unlike the other concurrent collections, the ConcurrentBag<T> has no non-concurrent counterpart in the .NET collections libraries.  Items can be added and removed from a bag just like any other collection, but unlike the other collections, the items are not maintained in any order.  This makes the bag handy for those cases when all you care about is that the data be consumed eventually, without regard for order of consumption or even fairness – that is, it’s possible new items could be consumed before older items given the right circumstances for a period of time. So why would you ever want a container that can be unfair?  Well, to look at it another way, you can use a ConcurrentQueue and get the fairness, but it comes at a cost in that the ordering rules and synchronization required to maintain that ordering can affect scalability a bit.  Thus sometimes the bag is great when you want the fastest way to get the next item to process, and don’t care what item it is or how long its been waiting. The way that the ConcurrentBag works is to take advantage of the new ThreadLocal<T> type (new in System.Threading for .NET 4.0) so that each thread using the bag has a list local to just that thread.  This means that adding or removing to a thread-local list requires very low synchronization.  The problem comes in where a thread goes to consume an item but it’s local list is empty.  In this case the bag performs “work-stealing” where it will rob an item from another thread that has items in its list.  This requires a higher level of synchronization which adds a bit of overhead to the take operation. So, as you can imagine, this makes the ConcurrentBag good for situations where each thread both produces and consumes items from the bag, but it would be less-than-idea in situations where some threads are dedicated producers and the other threads are dedicated consumers because the work-stealing synchronization would outweigh the thread-local optimization for a thread taking its own items. Like the other concurrent collections, there are some curiosities to keep in mind: IsEmpty(), Count, ToArray(), and GetEnumerator() lock collection Each of these needs to take a snapshot of whole bag to determine if empty, thus they tend to be more expensive and cause Add() and Take() operations to block. ToArray() and GetEnumerator() are static snapshots Because it is based on a snapshot, will not show subsequent updates after snapshot. Add() is lightweight Since adding to the thread-local list, there is very little overhead on Add. TryTake() is lightweight if items in thread-local list As long as items are in the thread-local list, TryTake() is very lightweight, much more so than ConcurrentStack() and ConcurrentQueue(), however if the local thread list is empty, it must steal work from another thread, which is more expensive. Remember, a bag is not ideal for all situations, it is mainly ideal for situations where a process consumes an item and either decomposes it into more items to be processed, or handles the item partially and places it back to be processed again until some point when it will complete.  The main point is that the bag works best when each thread both takes and adds items. For example, we could create a totally contrived example where perhaps we want to see the largest power of a number before it crosses a certain threshold.  Yes, obviously we could easily do this with a log function, but bare with me while I use this contrived example for simplicity. So let’s say we have a work function that will take a Tuple out of a bag, this Tuple will contain two ints.  The first int is the original number, and the second int is the last multiple of that number.  So we could load our bag with the initial values (let’s say we want to know the last multiple of each of 2, 3, 5, and 7 under 100. 1: var bag = new ConcurrentBag<Tuple<int, int>> 2: { 3: Tuple.Create(2, 1), 4: Tuple.Create(3, 1), 5: Tuple.Create(5, 1), 6: Tuple.Create(7, 1) 7: }; Then we can create a method that given the bag, will take out an item, apply the multiplier again, 1: public static void FindHighestPowerUnder(ConcurrentBag<Tuple<int,int>> bag, int threshold) 2: { 3: Tuple<int,int> pair; 4:  5: // while there are items to take, this will prefer local first, then steal if no local 6: while (bag.TryTake(out pair)) 7: { 8: // look at next power 9: var result = Math.Pow(pair.Item1, pair.Item2 + 1); 10:  11: if (result < threshold) 12: { 13: // if smaller than threshold bump power by 1 14: bag.Add(Tuple.Create(pair.Item1, pair.Item2 + 1)); 15: } 16: else 17: { 18: // otherwise, we're done 19: Console.WriteLine("Highest power of {0} under {3} is {0}^{1} = {2}.", 20: pair.Item1, pair.Item2, Math.Pow(pair.Item1, pair.Item2), threshold); 21: } 22: } 23: } Now that we have this, we can load up this method as an Action into our Tasks and run it: 1: // create array of tasks, start all, wait for all 2: var tasks = new[] 3: { 4: new Task(() => FindHighestPowerUnder(bag, 100)), 5: new Task(() => FindHighestPowerUnder(bag, 100)), 6: }; 7:  8: Array.ForEach(tasks, t => t.Start()); 9:  10: Task.WaitAll(tasks); Totally contrived, I know, but keep in mind the main point!  When you have a thread or task that operates on an item, and then puts it back for further consumption – or decomposes an item into further sub-items to be processed – you should consider a ConcurrentBag as the thread-local lists will allow for quick processing.  However, if you need ordering or if your processes are dedicated producers or consumers, this collection is not ideal.  As with anything, you should performance test as your mileage will vary depending on your situation! BlockingCollection<T> – A producers & consumers pattern collection The BlockingCollection<T> can be treated like a collection in its own right, but in reality it adds a producers and consumers paradigm to any collection that implements the interface IProducerConsumerCollection<T>.  If you don’t specify one at the time of construction, it will use a ConcurrentQueue<T> as its underlying store. If you don’t want to use the ConcurrentQueue, the ConcurrentStack and ConcurrentBag also implement the interface (though ConcurrentDictionary does not).  In addition, you are of course free to create your own implementation of the interface. So, for those who don’t remember the producers and consumers classical computer-science problem, the gist of it is that you have one (or more) processes that are creating items (producers) and one (or more) processes that are consuming these items (consumers).  Now, the crux of the problem is that there is a bin (queue) where the produced items are placed, and typically that bin has a limited size.  Thus if a producer creates an item, but there is no space to store it, it must wait until an item is consumed.  Also if a consumer goes to consume an item and none exists, it must wait until an item is produced. The BlockingCollection makes it trivial to implement any standard producers/consumers process set by providing that “bin” where the items can be produced into and consumed from with the appropriate blocking operations.  In addition, you can specify whether the bin should have a limited size or can be (theoretically) unbounded, and you can specify timeouts on the blocking operations. As far as your choice of “bin”, for the most part the ConcurrentQueue is the right choice because it is fairly light and maximizes fairness by ordering items so that they are consumed in the same order they are produced.  You can use the concurrent bag or stack, of course, but your ordering would be random-ish in the case of the former and LIFO in the case of the latter. So let’s look at some of the methods of note in BlockingCollection: BoundedCapacity returns capacity of the “bin” If the bin is unbounded, the capacity is int.MaxValue. Count returns an internally-kept count of items This makes it O(1), but if you modify underlying collection directly (not recommended) it is unreliable. CompleteAdding() is used to cut off further adds. This sets IsAddingCompleted and begins to wind down consumers once empty. IsAddingCompleted is true when producers are “done”. Once you are done producing, should complete the add process to alert consumers. IsCompleted is true when producers are “done” and “bin” is empty. Once you mark the producers done, and all items removed, this will be true. Add() is a blocking add to collection. If bin is full, will wait till space frees up Take() is a blocking remove from collection. If bin is empty, will wait until item is produced or adding is completed. GetConsumingEnumerable() is used to iterate and consume items. Unlike the standard enumerator, this one consumes the items instead of iteration. TryAdd() attempts add but does not block completely If adding would block, returns false instead, can specify TimeSpan to wait before stopping. TryTake() attempts to take but does not block completely Like TryAdd(), if taking would block, returns false instead, can specify TimeSpan to wait. Note the use of CompleteAdding() to signal the BlockingCollection that nothing else should be added.  This means that any attempts to TryAdd() or Add() after marked completed will throw an InvalidOperationException.  In addition, once adding is complete you can still continue to TryTake() and Take() until the bin is empty, and then Take() will throw the InvalidOperationException and TryTake() will return false. So let’s create a simple program to try this out.  Let’s say that you have one process that will be producing items, but a slower consumer process that handles them.  This gives us a chance to peek inside what happens when the bin is bounded (by default, the bin is NOT bounded). 1: var bin = new BlockingCollection<int>(5); Now, we create a method to produce items: 1: public static void ProduceItems(BlockingCollection<int> bin, int numToProduce) 2: { 3: for (int i = 0; i < numToProduce; i++) 4: { 5: // try for 10 ms to add an item 6: while (!bin.TryAdd(i, TimeSpan.FromMilliseconds(10))) 7: { 8: Console.WriteLine("Bin is full, retrying..."); 9: } 10: } 11:  12: // once done producing, call CompleteAdding() 13: Console.WriteLine("Adding is completed."); 14: bin.CompleteAdding(); 15: } And one to consume them: 1: public static void ConsumeItems(BlockingCollection<int> bin) 2: { 3: // This will only be true if CompleteAdding() was called AND the bin is empty. 4: while (!bin.IsCompleted) 5: { 6: int item; 7:  8: if (!bin.TryTake(out item, TimeSpan.FromMilliseconds(10))) 9: { 10: Console.WriteLine("Bin is empty, retrying..."); 11: } 12: else 13: { 14: Console.WriteLine("Consuming item {0}.", item); 15: Thread.Sleep(TimeSpan.FromMilliseconds(20)); 16: } 17: } 18: } Then we can fire them off: 1: // create one producer and two consumers 2: var tasks = new[] 3: { 4: new Task(() => ProduceItems(bin, 20)), 5: new Task(() => ConsumeItems(bin)), 6: new Task(() => ConsumeItems(bin)), 7: }; 8:  9: Array.ForEach(tasks, t => t.Start()); 10:  11: Task.WaitAll(tasks); Notice that the producer is faster than the consumer, thus it should be hitting a full bin often and displaying the message after it times out on TryAdd(). 1: Consuming item 0. 2: Consuming item 1. 3: Bin is full, retrying... 4: Bin is full, retrying... 5: Consuming item 3. 6: Consuming item 2. 7: Bin is full, retrying... 8: Consuming item 4. 9: Consuming item 5. 10: Bin is full, retrying... 11: Consuming item 6. 12: Consuming item 7. 13: Bin is full, retrying... 14: Consuming item 8. 15: Consuming item 9. 16: Bin is full, retrying... 17: Consuming item 10. 18: Consuming item 11. 19: Bin is full, retrying... 20: Consuming item 12. 21: Consuming item 13. 22: Bin is full, retrying... 23: Bin is full, retrying... 24: Consuming item 14. 25: Adding is completed. 26: Consuming item 15. 27: Consuming item 16. 28: Consuming item 17. 29: Consuming item 19. 30: Consuming item 18. Also notice that once CompleteAdding() is called and the bin is empty, the IsCompleted property returns true, and the consumers will exit. Summary The ConcurrentBag is an interesting collection that can be used to optimize concurrency scenarios where tasks or threads both produce and consume items.  In this way, it will choose to consume its own work if available, and then steal if not.  However, in situations where you want fair consumption or ordering, or in situations where the producers and consumers are distinct processes, the bag is not optimal. The BlockingCollection is a great wrapper around all of the concurrent queue, stack, and bag that allows you to add producer and consumer semantics easily including waiting when the bin is full or empty. That’s the end of my dive into the concurrent collections.  I’d also strongly recommend, once again, you read this excellent Microsoft white paper that goes into much greater detail on the efficiencies you can gain using these collections judiciously (here). Tweet Technorati Tags: C#,.NET,Concurrent Collections,Little Wonders

    Read the article

  • C++ Little Wonders: The C++11 auto keyword redux

    - by James Michael Hare
    I’ve decided to create a sub-series of my Little Wonders posts to focus on C++.  Just like their C# counterparts, these posts will focus on those features of the C++ language that can help improve code by making it easier to write and maintain.  The index of the C# Little Wonders can be found here. This has been a busy week with a rollout of some new website features here at my work, so I don’t have a big post for this week.  But I wanted to write something up, and since lately I’ve been renewing my C++ skills in a separate project, it seemed like a good opportunity to start a C++ Little Wonders series.  Most of my development work still tends to focus on C#, but it was great to get back into the saddle and renew my C++ knowledge.  Today I’m going to focus on a new feature in C++11 (formerly known as C++0x, which is a major move forward in the C++ language standard).  While this small keyword can seem so trivial, I feel it is a big step forward in improving readability in C++ programs. The auto keyword If you’ve worked on C++ for a long time, you probably have some passing familiarity with the old auto keyword as one of those rarely used C++ keywords that was almost never used because it was the default. That is, in the code below (before C++11): 1: int foo() 2: { 3: // automatic variables (allocated and deallocated on stack) 4: int x; 5: auto int y; 6:  7: // static variables (retain their value across calls) 8: static int z; 9:  10: return 0; 11: } The variable x is assumed to be auto because that is the default, thus it is unnecessary to specify it explicitly as in the declaration of y below that.  Basically, an auto variable is one that is allocated and de-allocated on the stack automatically.  Contrast this to static variables, that are allocated statically and exist across the lifetime of the program. Because auto was so rarely (if ever) used since it is the norm, they decided to remove it for this purpose and give it new meaning in C++11.  The new meaning of auto: implicit typing Now, if your compiler supports C++ 11 (or at least a good subset of C++11 or 0x) you can take advantage of type inference in C++.  For those of you from the C# world, this means that the auto keyword in C++ now behaves a lot like the var keyword in C#! For example, many of us have had to declare those massive type declarations for an iterator before.  Let’s say we have a std::map of std::string to int which will map names to ages: 1: std::map<std::string, int> myMap; And then let’s say we want to find the age of a given person: 1: // Egad that's a long type... 2: std::map<std::string, int>::const_iterator pos = myMap.find(targetName); Notice that big ugly type definition to declare variable pos?  Sure, we could shorten this by creating a typedef of our specific map type if we wanted, but now with the auto keyword there’s no need: 1: // much shorter! 2: auto pos = myMap.find(targetName); The auto now tells the compiler to determine what type pos should be based on what it’s being assigned to.  This is not dynamic typing, it still determines the type as if it were explicitly declared and once declared that type cannot be changed.  That is, this is invalid: 1: // x is type int 2: auto x = 42; 3:  4: // can't assign string to int 5: x = "Hello"; Once the compiler determines x is type int it is exactly as if we typed int x = 42; instead, so don’t' confuse it with dynamic typing, it’s still very type-safe. An interesting feature of the auto keyword is that you can modify the inferred type: 1: // declare method that returns int* 2: int* GetPointer(); 3:  4: // p1 is int*, auto inferred type is int 5: auto *p1 = GetPointer(); 6:  7: // ps is int*, auto inferred type is int* 8: auto p2 = GetPointer(); Notice in both of these cases, p1 and p2 are determined to be int* but in each case the inferred type was different.  because we declared p1 as auto *p1 and GetPointer() returns int*, it inferred the type int was needed to complete the declaration.  In the second case, however, we declared p2 as auto p2 which means the inferred type was int*.  Ultimately, this make p1 and p2 the same type, but which type is inferred makes a difference, if you are chaining multiple inferred declarations together.  In these cases, the inferred type of each must match the first: 1: // Type inferred is int 2: // p1 is int* 3: // p2 is int 4: // p3 is int& 5: auto *p1 = GetPointer(), p2 = 42, &p3 = p2; Note that this works because the inferred type was int, if the inferred type was int* instead: 1: // syntax error, p1 was inferred to be int* so p2 and p3 don't make sense 2: auto p1 = GetPointer(), p2 = 42, &p3 = p2; You could also use const or static to modify the inferred type: 1: // inferred type is an int, theAnswer is a const int 2: const auto theAnswer = 42; 3:  4: // inferred type is double, Pi is a static double 5: static auto Pi = 3.1415927; Thus in the examples above it inferred the types int and double respectively, which were then modified to const and static. Summary The auto keyword has gotten new life in C++11 to allow you to infer the type of a variable from it’s initialization.  This simple little keyword can be used to cut down large declarations for complex types into a much more readable form, where appropriate.   Technorati Tags: C++, C++11, Little Wonders, auto

    Read the article

1 2 3 4  | Next Page >