Search Results

Search found 17770 results on 711 pages for 'inside the concurrent collections'.

Page 1/711 | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Lazy non-modifiable list in Google Collections

- by mindas

I was looking for a decent implementation of a generic lazy non-modifiable list implementation to wrap my search result entries. The unmodifiable part of the task is easy as it can be achieved by Collections.unmodifiableList() so I only need to sort out the the lazy part. Surprisingly, google-collections doesn't have anything to offer; while LazyList from Apache Commons Collections does not support generics. I have found an attempt to build something on top of google-collections but it seems to be incomplete (e.g. does not support size()), outdated (does not compile with 1.0 final) and requiring some external classes, but could be used as a good starting point to build my own class. Is anybody aware of any good implementation of a LazyList? If not, which option do you think is better: write my own implementation, based on google-collections ForwardingList, similar to what Peter Maas did; write my own wrapper around Commons Collections LazyList (the wrapper would only add generics so I don't have to cast everywhere but only in the wrapper itself); just write something on top of java.util.AbstractList; Any other suggestions are welcome.

Read the article
C#/.NET Little Wonders: The Concurrent Collections (1 of 3)

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. In the next few weeks, we will discuss the concurrent collections and how they have changed the face of concurrent programming. This week’s post will begin with a general introduction and discuss the ConcurrentStack<T> and ConcurrentQueue<T>. Then in the following post we’ll discuss the ConcurrentDictionary<T> and ConcurrentBag<T>. Finally, we shall close on the third post with a discussion of the BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. A brief history of collections In the beginning was the .NET 1.0 Framework. And out of this framework emerged the System.Collections namespace, and it was good. It contained all the basic things a growing programming language needs like the ArrayList and Hashtable collections. The main problem, of course, with these original collections is that they held items of type object which means you had to be disciplined enough to use them correctly or you could end up with runtime errors if you got an object of a type you weren't expecting. Then came .NET 2.0 and generics and our world changed forever! With generics the C# language finally got an equivalent of the very powerful C++ templates. As such, the System.Collections.Generic was born and we got type-safe versions of all are favorite collections. The List<T> succeeded the ArrayList and the Dictionary<TKey,TValue> succeeded the Hashtable and so on. The new versions of the library were not only safer because they checked types at compile-time, in many cases they were more performant as well. So much so that it's Microsoft's recommendation that the System.Collections original collections only be used for backwards compatibility. So we as developers came to know and love the generic collections and took them into our hearts and embraced them. The problem is, thread safety in both the original collections and the generic collections can be problematic, for very different reasons. Now, if you are only doing single-threaded development you may not care – after all, no locking is required. Even if you do have multiple threads, if a collection is “load-once, read-many” you don’t need to do anything to protect that container from multi-threaded access, as illustrated below: 1: public static class OrderTypeTranslator 2: { 3: // because this dictionary is loaded once before it is ever accessed, we don't need to synchronize 4: // multi-threaded read access 5: private static readonly Dictionary<string, char> _translator = new Dictionary<string, char> 6: { 7: {"New", 'N'}, 8: {"Update", 'U'}, 9: {"Cancel", 'X'} 10: }; 11: 12: // the only public interface into the dictionary is for reading, so inherently thread-safe 13: public static char? Translate(string orderType) 14: { 15: char charValue; 16: if (_translator.TryGetValue(orderType, out charValue)) 17: { 18: return charValue; 19: } 20: 21: return null; 22: } 23: } Unfortunately, most of our computer science problems cannot get by with just single-threaded applications or with multi-threading in a load-once manner. Looking at today's trends, it's clear to see that computers are not so much getting faster because of faster processor speeds -- we've nearly reached the limits we can push through with today's technologies -- but more because we're adding more cores to the boxes. With this new hardware paradigm, it is even more important to use multi-threaded applications to take full advantage of parallel processing to achieve higher application speeds. So let's look at how to use collections in a thread-safe manner. Using historical collections in a concurrent fashion The early .NET collections (System.Collections) had a Synchronized() static method that could be used to wrap the early collections to make them completely thread-safe. This paradigm was dropped in the generic collections (System.Collections.Generic) because having a synchronized wrapper resulted in atomic locks for all operations, which could prove overkill in many multithreading situations. Thus the paradigm shifted to having the user of the collection specify their own locking, usually with an external object: 1: public class OrderAggregator 2: { 3: private static readonly Dictionary<string, List<Order>> _orders = new Dictionary<string, List<Order>>(); 4: private static readonly _orderLock = new object(); 5: 6: public void Add(string accountNumber, Order newOrder) 7: { 8: List<Order> ordersForAccount; 9: 10: // a complex operation like this should all be protected 11: lock (_orderLock) 12: { 13: if (!_orders.TryGetValue(accountNumber, out ordersForAccount)) 14: { 15: _orders.Add(accountNumber, ordersForAccount = new List<Order>()); 16: } 17: 18: ordersForAccount.Add(newOrder); 19: } 20: } 21: } Notice how we’re performing several operations on the dictionary under one lock. With the Synchronized() static methods of the early collections, you wouldn’t be able to specify this level of locking (a more macro-level). So in the generic collections, it was decided that if a user needed synchronization, they could implement their own locking scheme instead so that they could provide synchronization as needed. The need for better concurrent access to collections Here’s the problem: it’s relatively easy to write a collection that locks itself down completely for access, but anything more complex than that can be difficult and error-prone to write, and much less to make it perform efficiently! For example, what if you have a Dictionary that has frequent reads but in-frequent updates? Do you want to lock down the entire Dictionary for every access? This would be overkill and would prevent concurrent reads. In such cases you could use something like a ReaderWriterLockSlim which allows for multiple readers in a lock, and then once a writer grabs the lock it blocks all further readers until the writer is done (in a nutshell). This is all very complex stuff to consider. Fortunately, this is where the Concurrent Collections come in. The Parallel Computing Platform team at Microsoft went through great pains to determine how to make a set of concurrent collections that would have the best performance characteristics for general case multi-threaded use. Now, as in all things involving threading, you should always make sure you evaluate all your container options based on the particular usage scenario and the degree of parallelism you wish to acheive. This article should not be taken to understand that these collections are always supperior to the generic collections. Each fills a particular need for a particular situation. Understanding what each container is optimized for is key to the success of your application whether it be single-threaded or multi-threaded. General points to consider with the concurrent collections The MSDN points out that the concurrent collections all support the ICollection interface. However, since the collections are already synchronized, the IsSynchronized property always returns false, and SyncRoot always returns null. Thus you should not attempt to use these properties for synchronization purposes. Note that since the concurrent collections also may have different operations than the traditional data structures you may be used to. Now you may ask why they did this, but it was done out of necessity to keep operations safe and atomic. For example, in order to do a Pop() on a stack you have to know the stack is non-empty, but between the time you check the stack’s IsEmpty property and then do the Pop() another thread may have come in and made the stack empty! This is why some of the traditional operations have been changed to make them safe for concurrent use. In addition, some properties and methods in the concurrent collections achieve concurrency by creating a snapshot of the collection, which means that some operations that were traditionally O(1) may now be O(n) in the concurrent models. I’ll try to point these out as we talk about each collection so you can be aware of any potential performance impacts. Finally, all the concurrent containers are safe for enumeration even while being modified, but some of the containers support this in different ways (snapshot vs. dirty iteration). Once again I’ll highlight how thread-safe enumeration works for each collection. ConcurrentStack<T>: The thread-safe LIFO container The ConcurrentStack<T> is the thread-safe counterpart to the System.Collections.Generic.Stack<T>, which as you may remember is your standard last-in-first-out container. If you think of algorithms that favor stack usage (for example, depth-first searches of graphs and trees) then you can see how using a thread-safe stack would be of benefit. The ConcurrentStack<T> achieves thread-safe access by using System.Threading.Interlocked operations. This means that the multi-threaded access to the stack requires no traditional locking and is very, very fast! For the most part, the ConcurrentStack<T> behaves like it’s Stack<T> counterpart with a few differences: Pop() was removed in favor of TryPop() Returns true if an item existed and was popped and false if empty. PushRange() and TryPopRange() were added Allows you to push multiple items and pop multiple items atomically. Count takes a snapshot of the stack and then counts the items. This means it is a O(n) operation, if you just want to check for an empty stack, call IsEmpty instead which is O(1). ToArray() and GetEnumerator() both also take snapshots. This means that iteration over a stack will give you a static view at the time of the call and will not reflect updates. Pushing on a ConcurrentStack<T> works just like you’d expect except for the aforementioned PushRange() method that was added to allow you to push a range of items concurrently. 1: var stack = new ConcurrentStack<string>(); 2: 3: // adding to stack is much the same as before 4: stack.Push("First"); 5: 6: // but you can also push multiple items in one atomic operation (no interleaves) 7: stack.PushRange(new [] { "Second", "Third", "Fourth" }); For looking at the top item of the stack (without removing it) the Peek() method has been removed in favor of a TryPeek(). This is because in order to do a peek the stack must be non-empty, but between the time you check for empty and the time you execute the peek the stack contents may have changed. Thus the TryPeek() was created to be an atomic check for empty, and then peek if not empty: 1: // to look at top item of stack without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (stack.TryPeek(out item)) 5: { 6: Console.WriteLine("Top item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Stack was empty."); 11: } Finally, to remove items from the stack, we have the TryPop() for single, and TryPopRange() for multiple items. Just like the TryPeek(), these operations replace Pop() since we need to ensure atomically that the stack is non-empty before we pop from it: 1: // to remove items, use TryPop or TryPopRange to get multiple items atomically (no interleaves) 2: if (stack.TryPop(out item)) 3: { 4: Console.WriteLine("Popped " + item); 5: } 6: 7: // TryPopRange will only pop up to the number of spaces in the array, the actual number popped is returned. 8: var poppedItems = new string[2]; 9: int numPopped = stack.TryPopRange(poppedItems); 10: 11: foreach (var theItem in poppedItems.Take(numPopped)) 12: { 13: Console.WriteLine("Popped " + theItem); 14: } Finally, note that as stated before, GetEnumerator() and ToArray() gets a snapshot of the data at the time of the call. That means if you are enumerating the stack you will get a snapshot of the stack at the time of the call. This is illustrated below: 1: var stack = new ConcurrentStack<string>(); 2: 3: // adding to stack is much the same as before 4: stack.Push("First"); 5: 6: var results = stack.GetEnumerator(); 7: 8: // but you can also push multiple items in one atomic operation (no interleaves) 9: stack.PushRange(new [] { "Second", "Third", "Fourth" }); 10: 11: while(results.MoveNext()) 12: { 13: Console.WriteLine("Stack only has: " + results.Current); 14: } The only item that will be printed out in the above code is "First" because the snapshot was taken before the other items were added. This may sound like an issue, but it’s really for safety and is more correct. You don’t want to enumerate a stack and have half a view of the stack before an update and half a view of the stack after an update, after all. In addition, note that this is still thread-safe, whereas iterating through a non-concurrent collection while updating it in the old collections would cause an exception. ConcurrentQueue<T>: The thread-safe FIFO container The ConcurrentQueue<T> is the thread-safe counterpart of the System.Collections.Generic.Queue<T> class. The concurrent queue uses an underlying list of small arrays and lock-free System.Threading.Interlocked operations on the head and tail arrays. Once again, this allows us to do thread-safe operations without the need for heavy locks! The ConcurrentQueue<T> (like the ConcurrentStack<T>) has some departures from the non-concurrent counterpart. Most notably: Dequeue() was removed in favor of TryDequeue(). Returns true if an item existed and was dequeued and false if empty. Count does not take a snapshot It subtracts the head and tail index to get the count. This results overall in a O(1) complexity which is quite good. It’s still recommended, however, that for empty checks you call IsEmpty instead of comparing Count to zero. ToArray() and GetEnumerator() both take snapshots. This means that iteration over a queue will give you a static view at the time of the call and will not reflect updates. The Enqueue() method on the ConcurrentQueue<T> works much the same as the generic Queue<T>: 1: var queue = new ConcurrentQueue<string>(); 2: 3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5: queue.Enqueue("Second"); 6: queue.Enqueue("Third"); For front item access, the TryPeek() method must be used to attempt to see the first item if the queue. There is no Peek() method since, as you’ll remember, we can only peek on a non-empty queue, so we must have an atomic TryPeek() that checks for empty and then returns the first item if the queue is non-empty. 1: // to look at first item in queue without removing it, can use TryPeek. 2: // Note that there is no Peek(), this is because you need to check for empty first. TryPeek does. 3: string item; 4: if (queue.TryPeek(out item)) 5: { 6: Console.WriteLine("First item was " + item); 7: } 8: else 9: { 10: Console.WriteLine("Queue was empty."); 11: } Then, to remove items you use TryDequeue(). Once again this is for the same reason we have TryPeek() and not Peek(): 1: // to remove items, use TryDequeue. If queue is empty returns false. 2: if (queue.TryDequeue(out item)) 3: { 4: Console.WriteLine("Dequeued first item " + item); 5: } Just like the concurrent stack, the ConcurrentQueue<T> takes a snapshot when you call ToArray() or GetEnumerator() which means that subsequent updates to the queue will not be seen when you iterate over the results. Thus once again the code below will only show the first item, since the other items were added after the snapshot. 1: var queue = new ConcurrentQueue<string>(); 2: 3: // adding to queue is much the same as before 4: queue.Enqueue("First"); 5: 6: var iterator = queue.GetEnumerator(); 7: 8: queue.Enqueue("Second"); 9: queue.Enqueue("Third"); 10: 11: // only shows First 12: while (iterator.MoveNext()) 13: { 14: Console.WriteLine("Dequeued item " + iterator.Current); 15: } Using collections concurrently You’ll notice in the examples above I stuck to using single-threaded examples so as to make them deterministic and the results obvious. Of course, if we used these collections in a truly multi-threaded way the results would be less deterministic, but would still be thread-safe and with no locking on your part required! For example, say you have an order processor that takes an IEnumerable<Order> and handles each other in a multi-threaded fashion, then groups the responses together in a concurrent collection for aggregation. This can be done easily with the TPL’s Parallel.ForEach(): 1: public static IEnumerable<OrderResult> ProcessOrders(IEnumerable<Order> orderList) 2: { 3: var proxy = new OrderProxy(); 4: var results = new ConcurrentQueue<OrderResult>(); 5: 6: // notice that we can process all these in parallel and put the results 7: // into our concurrent collection without needing any external locking! 8: Parallel.ForEach(orderList, 9: order => 10: { 11: var result = proxy.PlaceOrder(order); 12: 13: results.Enqueue(result); 14: }); 15: 16: return results; 17: } Summary Obviously, if you do not need multi-threaded safety, you don’t need to use these collections, but when you do need multi-threaded collections these are just the ticket! The plethora of features (I always think of the movie The Three Amigos when I say plethora) built into these containers and the amazing way they acheive thread-safe access in an efficient manner is wonderful to behold. Stay tuned next week where we’ll continue our discussion with the ConcurrentBag<T> and the ConcurrentDictionary<TKey,TValue>. For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here. Tweet Technorati Tags: C#,.NET,Concurrent Collections,Collections,Multi-Threading,Little Wonders,BlackRabbitCoder,James Michael Hare

Read the article
Inside the Concurrent Collections: ConcurrentBag

- by Simon Cooper

Unlike the other concurrent collections, ConcurrentBag does not really have a non-concurrent analogy. As stated in the MSDN documentation, ConcurrentBag is optimised for the situation where the same thread is both producing and consuming items from the collection. We'll see how this is the case as we take a closer look. Again, I recommend you have ConcurrentBag open in a decompiler for reference. Thread Statics ConcurrentBag makes heavy use of thread statics - static variables marked with ThreadStaticAttribute. This is a special attribute that instructs the CLR to scope any values assigned to or read from the variable to the executing thread, not globally within the AppDomain. This means that if two different threads assign two different values to the same thread static variable, one value will not overwrite the other, and each thread will see the value they assigned to the variable, separately to any other thread. This is a very useful function that allows for ConcurrentBag's concurrency properties. You can think of a thread static variable: [ThreadStatic] private static int m_Value; as doing the same as: private static Dictionary<Thread, int> m_Values; where the executing thread's identity is used to automatically set and retrieve the corresponding value in the dictionary. In .NET 4, this usage of ThreadStaticAttribute is encapsulated in the ThreadLocal class. Lists of lists ConcurrentBag, at its core, operates as a linked list of linked lists: Each outer list node is an instance of ThreadLocalList, and each inner list node is an instance of Node. Each outer ThreadLocalList is owned by a particular thread, accessible through the thread local m_locals variable: private ThreadLocal<ThreadLocalList<T>> m_locals It is important to note that, although the m_locals variable is thread-local, that only applies to accesses through that variable. The objects referenced by the thread (each instance of the ThreadLocalList object) are normal heap objects that are not specific to any thread. Thinking back to the Dictionary analogy above, if each value stored in the dictionary could be accessed by other means, then any thread could access the value belonging to other threads using that mechanism. Only reads and writes to the variable defined as thread-local are re-routed by the CLR according to the executing thread's identity. So, although m_locals is defined as thread-local, the m_headList, m_nextList and m_tailList variables aren't. This means that any thread can access all the thread local lists in the collection by doing a linear search through the outer linked list defined by these variables. Adding items So, onto the collection operations. First, adding items. This one's pretty simple. If the current thread doesn't already own an instance of ThreadLocalList, then one is created (or, if there are lists owned by threads that have stopped, it takes control of one of those). Then the item is added to the head of that thread's list. That's it. Don't worry, it'll get more complicated when we account for the other operations on the list! Taking & Peeking items This is where it gets tricky. If the current thread's list has items in it, then it peeks or removes the head item (not the tail item) from the local list and returns that. However, if the local list is empty, it has to go and steal another item from another list, belonging to a different thread. It iterates through all the thread local lists in the collection using the m_headList and m_nextList variables until it finds one that has items in it, and it steals one item from that list. Up to this point, the two threads had been operating completely independently. To steal an item from another thread's list, the stealing thread has to do it in such a way as to not step on the owning thread's toes. Recall how adding and removing items both operate on the head of the thread's linked list? That gives us an easy way out - a thread trying to steal items from another thread can pop in round the back of another thread's list using the m_tail variable, and steal an item from the back without the owning thread knowing anything about it. The owning thread can carry on completely independently, unaware that one of its items has been nicked. However, this only works when there are at least 3 items in the list, as that guarantees there will be at least one node between the owning thread performing operations on the list head and the thread stealing items from the tail - there's no chance of the two threads operating on the same node at the same time and causing a race condition. If there's less than three items in the list, then there does need to be some synchronization between the two threads. In this case, the lock on the ThreadLocalList object is used to mediate access to a thread's list when there's the possibility of contention. Thread synchronization In ConcurrentBag, this is done using several mechanisms: Operations performed by the owner thread only take out the lock when there are less than three items in the collection. With three or greater items, there won't be any conflict with a stealing thread operating on the tail of the list. If a lock isn't taken out, the owning thread sets the list's m_currentOp variable to a non-zero value for the duration of the operation. This indicates to all other threads that there is a non-locked operation currently occuring on that list. The stealing thread always takes out the lock, to prevent two threads trying to steal from the same list at the same time. After taking out the lock, the stealing thread spinwaits until m_currentOp has been set to zero before actually performing the steal. This ensures there won't be a conflict with the owning thread when the number of items in the list is on the 2-3 item borderline. If any add or remove operations are started in the meantime, and the list is below 3 items, those operations try to take out the list's lock and are blocked until the stealing thread has finished. This allows a thread to steal an item from another thread's list without corrupting it. What about synchronization in the collection as a whole? Collection synchronization Any thread that operates on the collection's global structure (accessing anything outside the thread local lists) has to take out the collection's global lock - m_globalListsLock. This single lock is sufficient when adding a new thread local list, as the items inside each thread's list are unaffected. However, what about operations (such as Count or ToArray) that need to access every item in the collection? In order to ensure a consistent view, all operations on the collection are stopped while the count or ToArray is performed. This is done by freezing the bag at the start, performing the global operation, and unfreezing at the end: The global lock is taken out, to prevent structural alterations to the collection. m_needSync is set to true. This notifies all the threads that they need to take out their list's lock irregardless of what operation they're doing. All the list locks are taken out in order. This blocks all locking operations on the lists. The freezing thread waits for all current lockless operations to finish by spinwaiting on each m_currentOp field. The global operation can then be performed while the bag is frozen, but no other operations can take place at the same time, as all other threads are blocked on a list's lock. Then, once the global operation has finished, the locks are released, m_needSync is unset, and normal concurrent operation resumes. Concurrent principles That's the essence of how ConcurrentBag operates. Each thread operates independently on its own local list, except when they have to steal items from another list. When stealing, only the stealing thread is forced to take out the lock; the owning thread only has to when there is the possibility of contention. And a global lock controls accesses to the structure of the collection outside the thread lists. Operations affecting the entire collection take out all locks in the collection to freeze the contents at a single point in time. So, what principles can we extract here? Threads operate independently Thread-static variables and ThreadLocal makes this easy. Threads operate entirely concurrently on their own structures; only when they need to grab data from another thread is there any thread contention. Minimised lock-taking Even when two threads need to operate on the same data structures (one thread stealing from another), they do so in such a way such that the probability of actually blocking on a lock is minimised; the owning thread always operates on the head of the list, and the stealing thread always operates on the tail. Management of lockless operations Any operations that don't take out a lock still have a 'hook' to force them to lock when necessary. This allows all operations on the collection to be stopped temporarily while a global snapshot is taken. Hopefully, such operations will be short-lived and infrequent. That's all the concurrent collections covered. I hope you've found it as informative and interesting as I have. Next, I'll be taking a closer look at ThreadLocal, which I came across while analyzing ConcurrentBag. As you'll see, the operation of this class deserves a much closer look.

Read the article
Inside the Concurrent Collections: ConcurrentDictionary

- by Simon Cooper

Using locks to implement a thread-safe collection is rather like using a sledgehammer - unsubtle, easy to understand, and tends to make any other tool redundant. Unlike the previous two collections I looked at, ConcurrentStack and ConcurrentQueue, ConcurrentDictionary uses locks quite heavily. However, it is careful to wield locks only where necessary to ensure that concurrency is maximised. This will, by necessity, be a higher-level look than my other posts in this series, as there is quite a lot of code and logic in ConcurrentDictionary. Therefore, I do recommend that you have ConcurrentDictionary open in a decompiler to have a look at all the details that I skip over. The problem with locks There's several things to bear in mind when using locks, as encapsulated by the lock keyword in C# and the System.Threading.Monitor class in .NET (if you're unsure as to what lock does in C#, I briefly covered it in my first post in the series): Locks block threads The most obvious problem is that threads waiting on a lock can't do any work at all. No preparatory work, no 'optimistic' work like in ConcurrentQueue and ConcurrentStack, nothing. It sits there, waiting to be unblocked. This is bad if you're trying to maximise concurrency. Locks are slow Whereas most of the methods on the Interlocked class can be compiled down to a single CPU instruction, ensuring atomicity at the hardware level, taking out a lock requires some heavy lifting by the CLR and the operating system. There's quite a bit of work required to take out a lock, block other threads, and wake them up again. If locks are used heavily, this impacts performance. Deadlocks When using locks there's always the possibility of a deadlock - two threads, each holding a lock, each trying to aquire the other's lock. Fortunately, this can be avoided with careful programming and structured lock-taking, as we'll see. So, it's important to minimise where locks are used to maximise the concurrency and performance of the collection. Implementation As you might expect, ConcurrentDictionary is similar in basic implementation to the non-concurrent Dictionary, which I studied in a previous post. I'll be using some concepts introduced there, so I recommend you have a quick read of it. So, if you were implementing a thread-safe dictionary, what would you do? The naive implementation is to simply have a single lock around all methods accessing the dictionary. This would work, but doesn't allow much concurrency. Fortunately, the bucketing used by Dictionary allows a simple but effective improvement to this - one lock per bucket. This allows different threads modifying different buckets to do so in parallel. Any thread making changes to the contents of a bucket takes the lock for that bucket, ensuring those changes are thread-safe. The method that maps each bucket to a lock is the GetBucketAndLockNo method: private void GetBucketAndLockNo( int hashcode, out int bucketNo, out int lockNo, int bucketCount) { // the bucket number is the hashcode (without the initial sign bit) // modulo the number of buckets bucketNo = (hashcode & 0x7fffffff) % bucketCount; // and the lock number is the bucket number modulo the number of locks lockNo = bucketNo % m_locks.Length; } However, this does require some changes to how the buckets are implemented. The 'implicit' linked list within a single backing array used by the non-concurrent Dictionary adds a dependency between separate buckets, as every bucket uses the same backing array. Instead, ConcurrentDictionary uses a strict linked list on each bucket: This ensures that each bucket is entirely separate from all other buckets; adding or removing an item from a bucket is independent to any changes to other buckets. Modifying the dictionary All the operations on the dictionary follow the same basic pattern: void AlterBucket(TKey key, ...) { int bucketNo, lockNo; 1: GetBucketAndLockNo( key.GetHashCode(), out bucketNo, out lockNo, m_buckets.Length); 2: lock (m_locks[lockNo]) { 3: Node headNode = m_buckets[bucketNo]; 4: Mutate the node linked list as appropriate } } For example, when adding another entry to the dictionary, you would iterate through the linked list to check whether the key exists already, and add the new entry as the head node. When removing items, you would find the entry to remove (if it exists), and remove the node from the linked list. Adding, updating, and removing items all follow this pattern. Performance issues There is a problem we have to address at this point. If the number of buckets in the dictionary is fixed in the constructor, then the performance will degrade from O(1) to O(n) when a large number of items are added to the dictionary. As more and more items get added to the linked lists in each bucket, the lookup operations will spend most of their time traversing a linear linked list. To fix this, the buckets array has to be resized once the number of items in each bucket has gone over a certain limit. (In ConcurrentDictionary this limit is when the size of the largest bucket is greater than the number of buckets for each lock. This check is done at the end of the TryAddInternal method.) Resizing the bucket array and re-hashing everything affects every bucket in the collection. Therefore, this operation needs to take out every lock in the collection. Taking out mutiple locks at once inevitably summons the spectre of the deadlock; two threads each hold a lock, and each trying to acquire the other lock. How can we eliminate this? Simple - ensure that threads never try to 'swap' locks in this fashion. When taking out multiple locks, always take them out in the same order, and always take out all the locks you need before starting to release them. In ConcurrentDictionary, this is controlled by the AcquireLocks, AcquireAllLocks and ReleaseLocks methods. Locks are always taken out and released in the order they are in the m_locks array, and locks are all released right at the end of the method in a finally block. At this point, it's worth pointing out that the locks array is never re-assigned, even when the buckets array is increased in size. The number of locks is fixed in the constructor by the concurrencyLevel parameter. This simplifies programming the locks; you don't have to check if the locks array has changed or been re-assigned before taking out a lock object. And you can be sure that when a thread takes out a lock, another thread isn't going to re-assign the lock array. This would create a new series of lock objects, thus allowing another thread to ignore the existing locks (and any threads controlling them), breaking thread-safety. Consequences of growing the array Just because we're using locks doesn't mean that race conditions aren't a problem. We can see this by looking at the GrowTable method. The operation of this method can be boiled down to: private void GrowTable(Node[] buckets) { try { 1: Acquire first lock in the locks array // this causes any other thread trying to take out // all the locks to block because the first lock in the array // is always the one taken out first // check if another thread has already resized the buckets array // while we were waiting to acquire the first lock 2: if (buckets != m_buckets) return; 3: Calculate the new size of the backing array 4: Node[] array = new array[size]; 5: Acquire all the remaining locks 6: Re-hash the contents of the existing buckets into array 7: m_buckets = array; } finally { 8: Release all locks } } As you can see, there's already a check for a race condition at step 2, for the case when the GrowTable method is called twice in quick succession on two separate threads. One will successfully resize the buckets array (blocking the second in the meantime), when the second thread is unblocked it'll see that the array has already been resized & exit without doing anything. There is another case we need to consider; looking back at the AlterBucket method above, consider the following situation: Thread 1 calls AlterBucket; step 1 is executed to get the bucket and lock numbers. Thread 2 calls GrowTable and executes steps 1-5; thread 1 is blocked when it tries to take out the lock in step 2. Thread 2 re-hashes everything, re-assigns the buckets array, and releases all the locks (steps 6-8). Thread 1 is unblocked and continues executing, but the calculated bucket and lock numbers are no longer valid. Between calculating the correct bucket and lock number and taking out the lock, another thread has changed where everything is. Not exactly thread-safe. Well, a similar problem was solved in ConcurrentStack and ConcurrentQueue by storing a local copy of the state, doing the necessary calculations, then checking if that state is still valid. We can use a similar idea here: void AlterBucket(TKey key, ...) { while (true) { Node[] buckets = m_buckets; int bucketNo, lockNo; GetBucketAndLockNo( key.GetHashCode(), out bucketNo, out lockNo, buckets.Length); lock (m_locks[lockNo]) { // if the state has changed, go back to the start if (buckets != m_buckets) continue; Node headNode = m_buckets[bucketNo]; Mutate the node linked list as appropriate } break; } } TryGetValue and GetEnumerator And so, finally, we get onto TryGetValue and GetEnumerator. I've left these to the end because, well, they don't actually use any locks. How can this be? Whenever you change a bucket, you need to take out the corresponding lock, yes? Indeed you do. However, it is important to note that TryGetValue and GetEnumerator don't actually change anything. Just as immutable objects are, by definition, thread-safe, read-only operations don't need to take out a lock because they don't change anything. All lockless methods can happily iterate through the buckets and linked lists without worrying about locking anything. However, this does put restrictions on how the other methods operate. Because there could be another thread in the middle of reading the dictionary at any time (even if a lock is taken out), the dictionary has to be in a valid state at all times. Every change to state has to be made visible to other threads in a single atomic operation (all relevant variables are marked volatile to help with this). This restriction ensures that whatever the reading threads are doing, they never read the dictionary in an invalid state (eg items that should be in the collection temporarily removed from the linked list, or reading a node that has had it's key & value removed before the node itself has been removed from the linked list). Fortunately, all the operations needed to change the dictionary can be done in that way. Bucket resizes are made visible when the new array is assigned back to the m_buckets variable. Any additions or modifications to a node are done by creating a new node, then splicing it into the existing list using a single variable assignment. Node removals are simply done by re-assigning the node's m_next pointer. Because the dictionary can be changed by another thread during execution of the lockless methods, the GetEnumerator method is liable to return dirty reads - changes made to the dictionary after GetEnumerator was called, but before the enumeration got to that point in the dictionary. It's worth listing at this point which methods are lockless, and which take out all the locks in the dictionary to ensure they get a consistent view of the dictionary: Lockless: TryGetValue GetEnumerator The indexer getter ContainsKey Takes out every lock (lockfull?): Count IsEmpty Keys Values CopyTo ToArray Concurrent principles That covers the overall implementation of ConcurrentDictionary. I haven't even begun to scratch the surface of this sophisticated collection. That I leave to you. However, we've looked at enough to be able to extract some useful principles for concurrent programming: Partitioning When using locks, the work is partitioned into independant chunks, each with its own lock. Each partition can then be modified concurrently to other partitions. Ordered lock-taking When a method does need to control the entire collection, locks are taken and released in a fixed order to prevent deadlocks. Lockless reads Read operations that don't care about dirty reads don't take out any lock; the rest of the collection is implemented so that any reading thread always has a consistent view of the collection. That leads us to the final collection in this little series - ConcurrentBag. Lacking a non-concurrent analogy, it is quite different to any other collection in the class libraries. Prepare your thinking hats!

Read the article
Modern programming language with intuitive concurrent programming abstractions

- by faif

I am interested in learning concurrent programming, focusing on the application/user level (not system programming). I am looking for a modern high level programming language that provides intuitive abstractions for writing concurrent applications. I want to focus on languages that increase productivity and hide the complexity of concurrent programming. To give some examples, I don't consider a good option writing multithreaded code in C, C++, or Java because IMHO my productivity is reduced and their programming model is not intuitive. On the other hand, languages that increase productivity and offer more intuitive abstractions such as Python and the multiprocessing module, Erlang, Clojure, Scala, etc. would be good options. What would you recommend based on your experience and why?

Read the article
Functional Methods on Collections

- by GlenPeterson

I'm learning Scala and am a little bewildered by all the methods (higher-order functions) available on the collections. Which ones produce more results than the original collection, which ones produce less, and which are most appropriate for a given problem? Though I'm studying Scala, I think this would pertain to most modern functional languages (Clojure, Haskell) and also to Java 8 which introduces these methods on Java collections. Specifically, right now I'm wondering about map with filter vs. fold/reduce. I was delighted that using foldRight() can yield the same result as a map(...).filter(...) with only one traversal of the underlying collection. But a friend pointed out that foldRight() may force sequential processing while map() is friendlier to being processed by multiple processors in parallel. Maybe this is why mapReduce() is so popular? More generally, I'm still sometimes surprised when I chain several of these methods together to get back a List(List()) or to pass a List(List()) and get back just a List(). For instance, when would I use: collection.map(a => a.map(b => ...)) vs. collection.map(a => ...).map(b => ...) The for/yield command does nothing to help this confusion. Am I asking about the difference between a "fold" and "unfold" operation? Am I trying to jam too many questions into one? I think there may be an underlying concept that, if I understood it, might answer all these questions, or at least tie the answers together.

Read the article
Creating a custom collections class using System.Collections.CollectionBase

This article will show you how you can create a class that can behave just like a collection class.

Read the article
Need a Holistic view of your Concurrent Processing?

- by cwarticki

Need a Holistic view of your Concurrent Processing? Choose CP AnalyzerGo to Doc 1411723.1 for more details and script download. The Concurrent Processing Analyzer is a Self-Service Health-Check script which reviews the overall Concurrent Processing Footprint, analyzes the current configurations and settings for the environment providing feedback and recommendations on Best Practices. This is a non-invasive script which provides recommended actions to be performed on the instance it was run on. For production instances, always apply any changes to a recent clone to ensure an expected outcome. E-Business Applications Concurrent Processing Analyzer Overview E-Business Applications Concurrent Request Analysis E-Business Applications Concurrent Manager Analysis Identifies Concurrent System Setup and configurations Identifies and recommends Concurrent Best Practices Easy to add Tool for regular Concurrent Maintenance Execute Analysis anytime to compare trending from past outputs Feedback welcome!

Read the article
Concurrent Business Events

- by Manoj Madhusoodanan

This blog describes the various business events related to concurrent requests.In the concurrent program definition screen we can see the various business events which are attached to concurrent processing. Following are the actual definition of above business events. Each event will have following parameters. Create subscriptions to above business events.Before testing enable profile option 'Concurrent: Business Intelligence Integration Enable' to Yes. ExampleI have created a scenario.Whenever my concurrent request completes normally I want to send out file as attachment to my mail.So following components I have created.1) Host file deployed on $XXCUST_TOP/bin to send mail.It accepts mail ids,subject and output file.(Code here)2) Concurrent Program to send mail which points to above host file.3) Subscription package to oracle.apps.fnd.concurrent.request.completed.(Code here)Choose a concurrent program which you want to send the out file as attachment.Check Request Completed check box.Submit the program.If it completes normally the business event subscription program will send the out file as attachment to the specified mail id.

Read the article
Comparing two collections for equality

- by Crossbrowser

I would like to compare two collections (in C#), but I'm not sure of the best way to implement this efficiently. I've read the other thread about Enumerable.SequenceEqual, but it's not exactly what I'm looking for. In my case, two collections would be equal if they both contain the same items (no matter the order). Example: collection1 = {1, 2, 3, 4}; collection2 = {2, 4, 1, 3}; collection1 == collection2; // true What I usually do is to loop through each item of one collection and see if it exists in the other collection, then loop through each item of the other collection and see if it exists in the first collection. (I start by comparing the lengths). if (collection1.Count != collection2.Count) return false; // the collections are not equal foreach (Item item in collection1) { if (!collection2.Contains(item)) return false; // the collections are not equal } foreach (Item item in collection2) { if (!collection1.Contains(item)) return false; // the collections are not equal } return true; // the collections are equal However, this is not entirely correct, and it's probably not the most efficient way to do compare two collections for equality. An example I can think of that would be wrong is: collection1 = {1, 2, 3, 3, 4} collection2 = {1, 2, 2, 3, 4} Which would be equal with my implementation. Should I just count the number of times each item is found and make sure the counts are equal in both collections? The examples are in some sort of C# (let's call it pseudo-C#), but give your answer in whatever language you wish, it does not matter. Note: I used integers in the examples for simplicity, but I want to be able to use reference-type objects too (they do not behave correctly as keys because only the reference of the object is compared, not the content).

Read the article
Type-safe, generic, empty Collections with static generics

- by Droo

I return empty collections vs. null whenever possible. I switch between two methods for doing so using java.util.Collections: return Collections.EMPTY_LIST; return Collections.emptyList(); where emptyList() is supposed to be type-safe. But I recently discovered: return Collections.<ComplexObject> emptyList(); return Collections.<ComplexObject> singletonList(new ComplexObject()); etc. I see this method in Eclipse Package Explorer: <clinit> () : void but I don't see how this is done in the source code (1.5). How is this magic tomfoolerie happening!!

Read the article
Inside Red Gate - Exercises in Leanness

- by Simon Cooper

There's a new movement rumbling around Red Gate Towers - the Lean Startup. At its core is the idea that you don't have to be in a company with single-digit employees to be an entrepreneur; you simply have to (being blunt) not know what you should be doing. Specifically, you accept that you don't know everything you need to know in order to create a useful, successful & profitable product. This is something that Red Gate has had problems with in the past; we've created products that weren't aimed at the correct market, or didn't solve the problem the user had (although they solved the problem we thought the users had, or the problem the users thought they had). As a result, these products weren't as successful as they could have been. The ideas at the core of the Lean Startup help to combat this tendency to build large, well-engineered products that solve the wrong problem. You need to actually test your hypotheses about what the users and the market needs, rather than just running a project based on those untested assumptions. Furthermore, these tests need to be done as fast as possible (on the order of a week) so that, if necessary, you can change the direction of the project without wasting effort going down a dead end. Over time, as more tests are done and more hypotheses are confirmed or refuted, the project moves towards something that solves users' actual problems. However, re-aligning the development teams that operate within Red Gate along these lines does itself have some issues; we've got very good at doing large, monolithic releases, with a feature set decided well in advance. Currently it takes about 2 weeks to do install & release testing before a release; this is clearly not practicable for a team doing weekly, or even daily releases. There's also many infrastructure issues to be solved; in our source control, build system, release mechanism, support pages & documentation, licensing system, update system, and download pages. All these need modifications to allow the fast releases necessary for each experiment. Not only do we have to change our infrastructure, we have to change our mindset. Doing daily releases means each release won't get nearly as much testing as 'standard' releases. As a team, we have to be prepared that there will be releases that have bugs and issues with them; not only do we have to be prepared to change direction with every experiment we do, but we have to be ready to fix any bugs that are reported very quickly as well. The SmartAssembly team is spearheading this move towards leanness within the company, using Feature Usage Reporting (FUR). We think this is a cracking feature that will really help developers learn how people use their products, but we need to confirm this hypothesis. So, over the next few weeks, we'll be running a variety of experiments on SmartAssembly to either confirm or refute our hypotheses concerning how people use SmartAssembly and apply FUR to their own products. In the rest of this series, I'll be documenting how the experiments we perform get on, and our experiences with applying the Lean Startup model to a mature product like SmartAssembly.

Read the article
Inside Red Gate - Divisions

- by Simon Cooper

When I joined Red Gate back in 2007, there were around 80 people in the company. Now, around 3 years later, it's grown to more than 200. It's a constant battle against Dunbar's number; the maximum number of people you can keep track of in a social group, to try and maintain that 'small company' feel that attracted myself and so many others to apply in the first place. There are several strategies the company's developed over the years to try and mitigate the effects of Dunbar's number. One of the main ones has been divisionalisation. Divisions The first division, .NET, appeared around the same time that I started in 2007. This combined the development, sales, marketing and management of the .NET tools (then, ANTS Profiler v3) into a separate section of the office. The idea was to increase the cohesion and communication between the different people involved in the entire lifecycle of the tools; from initial product development, through to marketing, then to customer support, who would feed back to the development team. This was such a success that the other development teams were re-worked around this model in 2009. Nowadays there are 4 divisions - SQL Tools, DBA, .NET, and New Business. Along the way there have been various tweaks to the details - the sales teams have been merged into the divisions, marketing and product support have been (mostly) centralised - but the same basic model remains. So, how has this helped? As Red Gate has continued to grow over the years, divisionalisation has turned Red Gate from a monolithic software company into what one person described as a 'federation of small businesses'. Each division is free to structure itself as it sees fit, it's free to decide what to concentrate development work on, organise its own newsletters and webinars, decide its own release schedule. Each division is its own small business. In terms of numbers, the size of each division varies from 20 people (.NET) to 52 (SQL Tools); well below Dunbar's number. From a developer's perspective, this means organisational structure is very flat & wide - there's only 2 layers between myself and the CEOs (not that it matters much; everyone can go and have a chat to Neil or Simon, or anyone else inbetween, whenever they want. Provided you can catch them at their desk!). As Red Gate grows, and expands into new areas, new divisions will be created as needed, old ones merged or disbanded, but the division structure will help to maintain that small-company feel that keeps Red Gate working as it does.

Read the article
Inside Red Gate - Introduction

- by Simon Cooper

I work for Red Gate Software, a software company based in Cambridge, UK. In this series of posts, I'll be discussing how we develop software at Red Gate, and what we get up to, all from a dev's perspective. Before I start the series proper, in this post I'll give you a brief background to what I have done and continue to do as part of my job. The initial few posts will be giving an overview of how the development sections of the company work. There is much more to a software company than writing the products, but as I'm a developer my experience is biased towards that, and so that is what this series will concentrate on. My background Red Gate was founded in 1999 by Neil Davidson & Simon Galbraith, who continue to be joint CEOs. I joined in September 2007, and immediately set to work writing a new Check for Updates client and server (CfU), as part of a team of 2. That was finished at the end of 2007. I then joined the SQL Compare team. The first large project I worked on was updating SQL Compare for SQL Server 2008, resulting in SQL Compare 7, followed by a UI redesign in SQL Compare 8. By the end of this project in early 2009 I had become the 'go-to' guy for the SQL Compare Engine (I'll explain what that means in a later post), which is used by most of the other tools in the SQL Tools division in one way or another. After that, we decided to expand into Oracle, and I wrote the prototype for what became the engine of Schema Compare for Oracle (SCO). In the latter half of 2009 a full project was started, resulting in the release of SCO v1 in early 2010. Near the end of 2010 I moved to the .NET division, where I joined the team working on SmartAssembly. That's what I continue to work on today. The posts in this series will cover my experience in software development at Red Gate, within the SQL Tools and .NET divisions. Hopefully, you'll find this series an interesting look at what exactly goes into producing the software at Red Gate.

Read the article
Inside Red Gate - The Office

- by Simon Cooper

The vast majority of Red Gate is on the first and second floors (the second and third floors in US parlance) of an office building in Cambridge Business Park (here we are!). As you can see, the building is split into three sections; the two wings, and the section between them. As well as being organisationally separate, the four divisions are also split up in the office; each division has it's own floor and wing, so everyone in the division is working together in the same area (.NET and DBA on the left, SQL Tools and New Business on the right). The non-divisional parts of the business share wings with the smaller divisions, again keeping each group together. The canteen One of the downsides of divisionalisation is that communication between people in different decisions is greatly reduced. This is where the canteen (aka the SQL Servery) comes in. Occupying most of the central section on the first floor, the canteen provides free cooked lunch every day, and is where everyone in the company gathers for lunch. The idea is to encourage communication between the divisions; having lunch with people in a different division you wouldn't otherwise talk to helps people keep track of what's going on elsewhere in the company. (I'm still amazed at how the canteen staff provide a wide range of superbly cooked food for over 200 people out of a kitchen in which, if you were to swing a cat, it would get severe head injuries.). There's also table tennis and table football tables that anyone can use, provided you can grab them when they're free! Office layout Cubicles are practically unheard of in the UK, and no one, including the CEOs, has separate offices. The entire office is open-plan, as you can see in this youtube video from when we first moved in (although all the empty desks are now full!). Neil & Simon, instead of having dedicated offices, move between the different divisions every few months to keep up to date with what's going on around the company; sitting with a division gives you a much better overall impression of how the division's doing than written status reports from the division heads. There's also the usual plethora of meeting rooms scattered around the place; when we first moved in in 2009 we had a competition to name them all. We've got Afoxalypse A & B, Seagulls A & B, Traffic Jam, Thinking Hats, Camelids A & B, Horses, etc. All the meeting rooms have pictures on the walls corresponding to their theme, which adds a nice bit of individuality to otherwise fairly drab meeting rooms. Generally, any meeting room can be booked by anyone at any time, although some groups have priority in certain rooms (Camelids B is used a lot for UX testing, the Interview Room is used for, well, interviews). And, as you can see from the video, each area has various pictures, post-its, notes, signs, on the walls to try and stop it being a dull office space. Yes, it's still an office, but it's designed to be as interesting and as individual as possible.

Read the article
Inside Red Gate - Ricky Leeks

- by Simon Cooper

So, one of our profilers has a problem. Red Gate produces two .NET profilers - ANTS Performance Profiler (APP) and ANTS Memory Profiler (AMP). Both products help .NET developers solve problems they are virtually guaranteed to encounter at some point in their careers - slow code, and high memory usage, respectively. Everyone understands slow code - the symptoms are very obvious (an operation takes 2 hours when it should take 10 seconds), you know when you've solved it (the same operation now takes 15 seconds), and everyone understands how you can use a profiler like APP to help solve your particular problem. High memory usage is a much more subtle and misunderstood concept. How can .NET have memory leaks? The garbage collector, and how the CLR uses and frees memory, is one of the most misunderstood concepts in .NET. There's hundreds of blog posts out there covering various aspects of the GC and .NET memory, some of them helpful, some of them confusing, and some of them are just plain wrong. There's a lot of misconceptions out there. And, if you have got an application that uses far too much memory, it can be hard to wade through all the contradictory information available to even get an idea as to what's going on, let alone trying to solve it. That's where a memory profiler, like AMP, comes into play. Unfortunately, that's not the end of the issue. .NET memory management is a large, complicated, and misunderstood problem. Even armed with a profiler, you need to understand what .NET is doing with your objects, how it processes them, and how it frees them, to be able to use the profiler effectively to solve your particular problem. And that's what's wrong with AMP - even with all the thought, designs, UX sessions, and research we've put into AMP itself, some users simply don't have the knowledge required to be able to understand what AMP is telling them about how their application uses memory, and so they have problems understanding & solving their memory problem. Ricky Leeks This is where Ricky Leeks comes in. Created by one of the many...colourful...people in Red Gate, he headlines and promotes several tutorials, pages, and articles all with information on how .NET memory management actually works, with the goal to help educate developers on .NET memory management. And educating us all on how far you can push various vegetable-based puns. This, in turn, not only helps them understand and solve any memory issues they may be having, but helps them proactively code against such memory issues in their existing code. Ricky's latest outing is an interview on .NET Rocks, providing information on the Top 5 .NET Memory Management Gotchas, along with information on a free ebook on .NET Memory Management. Don't worry, there's loads more vegetable-based jokes where those came from...

Read the article
Inside Red Gate - Be Reasonable!

- by Simon Cooper

As I discussed in my previous posts, divisions and project teams within Red Gate are allowed a lot of autonomy to manage themselves. It's not just the teams though, there's an awful lot of freedom given to individual employees within the company as well. Reasonableness How Red Gate treats it's employees is embodied in the phrase 'You will be reasonable with us, and we will be reasonable with you'. As an employee, you are trusted to do your job to the best of you ability. There's no one looking over your shoulder, no one clocking you in and out each day. Everyone is working at the company because they want to, and one of the core ideas of Red Gate is that the company exists to 'let people do the best work of their lives'. Everything is geared towards that. To help you do your job, office services and the IT department are there. If you need something to help you work better (a third or fourth monitor, footrests, or a new keyboard) then ask people in Information Systems (IS) or Office Services and you will be given it, no questions asked. Everyone has administrator access to their own machines, and you can install whatever you want on it. If there's a particular bit of software you need, then ask IS and they will buy it. As an example, last year I wanted to replace my main hard drive with an SSD; I had a summer job at school working in a computer repair shop, so knew what to do. I went to IS and asked for 'an SSD, a SATA cable, and a screwdriver'. And I got it there and then, even the screwdriver. Awesome. I screwed it in myself, copied all my main drive files across, and I was good to go. Of course, if you're not happy doing that yourself, then IS will sort it all out for you, no problems. If you need something that the company doesn't have (say, a book off Amazon, or you need some specifications printing off & bound), then everyone has a expense limit of £100 that you can use without any sign-off needed from your managers. If you need a company credit card for whatever reason, then you can get it. This freedom extends to working hours and holiday; you're expected to be in the office 11am-3pm each day, but outside those times you can work whenever you want. If you need a half-day holiday on a days notice, or even the same day, then you'll get it, unless there's a good reason you're needed that day. If you need to work from home for a day or so for whatever reason, then you can. If it's reasonable, then it's allowed. Trust issues? A lot of trust, and a lot of leeway, is given to all the people in Red Gate. Everyone is expected to work hard, do their jobs to the best of their ability, and there will be a minimum of bureaucratic obstacles that stop you doing your work. What happens if you abuse this trust? Well, an example is company trip expenses. You're free to expense what you like; food, drink, transport, etc, but if you expenses are not reasonable, then you will never travel with the company again. Simple as that. Everyone knows when they're abusing the system, so simply don't do it. Along with reasonableness, another phrase used is 'Don't be a ***'. If you act like a ***, and abuse any of the trust placed in you, even if you're the best tester, salesperson, dev, or manager in the company, then you won't be a part of the company any more. From what I know about other companies, employee trust is highly variable between companies, all the way up to CCTV trained on employee's monitors. As a dev, I want to produce well-written & useful code that solves people's problems. Being able to get whatever I need - install whatever tools I need, get time off when I need to, obtain reference books within a day - all let me do my job, and so let Red Gate help other people do their own jobs through the tools we produce. Plus, I don't think I would like working for a company that doesn't allow admin access to your own machine and blocks Facebook!

Read the article
Inside Red Gate - Experimenting In Public

- by Simon Cooper

Over the next few weeks, we'll be performing experiments on SmartAssembly to confirm or refute various hypotheses we have about how people use the product, what is stopping them from using it to its full extent, and what we can change to make it more useful and easier to use. Some of these experiments can be done within the team, some within Red Gate, and some need to be done on external users. External testing Some external testing can be done by standard usability tests and surveys, however, there are some hypotheses that can only be tested by building a version of SmartAssembly with some things in the UI or implementation changed. We'll then be able to look at how the experimental build is used compared to the 'mainline' build, which forms our baseline or control group, and use this data to confirm or refute the relevant hypotheses. However, there are several issues we need to consider before running experiments using separate builds: Ideally, the user wouldn't know they're running an experimental SmartAssembly. We don't want users to use the experimental build like it's an experimental build, we want them to use it like it's the real mainline build. Only then will we get valid, useful, and informative data concerning our hypotheses. There's no point running the experiments if we can't find out what happens after the download. To confirm or refute some of our hypotheses, we need to find out how the tool is used once it is installed. Fortunately, we've applied feature usage reporting to the SmartAssembly codebase itself to provide us with that information. Of course, this then makes the experimental data conditional on the user agreeing to send that data back to us in the first place. Unfortunately, even though this does limit the amount of useful data we'll be getting back, and possibly skew the data, there's not much we can do about this; we don't collect feature usage data without the user's consent. Looks like we'll simply have to live with this. What if the user tries to buy the experiment? This is something that isn't really covered by the Lean Startup book; how do you support users who give you money for an experiment? If the experiment is a new feature, and the user buys a license for SmartAssembly based on that feature, then what do we do if we later decide to pivot & scrap that feature? We've either got to spend time and money bringing that feature up to production quality and into the mainline anyway, or we've got disgruntled customers. Either way is bad. Again, there's not really any good solution to this. Similarly, what if we've removed some features for an experiment and a potential new user downloads the experimental build? (As I said above, there's no indication the build is an experimental build, as we want to see what users really do with it). The crucial feature they need is missing, causing a bad trial experience, a lost potential customer, and a lost chance to help the customer with their problem. Again, this is something not really covered by the Lean Startup book, and something that doesn't have a good solution. So, some tricky issues there, not all of them with nice easy answers. Turns out the practicalities of running Lean Startup experiments are more complicated than they first seem!

Read the article
Inside Red Gate - Project teams

- by Simon Cooper

Within each division in Red Gate, development effort is structured around one or more project teams; currently, each division contains 2-3 separate teams. These are self contained units responsible for a particular development project. Project team structure The typical size of a development team varies, but is normally around 4-7 people - one project manager, two developers, one or two testers, a technical author (who is responsible for the text within the application, website content, and help documentation) and a user experience designer (who designs and prototypes the UIs) . However, team sizes can vary from 3 up to 12, depending on the division and project. As an rule, all the team sits together in the same area of the office. (Again, this is my experience of what happens. I haven't worked in the DBA division, and SQL Tools might have changed completely since I moved to .NET. As I mentioned in my previous post, each division is free to structure itself as it sees fit.) Depending on the project, and the other needs in the division, the tech author and UX designer may be shared between several projects. Generally, developers and testers work on one project at a time. If the project is a simple point release, then it might not need a UX designer at all. However, if it's a brand new product, then a UX designer and tech author will be involved right from the start. Developers, testers, and the project manager will normally stay together in the same team as they work on different projects, unless there's a good reason to split or merge teams for a particular project. Technical authors and UX designers will normally go wherever they are needed in the division, depending on what each project needs at the time. In my case, I was working with more or less the same people for over 2 years, all the way through SQL Compare 7, 8, and Schema Compare for Oracle. This helped to build a great sense of camaraderie wihin the team, and helped to form and maintain a team identity. This, in turn, meant we worked very well together, and so the final result was that much better (as well as making the work more fun). How is a project started and run? The product manager within each division collates user feedback and ideas, does lots of research, throws in a few ideas from people within the company, and then comes up with a list of what the division should work on in the next few years. This is split up into projects, and after each project is greenlit (I'll be discussing this later on) it is then assigned to a project team, as and when they become available (I'm sure there's lots of discussions and meetings at this point that I'm not aware of!). From that point, it's entirely up to the project team. Just as divisions are autonomous, project teams are also given a high degree of autonomy. All the teams in Red Gate use some sort of vaguely agile methodology; most use some variations on SCRUM, some have experimented with Kanban. Some store the project progress on a whiteboard, some use our bug tracker, others use different methods. It all depends on what the team members think will work best for them to get the best result at the end. From that point, the project proceeds as you would expect; code gets written, tests pass and fail, discussions about how to resolve various problems are had and decided upon, and out pops a new product, new point release, new internal tool, or whatever the project's goal was. The project manager ensures that everyone works together without too much bloodshed and that thrown missiles are constrained to Nerf bullets, the developers write the code, the testers ensure it actually works, and the tech author and UX designer ensure that people will be able to use the final product to solve their problem (after all, developers make lousy UI designers and technical authors). Projects in Red Gate last a relatively short amount of time; most projects are less than 6 months. The longest was 18 months. This has evolved as the company has grown, and I suspect is a side effect of the type of software Red Gate produces. As an ISV, we sell packaged software; we only get revenue when customers purchase the ready-made tools. As a result, we only get a sellable piece of software right at the end of a project. Therefore, the longer the project lasts, the more time and money has to be invested by the company before we get any revenue from it, and the riskier the project becomes. This drives the average project time down. Small project teams are the core of how Red Gate produces software, and are what the whole development effort of the company is built around. In my next post, I'll be looking at the office itself, and how all 200 of us manage to fit on two floors of a small office building.

Read the article
Why does using Collections.emptySet() with generics work in assignment but not as a method parameter

- by Karl von L

So, I have a class with a constructor like this: public FilterList(Set<Integer> labels) { ... } and I want to construct a new FilterList object with an empty set. Following Joshua Bloch's advice in his book Effective Java, I don't want to create a new object for the empty set; I'll just use Collections.emptySet() instead: FilterList emptyList = new FilterList(Collections.emptySet()); This gives me an error, complaining that java.util.Set<java.lang.Object> is not a java.util.Set<java.lang.Integer>. OK, how about this: FilterList emptyList = new FilterList((Set<Integer>)Collections.emptySet()); This also gives me an error! Ok, how about this: Set<Integer> empty = Collections.emptySet(); FilterList emptyList = new FilterList(empty); Hey, it works! But why? After all, Java doesn't have type inference, which is why you get an unchecked conversion warning if you do Set<Integer> foo = new TreeSet() instead of Set<Integer> foo = new TreeSet<Integer>(). But Set<Integer> empty = Collections.emptySet(); works without even a warning. Why is that?

Read the article
Parallel Classloading Revisited: Fully Concurrent Loading

- by davidholmes

Java 7 introduced support for parallel classloading. A description of that project and its goals can be found here: http://openjdk.java.net/groups/core-libs/ClassLoaderProposal.html The solution for parallel classloading was to add to each class loader a ConcurrentHashMap, referenced through a new field, parallelLockMap. This contains a mapping from class names to Objects to use as a classloading lock for that class name. This was then used in the following way: protected Class loadClass(String name, boolean resolve) throws ClassNotFoundException { synchronized (getClassLoadingLock(name)) { // First, check if the class has already been loaded Class c = findLoadedClass(name); if (c == null) { long t0 = System.nanoTime(); try { if (parent != null) { c = parent.loadClass(name, false); } else { c = findBootstrapClassOrNull(name); } } catch (ClassNotFoundException e) { // ClassNotFoundException thrown if class not found // from the non-null parent class loader } if (c == null) { // If still not found, then invoke findClass in order // to find the class. long t1 = System.nanoTime(); c = findClass(name); // this is the defining class loader; record the stats sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0); sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1); sun.misc.PerfCounter.getFindClasses().increment(); } } if (resolve) { resolveClass(c); } return c; } } Where getClassLoadingLock simply does: protected Object getClassLoadingLock(String className) { Object lock = this; if (parallelLockMap != null) { Object newLock = new Object(); lock = parallelLockMap.putIfAbsent(className, newLock); if (lock == null) { lock = newLock; } } return lock; } This approach is very inefficient in terms of the space used per map and the number of maps. First, there is a map per-classloader. As per the code above under normal delegation the current classloader creates and acquires a lock for the given class, checks if it is already loaded, then asks its parent to load it; the parent in turn creates another lock in its own map, checks if the class is already loaded and then delegates to its parent and so on till the boot loader is invoked for which there is no map and no lock. So even in the simplest of applications, you will have two maps (in the system and extensions loaders) for every class that has to be loaded transitively from the application's main class. If you knew before hand which loader would actually load the class the locking would only need to be performed in that loader. As it stands the locking is completely unnecessary for all classes loaded by the boot loader. Secondly, once loading has completed and findClass will return the class, the lock and the map entry is completely unnecessary. But as it stands, the lock objects and their associated entries are never removed from the map. It is worth understanding exactly what the locking is intended to achieve, as this will help us understand potential remedies to the above inefficiencies. Given this is the support for parallel classloading, the class loader itself is unlikely to need to guard against concurrent load attempts - and if that were not the case it is likely that the classloader would need a different means to protect itself rather than a lock per class. Ultimately when a class file is located and the class has to be loaded, defineClass is called which calls into the VM - the VM does not require any locking at the Java level and uses its own mutexes for guarding its internal data structures (such as the system dictionary). The classloader locking is primarily needed to address the following situation: if two threads attempt to load the same class, one will initiate the request through the appropriate loader and eventually cause defineClass to be invoked. Meanwhile the second attempt will block trying to acquire the lock. Once the class is loaded the first thread will release the lock, allowing the second to acquire it. The second thread then sees that the class has now been loaded and will return that class. Neither thread can tell which did the loading and they both continue successfully. Consider if no lock was acquired in the classloader. Both threads will eventually locate the file for the class, read in the bytecodes and call defineClass to actually load the class. In this case the first to call defineClass will succeed, while the second will encounter an exception due to an attempted redefinition of an existing class. It is solely for this error condition that the lock has to be used. (Note that parallel capable classloaders should not need to be doing old deadlock-avoidance tricks like doing a wait() on the lock object\!). There are a number of obvious things we can try to solve this problem and they basically take three forms: Remove the need for locking. This might be achieved by having a new version of defineClass which acts like defineClassIfNotPresent - simply returning an existing Class rather than triggering an exception. Increase the coarseness of locking to reduce the number of lock objects and/or maps. For example, using a single shared lockMap instead of a per-loader lockMap. Reduce the lifetime of lock objects so that entries are removed from the map when no longer needed (eg remove after loading, use weak references to the lock objects and cleanup the map periodically). There are pros and cons to each of these approaches. Unfortunately a significant "con" is that the API introduced in Java 7 to support parallel classloading has essentially mandated that these locks do in fact exist, and they are accessible to the application code (indirectly through the classloader if it exposes them - which a custom loader might do - and regardless they are accessible to custom classloaders). So while we can reason that we could do parallel classloading with no locking, we can not implement this without breaking the specification for parallel classloading that was put in place for Java 7. Similarly we might reason that we can remove a mapping (and the lock object) because the class is already loaded, but this would again violate the specification because it can be reasoned that the following assertion should hold true: Object lock1 = loader.getClassLoadingLock(name); loader.loadClass(name); Object lock2 = loader.getClassLoadingLock(name); assert lock1 == lock2; Without modifying the specification, or at least doing some creative wordsmithing on it, options 1 and 3 are precluded. Even then there are caveats, for example if findLoadedClass is not atomic with respect to defineClass, then you can have concurrent calls to findLoadedClass from different threads and that could be expensive (this is also an argument against moving findLoadedClass outside the locked region - it may speed up the common case where the class is already loaded, but the cost of re-executing after acquiring the lock could be prohibitive. Even option 2 might need some wordsmithing on the specification because the specification for getClassLoadingLock states "returns a dedicated object associated with the specified class name". The question is, what does "dedicated" mean here? Does it mean unique in the sense that the returned object is only associated with the given class in the current loader? Or can the object actually guard loading of multiple classes, possibly across different class loaders? So it seems that changing the specification will be inevitable if we wish to do something here. In which case lets go for something that more cleanly defines what we want to be doing: fully concurrent class-loading. Note: defineClassIfNotPresent is already implemented in the VM as find_or_define_class. It is only used if the AllowParallelDefineClass flag is set. This gives us an easy hook into existing VM mechanics. Proposal: Fully Concurrent ClassLoaders The proposal is that we expand on the notion of a parallel capable class loader and define a "fully concurrent parallel capable class loader" or fully concurrent loader, for short. A fully concurrent loader uses no synchronization in loadClass and the VM uses the "parallel define class" mechanism. For a fully concurrent loader getClassLoadingLock() can return null (or perhaps not - it doesn't matter as we won't use the result anyway). At present we have not made any changes to this method. All the parallel capable JDK classloaders become fully concurrent loaders. This doesn't require any code re-design as none of the mechanisms implemented rely on the per-name locking provided by the parallelLockMap. This seems to give us a path to remove all locking at the Java level during classloading, while retaining full compatibility with Java 7 parallel capable loaders. Fully concurrent loaders will still encounter the performance penalty associated with concurrent attempts to find and prepare a class's bytecode for definition by the VM. What this penalty is depends on the number of concurrent load attempts possible (a function of the number of threads and the application logic, and dependent on the number of processors), and the costs associated with finding and preparing the bytecodes. This obviously has to be measured across a range of applications. Preliminary webrevs: http://cr.openjdk.java.net/~dholmes/concurrent-loaders/webrev.hotspot/ http://cr.openjdk.java.net/~dholmes/concurrent-loaders/webrev.jdk/ Please direct all comments to the mailing list [email protected].

Read the article
EBS Concurrent Processing Information Center

- by LuciaC

Do you have problems or questions about concurrent request processing? Do you want to know: How and when to run CP Diagnostics? What are the latest Hot Topics being looked at for Concurrent Processing? All about the Concurrent Process Analyzer self-service Health-Check script? Go to the EBS Concurrent Processing Information Center (Doc ID 1304305.1) and find out the above and lots more!

Read the article
Lazy non-modifiable list

- by mindas

I was looking for a decent implementation of a generic lazy non-modifiable list implementation to wrap my search result entries. The unmodifiable part of the task is easy as it can be achieved by Collections.unmodifiableList() so I only need to sort out the the lazy part. Surprisingly, google-collections doesn't have anything to offer; while LazyList from Apache Commons Collections does not support generics. I have found an attempt to build something on top of google-collections but it seems to be incomplete (e.g. does not support size()), outdated (does not compile with 1.0 final) and requiring some external classes, but could be used as a good starting point to build my own class. Is anybody aware of any good implementation of a LazyList? If not, which option do you think is better: write my own implementation, based on google-collections ForwardingList, similar to what Peter Maas did; write my own wrapper around Commons Collections LazyList (the wrapper would only add generics so I don't have to cast everywhere but only in the wrapper itself); just write something on top of java.util.AbstractList; Any other suggestions are welcome.

Read the article
Architecture choice about representation of collections in Business Objects

- by Rajarshi

I have made certain choices in my architecture which I request the community to review and comment. I am breaking up the post in smaller sections to make it easier to understand the context and then suggest/comment. I am sorry that the post is long, but is required to explain the context. What am I building A typical business application where there are application users, security roles, business operation/action rights based on roles and several business modules like Stock Receive, Stock Transfer, Sale Order, Sale Invoice, Sale Return, Stock Audit etc. and several reports. The application is a WinForm application since it has a lot of rich and responsive UI requirements and has to operate in disconnected mode (with a local SQL Server), most of the time. What have I done I have built a framework - nothing to boast about, but just a set of libraries that serves the repetative requirements of my application, e.g. authentication, role based authorization, data access, validation, exception handling, logging, change status tracking, presentation model compliance and reasonable loose coupling between components. No, I have not written everything from scratch, you can say I have consolidated many things together like some concepts from CSLA, Martin Fowler for Presentation Model, blocks from Enterprise Library, Unity etc. to build a set of libraries that will help my developers be productive quickly without having to look up Google for many of the technical requirements. I have tried to keep the framework generic so that it can be used in typical business applications and also tried to follow some best practices that will support the same Business Objects to be used in an ASP.NET MVC environment also. My present architecture serves my objectives well, and have built several modules (on WinForm) without much trouble. The architecture also lent itself well to build some usable prototype on ASP.NET MVC with the same set of business objects, without changing a single line of code. My Dilemma I have used Custom Business Objects since that gives me a clearer OOP representation of the problem scope in my solution scope, and helps me visualize my entire solution as collection of objects with data and behavior rather than having a set of relational data (DataSet) and implement behaviours (business logic, validation) etc. separately. With rich databinding support in .NET 2.0 binding Custom Business Objects to UI was a breeze. Now while building my business objects, I am still in a dilemma about representation of collections in business objects. Currently I am using DataSets to represent collections while I have seen many suggestions to implement custom collections. For example, in my vision, a typical Sale Invoice Object will contain 'Sales Invoice Items' as a collection. Now theoritically, I can accept that the each 'Sales Invoice Item' should have its own behavior along with their data (ItemCode, Name, Qty, Price etc.) but typically managing of Sale Invoice Items in a Sale Invoice is handled by the Sale Invoice Object itself, e.g. adding/removing Items from collection. Additionally, we can also put business logic/rules for the Sales Invoice Items like "Qty should not be greater than the ordered qty", "Price should be max 10% above the price in Sale Order" etc. in the Sale Invoice object itself. With that kind of a vision, I felt that most business object child collections can be managed by the parent itself, including add/remove from collection as well and implementing business logic for the collection items, hence the collection items hold nothing but data. Additionally, typical collections are represented in UI in Grids, where ability to support DataBinding becomes very important for any collection. Implementing a custom collection, in that case would also mean, I have to implement robust DataBinding support as well, for the collection, which is of course time consuming. Now, considering child collection behaviors are implemented in the parent and the need for DataBinding of child collections, I chose DataSet to represent any child collection in my business objects. In the above example of Sale Invoice I will have 'Invoice Number', 'Date', 'Customer' etc. as attributes of the 'Sale Invoice' but 'InvoiceItems' as a DataSet. Of course, when I say DataSet, it is not a vanilla dataset but an extended DataSet that supports business rule validation and the same role based security model of my framework to allow/deny any business operation to rows/columns of the DataSet, automatically. This approach has allowed easier collection management and databinding in my business objects and my developers are able to deliver modules rapidly. Questions Do you feel that the approach is reasonable? Do you see any shortcomings of this approach? I am recently thinking of using 'Typed DataSets' as child collections, for easier representation in code, that will allow me to write 'currentInvoice.InvoiceItems' (for the DataTable) and 'invoiceItem.ProductCode' or 'invoiceItem.Qty', instead of 'drow["ProductCode"].ToString()' or '(int)drow["Qty"]' etc. Does this choice have any demerits? Thank you if you have read so far and a salute if you still have the Energy to answer.

Read the article
Java - Collections.binarySearch with PriorityQueue?

- by msr

Hello, Can I use Collections.binarySearch() method to search elements in a PriorityQueue? Otherwise, how can I apply search algorithms to a PriorityQueue? I have this (class Evento implements Comparable): public class PriorityQueueCAP extends PriorityQueue<Evento>{ // (...) public void removeEventos(Evento evento){ Collections.binarySearch(this, evento); // ERROR! } } And I got this error: "The method binarySearch(List, T) in the type Collections is not applicable for the arguments (PriorityQueueCAP, Evento)" Why? Thanks in advance!

Read the article

Search Results

Search found 17770 results on 711 pages for 'inside the concurrent collections'.

Page 1/711 | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by mindas

- by James Michael Hare

- by Simon Cooper

- by Simon Cooper

- by faif

- by GlenPeterson

- by cwarticki

- by Manoj Madhusoodanan

- by Crossbrowser

- by Droo

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Simon Cooper

- by Karl von L

- by davidholmes

- by LuciaC

- by mindas

- by Rajarshi

- by msr

1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >