Search Results

Search found 7097 results on 284 pages for 'calls'.

Page 103/284 | < Previous Page | 99 100 101 102 103 104 105 106 107 108 109 110 | Next Page >

Is Berkeley DB a NoSQL solution?

- by Gregory Burd

Berkeley DB is a library. To use it to store data you must link the library into your application. You can use most programming languages to access the API, the calls across these APIs generally mimic the Berkeley DB C-API which makes perfect sense because Berkeley DB is written in C. The inspiration for Berkeley DB was the DBM library, a part of the earliest versions of UNIX written by AT&T's Ken Thompson in 1979. DBM was a simple key/value hashtable-based storage library. In the early 1990s as BSD UNIX was transitioning from version 4.3 to 4.4 and retrofitting commercial code owned by AT&T with unencumbered code, it was the future founders of Sleepycat Software who wrote libdb (aka Berkeley DB) as the replacement for DBM. The problem it addressed was fast, reliable local key/value storage. At that time databases almost always lived on a single node, even the most sophisticated databases only had simple fail-over two node solutions. If you had a lot of data to store you would choose between the few commercial RDBMS solutions or to write your own custom solution. Berkeley DB took the headache out of the custom approach. These basic market forces inspired other DBM implementations. There was the "New DBM" (ndbm) and the "GNU DBM" (GDBM) and a few others, but the theme was the same. Even today TokyoCabinet calls itself "a modern implementation of DBM" mimicking, and improving on, something first created over thirty years ago. In the mid-1990s, DBM was the name for what you needed if you were looking for fast, reliable local storage. Fast forward to today. What's changed? Systems are connected over fast, very reliable networks. Disks are cheep, fast, and capable of storing huge amounts of data. CPUs continued to follow Moore's Law, processing power that filled a room in 1990 now fits in your pocket. PCs, servers, and other computers proliferated both in business and the personal markets. In addition to the new hardware entire markets, social systems, and new modes of interpersonal communication moved onto the web and started evolving rapidly. These changes cause a massive explosion of data and a need to analyze and understand that data. Taken together this resulted in an entirely different landscape for database storage, new solutions were needed. A number of novel solutions stepped up and eventually a category called NoSQL emerged. The new market forces inspired the CAP theorem and the heated debate of BASE vs. ACID. But in essence this was simply the market looking at what to trade off to meet these new demands. These new database systems shared many qualities in common. There were designed to address massive amounts of data, millions of requests per second, and scale out across multiple systems. The first large-scale and successful solution was Dynamo, Amazon's distributed key/value database. Dynamo essentially took the next logical step and added a twist. Dynamo was to be the database of record, it would be distributed, data would be partitioned across many nodes, and it would tolerate failure by avoiding single points of failure. Amazon did this because they recognized that the majority of the dynamic content they provided to customers visiting their web store front didn't require the services of an RDBMS. The queries were simple, key/value look-ups or simple range queries with only a few queries that required more complex joins. They set about to use relational technology only in places where it was the best solution for the task, places like accounting and order fulfillment, but not in the myriad of other situations. The success of Dynamo, and it's design, inspired the next generation of Non-SQL, distributed database solutions including Cassandra, Riak and Voldemort. The problem their designers set out to solve was, "reliability at massive scale" so the first focal point was distributed database algorithms. Underneath Dynamo there is a local transactional database; either Berkeley DB, Berkeley DB Java Edition, MySQL or an in-memory key/value data structure. Dynamo was an evolution of local key/value storage onto networks. Cassandra, Riak, and Voldemort all faced similar design decisions and one, Voldemort, choose Berkeley DB Java Edition for it's node-local storage. Riak at first was entirely in-memory, but has recently added write-once, append-only log-based on-disk storage similar type of storage as Berkeley DB except that it is based on a hash table which must reside entirely in-memory rather than a btree which can live in-memory or on disk. Berkeley DB evolved too, we added high availability (HA) and a replication manager that makes it easy to setup replica groups. Berkeley DB's replication doesn't partitioned the data, every node keeps an entire copy of the database. For consistency, there is a single node where writes are committed first - a master - then those changes are delivered to the replica nodes as log records. Applications can choose to wait until all nodes are consistent, or fire and forget allowing Berkeley DB to eventually become consistent. Berkeley DB's HA scales-out quite well for read-intensive applications and also effectively eliminates the central point of failure by allowing replica nodes to be elected (using a PAXOS algorithm) to mastership if the master should fail. This implementation covers a wide variety of use cases. MemcacheDB is a server that implements the Memcache network protocol but uses Berkeley DB for storage and HA to replicate the cache state across all the nodes in the cache group. Google Accounts, the user authentication layer for all Google properties, was until recently running Berkeley DB HA. That scaled to a globally distributed system. That said, most NoSQL solutions try to partition (shard) data across nodes in the replication group and some allow writes as well as reads at any node, Berkeley DB HA does not. So, is Berkeley DB a "NoSQL" solution? Not really, but it certainly is a component of many of the existing NoSQL solutions out there. Forgetting all the noise about how NoSQL solutions are complex distributed databases when you boil them down to a single node you still have to store the data to some form of stable local storage. DBMs solved that problem a long time ago. NoSQL has more to do with the layers on top of the DBM; the distributed, sometimes-consistent, partitioned, scale-out storage that manage key/value or document sets and generally have some form of simple HTTP/REST-style network API. Does Berkeley DB do that? Not really. Is Berkeley DB a "NoSQL" solution today? Nope, but it's the most robust solution on which to build such a system. Re-inventing the node-local data storage isn't easy. A lot of people are starting to come to appreciate the sophisticated features found in Berkeley DB, even mimic them in some cases. Could Berkeley DB grow into a NoSQL solution? Absolutely. Our key/value API could be extended over the net using any of a number of existing network protocols such as memcache or HTTP/REST. We could adapt our node-local data partitioning out over replicated nodes. We even have a nice query language and cost-based query optimizer in our BDB XML product that we could reuse were we to build out a document-based NoSQL-style product. XML and JSON are not so different that we couldn't adapt one to work with the other interchangeably. Without too much effort we could add what's missing, we could jump into this No SQL market withing a single product development cycle. Why isn't Berkeley DB already a NoSQL solution? Why aren't we working on it? Why indeed...

Read the article
Dynamically creating a Generic Type at Runtime

- by Rick Strahl

I learned something new today. Not uncommon, but it's a core .NET runtime feature I simply did not know although I know I've run into this issue a few times and worked around it in other ways. Today there was no working around it and a few folks on Twitter pointed me in the right direction. The question I ran into is: How do I create a type instance of a generic type when I have dynamically acquired the type at runtime? Yup it's not something that you do everyday, but when you're writing code that parses objects dynamically at runtime it comes up from time to time. In my case it's in the bowels of a custom JSON parser. After some thought triggered by a comment today I realized it would be fairly easy to implement two-way Dictionary parsing for most concrete dictionary types. I could use a custom Dictionary serialization format that serializes as an array of key/value objects. Basically I can use a custom type (that matches the JSON signature) to hold my parsed dictionary data and then add it to the actual dictionary when parsing is complete. Generic Types at Runtime One issue that came up in the process was how to figure out what type the Dictionary<K,V> generic parameters take. Reflection actually makes it fairly easy to figure out generic types at runtime with code like this: if (arrayType.GetInterface("IDictionary") != null) { if (arrayType.IsGenericType) { var keyType = arrayType.GetGenericArguments()[0]; var valueType = arrayType.GetGenericArguments()[1]; … } } The GetArrayType method gets passed a type instance that is the array or array-like object that is rendered in JSON as an array (which includes IList, IDictionary, IDataReader and a few others). In my case the type passed would be something like Dictionary<string, CustomerEntity>. So I know what the parent container class type is. Based on the the container type using it's then possible to use GetGenericTypeArguments() to retrieve all the generic types in sequential order of definition (ie. string, CustomerEntity). That's the easy part. Creating a Generic Type and Providing Generic Parameters at RunTime The next problem is how do I get a concrete type instance for the generic type? I know what the type name and I have a type instance is but it's generic, so how do I get a type reference to keyvaluepair<K,V> that is specific to the keyType and valueType above? Here are a couple of things that come to mind but that don't work (and yes I tried that unsuccessfully first): Type elementType = typeof(keyvalue<keyType, valueType>); Type elementType = typeof(keyvalue<typeof(keyType), typeof(valueType)>); The problem is that this explicit syntax expects a type literal not some dynamic runtime value, so both of the above won't even compile. I turns out the way to create a generic type at runtime is using a fancy bit of syntax that until today I was completely unaware of: Type elementType = typeof(keyvalue<,>).MakeGenericType(keyType, valueType); The key is the type(keyvalue<,>) bit which looks weird at best. It works however and produces a non-generic type reference. You can see the difference between the full generic type and the non-typed (?) generic type in the debugger: The nonGenericType doesn't show any type specialization, while the elementType type shows the string, CustomerEntity (truncated above) in the type name. Once the full type reference exists (elementType) it's then easy to create an instance. In my case the parser parses through the JSON and when it completes parsing the value/object it creates a new keyvalue<T,V> instance. Now that I know the element type that's pretty trivial with: // Objects start out null until we find the opening tag resultObject = Activator.CreateInstance(elementType); Here the result object is picked up by the JSON array parser which creates an instance of the child object (keyvalue<K,V>) and then parses and assigns values from the JSON document using the types key/value property signature. Internally the parser then takes each individually parsed item and adds it to a list of List<keyvalue<K,V>> items. Parsing through a Generic type when you only have Runtime Type Information When parsing of the JSON array is done, the List needs to be turned into a defacto Dictionary<K,V>. This should be easy since I know that I'm dealing with an IDictionary, and I know the generic types for the key and value. The problem is again though that this needs to happen at runtime which would mean using several Convert.ChangeType() calls in the code to dynamically cast at runtime. Yuk. In the end I decided the easier and probably only slightly slower way to do this is a to use the dynamic type to collect the items and assign them to avoid all the dynamic casting madness: else if (IsIDictionary) { IDictionary dict = Activator.CreateInstance(arrayType) as IDictionary; foreach (dynamic item in items) { dict.Add(item.key, item.value); } return dict; } This code creates an instance of the generic dictionary type first, then loops through all of my custom keyvalue<K,V> items and assigns them to the actual dictionary. By using Dynamic here I can side step all the explicit type conversions that would be required in the three highlighted areas (not to mention that this nested method doesn't have access to the dictionary item generic types here). Static <- -> Dynamic Dynamic casting in a static language like C# is a bitch to say the least. This is one of the few times when I've cursed static typing and the arcane syntax that's required to coax types into the right format. It works but it's pretty nasty code. If it weren't for dynamic that last bit of code would have been a pretty ugly as well with a bunch of Convert.ChangeType() calls to litter the code. Fortunately this type of type convulsion is rather rare and reserved for system level code. It's not every day that you create a string to object parser after all :-)© Rick Strahl, West Wind Technologies, 2005-2011Posted in .NET CSharp Tweet (function() { var po = document.createElement('script'); po.type = 'text/javascript'; po.async = true; po.src = 'https://apis.google.com/js/plusone.js'; var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(po, s); })();

Read the article
AdvancedFormatProvider: Making string.format do more

- by plblum

When I have an integer that I want to format within the String.Format() and ToString(format) methods, I’m always forgetting the format symbol to use with it. That’s probably because its not very intuitive. Use {0:N0} if you want it with group (thousands) separators. text = String.Format("{0:N0}", 1000); // returns "1,000" int value1 = 1000; text = value1.ToString("N0"); Use {0:D} or {0:G} if you want it without group separators. text = String.Format("{0:D}", 1000); // returns "1000" int value2 = 1000; text2 = value2.ToString("D"); The {0:D} is especially confusing because Microsoft gives the token the name “Decimal”. I thought it reasonable to have a new format symbol for String.Format, "I" for integer, and the ability to tell it whether it shows the group separators. Along the same lines, why not expand the format symbols for currency ({0:C}) and percent ({0:P}) to let you omit the currency or percent symbol, omit the group separator, and even to drop the decimal part when the value is equal to the whole number? My solution is an open source project called AdvancedFormatProvider, a group of classes that provide the new format symbols, continue to support the rest of the native symbols and makes it easy to plug in additional format symbols. Please visit https://github.com/plblum/AdvancedFormatProvider to learn about it in detail and explore how its implemented. The rest of this post will explore some of the concepts it takes to expand String.Format() and ToString(format). AdvancedFormatProvider benefits: Supports {0:I} token for integers. It offers the {0:I-,} option to omit the group separator. Supports {0:C} token with several options. {0:C-$} omits the currency symbol. {0:C-,} omits group separators, and {0:C-0} hides the decimal part when the value would show “.00”. For example, 1000.0 becomes “$1000” while 1000.12 becomes “$1000.12”. Supports {0:P} token with several options. {0:P-%} omits the percent symbol. {0:P-,} omits group separators, and {0:P-0} hides the decimal part when the value would show “.00”. For example, 1 becomes “100 %” while 1.1223 becomes “112.23 %”. Provides a plug in framework that lets you create new formatters to handle specific format symbols. You register them globally so you can just pass the AdvancedFormatProvider object into String.Format and ToString(format) without having to figure out which plug ins to add. text = String.Format(AdvancedFormatProvider.Current, "{0:I}", 1000); // returns "1,000" text2 = String.Format(AdvancedFormatProvider.Current, "{0:I-,}", 1000); // returns "1000" text3 = String.Format(AdvancedFormatProvider.Current, "{0:C-$-,}", 1000.0); // returns "1000.00" The IFormatProvider parameter Microsoft has made String.Format() and ToString(format) format expandable. They each take an additional parameter that takes an object that implements System.IFormatProvider. This interface has a single member, the GetFormat() method, which returns an object that knows how to convert the format symbol and value into the desired string. There are already a number of web-based resources to teach you about IFormatProvider and the companion interface ICustomFormatter. I’ll defer to them if you want to dig more into the topic. The only thing I want to point out is what I think are implementation considerations. Why GetFormat() always tests for ICustomFormatter When you see examples of implementing IFormatProviders, the GetFormat() method always tests the parameter against the ICustomFormatter type. Why is that? public object GetFormat(Type formatType) { if (formatType == typeof(ICustomFormatter)) return this; else return null; } The value of formatType is already predetermined by the .net framework. String.Format() uses the StringBuilder.AppendFormat() method to parse the string, extracting the tokens and calling GetFormat() with the ICustomFormatter type. (The .net framework also calls GetFormat() with the types of System.Globalization.NumberFormatInfo and System.Globalization.DateTimeFormatInfo but these are exclusive to how the System.Globalization.CultureInfo class handles its implementation of IFormatProvider.) Your code replaces instead of expands I would have expected the caller to pass in the format string to GetFormat() to allow your code to determine if it handles the request. My vision would be to return null when the format string is not supported. The caller would iterate through IFormatProviders until it finds one that handles the format string. Unfortunatley that is not the case. The reason you write GetFormat() as above is because the caller is expecting an object that handles all formatting cases. You are effectively supposed to write enough code in your formatter to handle your new cases and call .net functions (like String.Format() and ToString(format)) to handle the original cases. Its not hard to support the native functions from within your ICustomFormatter.Format function. Just test the format string to see if it applies to you. If not, call String.Format() with a token using the format passed in. public string Format(string format, object arg, IFormatProvider formatProvider) { if (format.StartsWith("I")) { // handle "I" formatter } else return String.Format(formatProvider, "{0:" + format + "}", arg); } Formatters are only used by explicit request Each time you write a custom formatter (implementer of ICustomFormatter), it is not used unless you explicitly passed an IFormatProvider object that supports your formatter into String.Format() or ToString(). This has several disadvantages: Suppose you have several ICustomFormatters. In order to have all available to String.Format() and ToString(format), you have to merge their code and create an IFormatProvider to return an instance of your new class. You have to remember to utilize the IFormatProvider parameter. Its easy to overlook, especially when you have existing code that calls String.Format() without using it. Some APIs may call String.Format() themselves. If those APIs do not offer an IFormatProvider parameter, your ICustomFormatter will not be available to them. The AdvancedFormatProvider solves the first two of these problems by providing a plug-in architecture.

Read the article
Asserting with JustMock

- by mehfuzh

In this post, i will be digging in a bit deep on Mock.Assert. This is the continuation from previous post and covers up the ways you can use assert for your mock expectations. I have used another traditional sample of Talisker that has a warehouse [Collaborator] and an order class [SUT] that will call upon the warehouse to see the stock and fill it up with items. Our sample, interface of warehouse and order looks similar to : public interface IWarehouse { bool HasInventory(string productName, int quantity); void Remove(string productName, int quantity); } public class Order { public string ProductName { get; private set; } public int Quantity { get; private set; } public bool IsFilled { get; private set; } public Order(string productName, int quantity) { this.ProductName = productName; this.Quantity = quantity; } public void Fill(IWarehouse warehouse) { if (warehouse.HasInventory(ProductName, Quantity)) { warehouse.Remove(ProductName, Quantity); IsFilled = true; } } } Our first example deals with mock object assertion [my take] / assert all scenario. This will only act on the setups that has this “MustBeCalled” flag associated. To be more specific , let first consider the following test code: var order = new Order(TALISKER, 0); var wareHouse = Mock.Create<IWarehouse>(); Mock.Arrange(() => wareHouse.HasInventory(Arg.Any<string>(), 0)).Returns(true).MustBeCalled(); Mock.Arrange(() => wareHouse.Remove(Arg.Any<string>(), 0)).Throws(new InvalidOperationException()).MustBeCalled(); Mock.Arrange(() => wareHouse.Remove(Arg.Any<string>(), 100)).Throws(new InvalidOperationException()); //exercise Assert.Throws<InvalidOperationException>(() => order.Fill(wareHouse)); // it will assert first and second setup. Mock.Assert(wareHouse); Here, we have created the order object, created the mock of IWarehouse , then I setup our HasInventory and Remove calls of IWarehouse with my expected, which is called by the order.Fill internally. Now both of these setups are marked as “MustBeCalled”. There is one additional IWarehouse.Remove that is invalid and is not marked. On line 9 , as we do order.Fill , the first and second setups will be invoked internally where the third one is left un-invoked. Here, Mock.Assert will pass successfully as both of the required ones are called as expected. But, if we marked the third one as must then it would fail with an proper exception. Here, we can also see that I have used the same call for two different setups, this feature is called sequential mocking and will be covered later on. Moving forward, let’s say, we don’t want this must call, when we want to do it specifically with lamda. For that let’s consider the following code: //setup - data var order = new Order(TALISKER, 50); var wareHouse = Mock.Create<IWarehouse>(); Mock.Arrange(() => wareHouse.HasInventory(TALISKER, 50)).Returns(true); //exercise order.Fill(wareHouse); //verify state Assert.True(order.IsFilled); //verify interaction Mock.Assert(()=> wareHouse.HasInventory(TALISKER, 50)); Here, the snippet shows a case for successful order, i haven’t used “MustBeCalled” rather i used lamda specifically to assert the call that I have made, which is more justified for the cases where we exactly know the user code will behave. But, here goes a question that how we are going assert a mock call if we don’t know what item a user code may request for. In that case, we can combine the matchers with our assert calls like we do it for arrange: //setup - data var order = new Order(TALISKER, 50); var wareHouse = Mock.Create<IWarehouse>(); Mock.Arrange(() => wareHouse.HasInventory(TALISKER, Arg.Matches<int>( x => x <= 50))).Returns(true); //exercise order.Fill(wareHouse); //verify state Assert.True(order.IsFilled); //verify interaction Mock.Assert(() => wareHouse.HasInventory(Arg.Any<string>(), Arg.Matches<int>(x => x <= 50))); Here, i have asserted a mock call for which i don’t know the item name, but i know that number of items that user will request is less than 50. This kind of expression based assertion is now possible with JustMock. We can extent this sample for properties as well, which will be covered shortly [in other posts]. In addition to just simple assertion, we can also use filters to limit to times a call has occurred or if ever occurred. Like for the first test code, we have one setup that is never invoked. For such, it is always valid to use the following assert call: Mock.Assert(() => wareHouse.Remove(Arg.Any<string>(), 100), Occurs.Never()); Or ,for warehouse.HasInventory we can do the following: Mock.Assert(() => wareHouse.HasInventory(Arg.Any<string>(), 0), Occurs.Once()); Or, to be more specific, it’s even better with: Mock.Assert(() => wareHouse.HasInventory(Arg.Any<string>(), 0), Occurs.Exactly(1)); There are other filters that you can apply here using AtMost, AtLeast and AtLeastOnce but I left those to the readers. You can try the above sample that is provided in the examples shipped with JustMock.Please, do check it out and feel free to ping me for any issues. Enjoy!!

Read the article
C#/.NET Little Wonders: The Timeout static class

- by James Michael Hare

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. When I started the “Little Wonders” series, I really wanted to pay homage to parts of the .NET Framework that are often small but can help in big ways. The item I have to discuss today really is a very small item in the .NET BCL, but once again I feel it can help make the intention of code much clearer and thus is worthy of note. The Problem - Magic numbers aren’t very readable or maintainable In my first Little Wonders Post (Five Little Wonders That Make Code Better) I mention the TimeSpan factory methods which, I feel, really help the readability of constructed TimeSpan instances. Just to quickly recap that discussion, ask yourself what the TimeSpan specified in each case below is 1: // Five minutes? Five Seconds? 2: var fiveWhat1 = new TimeSpan(0, 0, 5); 3: var fiveWhat2 = new TimeSpan(0, 0, 5, 0); 4: var fiveWhat3 = new TimeSpan(0, 0, 5, 0, 0); You’d think they’d all be the same unit of time, right? After all, most overloads tend to tack additional arguments on the end. But this is not the case with TimeSpan, where the constructor forms are: TimeSpan(int hours, int minutes, int seconds); TimeSpan(int days, int hours, int minutes, int seconds); TimeSpan(int days, int hours, int minutes, int seconds, int milliseconds); Notice how in the 4 and 5 parameter version we suddenly have the parameter days slipping in front of hours? This can make reading constructors like those above much harder. Fortunately, there are TimeSpan factory methods to help make your intention crystal clear: 1: // Ah! Much clearer! 2: var fiveSeconds = TimeSpan.FromSeconds(5); These are great because they remove all ambiguity from the reader! So in short, magic numbers in constructors and methods can be ambiguous, and anything we can do to clean up the intention of the developer will make the code much easier to read and maintain. Timeout – Readable identifiers for infinite timeout values In a similar way to TimeSpan, let’s consider specifying timeouts for some of .NET’s (or our own) many methods that allow you to specify timeout periods. For example, in the TPL Task class, there is a family of Wait() methods that can take TimeSpan or int for timeouts. Typically, if you want to specify an infinite timeout, you’d just call the version that doesn’t take a timeout parameter at all: 1: myTask.Wait(); // infinite wait But there are versions that take the int or TimeSpan for timeout as well: 1: // Wait for 100 ms 2: myTask.Wait(100); 3: 4: // Wait for 5 seconds 5: myTask.Wait(TimeSpan.FromSeconds(5); Now, if we want to specify an infinite timeout to wait on the Task, we could pass –1 (or a TimeSpan set to –1 ms), which what the .NET BCL methods with timeouts use to represent an infinite timeout: 1: // Also infinite timeouts, but harder to read/maintain 2: myTask.Wait(-1); 3: myTask.Wait(TimeSpan.FromMilliseconds(-1)); However, these are not as readable or maintainable. If you were writing this code, you might make the mistake of thinking 0 or int.MaxValue was an infinite timeout, and you’d be incorrect. Also, reading the code above it isn’t as clear that –1 is infinite unless you happen to know that is the specified behavior. To make the code like this easier to read and maintain, there is a static class called Timeout in the System.Threading namespace which contains definition for infinite timeouts specified as both int and TimeSpan forms: Timeout.Infinite An integer constant with a value of –1 Timeout.InfiniteTimeSpan A static readonly TimeSpan which represents –1 ms (only available in .NET 4.5+) This makes our calls to Task.Wait() (or any other calls with timeouts) much more clear: 1: // intention to wait indefinitely is quite clear now 2: myTask.Wait(Timeout.Infinite); 3: myTask.Wait(Timeout.InfiniteTimeSpan); But wait, you may say, why would we care at all? Why not use the version of Wait() that takes no arguments? Good question! When you’re directly calling the method with an infinite timeout that’s what you’d most likely do, but what if you are just passing along a timeout specified by a caller from higher up? Or perhaps storing a timeout value from a configuration file, and want to default it to infinite? For example, perhaps you are designing a communications module and want to be able to shutdown gracefully, but if you can’t gracefully finish in a specified amount of time you want to force the connection closed. You could create a Shutdown() method in your class, and take a TimeSpan or an int for the amount of time to wait for a clean shutdown – perhaps waiting for client to acknowledge – before terminating the connection. So, assume we had a pub/sub system with a class to broadcast messages: 1: // Some class to broadcast messages to connected clients 2: public class Broadcaster 3: { 4: // ... 5: 6: // Shutdown connection to clients, wait for ack back from clients 7: // until all acks received or timeout, whichever happens first 8: public void Shutdown(int timeout) 9: { 10: // Kick off a task here to send shutdown request to clients and wait 11: // for the task to finish below for the specified time... 12: 13: if (!shutdownTask.Wait(timeout)) 14: { 15: // If Wait() returns false, we timed out and task 16: // did not join in time. 17: } 18: } 19: } We could even add an overload to allow us to use TimeSpan instead of int, to give our callers the flexibility to specify timeouts either way: 1: // overload to allow them to specify Timeout in TimeSpan, would 2: // just call the int version passing in the TotalMilliseconds... 3: public void Shutdown(TimeSpan timeout) 4: { 5: Shutdown(timeout.TotalMilliseconds); 6: } Notice in case of this class, we don’t assume the caller wants infinite timeouts, we choose to rely on them to tell us how long to wait. So now, if they choose an infinite timeout, they could use the –1, which is more cryptic, or use Timeout class to make the intention clear: 1: // shutdown the broadcaster, waiting until all clients ack back 2: // without timing out. 3: myBroadcaster.Shutdown(Timeout.Infinite); We could even add a default argument using the int parameter version so that specifying no arguments to Shutdown() assumes an infinite timeout: 1: // Modified original Shutdown() method to add a default of 2: // Timeout.Infinite, works because Timeout.Infinite is a compile 3: // time constant. 4: public void Shutdown(int timeout = Timeout.Infinite) 5: { 6: // same code as before 7: } Note that you can’t default the ShutDown(TimeSpan) overload with Timeout.InfiniteTimeSpan since it is not a compile-time constant. The only acceptable default for a TimeSpan parameter would be default(TimeSpan) which is zero milliseconds, which specified no wait, not infinite wait. Summary While Timeout.Infinite and Timeout.InfiniteTimeSpan are not earth-shattering classes in terms of functionality, they do give you very handy and readable constant values that you can use in your programs to help increase readability and maintainability when specifying infinite timeouts for various timeouts in the BCL and your own applications. Technorati Tags: C#,CSharp,.NET,Little Wonders,Timeout,Task

Read the article
Segfault when iterating over a map<string, string> and drawing its contents using SDL_TTF

- by Michael Stahre

I'm not entirely sure this question belongs on gamedev.stackexchange, but I'm technically working on a game and working with SDL, so it might not be entirely offtopic. I've written a class called DebugText. The point of the class is to have a nice way of printing values of variables to the game screen. The idea is to call SetDebugText() with the variables in question every time they change or, as is currently the case, every time the game's Update() is called. The issue is that when iterating over the map that contains my variables and their latest updated values, I get segfaults. See the comments in DrawDebugText() below, it specifies where the error happens. I've tried splitting the calls to it-first and it-second into separate lines and found that the problem doesn't always happen when calling it-first. It alters between it-first and it-second. I can't find a pattern. It doesn't fail on every call to DrawDebugText() either. It might fail on the third time DrawDebugText() is called, or it might fail on the fourth. Class header: #ifndef CLIENT_DEBUGTEXT_H #define CLIENT_DEBUGTEXT_H #include <Map> #include <Math.h> #include <sstream> #include <SDL.h> #include <SDL_ttf.h> #include "vector2.h" using std::string; using std::stringstream; using std::map; using std::pair; using game::Vector2; namespace game { class DebugText { private: TTF_Font* debug_text_font; map<string, string>* debug_text_list; public: void SetDebugText(string var, bool value); void SetDebugText(string var, float value); void SetDebugText(string var, int value); void SetDebugText(string var, Vector2 value); void SetDebugText(string var, string value); int DrawDebugText(SDL_Surface*, SDL_Rect*); void InitDebugText(); void Clear(); }; } #endif Class source file: #include "debugtext.h" namespace game { // Copypasta function for handling the toString conversion template <class T> inline string to_string (const T& t) { stringstream ss (stringstream::in | stringstream::out); ss << t; return ss.str(); } // Initializes SDL_TTF and sets its font void DebugText::InitDebugText() { if(TTF_WasInit()) TTF_Quit(); TTF_Init(); debug_text_font = TTF_OpenFont("LiberationSans-Regular.ttf", 16); TTF_SetFontStyle(debug_text_font, TTF_STYLE_NORMAL); } // Iterates over the current debug_text_list and draws every element on the screen. // After drawing with SDL you need to get a rect specifying the area on the screen that was changed and tell SDL that this part of the screen needs to be updated. this is done in the game's Draw() function // This function sets rects_to_update to the new list of rects provided by all of the surfaces and returns the number of rects in the list. These two parameters are used in Draw() when calling on SDL_UpdateRects(), which takes an SDL_Rect* and a list length int DebugText::DrawDebugText(SDL_Surface* screen, SDL_Rect* rects_to_update) { if(debug_text_list == NULL) return 0; if(!TTF_WasInit()) InitDebugText(); rects_to_update = NULL; // Specifying the font color SDL_Color font_color = {0xff, 0x00, 0x00, 0x00}; // r, g, b, unused int row_count = 0; string line; // The iterator variable map<string, string>::iterator it; // Gets the iterator and iterates over it for(it = debug_text_list->begin(); it != debug_text_list->end(); it++) { // Takes the first value (the name of the variable) and the second value (the value of the parameter in string form) //---------THIS LINE GIVES ME SEGFAULTS----- line = it->first + ": " + it->second; //------------------------------------------ // Creates a surface with the text on it that in turn can be rendered to the screen itself later SDL_Surface* debug_surface = TTF_RenderText_Solid(debug_text_font, line.c_str(), font_color); if(debug_surface == NULL) { // A standard check for errors fprintf(stderr, "Error: %s", TTF_GetError()); return NULL; } else { // If SDL_TTF did its job right, then we now set a destination rect row_count++; SDL_Rect dstrect = {5, 5, 0, 0}; // x, y, w, h dstrect.x = 20; dstrect.y = 20*row_count; // Draws the surface with the text on it to the screen int res = SDL_BlitSurface(debug_surface,NULL,screen,&dstrect); if(res != 0) { //Just an error check fprintf(stderr, "Error: %s", SDL_GetError()); return NULL; } // Creates a new rect to specify the area that needs to be updated with SDL_Rect* new_rect_to_update = (SDL_Rect*) malloc(sizeof(SDL_Rect)); new_rect_to_update->h = debug_surface->h; new_rect_to_update->w = debug_surface->w; new_rect_to_update->x = dstrect.x; new_rect_to_update->y = dstrect.y; // Just freeing the surface since it isn't necessary anymore SDL_FreeSurface(debug_surface); // Creates a new list of rects with room for the new rect SDL_Rect* newtemp = (SDL_Rect*) malloc(row_count*sizeof(SDL_Rect)); // Copies the data from the old list of rects to the new one memcpy(newtemp, rects_to_update, (row_count-1)*sizeof(SDL_Rect)); // Adds the new rect to the new list newtemp[row_count-1] = *new_rect_to_update; // Frees the memory used by the old list free(rects_to_update); // And finally redirects the pointer to the old list to the new list rects_to_update = newtemp; newtemp = NULL; } } // When the entire map has been iterated over, return the number of lines that were drawn, ie. the number of rects in the returned rect list return row_count; } // The SetDebugText used by all the SetDebugText overloads // Takes two strings, inserts them into the map as a pair void DebugText::SetDebugText(string var, string value) { if (debug_text_list == NULL) { debug_text_list = new map<string, string>(); } debug_text_list->erase(var); debug_text_list->insert(pair<string, string>(var, value)); } // Writes the bool to a string and calls SetDebugText(string, string) void DebugText::SetDebugText(string var, bool value) { string result; if (value) result = "True"; else result = "False"; SetDebugText(var, result); } // Does the same thing, but uses to_string() to convert the float void DebugText::SetDebugText(string var, float value) { SetDebugText(var, to_string(value)); } // Same as above, but int void DebugText::SetDebugText(string var, int value) { SetDebugText(var, to_string(value)); } // Vector2 is a struct of my own making. It contains the two float vars x and y void DebugText::SetDebugText(string var, Vector2 value) { SetDebugText(var + ".x", to_string(value.x)); SetDebugText(var + ".y", to_string(value.y)); } // Empties the list. I don't actually use this in my code. Shame on me for writing something I don't use. void DebugText::Clear() { if(debug_text_list != NULL) debug_text_list->clear(); } }

Read the article
Diagnosing ADF Mobile iOS deployment problems

- by Chris Muir

From time to time I encounter customers who have taken possession of a brand new Apple Mac, have that excited "I've just spent more on a computer then I ever wanted to but it's okay" crazy gleam in their eye, but on pre-loading all the necessary software for Oracle's ADF Mobile to start their mobile campaign, following Oracle's setup instructions and deploying their first app to Apple's XCode iPhone Simulator they hit this error message in the JDeveloper Log-Deployment window: [01:36:46 PM] Deployment cancelled. [01:36:46 PM] ---- Deployment incomplete ----. [01:36:46 PM] Failed to build the iOS application bundle. [01:36:46 PM] Deployment failed due to one or more errors returned by '/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild'. The following is a summary of the returned error(s): Command-line execution failed (Return code: 69) "Oh, return code 69, I know that well" I hear you say. Admittedly the error code is less than useful besides drawing some titters from the peanut gallery. Before explaining what's gone wrong, I think it's useful to teach customers how to diagnose these issues themselves. When ADF Mobile commences a deployment, be it to Apple's iOS or Google's Android platforms, JDeveloper and ADF Mobile do a good job in the Log window of showing you what the deployment process entails. In the case of deploying to iOS the log window will literally include the XCode commands executed to complete the deployment cycle. As example here's the log output that was produced before the error message was raised.... take the opportunity to read this line by line and note the command line calls highlighted in blue: (Note some of the following lines have been split over multiple lines to suit reading on this blog, each original line is preceded by a timestamp. Ensure to check the exact commands from JDev) [01:36:33 PM] Target platform is (iOS). [01:36:33 PM] Beginning deployment of ADF Mobile application 'LayoutDemo' to iOS using profile 'IOS_MOBILE_NATIVE_archive1'. [01:36:34 PM] Command-line executed: [/Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild, -version] [01:36:34 PM] Command-line execution succeeded. [01:36:34 PM] Running dependency analysis... [01:36:34 PM] Building... [01:36:34 PM] Deploying 3 profiles... [01:36:35 PM] Wrote Archive Module to /Users/chris/fmw/jdeveloper/jdev/extensions/ oracle.adf.mobile/Samples/PublicSamples/LayoutDemo/ApplicationController/ deploy/ApplicationController.jar [01:36:35 PM] WARNING: No Resource Catalog enabled ADF components found to package [01:36:36 PM] Wrote Archive Module to /Users/chris/fmw/jdeveloper/jdev/extensions/ oracle.adf.mobile/Samples/PublicSamples/LayoutDemo/ViewController/ deploy/ViewController.jar [01:36:36 PM] Verifying existence of the .adf source directory of the ADF Mobile application... [01:36:36 PM] Verifying Application Controller project exists... [01:36:36 PM] Verifying application dependencies... [01:36:36 PM] The application may not function correctly because the following dependent libraries are missing: /Users/chris/jdev/jdeveloper/jdeveloper/jdev/extensions/oracle.adf.mobile/ lib/adfmf.springboard.jar [01:36:36 PM] Verifying project dependencies... [01:36:36 PM] Validating application XML files... [01:36:36 PM] Validating XML files in project ApplicationController... [01:36:36 PM] Validating XML files in project ViewController... [01:36:40 PM] Copying common javascript files... [01:36:41 PM] Copying FARs to the ADF Mobile Framework application... [01:36:41 PM] Extracting Feature Archive file, "ApplicationController.jar" to deployment folder, "ApplicationController". [01:36:42 PM] Extracting Feature Archive file, "ViewController.jar" to deployment folder, "ViewController". [01:36:42 PM] Deploying skinning files... [01:36:43 PM] Copying the CVM SDK files built for the x86 processor... [01:36:43 PM] Copying the CVM JDK files built for the x86 processor... [01:36:43 PM] Command-line executed: [cp, -R, -p, /Users/chris/fmw/jdeveloper/jdev/extensions/oracle.adf.mobile/iOS/jvmti/x86/, /Users/chris/fmw/jdeveloper/jdev/extensions/oracle.adf.mobile/ Samples/PublicSamples/ LayoutDemo/deploy/IOS_MOBILE_NATIVE_archive1/temporary_xcode_project/lib] [01:36:43 PM] Command-line execution succeeded. [01:36:43 PM] Command-line executed: [cp, -R, -p, /Users/chris/fmw/jdeveloper/jdev/extensions/oracle.adf.mobile/iOS/jvmti/jar/, /Users/chris/fmw/jdeveloper/jdev/extensions/oracle.adf.mobile/Samples/ PublicSamples/LayoutDemo/deploy/IOS_MOBILE_NATIVE_archive1/ temporary_xcode_project/lib] [01:36:43 PM] Command-line execution succeeded. [01:36:43 PM] Copying security related files to the ADF Mobile Framework application... [01:36:44 PM] Command-line executed from path: /Users/chris/fmw/jdeveloper/jdev/extensions/oracle.adf.mobile/Samples/ PublicSamples/LayoutDemo/deploy/IOS_MOBILE_NATIVE_archive1/temporary_xcode_project/ [01:36:44 PM] Command-line executed: /Applications/Xcode.app/Contents/Developer/usr/bin/xcodebuild clean install -configuration Debug -sdk /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform/ Developer/SDKs/iPhoneSimulator6.1.sdk DSTROOT=/Users/chris/fmw/jdeveloper/jdev/extensions/oracle.adf.mobile/Samples/ PublicSamples/LayoutDemo/deploy/IOS_MOBILE_NATIVE_archive1/Destination_Root/ IPHONEOS_DEPLOYMENT_TARGET=5.0 TARGETED_DEVICE_FAMILY=1,2 PRODUCT_NAME=LayoutDemo ADD_SETTINGS_BUNDLE=NO As you can see when we move from JDeveloper undertaking its work, it then passes the code off in the last few lines for Apple's XCode to assemble and deploy the required .ipa file. From the original error message which followed this complaining about xcodebuild failing with return code 69, we can quickly see the exact command line used to call xcodebuild. As this is the exact command line call with all its options, you're free to open a Terminal window in Mac OSX and execute the same command by simply copying and pasting the command line. And via this you'll then find out what return code actually 69 means. Unfortunately it's not that exciting. For Macs that have just been installed and configured with XCode, XCode (and for that matter iTunes) which is required by ADF Mobile to deploy must have been run at least once before hand on your brand new Mac (to be clear that's once ever, not once every restart). On doing so you will be presented with a license agreement from Apple that you must accept. Only once you've done this will the command line calls work. They're currently failing as you haven't accepted the legal terms and conditions. (arguably you an also accept the terms and conditions from the command line too, but ADF Mobile cannot do this on your behalf, so it's just easier to open the tools and confirm the legal requirements that way). Putting aside the error code and its meaning, watching the log window, watching what commands are executed, learning what they do, this will assist you to diagnose issues yourself and solve these sort of issues more relatively quickly. From my perspective as an Oracle Product Manager, it allows me to say "this is the stuff you don't need to worry about when you use ADF Mobile when it's configured correctly" .... as you can see my salesman qualities shine through. For anyone who is happily using ADF Mobile on a Mac and wondering why you didn't hit these issues, it's quite likely that you already accepted the license conditions before deploying via ADF Mobile. For instance, though I'm not a fan of iTunes itself, iTunes was one of the first things I loaded on my Mac to access my Justin Bieber albums. Image courtesy of winnond / FreeDigitalPhotos.net

Read the article
Writing the tests for FluentPath

- by Bertrand Le Roy

Writing the tests for FluentPath is a challenge. The library is a wrapper around a legacy API (System.IO) that wasn’t designed to be easily testable. If it were more testable, the sensible testing methodology would be to tell System.IO to act against a mock file system, which would enable me to verify that my code is doing the expected file system operations without having to manipulate the actual, physical file system: what we are testing here is FluentPath, not System.IO. Unfortunately, that is not an option as nothing in System.IO enables us to plug a mock file system in. As a consequence, we are left with few options. A few people have suggested me to abstract my calls to System.IO away so that I could tell FluentPath – not System.IO – to use a mock instead of the real thing. That in turn is getting a little silly: FluentPath already is a thin abstraction around System.IO, so layering another abstraction between them would double the test surface while bringing little or no value. I would have to test that new abstraction layer, and that would bring us back to square one. Unless I’m missing something, the only option I have here is to bite the bullet and test against the real file system. Of course, the tests that do that can hardly be called unit tests. They are more integration tests as they don’t only test bits of my code. They really test the successful integration of my code with the underlying System.IO. In order to write such tests, the techniques of BDD work particularly well as they enable you to express scenarios in natural language, from which test code is generated. Integration tests are being better expressed as scenarios orchestrating a few basic behaviors, so this is a nice fit. The Orchard team has been successfully using SpecFlow for integration tests for a while and I thought it was pretty cool so that’s what I decided to use. Consider for example the following scenario: Scenario: Change extension Given a clean test directory When I change the extension of bar\notes.txt to foo Then bar\notes.txt should not exist And bar\notes.foo should exist This is human readable and tells you everything you need to know about what you’re testing, but it is also executable code. What happens when SpecFlow compiles this scenario is that it executes a bunch of regular expressions that identify the known Given (set-up phases), When (actions) and Then (result assertions) to identify the code to run, which is then translated into calls into the appropriate methods. Nothing magical. Here is the code generated by SpecFlow: [NUnit.Framework.TestAttribute()] [NUnit.Framework.DescriptionAttribute("Change extension")] public virtual void ChangeExtension() { TechTalk.SpecFlow.ScenarioInfo scenarioInfo = new TechTalk.SpecFlow.ScenarioInfo("Change extension", ((string[])(null))); #line 6 this.ScenarioSetup(scenarioInfo); #line 7 testRunner.Given("a clean test directory"); #line 8 testRunner.When("I change the extension of " + "bar\\notes.txt to foo"); #line 9 testRunner.Then("bar\\notes.txt should not exist"); #line 10 testRunner.And("bar\\notes.foo should exist"); #line hidden testRunner.CollectScenarioErrors();} The #line directives are there to give clues to the debugger, because yes, you can put breakpoints into a scenario: The way you usually write tests with SpecFlow is that you write the scenario first, let it fail, then write the translation of your Given, When and Then into code if they don’t already exist, which results in running but failing tests, and then you write the code to make your tests pass (you implement the scenario). In the case of FluentPath, I built a simple Given method that builds a simple file hierarchy in a temporary directory that all scenarios are going to work with: [Given("a clean test directory")] public void GivenACleanDirectory() { _path = new Path(SystemIO.Path.GetTempPath()) .CreateSubDirectory("FluentPathSpecs") .MakeCurrent(); _path.GetFileSystemEntries() .Delete(true); _path.CreateFile("foo.txt", "This is a text file named foo."); var bar = _path.CreateSubDirectory("bar"); bar.CreateFile("baz.txt", "bar baz") .SetLastWriteTime(DateTime.Now.AddSeconds(-2)); bar.CreateFile("notes.txt", "This is a text file containing notes."); var barbar = bar.CreateSubDirectory("bar"); barbar.CreateFile("deep.txt", "Deep thoughts"); var sub = _path.CreateSubDirectory("sub"); sub.CreateSubDirectory("subsub"); sub.CreateFile("baz.txt", "sub baz") .SetLastWriteTime(DateTime.Now); sub.CreateFile("binary.bin", new byte[] {0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0xFF}); } Then, to implement the scenario that you can read above, I had to write the following When: [When("I change the extension of (.*) to (.*)")] public void WhenIChangeTheExtension( string path, string newExtension) { var oldPath = Path.Current.Combine(path.Split('\\')); oldPath.Move(p => p.ChangeExtension(newExtension)); } As you can see, the When attribute is specifying the regular expression that will enable the SpecFlow engine to recognize what When method to call and also how to map its parameters. For our scenario, “bar\notes.txt” will get mapped to the path parameter, and “foo” to the newExtension parameter. And of course, the code that verifies the assumptions of the scenario: [Then("(.*) should exist")] public void ThenEntryShouldExist(string path) { Assert.IsTrue(_path.Combine(path.Split('\\')).Exists); } [Then("(.*) should not exist")] public void ThenEntryShouldNotExist(string path) { Assert.IsFalse(_path.Combine(path.Split('\\')).Exists); } These steps should be written with reusability in mind. They are building blocks for your scenarios, not implementation of a specific scenario. Think small and fine-grained. In the case of the above steps, I could reuse each of those steps in other scenarios. Those tests are easy to write and easier to read, which means that they also constitute a form of documentation. Oh, and SpecFlow is just one way to do this. Rob wrote a long time ago about this sort of thing (but using a different framework) and I highly recommend this post if I somehow managed to pique your interest: http://blog.wekeroad.com/blog/make-bdd-your-bff-2/ And this screencast (Rob always makes excellent screencasts): http://blog.wekeroad.com/mvc-storefront/kona-3/ (click the “Download it here” link)

Read the article
CodePlex Daily Summary for Thursday, June 17, 2010

CodePlex Daily Summary for Thursday, June 17, 2010New ProjectsAstalanumerator: A JavaScript based recursive DOM/JS object inspector. Uses a simple tree menu to enumerate all properties of a object.BDD Log Converter: A simple .NET class and console application that will convert BDD logs (MDT) into XML format.CastleInvestProj: Castle Investigating project Easy Callback: This library facilitates the use of multiple asynchronous calls on the same page, and asynchronous calls from a user control also have a clean cod...Easy Wings: Small webApp to manage aircraft booking in flying club. French only for the moment.EPiServer Template Foundation: EPiServer Template Foundation builds on top of Page Type Builder to provide a framework for common site features such as basic page type properties...guidebook: a project to plan your road trip.Look into documents for e-discovery: Search, browse, tag, annotate documents such as MS Word, PDF, e-mail, etc. Good for legal professionals do e-discovery. One Bus Away for Windows Phone: A Windows Phone 7 application written in Silverlight for the OneBusAway (www.onebusaway.org) website. Allows mobile users to search for public tra...OneBusAway for Windows Phone 7: OneBusAway is a service with transit information for the Seattle, WA region. We are creating a mobile application for Windows Phone 7 utilizing th...PoFabLab - Poetry Generation Library and Editor in .NET: PoFabLab is an open source library and word processor designed for digital poets. The library can scan lines, perform Markov analysis, filter text...Project Axure: More details coming soon.Чат кутежа 2.0: ИРЦ чат специально для форума ЕНЕ简易代码生成器: 初次使用CodePlex，这只是一个测试项目。打算用WPF做一个简单的代码生成器，兼具SQL Server Client功能。使用.Net 4.0, C#开发。运营工作系统: TRAS(Team resource assist system) is a toolkit that help the studio to manage and distribute the daily work, like publish the news, GM broadcast a...New ReleasesAmuse - A New MU* Client For Windows: 2010 June: Important Notice to TestersPlease uninstall any previous versions of Amuse prior to this one before installing. Changes and InformationFirst relea...ASP.NET Generic Data Source Control: V1.0: GenericDataSource - Version 1.0Binary This is the first official binary release of the GenericDataSource for ASP.NET - stable and ready for product...Astalanumerator: Astalanumerator 0.7: I wanted to map all properties in javascript and inspect them regardless if they were objects or not. IE doesn’t support for(i in..) for native pro...BDD Log Converter: BDD Log Converter 0.1.0: First release (0.1.0).DVD Swarm: 0.8.10.616: Major update with improvements to encoding speed.Easy Callback: Easy Callback 1.0.0.0: Easy Callback library 1.0.0.0Facebook Connect Authentication for ASP.NET: Facebook Connect Authentication for ASP.NET - v1.0: Now supporting Facebook's new Open Graph API JavaScript SDK, this release of FBConnectAuth also adds support for running in partially trusted envir...FlickrNet API Library: 3.0 Beta 3: Another small Beta. Changed parsing code so exceptions aren't raised when new attributes are added by Flickr. This affects searches where you are ...Infragistics Analytics Framework: Infragistics Analytics Framework 10.2: An updated version of Infragistics Analytics Framework, which utilizes the newest version (v.1.4.4) of MSAF as well as the newest release (v.10.2) ...NUnit Add-in for Growl Notifications: NUnit Add-in for Growl Notifications 1.0 build 1: Version 1.0 build 1:[change] Test run failure notification now disappears automaticallyOpen Source PLM Activities: 3dxml player integration for Aras Innovator: This is just a simple html file you need to add to your Aras Innovator install directory. It loads the 3Dxml player for your 3dxml files. Tested o...patterns & practices - Windows Azure Guidance: WAAG - Part 2 - Drop 1: First code and docs drop for Part 2 of the Windows Azure Architecture Guide Part 1 of the Guide is released here. Highlights of this release are:...Phalanger - The PHP Language Compiler for the .NET Framework: 2.0 (June 2010): Installer of the latest binaries of Phalanger 2.0 (June 2010) and its integration into Visual Studio 2008 SP1. * Improved compatibility with P...RIA Services Essentials: Book Club Application (June 16, 2010): Added some XAML to hide/show link to BookShelf page based on whether the user is logged in or not. Updated IsBookOwner authorization rule implement...secs4net: Relase 1.01: version 1.01 releasesELedit: sELedit v1.1c: Added: Tool for exporting NPC/Mob database file that is used by sNPCeditSharePoint Ad Rotator: SPAdRotator 2.0 Beta 2: Added: Open tool pane link to default Web Part text Made all images except the first hidden by default, so the Web Part will degrade gracefully w...sMAPtool: sMAPtool v0.7f (without Maps): Added: 3rd party magnifier softwaresNPCedit: sNPCedit v0.9c: Added: npc/mob names and corresponding datbaseSolidWorks Addin Development: GenericAddinFrameworkR1-06.17.2010: .sTASKedit: sTASKedit v0.8: Important BugFix: there was an mistake in the structure, team-member block and get-items block was swapped internally. Tasks that contains both blo...stefvanhooijdonk.com: UnitTesting-SP2010-TFS2010: Files for my post on TFS2010 and NUnit testing with SP2010 projects. see the post here: http://wp.me/pMnlQ-88 The XSLT here is from http://nunit4t...Telerik CAB Enabling Kit for RadControls for WinForms: TCEK 2010.1.10.504: What's new in v2010.1.0610 (Beta): RadDocking component has been replaced with the latest RadDock control Requirements: Visual Studio 2005+ Tele...TFS Buddy: TFS Buddy 1.2: Fixes a problem with notificationsThales Simulator Library: Version 0.9: The Thales Simulator Library is an implementation of a software emulation of the Thales (formerly Zaxus & Racal) Hardware Security Module cryptogra...Triton Application Framework: Tools - Code Generator - Build 1.0: This is the first release of the Generator. This is buggy but works.VCC: Latest build, v2.1.30616.0: Automatic drop of latest buildXsltDb - DotNetNuke Module Builder: 01.01.27: Code completion for XsltDb, HTML and XSL stuff!! Full screen editing Some bugs are still in EditArea component and object lists in code completi...Чат кутежа 2.0: 0.9a build 2 версия: вторая сборка первой альфа-версии ирц-клиента.Most Popular ProjectsWBFS ManagerRawrAJAX Control ToolkitMicrosoft SQL Server Product Samples: DatabaseSilverlight ToolkitWindows Presentation Foundation (WPF)patterns & practices – Enterprise LibraryPHPExcelMicrosoft SQL Server Community & SamplesASP.NETMost Active ProjectsdotSpatialpatterns & practices: Enterprise Library Contribpatterns & practices – Enterprise LibraryBlogEngine.NETLightweight Fluent WorkflowRhyduino - Arduino and Managed CodeSunlit World SchemeNB_Store - Free DotNetNuke Ecommerce Catalog ModuleSolidWorks Addin DevelopmentN2 CMS

Read the article
Reg Gets a Job at Red Gate (and what happens behind the scenes)

- by red(at)work

Mr Reg Gater works at one of Cambridge’s many high-tech companies. He doesn’t love his job, but he puts up with it because... well, it could be worse. Every day he drives to work around the Red Gate roundabout, wondering what his boss is going to blame him for today, and wondering if there could be a better job out there for him. By late morning he already feels like handing his notice in. He got the hacky look from his boss for being 5 minutes late, and then they ran out of tea. Again. He goes to the local sandwich shop for lunch, and picks up a Red Gate job menu and a Book of Red Gate while he’s waiting for his order. That night, he goes along to Cambridge Geek Nights and sees some very enthusiastic Red Gaters talking about the work they do; it sounds interesting and, of all things, fun. He takes a quick look at the job vacancies on the Red Gate website, and an hour later realises he’s still there – looking at videos, photos and people profiles. He especially likes the Red Gate’s Got Talent page, and is very impressed with Simon Johnson’s marathon time. He thinks that he’d quite like to work with such awesome people. It just so happens that Red Gate recently decided that they wanted to hire another hot shot team member. Behind the scenes, the wheels were set in motion: the recruitment team met with the hiring manager to understand exactly what they’re looking for, and to decide what interview tests to do, who will do the interviews, and to kick-start any interview training those people might need. Next up, a job description and job advert were written, and the job was put on the market. Reg applies, and his CV lands in the Recruitment team’s inbox and they open it up with eager anticipation that Reg could be the next awesome new starter. He looks good, and in a jiffy they’ve arranged an interview. Reg arrives for his interview, and is greeted by a smiley receptionist. She offers him a selection of drinks and he feels instantly relaxed. A couple of interviews and an assessment later, he gets a job offer. We make his day and he makes ours by accepting, and becoming one of the 60 new starters so far this year. Behind the scenes, things start moving all over again. The HR team arranges for a “Welcome” goodie box to be whisked out to him, prepares his contract, sends an email to Information Services (Or IS for short - we’ll come back to them), keeps in touch with Reg to make sure he knows what to expect on his first day, and of course asks him to fill in the all-important wiki questionnaire so his new colleagues can start to get to know him before he even joins. Meanwhile, the IS team see an email in SupportWorks from HR. They see that Reg will be starting in the sales team in a few days’ time, and they know exactly what to do. They pull out a new machine, and within minutes have used their automated deployment software to install every piece of software that a new recruit could ever need. They also check with Reg’s new manager to see if he has any special requirements that they could help with. Reg starts and is amazed to find a fully configured machine sitting on his desk, complete with stationery and all the other tools he’ll need to do his job. He feels even more cared for after he gets a workstation assessment, and realises he’d be comfier with an ergonomic keyboard and a footstool. They arrive minutes later, just like that. His manager starts him off on his induction and sales training. Along with job-specific training, he’ll also have a buddy to help him find his feet, and loads of pre-arranged demos and introductions. Reg settles in nicely, and is great at his job. He enjoys the canteen, and regularly eats one of the 40,000 meals provided each year. He gets used to the selection of teas that are available, develops a taste for champagne launch parties, and has his fair share of the 25,000 cups of coffee downed at Red Gate towers each year. He goes along to some Feel Good Fund events, and donates a little something to charity in exchange for a turn on the chocolate fountain. He’s looking a little scruffy, so he decides to get his hair cut in between meetings, just in time for the Red Gate birthday company photo. Reg starts a new project: identifying existing customers to up-sell to new bundles. He talks with the web team to generate lists of qualifying customers who haven’t recently been sent marketing emails, and sends emails out, using a new in-house developed tool to schedule follow-up calls in CRM for the same group. The customer responds, saying they’d like to upgrade but are having a licensing problem – Reg sends the issue to Support, and it gets routed to the web team. The team identifies a workaround, and the bug gets scheduled into the next maintenance release in a fortnight’s time (hey; they got lucky). With all the new stuff Reg is working on, he realises that he’d be way more efficient if he had a third monitor. He speaks to IS and they get him one - no argument. He also needs a test machine and then some extra memory. Done. He then thinks he needs an iPad, and goes to ask for one. He gets told to stop pushing his luck. Some time later, Reg’s wife has a baby, so Reg gets 2 weeks of paid paternity leave and a bunch of flowers sent to his house. He signs up to the childcare scheme so that he doesn’t have to pay National Insurance on the first £243 of his childcare. The accounts team makes it all happen seamlessly, as they did with his Give As You Earn payments, which come out of his wages and go straight to his favorite charity. Reg’s sales career is going well. He’s grateful for the help that he gets from the product support team. How do they answer all those 900-ish support calls so effortlessly each month? He’s impressed with the patches that are sent out to customers who find “interesting behavior” in their tools, and to the customers who just must have that new feature. A little later in his career at Red Gate, Reg decides that he’d like to learn about management. He goes on some management training specially customised for Red Gate, joins the Management Book Club, and gets together with other new managers to brainstorm how to get the most out of one to one meetings with his team. Reg decides to go for a game of Foosball to celebrate his good fortune with his team, and has to wait for Finance to finish. While he’s waiting, he reflects on the wonderful time he’s had at Red Gate. He can’t put his finger on what it is exactly, but he knows he’s on to a good thing. All of the stuff that happened to Reg didn’t just happen magically. We’ve got teams of people working relentlessly behind the scenes to make sure that everyone here is comfortable, safe, well fed and caffeinated to the max.

Read the article
Changing CSS with jQuery syntax in Silverlight using jLight

- by Timmy Kokke

Lately I’ve ran into situations where I had to change elements or had to request a value in the DOM from Silverlight. jLight, which was introduced in an earlier article, can help with that. jQuery offers great ways to change CSS during runtime. Silverlight can access the DOM, but it isn’t as easy as jQuery. All examples shown in this article can be looked at in this online demo. The code can be downloaded here. Part 1: The easy stuff Selecting and changing properties is pretty straight forward. Setting the text color in all <B> </B> elements can be done using the following code: jQuery.Select("b").Css("color", "red"); The Css() method is an extension method on jQueryObject which is return by the jQuery.Select() method. The Css() method takes to parameters. The first is the Css style property. All properties used in Css can be entered in this string. The second parameter is the value you want to give the property. In this case the property is “color” and it is changed to “red”. To specify which element you want to select you can add a :selector parameter to the Select() method as shown in the next example. jQuery.Select("b:first").Css("font-family", "sans-serif"); The “:first” pseudo-class selector selects only the first element. This example changes the “font-family” property of the first <B></B> element to “sans-serif”. To make use of intellisense in Visual Studio I’ve added a extension methods to help with the pseudo-classes. In the example below the “font-weight” of every “Even” <LI></LI> is set to “bold”. jQuery.Select("li".Even()).Css("font-weight", "bold"); Because the Css() extension method returns a jQueryObject it is possible to chain calls to Css(). The following example show setting the “color”, “background-color” and the “font-size” of all headers in one go. jQuery.Select(":header").Css("color", "#12FF70") .Css("background-color", "yellow") .Css("font-size", "25px"); Part 2: More complex stuff In only a few cases you need to change only one style property. More often you want to change an entire set op style properties all in one go. You could chain a lot of Css() methods together. A better way is to add a class to a stylesheet and define all properties in there. With the AddClass() method you can set a style class to a set of elements. This example shows how to add the “demostyle” class to all <B></B> in the document. jQuery.Select("b").AddClass("demostyle"); Removing the class works in the same way: jQuery.Select("b").RemoveClass("demostyle"); jLight is build for interacting with to the DOM from Silverlight using jQuery. A jQueryObjectCss object can be used to define different sets of style properties in Silverlight. The over 60 most common Css style properties are defined in the jQueryObjectCss class. A string indexer can be used to access all style properties ( CssObject1[“background-color”] equals CssObject1.BackgroundColor). In the code below, two jQueryObjectCss objects are defined and instantiated. private jQueryObjectCss CssObject1; private jQueryObjectCss CssObject2; public Demo2() { CssObject1 = new jQueryObjectCss { BackgroundColor = "Lime", Color="Black", FontSize = "12pt", FontFamily = "sans-serif", FontWeight = "bold", MarginLeft = 150, LineHeight = "28px", Border = "Solid 1px #880000" }; CssObject2 = new jQueryObjectCss { FontStyle = "Italic", FontSize = "48", Color = "#225522" }; InitializeComponent(); } Now instead of chaining to set all different properties you can just pass one of the jQueryObjectCss objects to the Css() method. In this case all <LI></LI> elements are set to match this object. jQuery.Select("li").Css(CssObject1); When using the jQueryObjectCss objects chaining is still possible. In the following example all headers are given a blue backgroundcolor and the last is set to match CssObject2. jQuery.Select(":header").Css(new jQueryObjectCss{BackgroundColor = "Blue"}) .Eq(-1).Css(CssObject2); Part 3: The fun stuff Having Silverlight call JavaScript and than having JavaScript to call Silverlight requires a lot of plumbing code. Everything has to be registered and strings are passed back and forth to execute the JavaScript. jLight makes this kind of stuff so easy, it becomes fun to use. In a lot of situations jQuery can call a function to decide what to do, setting a style class based on complex expressions for example. jLight can do the same, but the callback methods are defined in Silverlight. This example calls the function() method for each <LI></LI> element. The callback method has to take a jQueryObject, an integer and a string as parameters. In this case jLight differs a bit from the actual jQuery implementation. jQuery uses only the index and the className parameters. A jQueryObject is added to make it simpler to access the attributes and properties of the element. If the text of the listitem starts with a ‘D’ or an ‘M’ the class is set. Otherwise null is returned and nothing happens. private void button1_Click(object sender, RoutedEventArgs e) { jQuery.Select("li").AddClass(function); } private string function(jQueryObject obj, int index, string className) { if (obj.Text[0] == 'D' || obj.Text[0] == 'M') return "demostyle"; return null; } The last thing I would like to demonstrate uses even more Silverlight and less jLight, but demonstrates the power of the combination. Animating a style property using a Storyboard with easing functions. First a dependency property is defined. In this case it is a double named Intensity. By handling the changed event the color is set using jQuery. public double Intensity { get { return (double)GetValue(IntensityProperty); } set { SetValue(IntensityProperty, value); } } public static readonly DependencyProperty IntensityProperty = DependencyProperty.Register("Intensity", typeof(double), typeof(Demo3), new PropertyMetadata(0.0, IntensityChanged)); private static void IntensityChanged(DependencyObject d, DependencyPropertyChangedEventArgs e) { var i = (byte)(double)e.NewValue; jQuery.Select("span").Css("color", string.Format("#{0:X2}{0:X2}{0:X2}", i)); } An animation has to be created. This code defines a Storyboard with one keyframe that uses a bounce ease as an easing function. The animation is set to target the Intensity dependency property defined earlier. private Storyboard CreateAnimation(double value) { Storyboard storyboard = new Storyboard(); var da = new DoubleAnimationUsingKeyFrames(); var d = new EasingDoubleKeyFrame { EasingFunction = new BounceEase(), KeyTime = KeyTime.FromTimeSpan(TimeSpan.FromSeconds(1.0)), Value = value }; da.KeyFrames.Add(d); Storyboard.SetTarget(da, this); Storyboard.SetTargetProperty(da, new PropertyPath(Demo3.IntensityProperty)); storyboard.Children.Add(da); return storyboard; } Initially the Intensity is set to 128 which results in a gray color. When one of the buttons is pressed, a new animation is created an played. One to animate to black, and one to animate to white. public Demo3() { InitializeComponent(); Intensity = 128; } private void button2_Click(object sender, RoutedEventArgs e) { CreateAnimation(255).Begin(); } private void button3_Click(object sender, RoutedEventArgs e) { CreateAnimation(0).Begin(); } Conclusion As you can see jLight can make the life of a Silverlight developer a lot easier when accessing the DOM. Almost all jQuery functions that are defined in jLight use the same constructions as described above. I’ve tried to stay as close as possible to the real jQuery. Having JavaScript perform callbacks to Silverlight using jLight will be described in more detail in a future tutorial about AJAX or eventing.

Read the article
Premature-Optimization and Performance Anxiety

- by James Michael Hare

While writing my post analyzing the new .NET 4 ConcurrentDictionary class (here), I fell into one of the classic blunders that I myself always love to warn about. After analyzing the differences of time between a Dictionary with locking versus the new ConcurrentDictionary class, I noted that the ConcurrentDictionary was faster with read-heavy multi-threaded operations. Then, I made the classic blunder of thinking that because the original Dictionary with locking was faster for those write-heavy uses, it was the best choice for those types of tasks. In short, I fell into the premature-optimization anti-pattern. Basically, the premature-optimization anti-pattern is when a developer is coding very early for a perceived (whether rightly-or-wrongly) performance gain and sacrificing good design and maintainability in the process. At best, the performance gains are usually negligible and at worst, can either negatively impact performance, or can degrade maintainability so much that time to market suffers or the code becomes very fragile due to the complexity. Keep in mind the distinction above. I'm not talking about valid performance decisions. There are decisions one should make when designing and writing an application that are valid performance decisions. Examples of this are knowing the best data structures for a given situation (Dictionary versus List, for example) and choosing performance algorithms (linear search vs. binary search). But these in my mind are macro optimizations. The error is not in deciding to use a better data structure or algorithm, the anti-pattern as stated above is when you attempt to over-optimize early on in such a way that it sacrifices maintainability. In my case, I was actually considering trading the safety and maintainability gains of the ConcurrentDictionary (no locking required) for a slight performance gain by using the Dictionary with locking. This would have been a mistake as I would be trading maintainability (ConcurrentDictionary requires no locking which helps readability) and safety (ConcurrentDictionary is safe for iteration even while being modified and you don't risk the developer locking incorrectly) -- and I fell for it even when I knew to watch out for it. I think in my case, and it may be true for others as well, a large part of it was due to the time I was trained as a developer. I began college in in the 90s when C and C++ was king and hardware speed and memory were still relatively priceless commodities and not to be squandered. In those days, using a long instead of a short could waste precious resources, and as such, we were taught to try to minimize space and favor performance. This is why in many cases such early code-bases were very hard to maintain. I don't know how many times I heard back then to avoid too many function calls because of the overhead -- and in fact just last year I heard a new hire in the company where I work declare that she didn't want to refactor a long method because of function call overhead. Now back then, that may have been a valid concern, but with today's modern hardware even if you're calling a trivial method in an extremely tight loop (which chances are the JIT compiler would optimize anyway) the results of removing method calls to speed up performance are negligible for the great majority of applications. Now, obviously, there are those coding applications where speed is absolutely king (for example drivers, computer games, operating systems) where such sacrifices may be made. But I would strongly advice against such optimization because of it's cost. Many folks that are performing an optimization think it's always a win-win. That they're simply adding speed to the application, what could possibly be wrong with that? What they don't realize is the cost of their choice. For every piece of straight-forward code that you obfuscate with performance enhancements, you risk the introduction of bugs in the long term technical debt of the application. It will become so fragile over time that maintenance will become a nightmare. I've seen such applications in places I have worked. There are times I've seen applications where the designer was so obsessed with performance that they even designed their own memory management system for their application to try to squeeze out every ounce of performance. Unfortunately, the application stability often suffers as a result and it is very difficult for anyone other than the original designer to maintain. I've even seen this recently where I heard a C++ developer bemoaning that in VS2010 the iterators are about twice as slow as they used to be because Microsoft added range checking (probably as part of the 0x standard implementation). To me this was almost a joke. Twice as slow sounds bad, but it almost never as bad as you think -- especially if you're gaining safety. The only time twice is really that much slower is when once was too slow to begin with. Think about it. 2 minutes is slow as a response time because 1 minute is slow. But if an iterator takes 1 microsecond to move one position and a new, safer iterator takes 2 microseconds, this is trivial! The only way you'd ever really notice this would be in iterating a collection just for the sake of iterating (i.e. no other operations). To my mind, the added safety makes the extra time worth it. Always favor safety and maintainability when you can. I know it can be a hard habit to break, especially if you started out your career early or in a language such as C where they are very performance conscious. But in reality, these type of micro-optimizations only end up hurting you in the long run. Remember the two laws of optimization. I'm not sure where I first heard these, but they are so true: For beginners: Do not optimize. For experts: Do not optimize yet. This is so true. If you're a beginner, resist the urge to optimize at all costs. And if you are an expert, delay that decision. As long as you have chosen the right data structures and algorithms for your task, your performance will probably be more than sufficient. Chances are it will be network, database, or disk hits that will be your slow-down, not your code. As they say, 98% of your code's bottleneck is in 2% of your code so premature-optimization may add maintenance and safety debt that won't have any measurable impact. Instead, code for maintainability and safety, and then, and only then, when you find a true bottleneck, then you should go back and optimize further.

Read the article
Consuming the Amazon S3 service from a Win8 Metro Application

- by cibrax

As many of the existing Http APIs for Cloud Services, AWS also provides a set of different platform SDKs for hiding many of complexities present in the APIs. While there is a platform SDK for .NET, which is open source and available in C#, that SDK does not work in Win8 Metro Applications for the changes introduced in WinRT. WinRT offers a complete different set of APIs for doing I/O operations such as doing http calls or using cryptography for signing or encrypting data, two aspects that are absolutely necessary for consuming AWS. All the I/O APIs available as part of WinRT are asynchronous, and uses the TPL model for .NET applications (HTML and JavaScript Metro applications use a model based in promises, which is similar concept). In the case of S3, the http Authorization header is used for two purposes, authenticating clients and make sure the messages were not altered while they were in transit. For doing that, it uses a signature or hash of the message content and some of the headers using a symmetric key (That's just one of the available mechanisms). Windows Azure for example also uses the same mechanism in many of its APIs. There are three challenges that any developer working for first time in Metro will have to face to consume S3, the new WinRT APIs, the asynchronous nature of them and the complexity introduced for generating the Authorization header. Having said that, I decided to write this post with some of the gotchas I found myself trying to consume this Amazon service. 1. Generating the signature for the Authorization header All the cryptography APIs in WinRT are available under Windows.Security.Cryptography namespace. Many of operations available in these APIs uses the concept of buffers (IBuffer) for representing a chunk of binary data. As you will see in the example below, these buffers are mainly generated with the use of static methods in a WinRT class CryptographicBuffer available as part of the namespace previously mentioned. private string DeriveAuthToken(string resource, string httpMethod, string timestamp) { var stringToSign = string.Format("{0}\n" + "\n" + "\n" + "\n" + "x-amz-date:{1}\n" + "/{2}/", httpMethod, timestamp, resource); var algorithm = MacAlgorithmProvider.OpenAlgorithm("HMAC_SHA1"); var keyMaterial = CryptographicBuffer.CreateFromByteArray(Encoding.UTF8.GetBytes(this.secret)); var hmacKey = algorithm.CreateKey(keyMaterial); var signature = CryptographicEngine.Sign( hmacKey, CryptographicBuffer.CreateFromByteArray(Encoding.UTF8.GetBytes(stringToSign)) ); return CryptographicBuffer.EncodeToBase64String(signature); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The algorithm that determines the information or content you need to use for generating the signature is very well described as part of the AWS documentation. In this case, this method is generating a signature required for creating a new bucket. A HmacSha1 hash is computed using a secret or symetric key provided by AWS in the management console. 2. Sending an Http Request to the S3 service WinRT also ships with the System.Net.Http.HttpClient that was first introduced some months ago with ASP.NET Web API. This client provides a rich interface on top the traditional WebHttpRequest class, and also solves some of limitations found in this last one. There are a few things that don't work with a raw WebHttpRequest such as setting the Host header, which is something absolutely required for consuming S3. Also, HttpClient is more friendly for doing unit tests, as it receives a HttpMessageHandler as part of the constructor that can fake to emulate a real http call. This is how the code for consuming the service with HttpClient looks like, public async Task<S3Response> CreateBucket(string name, string region = null, params string[] acl) { var timestamp = string.Format("{0:r}", DateTime.UtcNow); var auth = DeriveAuthToken(name, "PUT", timestamp); var request = new HttpRequestMessage(HttpMethod.Put, "http://s3.amazonaws.com/"); request.Headers.Host = string.Format("{0}.s3.amazonaws.com", name); request.Headers.TryAddWithoutValidation("Authorization", "AWS " + this.key + ":" + auth); request.Headers.Add("x-amz-date", timestamp); var client = new HttpClient(); var response = await client.SendAsync(request); return new S3Response { Succeed = response.StatusCode == HttpStatusCode.OK, Message = (response.Content != null) ? await response.Content.ReadAsStringAsync() : null }; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } You will notice a few additional things in this code. By default, HttpClient validates the values for some well-know headers, and Authorization is one of them. It won't allow you to set a value with ":" on it, which is something that S3 expects. However, that's not a problem at all, as you can skip the validation by using the TryAddWithoutValidation method. Also, the code is heavily relying on the new async and await keywords to transform all the asynchronous calls into synchronous ones. In case you would want to unit test this code and faking the call to the real S3 service, you should have to modify it to inject a custom HttpMessageHandler into the HttpClient. The following implementation illustrates this concept, In case you would want to unit test this code and faking the call to the real S3 service, you should have to modify it to inject a custom HttpMessageHandler into the HttpClient. The following implementation illustrates this concept, public class FakeHttpMessageHandler : HttpMessageHandler { HttpResponseMessage response; public FakeHttpMessageHandler(HttpResponseMessage response) { this.response = response; } protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) { var tcs = new TaskCompletionSource<HttpResponseMessage>(); tcs.SetResult(response); return tcs.Task; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } You can use this handler for injecting any response while you are unit testing the code.

Read the article
Consuming the Amazon S3 service from a Win8 Metro Application

- by cibrax

As many of the existing Http APIs for Cloud Services, AWS also provides a set of different platform SDKs for hiding many of complexities present in the APIs. While there is a platform SDK for .NET, which is open source and available in C#, that SDK does not work in Win8 Metro Applications for the changes introduced in WinRT. WinRT offers a complete different set of APIs for doing I/O operations such as doing http calls or using cryptography for signing or encrypting data, two aspects that are absolutely necessary for consuming AWS. All the I/O APIs available as part of WinRT are asynchronous, and uses the TPL model for .NET applications (HTML and JavaScript Metro applications use a model based in promises, which is similar concept). In the case of S3, the http Authorization header is used for two purposes, authenticating clients and make sure the messages were not altered while they were in transit. For doing that, it uses a signature or hash of the message content and some of the headers using a symmetric key (That's just one of the available mechanisms). Windows Azure for example also uses the same mechanism in many of its APIs. There are three challenges that any developer working for first time in Metro will have to face to consume S3, the new WinRT APIs, the asynchronous nature of them and the complexity introduced for generating the Authorization header. Having said that, I decided to write this post with some of the gotchas I found myself trying to consume this Amazon service. 1. Generating the signature for the Authorization header All the cryptography APIs in WinRT are available under Windows.Security.Cryptography namespace. Many of operations available in these APIs uses the concept of buffers (IBuffer) for representing a chunk of binary data. As you will see in the example below, these buffers are mainly generated with the use of static methods in a WinRT class CryptographicBuffer available as part of the namespace previously mentioned. private string DeriveAuthToken(string resource, string httpMethod, string timestamp) { var stringToSign = string.Format("{0}\n" + "\n" + "\n" + "\n" + "x-amz-date:{1}\n" + "/{2}/", httpMethod, timestamp, resource); var algorithm = MacAlgorithmProvider.OpenAlgorithm("HMAC_SHA1"); var keyMaterial = CryptographicBuffer.CreateFromByteArray(Encoding.UTF8.GetBytes(this.secret)); var hmacKey = algorithm.CreateKey(keyMaterial); var signature = CryptographicEngine.Sign( hmacKey, CryptographicBuffer.CreateFromByteArray(Encoding.UTF8.GetBytes(stringToSign)) ); return CryptographicBuffer.EncodeToBase64String(signature); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } The algorithm that determines the information or content you need to use for generating the signature is very well described as part of the AWS documentation. In this case, this method is generating a signature required for creating a new bucket. A HmacSha1 hash is computed using a secret or symetric key provided by AWS in the management console. 2. Sending an Http Request to the S3 service WinRT also ships with the System.Net.Http.HttpClient that was first introduced some months ago with ASP.NET Web API. This client provides a rich interface on top the traditional WebHttpRequest class, and also solves some of limitations found in this last one. There are a few things that don't work with a raw WebHttpRequest such as setting the Host header, which is something absolutely required for consuming S3. Also, HttpClient is more friendly for doing unit tests, as it receives a HttpMessageHandler as part of the constructor that can fake to emulate a real http call. This is how the code for consuming the service with HttpClient looks like, public async Task<S3Response> CreateBucket(string name, string region = null, params string[] acl) { var timestamp = string.Format("{0:r}", DateTime.UtcNow); var auth = DeriveAuthToken(name, "PUT", timestamp); var request = new HttpRequestMessage(HttpMethod.Put, "http://s3.amazonaws.com/"); request.Headers.Host = string.Format("{0}.s3.amazonaws.com", name); request.Headers.TryAddWithoutValidation("Authorization", "AWS " + this.key + ":" + auth); request.Headers.Add("x-amz-date", timestamp); var client = new HttpClient(); var response = await client.SendAsync(request); return new S3Response { Succeed = response.StatusCode == HttpStatusCode.OK, Message = (response.Content != null) ? await response.Content.ReadAsStringAsync() : null }; } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } You will notice a few additional things in this code. By default, HttpClient validates the values for some well-know headers, and Authorization is one of them. It won't allow you to set a value with ":" on it, which is something that S3 expects. However, that's not a problem at all, as you can skip the validation by using the TryAddWithoutValidation method. Also, the code is heavily relying on the new async and await keywords to transform all the asynchronous calls into synchronous ones. In case you would want to unit test this code and faking the call to the real S3 service, you should have to modify it to inject a custom HttpMessageHandler into the HttpClient. The following implementation illustrates this concept, In case you would want to unit test this code and faking the call to the real S3 service, you should have to modify it to inject a custom HttpMessageHandler into the HttpClient. The following implementation illustrates this concept, public class FakeHttpMessageHandler : HttpMessageHandler { HttpResponseMessage response; public FakeHttpMessageHandler(HttpResponseMessage response) { this.response = response; } protected override Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, System.Threading.CancellationToken cancellationToken) { var tcs = new TaskCompletionSource<HttpResponseMessage>(); tcs.SetResult(response); return tcs.Task; } } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } You can use this handler for injecting any response while you are unit testing the code.

Read the article
Use a Fake Http Channel to Unit Test with HttpClient

- by Steve Michelotti

Applications get data from lots of different sources. The most common is to get data from a database or a web service. Typically, we encapsulate calls to a database in a Repository object and we create some sort of IRepository interface as an abstraction to decouple between layers and enable easier unit testing by leveraging faking and mocking. This works great for database interaction. However, when consuming a RESTful web service, this is is not always the best approach. The WCF Web APIs that are available on CodePlex (current drop is Preview 3) provide a variety of features to make building HTTP REST services more robust. When you download the latest bits, you’ll also find a new HttpClient which has been updated for .NET 4.0 as compared to the one that shipped for 3.5 in the original REST Starter Kit. The HttpClient currently provides the best API for consuming REST services on the .NET platform and the WCF Web APIs provide a number of extension methods which extend HttpClient and make it even easier to use. Let’s say you have a client application that is consuming an HTTP service – this could be Silverlight, WPF, or any UI technology but for my example I’ll use an MVC application: 1: using System; 2: using System.Net.Http; 3: using System.Web.Mvc; 4: using FakeChannelExample.Models; 5: using Microsoft.Runtime.Serialization; 6: 7: namespace FakeChannelExample.Controllers 8: { 9: public class HomeController : Controller 10: { 11: private readonly HttpClient httpClient; 12: 13: public HomeController(HttpClient httpClient) 14: { 15: this.httpClient = httpClient; 16: } 17: 18: public ActionResult Index() 19: { 20: var response = httpClient.Get("Person(1)"); 21: var person = response.Content.ReadAsDataContract<Person>(); 22: 23: this.ViewBag.Message = person.FirstName + " " + person.LastName; 24: 25: return View(); 26: } 27: } 28: } On line #20 of the code above you can see I’m performing an HTTP GET request to a Person resource exposed by an HTTP service. On line #21, I use the ReadAsDataContract() extension method provided by the WCF Web APIs to serialize to a Person object. In this example, the HttpClient is being passed into the constructor by MVC’s dependency resolver – in this case, I’m using StructureMap as an IoC and my StructureMap initialization code looks like this: 1: using StructureMap; 2: using System.Net.Http; 3: 4: namespace FakeChannelExample 5: { 6: public static class IoC 7: { 8: public static IContainer Initialize() 9: { 10: ObjectFactory.Initialize(x => 11: { 12: x.For<HttpClient>().Use(() => new HttpClient("http://localhost:31614/")); 13: }); 14: return ObjectFactory.Container; 15: } 16: } 17: } My controller code currently depends on a concrete instance of the HttpClient. Now I *could* create some sort of interface and wrap the HttpClient in this interface and use that object inside my controller instead – however, there are a few why reasons that is not desirable: For one thing, the API provided by the HttpClient provides nice features for dealing with HTTP services. I don’t really *want* these to look like C# RPC method calls – when HTTP services have REST features, I may want to inspect HTTP response headers and hypermedia contained within the message so that I can make intelligent decisions as to what to do next in my workflow (although I don’t happen to be doing these things in my example above) – this type of workflow is common in hypermedia REST scenarios. If I just encapsulate HttpClient behind some IRepository interface and make it look like a C# RPC method call, it will become difficult to take advantage of these types of things. Second, it could get pretty mind-numbing to have to create interfaces all over the place just to wrap the HttpClient. Then you’re probably going to have to hard-code HTTP knowledge into your code to formulate requests rather than just “following the links” that the hypermedia in a message might provide. Third, at first glance it might appear that we need to create an interface to facilitate unit testing, but actually it’s unnecessary. Even though the code above is dependent on a concrete type, it’s actually very easy to fake the data in a unit test. The HttpClient provides a Channel property (of type HttpMessageChannel) which allows you to create a fake message channel which can be leveraged in unit testing. In this case, what I want is to be able to write a unit test that just returns fake data. I also want this to be as re-usable as possible for my unit testing. I want to be able to write a unit test that looks like this: 1: [TestClass] 2: public class HomeControllerTest 3: { 4: [TestMethod] 5: public void Index() 6: { 7: // Arrange 8: var httpClient = new HttpClient("http://foo.com"); 9: httpClient.Channel = new FakeHttpChannel<Person>(new Person { FirstName = "Joe", LastName = "Blow" }); 10: 11: HomeController controller = new HomeController(httpClient); 12: 13: // Act 14: ViewResult result = controller.Index() as ViewResult; 15: 16: // Assert 17: Assert.AreEqual("Joe Blow", result.ViewBag.Message); 18: } 19: } Notice on line #9, I’m setting the Channel property of the HttpClient to be a fake channel. I’m also specifying the fake object that I want to be in the response on my “fake” Http request. I don’t need to rely on any mocking frameworks to do this. All I need is my FakeHttpChannel. The code to do this is not complex: 1: using System; 2: using System.IO; 3: using System.Net.Http; 4: using System.Runtime.Serialization; 5: using System.Threading; 6: using FakeChannelExample.Models; 7: 8: namespace FakeChannelExample.Tests 9: { 10: public class FakeHttpChannel<T> : HttpClientChannel 11: { 12: private T responseObject; 13: 14: public FakeHttpChannel(T responseObject) 15: { 16: this.responseObject = responseObject; 17: } 18: 19: protected override HttpResponseMessage Send(HttpRequestMessage request, CancellationToken cancellationToken) 20: { 21: return new HttpResponseMessage() 22: { 23: RequestMessage = request, 24: Content = new StreamContent(this.GetContentStream()) 25: }; 26: } 27: 28: private Stream GetContentStream() 29: { 30: var serializer = new DataContractSerializer(typeof(T)); 31: Stream stream = new MemoryStream(); 32: serializer.WriteObject(stream, this.responseObject); 33: stream.Position = 0; 34: return stream; 35: } 36: } 37: } The HttpClientChannel provides a Send() method which you can override to return any HttpResponseMessage that you want. You can see I’m using the DataContractSerializer to serialize the object and write it to a stream. That’s all you need to do. In the example above, the only thing I’ve chosen to do is to provide a way to return different response objects. But there are many more features you could add to your own re-usable FakeHttpChannel. For example, you might want to provide the ability to add HTTP headers to the message. You might want to use a different serializer other than the DataContractSerializer. You might want to provide custom hypermedia in the response as well as just an object or set HTTP response codes. This list goes on. This is the just one example of the really cool features being added to the next version of WCF to enable various HTTP scenarios. The code sample for this post can be downloaded here.

Read the article
PASS: Election Changes for 2011

- by Bill Graziano

Last year after the election, the PASS Board created an Election Review Committee. This group was charged with reviewing our election procedures and making suggestions to improve the process. You can read about the formation of the group and review some of the intermediate work on the site – especially in the forums. I was one of the members of the group along with Joe Webb (Chair), Lori Edwards, Brian Kelley, Wendy Pastrick, Andy Warren and Allen White. This group worked from October to April on our election process. Along the way we: Interviewed interested parties including former NomCom members, Board candidates and anyone else that came forward. Held a session at the Summit to allow interested parties to discuss the issues Had numerous conference calls and worked through the various topics I can’t thank these people enough for the work they did. They invested a tremendous number of hours thinking, talking and writing about our elections. I’m proud to say I was a member of this group and thoroughly enjoyed working with everyone (even if I did finally get tired of all the calls.) The ERC delivered their recommendations to the PASS Board prior to our May Board meeting. We reviewed those and made a few modifications. I took their recommendations and rewrote them as procedures while incorporating those changes. Their original recommendations as well as our final document are posted at the ERC documents page. Please take a second and read them BEFORE we start the elections. If you have any questions please post them in the forums on the ERC site. (My final document includes a change log at the end that I decided to leave in. If you want to know which areas to pay special attention to that’s a good start.) Many of those recommendations were already posted in the forums or in the blogs of individual ERC members. Hopefully nothing in the ERC document is too surprising. In this post I’m going to walk through some of the key changes and talk about what I remember from both ERC and Board discussions. I’ll pay a little extra attention to things the Board changed from the ERC. I’d also encourage any of the Board or ERC members to blog their thoughts on this. The Nominating Committee will continue to exist. Personally, I was curious to see what the non-Board ERC members would think about the NomCom. There was broad agreement that a group to vet candidates had value to the organization. The NomCom will be composed of five members. Two will be Board members and three will be from the membership at large. The only requirement for the three community members is that you’ve volunteered in some way (and volunteering is defined very broadly). We expect potential at-large NomCom members to participate in a forum on the PASS site to answer questions from the other PASS members. We’re going to hold an election to determine the three community members. It will be closer to voting for Summit sessions than voting for Board members. That means there won’t be multiple dedicated emails. If you’re at all paying attention it will be easy to participate. Personally I wanted it easy for those that cared to participate but not overwhelm those that didn’t care. I think this strikes a good balance. There’s also a clause that in order to be considered a winner in this NomCom election, you must receive 10 votes. This is something I suggested. I have no idea how popular the NomCom election is going to be. I just wanted a fallback that if no one participated and some random person got in with one or two votes. Any open slots will be filled by the NomCom chair (usually the PASS Immediate Past President). My assumption is that they would probably take the next highest vote getters unless they were throwing flames in the forums or clearly unqualified. As a final check, the Board still approves the final NomCom. The NomCom is going to rank candidates instead of rating them. This has interesting implications. This was championed by another ERC member and I’m hoping they write something about it. This will really force the NomCom to make decisions between candidates. You can’t just rate everyone a 3 and be done with it. It may also make candidates appear further apart than they actually are. I’m looking forward talking with the NomCom after this election and getting their feedback on this. The PASS Board added an option to remove a candidate with a unanimous vote of the NomCom. This was primarily put in place to handle people that lied on their application or had a criminal background or some other unusual situation and we figured it out. We list an explicit goal of three candidate per open slot. We also wanted an easy way to find the NomCom candidate rankings from the ballot. Hopefully this will satisfy those that want a broad candidate pool and those that want the NomCom to identify the most qualified candidates. The primary spokesperson for the NomCom is the committee chair. After the issues around the election last year we didn’t have a good communication plan in place. We should have and that was a failure on the part of the Board. If there is criticism of the election this year I hope that falls squarely on the Board. The community members of the NomCom shouldn’t be fielding complaints over the election process. That said, the NomCom is ranking candidates and we are forcing them to rank some lower than others. I’m sure you’ll each find someone that you think should have been ranked differently. I also want to highlight one other change to the process that we started last year and isn’t included in these documents. I think the candidate forums on the PASS site were tremendously helpful last year in helping people to find out more about candidates. That gives our members a way to ask hard questions of the candidates and publicly see their answers. This year we have two important groups to fill. The first is the NomCom. We need three people from our membership to step up and fill this role. It won’t be easy. You will have to make subjective rankings of your fellow community members. Your actions will be important in deciding who the future leaders of PASS will be. There’s a 50/50 chance that one of the people you interview will be the President of PASS someday. This is not a responsibility to be taken lightly. The second is the slate of candidates. If you’ve ever thought about running for the Board this is the year. We’ve never had nine candidates on the ballot before. Your chance of making it through the NomCom are higher than in any previous year. Unfortunately the more of you that run, the more of you that will lose in the election. And hopefully that competition will mean more community involvement and better Board members for PASS. Is this the end of changes to the election process? It isn’t. Every year that I’ve been on the Board the election process has changed. Some years there have been small changes and some years there have been large changes. After this election we’ll look at how the process worked and decide what steps to take – just like we do every year.

Read the article
Exception Handling And Other Contentious Political Topics

- by Justin Jones

So about three years ago, around the time of my last blog post, I promised a friend I would write this post. Keeping promises is a good thing, and this is my first step towards easing back into regular blogging. I fully expect him to return from Pennsylvania to buy me a beer over this. However, it’s been an… ahem… eventful three years or so, and blogging, unfortunately, got pushed to the back burner on my priority list, along with a few other career minded activities. Now that the personal drama of the past three years is more or less resolved, it’s time to put a few things back on the front burner. What I consider to be proper exception handling practices is relatively well known these days. There are plenty of blog posts out there already on this topic which more or less echo my opinions on this topic. I’ll try to include a few links at the bottom of the post. Several years ago I had an argument with a co-worker who posited that exceptions should be caught at every level and logged. This might seem like sanity on the surface, but the resulting error log looked something like this: Error: System.SomeException Followed by small stack trace. Error: System.SomeException Followed by slightly bigger stack trace. Error: System.SomeException Followed by slightly bigger stack trace. Error: System.SomeException Followed by slightly bigger stack trace. Error: System.SomeException Followed by slightly bigger stack trace. Error: System.SomeException Followed by slightly bigger stack trace. Error: System.SomeException Followed by slightly bigger stack trace. Error: System.SomeException Followed by slightly bigger stack trace. These were all the same exception. The problem with this approach is that the error log, if you run any kind of analytics on in, becomes skewed depending on how far up the stack trace your exception was thrown. To mitigate this problem, we came up with the concept of the “PreLoggedException”. Basically, we would log the exception at the very top level and subsequently throw the exception back up the stack encapsulated in this pre-logged type, which our logging system knew to ignore. Now the error log looked like this: Error: System.SomeException Followed by small stack trace. Much cleaner, right? Well, there’s still a problem. When your exception happens in production and you go about trying to figure out what happened, you’ve lost more or less all context for where and how this exception was thrown, because all you really know is what method it was thrown in, but really nothing about who was calling the method or why. What gives you this clue is the entire stack trace, which we’re losing here. I believe that was further mitigated by having the logging system pull a system stack trace and add it to the log entry, but what you’re actually getting is the stack for how you got to the logging code. You’re still losing context about the actual error. Not to mention you’re executing a whole slew of catch blocks which are sloooooooowwwww……… In other words, we started with a bad idea and kept band-aiding it until it didn’t suck quite so bad. When I argued for not catching exceptions at every level but rather catching them following a certain set of rules, my co-worker warned me “do yourself a favor, never express that view in any future interviews.” I suppose this is my ultimate dismissal of that advice, but I’m not too worried. My approach for exception handling follows three basic rules: Only catch an exception if 1. You can do something about it. 2. You can add useful information to it. 3. You’re at an application boundary. Here’s what that means: 1. Only catch an exception if you can do something about it. We’ll start with a trivial example of a login system that uses a file. Please, never actually do this in production code, it’s just concocted example. So if our code goes to open a file and the file isn’t there, we get a FileNotFound exception. If the calling code doesn’t know what to do with this, it should bubble up. However, if we know how to create the file from scratch we can create the file and continue on our merry way. When you run into situations like this though, What should really run through your head is “How can I avoid handling an exception at all?” In this case, it’s a trivial matter to simply check for the existence of the file before trying to open it. If we detect that the file isn’t there, we can accomplish the same thing without having to handle in in a catch block. 2. Only catch an exception if you can do something about it. Continuing with the poorly thought out file based login system we contrived in part 1, if the code calls a Login(…) method and the FileNotFound exception is thrown higher up the stack, the code that calls Login must account for a FileNotFound exception. This is kind of counterintuitive because the calling code should not need to know the internals of the Login method, and the data file is an implementation detail. What makes more sense, assuming that we didn’t implement any of the good advice from step 1, is for Login to catch the FileNotFound exception and wrap it in a new exception. For argument’s sake we’ll say LoginSystemFailureException. (Sorry, couldn’t think of anything better at the moment.) This gives us two stack traces, preserving the original stack trace in the inner exception, and also is much more informative to the calling code. 3. Only catch an exception if you’re at an application boundary. At some point we have to catch all the exceptions, even the ones we don’t know what to do with. WinForms, ASP.Net, and most other UI technologies have some kind of built in mechanism for catching unhandled exceptions without fatally terminating the application. It’s still a good idea to somehow gracefully exit the application in this case if possible though, because you can no longer be sure what state your application is in, but nothing annoys a user more than an application just exploding. These unhandled exceptions need to be logged, and this is a good place to catch them. Ideally you never want this option to be exercised, but code as though it will be. When you log these exceptions, give them a “Fatal” status (e.g. Log4Net) and make sure these bugs get handled in your next release. That’s it in a nutshell. If you do it right each exception will only get logged once and with the largest stack trace possible which will make those 2am emergency severity 1 debugging sessions much shorter and less frustrating. Here’s a few people who also have interesting things to say on this topic: http://blogs.msdn.com/b/ericlippert/archive/2008/09/10/vexing-exceptions.aspx http://www.codeproject.com/Articles/9538/Exception-Handling-Best-Practices-in-NET I know there’s more but I can’t find them at the moment.

Read the article
Recursion in the form of a Recursive Func<T, T>

- by ToStringTheory

I gotta admit, I am kind of surprised that I didn’t realize I could do this sooner. I recently had a problem which required a recursive function call to come up with the answer. After some time messing around with a recursive method, and creating an API that I was not happy with, I was able to create an API that I enjoy, and seems intuitive. Introduction To bring it to a simple example, consider the summation to n: A mathematically identical formula is: In a .NET function, this can be represented by a function: Func<int, int> summation = x => x*(x+1)/2 Calling summation with an input integer will yield the summation to that number: var sum10 = summation(4); //sum10 would be equal to 10 But what if I wanted to get a second level summation… First some to n, and then use that argument as the input to the same function, to find the second level summation: So as an easy example, calculate the summation to 3, which yields 6. Then calculate the summation to 6 which yields 21. Represented as a mathematical formula - So what if I wanted to represent this as .NET functions. I can always do: //using the summation formula from above var sum3 = summation(3); //sets sum3 to 6 var sum3_2 = summation(sum3); //sets sum3 to 21 I could always create a while loop to perform the calculations too: Func<int, int> summation = x => x*(x+1)/2; //for the interests of a smaller example, using shorthand int sumResultTo = 3; int level = 2; while(level-- > 0) { sumResultTo = summation(sumResultTo); } //sumResultTo is equal to 21 now. Or express it as a for-loop, method calls, etc… I really didn’t like any of the options that I tried. Then it dawned on me – since I was using a Func<T, T> anyways, why not use the Func’s output from one call as the input as another directly. Some Code So, I decided that I wanted a recursion class. Something that I would be generic and reusable in case I ever wanted to do something like this again. It is limited to only the Func<T1, T2> level of Func, and T1 must be the same as T2. The first thing in this class is a private field for the function: private readonly Func<T, T> _functionToRecurse; So, I since I want the function to be unchangeable, I have defined it as readonly. Therefore my constructor looks like: public Recursion(Func<T, T> functionToRecurse) { if (functionToRecurse == null) { throw new ArgumentNullException("functionToRecurse", "The function to recurse can not be null"); } _functionToRecurse = functionToRecurse; } Simple enough. If you have any questions, feel free to post them in the comments, and I will be sure to answer them. Next, I want enough. If be able to get the result of a function dependent on how many levels of recursion: private Func<T, T> GetXLevel(int level) { if (level < 1) { throw new ArgumentOutOfRangeException("level", level, "The level of recursion must be greater than 0"); } if (level == 1) return _functionToRecurse; return _GetXLevel(level - 1, _functionToRecurse); } So, if you pass in 1 for the level, you get just the Func<T,T> back. If you say that you want to go deeper down the rabbit hole, it calls a method which accepts the level it is at, and the function which it needs to use to recurse further: private Func<T, T> _GetXLevel(int level, Func<T, T> prevFunc) { if (level == 1) return y => prevFunc(_functionToRecurse(y)); return _GetXLevel(level - 1, y => prevFunc(_functionToRecurse(y))); } That is really all that is needed for this class. If I exposed the GetXLevel function publicly, I could use that to get the function for a level, and pass in the argument.. But I wanted something better. So, I used the ‘this’ array operator for the class: public Func<T,T> this[int level] { get { if (level < 1) { throw new ArgumentOutOfRangeException("level", level, "The level of recursion must be greater than 0"); } return this.GetXLevel(level); } } So, using the same example above of finding the second recursion of the summation of 3: var summator = new Recursion<int>(x => (x * (x + 1)) / 2); var sum_3_level2 = summator[2](3); //yields 21 You can even find just store the delegate to the second level summation, and use it multiple times: var summator = new Recursion<int>(x => (x * (x + 1)) / 2); var sum_level2 = summator[2]; var sum_3_level2 = sum_level2(3); //yields 21 var sum_4_level2 = sum_level2(4); //yields 55 var sum_5_level2 = sum_level2(5); //yields 120 Full Code Don’t think I was just going to hold off on the full file together and make you do the hard work… Copy this into a new class file: public class Recursion<T> { private readonly Func<T, T> _functionToRecurse; public Recursion(Func<T, T> functionToRecurse) { if (functionToRecurse == null) { throw new ArgumentNullException("functionToRecurse", "The function to recurse can not be null"); } _functionToRecurse = functionToRecurse; } public Func<T,T> this[int level] { get { if (level < 1) { throw new ArgumentOutOfRangeException("level", level, "The level of recursion must be greater than 0"); } return this.GetXLevel(level); } } private Func<T, T> GetXLevel(int level) { if (level < 1) { throw new ArgumentOutOfRangeException("level", level, "The level of recursion must be greater than 0"); } if (level == 1) return _functionToRecurse; return _GetXLevel(level - 1, _functionToRecurse); } private Func<T, T> _GetXLevel(int level, Func<T, T> prevFunc) { if (level == 1) return y => prevFunc(_functionToRecurse(y)); return _GetXLevel(level - 1, y => prevFunc(_functionToRecurse(y))); } } Conclusion The great thing about this class, is that it can be used with any function with same input/output parameters. I strived to find an implementation that I found clean and useful, and I finally settled on this. If you have feedback – good or bad, I would love to hear it!

Read the article
Different Not Automatically Implies Better

- by Alois Kraus

Originally posted on: http://geekswithblogs.net/akraus1/archive/2013/11/05/154556.aspxRecently I was digging deeper why some WCF hosted workflow application did consume quite a lot of memory although it did basically only load a xaml workflow. The first tool of choice is Process Explorer or even better Process Hacker (has more options and the best feature copy&paste does work). The three most important numbers of a process with regards to memory are Working Set, Private Working Set and Private Bytes. Working set is the currently consumed physical memory (parts can be shared between processes e.g. loaded dlls which are read only) Private Working Set is the physical memory needed by this process which is not shareable Private Bytes is the number of non shareable which is only visible in the current process (e.g. all new, malloc, VirtualAlloc calls do create private bytes) When you have a bigger workflow it can consume under 64 bit easily 500MB for a 1-2 MB xaml file. This does not look very scalable. Under 64 bit the issue is excessive private bytes consumption and not the managed heap. The picture is quite different for 32 bit which looks a bit strange but it seems that the hosted VB compiler is a lot less memory hungry under 32 bit. I did try to repro the issue with a medium sized xaml file (400KB) which does contain 1000 variables and 1000 if which can be represented by C# code like this: string Var1; string Var2; ... string Var1000; if (!String.IsNullOrEmpty(Var1) ) { Console.WriteLine(“Var1”); } if (!String.IsNullOrEmpty(Var2) ) { Console.WriteLine(“Var2”); } .... Since WF is based on VB.NET expressions you are bound to the hosted VB.NET compiler which does result in (x64) 140 MB of private bytes which is ca. 140 KB for each if clause which is quite a lot if you think about the actually present functionality. But there is hope. .NET 4.5 does allow now C# expressions for WF which is a major step forward for all C# lovers. I did create some simple patcher to “cross compile” my xaml to C# expressions. Lets look at the result: C# Expressions VB Expressions x86 x86 On my home machine I have only 32 bit which gives you quite exactly half of the memory consumption under 64 bit. C# expressions are 10 times more memory hungry than VB.NET expressions! I wanted to do more with less memory but instead it did consume a magnitude more memory. That is surprising to say the least. The workflow does initialize in about the same time under x64 and x86 where the VB code does it in 2s whereas the C# version needs 18s. Also nearly ten times slower. That is a too high price to pay for any bigger sized xaml workflow to convert from VB.NET to C# expressions. If I do reduce the number of expressions to 500 then it does need 400MB which is about half of the memory. It seems that the cost per if does rise linear with the number of total expressions in a xaml workflow. Expression Language Cost per IF Startup Time C# 1000 Ifs x64 1,5 MB 18s C# 500 Ifs x64 750 KB 9s VB 1000 Ifs x64 140 KB 2s VB 500 Ifs x64 70 KB 1s Now we can directly compare two MS implementations. It is clear that the VB.NET compiler uses the same underlying structure but it has much higher offset compared to the highly inefficient C# expression compiler. I have filed a connect bug here with a harsher wording about recent advances in memory consumption. The funniest thing is that one MS employee did give an Azure AppFabric demo around early 2011 which was so slow that he needed to investigate with xperf. He was after startup time and the call stacks with regards to VB.NET expression compilation were remarkably similar. In fact I only found this post by googling for parts of my call stacks. … “C# expressions will be coming soon to WF, and that will have different performance characteristics than VB” … What did he know Jan 2011 what I did no know until today? ;-). He knew that C# expression will come but that they will not be automatically have better footprint. It is about time to fix that. In its current state C# expressions are not usable for bigger workflows. That also explains the headline for today. You can cheat startup time by prestarting workflows so that the demo looks nice and snappy but it does hurt scalability a lot since you do need much more memory than necessary. I did find the stacks by enabling virtual allocation tracking within XPerf which is still the best tool out there. But first you need to look at your process to check where the memory is hiding: For the C# Expression compiler you do not need xperf. You can directly dump the managed heap and check with a profiler of your choice. But if the allocations are happening on the Private Data ( VirtualAlloc ) you can find it with xperf. There is a nice video on channel 9 explaining VirtualAlloc tracking it in greater detail. If your data allocations are on the Heap it does mean that the C/C++ runtime did create a heap for you where all malloc, new calls do allocate from it. You can enable heap tracing with xperf and full call stack support as well which is doable via xperf like it is shown also on channel 9. Or you can use WPRUI directly: To make “Heap Usage” it work you need to set for your executable the tracing flags (before you start it). For example devenv.exe HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\devenv.exe DWORD TracingFlags 1 Do not forget to disable it after you did complete profiling the process or it will impact the startup time quite a lot. You can with xperf attach directly to a running process and collect heap allocation information from a gone wild process. Very handy if you need to find out what a process was doing which has arrived in a funny state. “VirtualAlloc usage” does work without explicitly enabling stuff for a specific process and is always on machine wide. I had issues on my Windows 7 machines with the call stack collection and the latest Windows 8.1 Performance Toolkit. I was told that WPA from Windows 8.0 should work fine but I do not want to downgrade.

Read the article
We've completed the first iteration

- by CliveT

There are a lot of features in C# that are implemented by the compiler and not by the underlying platform. One such feature is a lambda expression. Since local variables cannot be accessed once the current method activation finishes, the compiler has to go out of its way to generate a new class which acts as a home for any variable whose lifetime needs to be extended past the activation of the procedure. Take the following example: Random generator = new Random(); Func func = () = generator.Next(10); In this case, the compiler generates a new class called c_DisplayClass1 which is marked with the CompilerGenerated attribute. [CompilerGenerated] private sealed class c__DisplayClass1 { // Fields public Random generator; // Methods public int b__0() { return this.generator.Next(10); } } Two quick comments on this: (i) A display was the means that compilers for languages like Algol recorded the various lexical contours of the nested procedure activations on the stack. I imagine that this is what has led to the name. (ii) It is a shame that the same attribute is used to mark all compiler generated classes as it makes it hard to figure out what they are being used for. Indeed, you could imagine optimisations that the runtime could perform if it knew that classes corresponded to certain high level concepts. We can see that the local variable generator has been turned into a field in the class, and the body of the lambda expression has been turned into a method of the new class. The code that builds the Func object simply constructs an instance of this class and initialises the fields to their initial values. c__DisplayClass1 class2 = new c__DisplayClass1(); class2.generator = new Random(); Func func = new Func(class2.b__0); Reflector already contains code to spot this pattern of code and reproduce the form containing the lambda expression, so this is example is correctly decompiled. The use of compiler generated code is even more spectacular in the case of iterators. C# introduced the idea of a method that could automatically store its state between calls, so that it can pick up where it left off. The code can express the logical flow with yield return and yield break denoting places where the method should return a particular value and be prepared to resume. { yield return 1; yield return 2; yield return 3; } Of course, there was already a .NET pattern for expressing the idea of returning a sequence of values with the computation proceeding lazily (in the sense that the work for the next value is executed on demand). This is expressed by the IEnumerable interface with its Current property for fetching the current value and the MoveNext method for forcing the computation of the next value. The sequence is terminated when this method returns false. The C# compiler links these two ideas together so that an IEnumerator returning method using the yield keyword causes the compiler to produce the implementation of an Iterator. Take the following piece of code. IEnumerable GetItems() { yield return 1; yield return 2; yield return 3; } The compiler implements this by defining a new class that implements a state machine. This has an integer state that records which yield point we should go to if we are resumed. It also has a field that records the Current value of the enumerator and a field for recording the thread. This latter value is used for optimising the creation of iterator instances. [CompilerGenerated] private sealed class d__0 : IEnumerable, IEnumerable, IEnumerator, IEnumerator, IDisposable { // Fields private int 1__state; private int 2__current; public Program 4__this; private int l__initialThreadId; The body gets converted into the code to construct and initialize this new class. private IEnumerable GetItems() { d__0 d__ = new d__0(-2); d__.4__this = this; return d__; } When the class is constructed we set the state, which was passed through as -2 and the current thread. public d__0(int 1__state) { this.1__state = 1__state; this.l__initialThreadId = Thread.CurrentThread.ManagedThreadId; } The state needs to be set to 0 to represent a valid enumerator and this is done in the GetEnumerator method which optimises for the usual case where the returned enumerator is only used once. IEnumerator IEnumerable.GetEnumerator() { if ((Thread.CurrentThread.ManagedThreadId == this.l__initialThreadId) && (this.1__state == -2)) { this.1__state = 0; return this; } The state machine itself is implemented inside the MoveNext method. private bool MoveNext() { switch (this.1__state) { case 0: this.1__state = -1; this.2__current = 1; this.1__state = 1; return true; case 1: this.1__state = -1; this.2__current = 2; this.1__state = 2; return true; case 2: this.1__state = -1; this.2__current = 3; this.1__state = 3; return true; case 3: this.1__state = -1; break; } return false; } At each stage, the current value of the state is used to determine how far we got, and then we generate the next value which we return after recording the next state. Finally we return false from the MoveNext to signify the end of the sequence. Of course, that example was really simple. The original method body didn't have any local variables. Any local variables need to live between the calls to MoveNext and so they need to be transformed into fields in much the same way that we did in the case of the lambda expression. More complicated MoveNext methods are required to deal with resources that need to be disposed when the iterator finishes, and sometimes the compiler uses a temporary variable to hold the return value. Why all of this explanation? We've implemented the de-compilation of iterators in the current EAP version of Reflector (7). This contrasts with previous version where all you could do was look at the MoveNext method and try to figure out the control flow. There's a fair amount of things we have to do. We have to spot the use of a CompilerGenerated class which implements the Enumerator pattern. We need to go to the class and figure out the fields corresponding to the local variables. We then need to go to the MoveNext method and try to break it into the various possible states and spot the state transitions. We can then take these pieces and put them back together into an object model that uses yield return to show the transition points. After that Reflector can carry on optimising using its usual optimisations. The pattern matching is currently a little too sensitive to changes in the code generation, and we only do a limited analysis of the MoveNext method to determine use of the compiler generated fields. In some ways, it is a pity that iterators are compiled away and there is no metadata that reflects the original intent. Without it, we are always going to dependent on our knowledge of the compiler's implementation. For example, we have noticed that the Async CTP changes the way that iterators are code generated, so we'll have to do some more work to support that. However, with that warning in place, we seem to do a reasonable job of decompiling the iterators that are built into the framework. Hopefully, the EAP will give us a chance to find examples where we don't spot the pattern correctly or regenerate the wrong code, and we can improve things. Please give it a go, and report any problems.

Read the article
Azure Task Scheduling Options

- by charlie.mott

Currently, the Azure PaaS does not offer a distributed\resilient task scheduling service. If you do want to host a task scheduling product\solution off-premise (and ideally use Azure), what are your options? PaaS Option 1: Worker Roles Use a worker role to schedule and execute actions at specific time periods. There are a few frameworks available to assist with this: http://azuretoolkit.codeplex.com https://github.com/Lokad/lokad-cloud/wiki/TaskScheduler http://blog.smarx.com/posts/building-a-task-scheduler-in-windows-azure - This addresses a slightly different set of requirements. It’s a more dynamic approach for queuing up tasks, but not repeatable tasks (e.g. daily). I found the Azure Toolkit option the most simple to implement. Step 1 : Create a domain entity implementing IJob for each job to schedule. In this sample, I asynchronously call a WCF service method. 1: namespace Acme.WorkerRole.Jobs 2: { 3: using AzureToolkit; 4: using ScheduledTasksService; 5: 6: public class UploadEmployeesJob : IJob 7: { 8: public void Run() 9: { 10: // Call Tasks Service 11: var client = new ScheduledTasksServiceClient("BasicHttpBinding_IScheduledTasksService"); 12: client.UploadEmployees(); 13: client.Close(); 14: } 15: } 16: } Step 2 : In the worker role run method, add the jobs to the toolkit engine. 1: namespace Acme.WorkerRole 2: { 3: using AzureToolkit.Engine; 4: using Jobs; 5: 6: public class WorkerRole : WorkerRoleEntryPoint 7: { 8: public override void Run() 9: { 10: var engine = new CloudEngine(); 11: 12: // Add Scheduled Jobs (using CronJob syntax - see http://www.adminschoice.com/crontab-quick-reference). 13: 14: // 1. Upload Employee job - 8.00 PM every weekday (Mon-Fri) 15: engine.WithJobScheduler().ScheduleJob<UploadEmployeesJob>(c => { c.CronSchedule = "0 20 * * 1-5"; }); 16: // 2. Purge Data job - 10 AM every Saturday 17: engine.WithJobScheduler().ScheduleJob<PurgeDataJob>(c => { c.CronSchedule = "0 10 * * 6"; }); 18: // 3. Process Exceptions job - Every 5 minutes 19: engine.WithJobScheduler().ScheduleJob<ProcessExceptionsJob>(c => { c.CronSchedule = "*/5 * * * *"; }); 20: 21: engine.Run(); 22: base.Run(); 23: } 24: } 25: } Pros Cons Azure Toolkit option is simple to implement. For the AzureToolkit option, you are limited to a single worker role. Otherwise, the jobs will be executed multiple times, once for each worker role instance. Paying for a continuously running worker role, even if it just processes a single job once a week. If you only have a few scheduled tasks to run calling asynchronous services hosted in different web roles, an extra small worker role likely to be sufficient. However, for an extra small worker role this still costs $14.40/month (03/09/2012). Option 2: Use Scheduled Task on Azure Web Role calling a console app Setup a Windows Scheduled Task on the Azure Web Role. This calls a console application that calls the WCF service methods that run the task actions. This design is described here: http://www.ronaldwidha.net/2011/02/23/cron-job-on-azure-using-scheduled-task-on-a-web-role-to-replace-azure-worker-role-for-background-job/ http://www.voiceoftech.com/swhitley/index.php/2011/07/windows-azure-task-scheduler/ http://devlicio.us/blogs/vinull/archive/2011/10/23/moving-to-azure-worker-roles-for-nothing-and-tasks-for-free.aspx Pros Cons Fairly easy to implement. Supportability - I RDC’ed onto the Azure server and stopped the scheduled task. I then rebooted the machine and the task was re-started. I also tried deleting the task and rebooting, the same thing occurred. The only way to permanently guarantee that a task is disabled is to do a fresh deployment. I think this is a major supportability concern. Saleability - multiple instances would trigger multiple tasks. You can only have one instance for the scheduled task web role. The guidance implements setup of the scheduled task as part of a web role instance. But if you have more than one instance in a web role, the task will be triggered multiple times for each scheduled action (once per machine). Workaround: If we wanted to use scheduled tasks for another client with a saleable WCF service, then we could include the console & tasks scripts in a separate web role (e.g. a empty WCF service with no real purpose to it). SaaS Option 3: Azure Marketplace I thought that someone might be offering this type of service via the Azure marketplace. At the point of writing this blog post, I did not find anyone doing so. https://datamarket.azure.com/ Pros Cons Nobody currently offers this on the Azure Marketplace. Option 4: Online Job Scheduling Service Provider There are plenty of online providers that offer this type of service on a pay-as-you-go approach. Some of these are free for small usage. Many of these providers are listed here: http://en.wikipedia.org/wiki/Webcron Pros Cons No bespoke development for scheduler. Reliance on third party. IaaS Option 5: Setup Scheduling Software on Azure IaaS VM’s One of job scheduling software offerings could be installed and configured on Azure VM’s. A list of software options is listed here: http://en.wikipedia.org/wiki/List_of_job_scheduler_software Pros Cons Enterprise distributed\resilient task scheduling service VM Setup and maintenance Software Licence Costs Option 6: VM Gallery A the time of writing this blog post, I did not spot a VM in the gallery that included pre-installation of any of the above software options. Pros Cons No current VM template. Summary For my current project that had a small handful of tasks to schedule with a limited project budget I chose option 1 (a worker role using the Azure Toolkit to schedule tasks). If I was building an enterprise scale solution for the future, options 4 and 5 are currently worthy of consideration. Hopefully, Microsoft will include tasks scheduling in the future as part of their PaaS offerings.

Read the article
Restructuring a large Chrome Extension/WebApp

- by A.M.K

I have a very complex Chrome Extension that has gotten too large to maintain in its current format. I'd like to restructure it, but I'm 15 and this is the first webapp or extension of it's type I've built so I have no idea how to do it. TL;DR: I have a large/complex webapp I'd like to restructure and I don't know how to do it. Should I follow my current restructure plan (below)? Does that sound like a good starting point, or is there a different approach that I'm missing? Should I not do any of the things I listed? While it isn't relevant to the question, the actual code is on Github and the extension is on the webstore. The basic structure is as follows: index.html <html> <head> <link href="css/style.css" rel="stylesheet" />  <link href="css/widgets.css" rel="stylesheet" />  </head> <body class="unloaded">  <div class="tab-container" tabindex="-1">  </div>  <template id="template.toolbar">  </template>   <script type="text/javascript" src="js/plugins.js"></script>  <script type="text/javascript" src="js/widgets.js"></script>  <script type="text/javascript" src="js/script.js"></script> </body> </html> widgets.js (initLog || (window.initLog = [])).push([new Date().getTime(), "A log is kept during page load so performance can be analyzed and errors pinpointed"]); // Widgets are stored in an object and extended (with jQuery, but I'll probably switch to underscore if using Backbone) as necessary var Widgets = { 1: { // Widget ID, this is set here so widgets can be retreived by ID id: 1, // Widget ID again, this is used after the widget object is duplicated and detached size: 3, // Default size, medium in this case order: 1, // Order shown in "store" name: "Weather", // Widget name interval: 300000, // Refresh interval nicename: "weather", // HTML and JS safe widget name sizes: ["tiny", "small", "medium"], // Available widget sizes desc: "Short widget description", settings: [ { // Widget setting specifications stored as an array of objects. These are used to dynamically generate widget setting popups. type: "list", nicename: "location", label: "Location(s)", placeholder: "Enter a location and press Enter" } ], config: { // Widget settings as stored in the tabs object (see script.js for storage information) size: "medium", location: ["San Francisco, CA"] }, data: {}, // Cached widget data stored locally, this lets it work offline customFunc: function(cb) {}, // Widgets can optionally define custom functions in any part of their object refresh: function() {}, // This fetches data from the web and caches it locally in data, then calls render. It gets called after the page is loaded for faster loads render: function() {} // This renders the widget only using information from data, it's called on page load. } }; script.js (initLog || (window.initLog = [])).push([new Date().getTime(), "These are also at the end of every file"]); // Plugins, extends and globals go here. i.e. Number.prototype.pad = .... var iChrome = function(refresh) { // The main iChrome init, called with refresh when refreshing to not re-run libs iChrome.Status.log("Starting page generation"); // From now on iChrome.Status.log is defined, it's used in place of the initLog iChrome.CSS(); // Dynamically generate CSS based on settings iChrome.Tabs(); // This takes the tabs stored in the storage (see fetching below) and renders all columns and widgets as necessary iChrome.Status.log("Tabs rendered"); // These will be omitted further along in this excerpt, but they're used everywhere // Checks for justInstalled => show getting started are run here /* The main init runs the bare minimum required to display the page, this sets all non-visible or instantly need things (such as widget dragging) on a timeout */ iChrome.deferredTimeout = setTimeout(function() { iChrome.deferred(refresh); // Pass refresh along, see above }, 200); }; iChrome.deferred = function(refresh) {}; // This calls modules one after the next in the appropriate order to finish rendering the page iChrome.Search = function() {}; // Modules have a base init function and are camel-cased and capitalized iChrome.Search.submit = function(val) {}; // Methods within modules are camel-cased and not capitalized /* Extension storage is async and fetched at the beginning of plugins.js, it's then stored in a variable that iChrome.Storage processes. The fetcher checks to see if processStorage is defined, if it is it gets called, otherwise settings are left in iChromeConfig */ var processStorage = function() { iChrome.Storage(function() { iChrome.Templates(); // Templates are read from their elements and held in a cache iChrome(); // Init is called }); }; if (typeof iChromeConfig == "object") { processStorage(); } Objectives of the restructure Memory usage: Chrome apparently has a memory leak in extensions, they're trying to fix it but memory still keeps on getting increased every time the page is loaded. The app also uses a lot on its own. Code readability: At this point I can't follow what's being called in the code. While rewriting the code I plan on properly commenting everything. Module interdependence: Right now modules call each other a lot, AFAIK that's not good at all since any change you make to one module could affect countless others. Fault tolerance: There's very little fault tolerance or error handling right now. If a widget is causing the rest of the page to stop rendering the user should at least be able to remove it. Speed is currently not an issue and I'd like to keep it that way. How I think I should do it The restructure should be done using Backbone.js and events that call modules (i.e. on storage.loaded = init). Modules should each go in their own file, I'm thinking there should be a set of core files that all modules can rely on and call directly and everything else should be event based. Widget structure should be kept largely the same, but maybe they should also be split into their own files. AFAIK you can't load all templates in a folder, therefore they need to stay inline. Grunt should be used to merge all modules, plugins and widgets into one file. Templates should also all be precompiled. Question: Should I follow my current restructure plan? Does that sound like a good starting point, or is there a different approach that I'm missing? Should I not do any of the things I listed? Do applications written with Backbone tend to be more intensive (memory and speed) than ones written in Vanilla JS? Also, can I expect to improve this with a proper restructure or is my current code about as good as can be expected?

Read the article
The Incremental Architect’s Napkin - #5 - Design functions for extensibility and readability

- by Ralf Westphal

Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/08/24/the-incremental-architectrsquos-napkin---5---design-functions-for.aspx The functionality of programs is entered via Entry Points. So what we´re talking about when designing software is a bunch of functions handling the requests represented by and flowing in through those Entry Points. Designing software thus consists of at least three phases: Analyzing the requirements to find the Entry Points and their signatures Designing the functionality to be executed when those Entry Points get triggered Implementing the functionality according to the design aka coding I presume, you´re familiar with phase 1 in some way. And I guess you´re proficient in implementing functionality in some programming language. But in my experience developers in general are not experienced in going through an explicit phase 2. “Designing functionality? What´s that supposed to mean?” you might already have thought. Here´s my definition: To design functionality (or functional design for short) means thinking about… well, functions. You find a solution for what´s supposed to happen when an Entry Point gets triggered in terms of functions. A conceptual solution that is, because those functions only exist in your head (or on paper) during this phase. But you may have guess that, because it´s “design” not “coding”. And here is, what functional design is not: It´s not about logic. Logic is expressions (e.g. +, -, && etc.) and control statements (e.g. if, switch, for, while etc.). Also I consider calling external APIs as logic. It´s equally basic. It´s what code needs to do in order to deliver some functionality or quality. Logic is what´s doing that needs to be done by software. Transformations are either done through expressions or API-calls. And then there is alternative control flow depending on the result of some expression. Basically it´s just jumps in Assembler, sometimes to go forward (if, switch), sometimes to go backward (for, while, do). But calling your own function is not logic. It´s not necessary to produce any outcome. Functionality is not enhanced by adding functions (subroutine calls) to your code. Nor is quality increased by adding functions. No performance gain, no higher scalability etc. through functions. Functions are not relevant to functionality. Strange, isn´t it. What they are important for is security of investment. By introducing functions into our code we can become more productive (re-use) and can increase evolvability (higher unterstandability, easier to keep code consistent). That´s no small feat, however. Evolvable code can hardly be overestimated. That´s why to me functional design is so important. It´s at the core of software development. To sum this up: Functional design is on a level of abstraction above (!) logical design or algorithmic design. Functional design is only done until you get to a point where each function is so simple you are very confident you can easily code it. Functional design an logical design (which mostly is coding, but can also be done using pseudo code or flow charts) are complementary. Software needs both. If you start coding right away you end up in a tangled mess very quickly. Then you need back out through refactoring. Functional design on the other hand is bloodless without actual code. It´s just a theory with no experiments to prove it. But how to do functional design? An example of functional design Let´s assume a program to de-duplicate strings. The user enters a number of strings separated by commas, e.g. a, b, a, c, d, b, e, c, a. And the program is supposed to clear this list of all doubles, e.g. a, b, c, d, e. There is only one Entry Point to this program: the user triggers the de-duplication by starting the program with the string list on the command line C:\>deduplicate "a, b, a, c, d, b, e, c, a" a, b, c, d, e …or by clicking on a GUI button. This leads to the Entry Point function to get called. It´s the program´s main function in case of the batch version or a button click event handler in the GUI version. That´s the physical Entry Point so to speak. It´s inevitable. What then happens is a three step process: Transform the input data from the user into a request. Call the request handler. Transform the output of the request handler into a tangible result for the user. Or to phrase it a bit more generally: Accept input. Transform input into output. Present output. This does not mean any of these steps requires a lot of effort. Maybe it´s just one line of code to accomplish it. Nevertheless it´s a distinct step in doing the processing behind an Entry Point. Call it an aspect or a responsibility - and you will realize it most likely deserves a function of its own to satisfy the Single Responsibility Principle (SRP). Interestingly the above list of steps is already functional design. There is no logic, but nevertheless the solution is described - albeit on a higher level of abstraction than you might have done yourself. But it´s still on a meta-level. The application to the domain at hand is easy, though: Accept string list from command line De-duplicate Present de-duplicated strings on standard output And this concrete list of processing steps can easily be transformed into code:static void Main(string[] args) { var input = Accept_string_list(args); var output = Deduplicate(input); Present_deduplicated_string_list(output); } Instead of a big problem there are three much smaller problems now. If you think each of those is trivial to implement, then go for it. You can stop the functional design at this point. But maybe, just maybe, you´re not so sure how to go about with the de-duplication for example. Then just implement what´s easy right now, e.g.private static string Accept_string_list(string[] args) { return args[0]; } private static void Present_deduplicated_string_list( string[] output) { var line = string.Join(", ", output); Console.WriteLine(line); } Accept_string_list() contains logic in the form of an API-call. Present_deduplicated_string_list() contains logic in the form of an expression and an API-call. And then repeat the functional design for the remaining processing step. What´s left is the domain logic: de-duplicating a list of strings. How should that be done? Without any logic at our disposal during functional design you´re left with just functions. So which functions could make up the de-duplication? Here´s a suggestion: De-duplicate Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Processing step 2 obviously was the core of the solution. That´s where real creativity was needed. That´s the core of the domain. But now after this refinement the implementation of each step is easy again:private static string[] Parse_string_list(string input) { return input.Split(',') .Select(s => s.Trim()) .ToArray(); } private static Dictionary<string,object> Compile_unique_strings(string[] strings) { return strings.Aggregate( new Dictionary<string, object>(), (agg, s) => { agg[s] = null; return agg; }); } private static string[] Serialize_unique_strings( Dictionary<string,object> dict) { return dict.Keys.ToArray(); } With these three additional functions Main() now looks like this:static void Main(string[] args) { var input = Accept_string_list(args); var strings = Parse_string_list(input); var dict = Compile_unique_strings(strings); var output = Serialize_unique_strings(dict); Present_deduplicated_string_list(output); } I think that´s very understandable code: just read it from top to bottom and you know how the solution to the problem works. It´s a mirror image of the initial design: Accept string list from command line Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Present de-duplicated strings on standard output You can even re-generate the design by just looking at the code. Code and functional design thus are always in sync - if you follow some simple rules. But about that later. And as a bonus: all the functions making up the process are small - which means easy to understand, too. So much for an initial concrete example. Now it´s time for some theory. Because there is method to this madness ;-) The above has only scratched the surface. Introducing Flow Design Functional design starts with a given function, the Entry Point. Its goal is to describe the behavior of the program when the Entry Point is triggered using a process, not an algorithm. An algorithm consists of logic, a process on the other hand consists just of steps or stages. Each processing step transforms input into output or a side effect. Also it might access resources, e.g. a printer, a database, or just memory. Processing steps thus can rely on state of some sort. This is different from Functional Programming, where functions are supposed to not be stateful and not cause side effects.[1] In its simplest form a process can be written as a bullet point list of steps, e.g. Get data from user Output result to user Transform data Parse data Map result for output Such a compilation of steps - possibly on different levels of abstraction - often is the first artifact of functional design. It can be generated by a team in an initial design brainstorming. Next comes ordering the steps. What should happen first, what next etc.? Get data from user Parse data Transform data Map result for output Output result to user That´s great for a start into functional design. It´s better than starting to code right away on a given function using TDD. Please get me right: TDD is a valuable practice. But it can be unnecessarily hard if the scope of a functionn is too large. But how do you know beforehand without investing some thinking? And how to do this thinking in a systematic fashion? My recommendation: For any given function you´re supposed to implement first do a functional design. Then, once you´re confident you know the processing steps - which are pretty small - refine and code them using TDD. You´ll see that´s much, much easier - and leads to cleaner code right away. For more information on this approach I call “Informed TDD” read my book of the same title. Thinking before coding is smart. And writing down the solution as a bunch of functions possibly is the simplest thing you can do, I´d say. It´s more according to the KISS (Keep It Simple, Stupid) principle than returning constants or other trivial stuff TDD development often is started with. So far so good. A simple ordered list of processing steps will do to start with functional design. As shown in the above example such steps can easily be translated into functions. Moving from design to coding thus is simple. However, such a list does not scale. Processing is not always that simple to be captured in a list. And then the list is just text. Again. Like code. That means the design is lacking visuality. Textual representations need more parsing by your brain than visual representations. Plus they are limited in their “dimensionality”: text just has one dimension, it´s sequential. Alternatives and parallelism are hard to encode in text. In addition the functional design using numbered lists lacks data. It´s not visible what´s the input, output, and state of the processing steps. That´s why functional design should be done using a lightweight visual notation. No tool is necessary to draw such designs. Use pen and paper; a flipchart, a whiteboard, or even a napkin is sufficient. Visualizing processes The building block of the functional design notation is a functional unit. I mostly draw it like this: Something is done, it´s clear what goes in, it´s clear what comes out, and it´s clear what the processing step requires in terms of state or hardware. Whenever input flows into a functional unit it gets processed and output is produced and/or a side effect occurs. Flowing data is the driver of something happening. That´s why I call this approach to functional design Flow Design. It´s about data flow instead of control flow. Control flow like in algorithms is of no concern to functional design. Thinking about control flow simply is too low level. Once you start with control flow you easily get bogged down by tons of details. That´s what you want to avoid during design. Design is supposed to be quick, broad brush, abstract. It should give overview. But what about all the details? As Robert C. Martin rightly said: “Programming is abot detail”. Detail is a matter of code. Once you start coding the processing steps you designed you can worry about all the detail you want. Functional design does not eliminate all the nitty gritty. It just postpones tackling them. To me that´s also an example of the SRP. Function design has the responsibility to come up with a solution to a problem posed by a single function (Entry Point). And later coding has the responsibility to implement the solution down to the last detail (i.e. statement, API-call). TDD unfortunately mixes both responsibilities. It´s just coding - and thereby trying to find detailed implementations (green phase) plus getting the design right (refactoring). To me that´s one reason why TDD has failed to deliver on its promise for many developers. Using functional units as building blocks of functional design processes can be depicted very easily. Here´s the initial process for the example problem: For each processing step draw a functional unit and label it. Choose a verb or an “action phrase” as a label, not a noun. Functional design is about activities, not state or structure. Then make the output of an upstream step the input of a downstream step. Finally think about the data that should flow between the functional units. Write the data above the arrows connecting the functional units in the direction of the data flow. Enclose the data description in brackets. That way you can clearly see if all flows have already been specified. Empty brackets mean “no data is flowing”, but nevertheless a signal is sent. A name like “list” or “strings” in brackets describes the data content. Use lower case labels for that purpose. A name starting with an upper case letter like “String” or “Customer” on the other hand signifies a data type. If you like, you also can combine descriptions with data types by separating them with a colon, e.g. (list:string) or (strings:string[]). But these are just suggestions from my practice with Flow Design. You can do it differently, if you like. Just be sure to be consistent. Flows wired-up in this manner I call one-dimensional (1D). Each functional unit just has one input and/or one output. A functional unit without an output is possible. It´s like a black hole sucking up input without producing any output. Instead it produces side effects. A functional unit without an input, though, does make much sense. When should it start to work? What´s the trigger? That´s why in the above process even the first processing step has an input. If you like, view such 1D-flows as pipelines. Data is flowing through them from left to right. But as you can see, it´s not always the same data. It get´s transformed along its passage: (args) becomes a (list) which is turned into (strings). The Principle of Mutual Oblivion A very characteristic trait of flows put together from function units is: no functional units knows another one. They are all completely independent of each other. Functional units don´t know where their input is coming from (or even when it´s gonna arrive). They just specify a range of values they can process. And they promise a certain behavior upon input arriving. Also they don´t know where their output is going. They just produce it in their own time independent of other functional units. That means at least conceptually all functional units work in parallel. Functional units don´t know their “deployment context”. They now nothing about the overall flow they are place in. They are just consuming input from some upstream, and producing output for some downstream. That makes functional units very easy to test. At least as long as they don´t depend on state or resources. I call this the Principle of Mutual Oblivion (PoMO). Functional units are oblivious of others as well as an overall context/purpose. They are just parts of a whole focused on a single responsibility. How the whole is built, how a larger goal is achieved, is of no concern to the single functional units. By building software in such a manner, functional design interestingly follows nature. Nature´s building blocks for organisms also follow the PoMO. The cells forming your body do not know each other. Take a nerve cell “controlling” a muscle cell for example:[2] The nerve cell does not know anything about muscle cells, let alone the specific muscel cell it is “attached to”. Likewise the muscle cell does not know anything about nerve cells, let a lone a specific nerve cell “attached to” it. Saying “the nerve cell is controlling the muscle cell” thus only makes sense when viewing both from the outside. “Control” is a concept of the whole, not of its parts. Control is created by wiring-up parts in a certain way. Both cells are mutually oblivious. Both just follow a contract. One produces Acetylcholine (ACh) as output, the other consumes ACh as input. Where the ACh is going, where it´s coming from neither cell cares about. Million years of evolution have led to this kind of division of labor. And million years of evolution have produced organism designs (DNA) which lead to the production of these different cell types (and many others) and also to their co-location. The result: the overall behavior of an organism. How and why this happened in nature is a mystery. For our software, though, it´s clear: functional and quality requirements needs to be fulfilled. So we as developers have to become “intelligent designers” of “software cells” which we put together to form a “software organism” which responds in satisfying ways to triggers from it´s environment. My bet is: If nature gets complex organisms working by following the PoMO, who are we to not apply this recipe for success to our much simpler “machines”? So my rule is: Wherever there is functionality to be delivered, because there is a clear Entry Point into software, design the functionality like nature would do it. Build it from mutually oblivious functional units. That´s what Flow Design is about. In that way it´s even universal, I´d say. Its notation can also be applied to biology: Never mind labeling the functional units with nouns. That´s ok in Flow Design. You´ll do that occassionally for functional units on a higher level of abstraction or when their purpose is close to hardware. Getting a cockroach to roam your bedroom takes 1,000,000 nerve cells (neurons). Getting the de-duplication program to do its job just takes 5 “software cells” (functional units). Both, though, follow the same basic principle. Translating functional units into code Moving from functional design to code is no rocket science. In fact it´s straightforward. There are two simple rules: Translate an input port to a function. Translate an output port either to a return statement in that function or to a function pointer visible to that function. The simplest translation of a functional unit is a function. That´s what you saw in the above example. Functions are mutually oblivious. That why Functional Programming likes them so much. It makes them composable. Which is the reason, nature works according to the PoMO. Let´s be clear about one thing: There is no dependency injection in nature. For all of an organism´s complexity no DI container is used. Behavior is the result of smooth cooperation between mutually oblivious building blocks. Functions will often be the adequate translation for the functional units in your designs. But not always. Take for example the case, where a processing step should not always produce an output. Maybe the purpose is to filter input. Here the functional unit consumes words and produces words. But it does not pass along every word flowing in. Some words are swallowed. Think of a spell checker. It probably should not check acronyms for correctness. There are too many of them. Or words with no more than two letters. Such words are called “stop words”. In the above picture the optionality of the output is signified by the astrisk outside the brackets. It means: Any number of (word) data items can flow from the functional unit for each input data item. It might be none or one or even more. This I call a stream of data. Such behavior cannot be translated into a function where output is generated with return. Because a function always needs to return a value. So the output port is translated into a function pointer or continuation which gets passed to the subroutine when called:[3]void filter_stop_words( string word, Action<string> onNoStopWord) { if (...check if not a stop word...) onNoStopWord(word); } If you want to be nitpicky you might call such a function pointer parameter an injection. And technically you´re right. Conceptually, though, it´s not an injection. Because the subroutine is not functionally dependent on the continuation. Firstly continuations are procedures, i.e. subroutines without a return type. Remember: Flow Design is about unidirectional data flow. Secondly the name of the formal parameter is chosen in a way as to not assume anything about downstream processing steps. onNoStopWord describes a situation (or event) within the functional unit only. Translating output ports into function pointers helps keeping functional units mutually oblivious in cases where output is optional or produced asynchronically. Either pass the function pointer to the function upon call. Or make it global by putting it on the encompassing class. Then it´s called an event. In C# that´s even an explicit feature.class Filter { public void filter_stop_words( string word) { if (...check if not a stop word...) onNoStopWord(word); } public event Action<string> onNoStopWord; } When to use a continuation and when to use an event dependens on how a functional unit is used in flows and how it´s packed together with others into classes. You´ll see examples further down the Flow Design road. Another example of 1D functional design Let´s see Flow Design once more in action using the visual notation. How about the famous word wrap kata? Robert C. Martin has posted a much cited solution including an extensive reasoning behind his TDD approach. So maybe you want to compare it to Flow Design. The function signature given is:string WordWrap(string text, int maxLineLength) {...} That´s not an Entry Point since we don´t see an application with an environment and users. Nevertheless it´s a function which is supposed to provide a certain functionality. The text passed in has to be reformatted. The input is a single line of arbitrary length consisting of words separated by spaces. The output should consist of one or more lines of a maximum length specified. If a word is longer than a the maximum line length it can be split in multiple parts each fitting in a line. Flow Design Let´s start by brainstorming the process to accomplish the feat of reformatting the text. What´s needed? Words need to be assembled into lines Words need to be extracted from the input text The resulting lines need to be assembled into the output text Words too long to fit in a line need to be split Does sound about right? I guess so. And it shows a kind of priority. Long words are a special case. So maybe there is a hint for an incremental design here. First let´s tackle “average words” (words not longer than a line). Here´s the Flow Design for this increment: The the first three bullet points turned into functional units with explicit data added. As the signature requires a text is transformed into another text. See the input of the first functional unit and the output of the last functional unit. In between no text flows, but words and lines. That´s good to see because thereby the domain is clearly represented in the design. The requirements are talking about words and lines and here they are. But note the asterisk! It´s not outside the brackets but inside. That means it´s not a stream of words or lines, but lists or sequences. For each text a sequence of words is output. For each sequence of words a sequence of lines is produced. The asterisk is used to abstract from the concrete implementation. Like with streams. Whether the list of words gets implemented as an array or an IEnumerable is not important during design. It´s an implementation detail. Does any processing step require further refinement? I don´t think so. They all look pretty “atomic” to me. And if not… I can always backtrack and refine a process step using functional design later once I´ve gained more insight into a sub-problem. Implementation The implementation is straightforward as you can imagine. The processing steps can all be translated into functions. Each can be tested easily and separately. Each has a focused responsibility. And the process flow becomes just a sequence of function calls: Easy to understand. It clearly states how word wrapping works - on a high level of abstraction. And it´s easy to evolve as you´ll see. Flow Design - Increment 2 So far only texts consisting of “average words” are wrapped correctly. Words not fitting in a line will result in lines too long. Wrapping long words is a feature of the requested functionality. Whether it´s there or not makes a difference to the user. To quickly get feedback I decided to first implement a solution without this feature. But now it´s time to add it to deliver the full scope. Fortunately Flow Design automatically leads to code following the Open Closed Principle (OCP). It´s easy to extend it - instead of changing well tested code. How´s that possible? Flow Design allows for extension of functionality by inserting functional units into the flow. That way existing functional units need not be changed. The data flow arrow between functional units is a natural extension point. No need to resort to the Strategy Pattern. No need to think ahead where extions might need to be made in the future. I just “phase in” the remaining processing step: Since neither Extract words nor Reformat know of their environment neither needs to be touched due to the “detour”. The new processing step accepts the output of the existing upstream step and produces data compatible with the existing downstream step. Implementation - Increment 2 A trivial implementation checking the assumption if this works does not do anything to split long words. The input is just passed on: Note how clean WordWrap() stays. The solution is easy to understand. A developer looking at this code sometime in the future, when a new feature needs to be build in, quickly sees how long words are dealt with. Compare this to Robert C. Martin´s solution:[4] How does this solution handle long words? Long words are not even part of the domain language present in the code. At least I need considerable time to understand the approach. Admittedly the Flow Design solution with the full implementation of long word splitting is longer than Robert C. Martin´s. At least it seems. Because his solution does not cover all the “word wrap situations” the Flow Design solution handles. Some lines would need to be added to be on par, I guess. But even then… Is a difference in LOC that important as long as it´s in the same ball park? I value understandability and openness for extension higher than saving on the last line of code. Simplicity is not just less code, it´s also clarity in design. But don´t take my word for it. Try Flow Design on larger problems and compare for yourself. What´s the easier, more straightforward way to clean code? And keep in mind: You ain´t seen all yet ;-) There´s more to Flow Design than described in this chapter. In closing I hope I was able to give you a impression of functional design that makes you hungry for more. To me it´s an inevitable step in software development. Jumping from requirements to code does not scale. And it leads to dirty code all to quickly. Some thought should be invested first. Where there is a clear Entry Point visible, it´s functionality should be designed using data flows. Because with data flows abstraction is possible. For more background on why that´s necessary read my blog article here. For now let me point out to you - if you haven´t already noticed - that Flow Design is a general purpose declarative language. It´s “programming by intention” (Shalloway et al.). Just write down how you think the solution should work on a high level of abstraction. This breaks down a large problem in smaller problems. And by following the PoMO the solutions to those smaller problems are independent of each other. So they are easy to test. Or you could even think about getting them implemented in parallel by different team members. Flow Design not only increases evolvability, but also helps becoming more productive. All team members can participate in functional design. This goes beyon collective code ownership. We´re talking collective design/architecture ownership. Because with Flow Design there is a common visual language to talk about functional design - which is the foundation for all other design activities. PS: If you like what you read, consider getting my ebook “The Incremental Architekt´s Napkin”. It´s where I compile all the articles in this series for easier reading. I like the strictness of Function Programming - but I also find it quite hard to live by. And it certainly is not what millions of programmers are used to. Also to me it seems, the real world is full of state and side effects. So why give them such a bad image? That´s why functional design takes a more pragmatic approach. State and side effects are ok for processing steps - but be sure to follow the SRP. Don´t put too much of it into a single processing step. ? Image taken from www.physioweb.org ? My code samples are written in C#. C# sports typed function pointers called delegates. Action is such a function pointer type matching functions with signature void someName(T t). Other languages provide similar ways to work with functions as first class citizens - even Java now in version 8. I trust you find a way to map this detail of my translation to your favorite programming language. I know it works for Java, C++, Ruby, JavaScript, Python, Go. And if you´re using a Functional Programming language it´s of course a no brainer. ? Taken from his blog post “The Craftsman 62, The Dark Path”. ?

Read the article
How John Got 15x Improvement Without Really Trying

- by rchrd

The following article was published on a Sun Microsystems website a number of years ago by John Feo. It is still useful and worth preserving. So I'm republishing it here. How I Got 15x Improvement Without Really Trying John Feo, Sun Microsystems Taking ten "personal" program codes used in scientific and engineering research, the author was able to get from 2 to 15 times performance improvement easily by applying some simple general optimization techniques. Introduction Scientific research based on computer simulation depends on the simulation for advancement. The research can advance only as fast as the computational codes can execute. The codes' efficiency determines both the rate and quality of results. In the same amount of time, a faster program can generate more results and can carry out a more detailed simulation of physical phenomena than a slower program. Highly optimized programs help science advance quickly and insure that monies supporting scientific research are used as effectively as possible. Scientific computer codes divide into three broad categories: ISV, community, and personal. ISV codes are large, mature production codes developed and sold commercially. The codes improve slowly over time both in methods and capabilities, and they are well tuned for most vendor platforms. Since the codes are mature and complex, there are few opportunities to improve their performance solely through code optimization. Improvements of 10% to 15% are typical. Examples of ISV codes are DYNA3D, Gaussian, and Nastran. Community codes are non-commercial production codes used by a particular research field. Generally, they are developed and distributed by a single academic or research institution with assistance from the community. Most users just run the codes, but some develop new methods and extensions that feed back into the general release. The codes are available on most vendor platforms. Since these codes are younger than ISV codes, there are more opportunities to optimize the source code. Improvements of 50% are not unusual. Examples of community codes are AMBER, CHARM, BLAST, and FASTA. Personal codes are those written by single users or small research groups for their own use. These codes are not distributed, but may be passed from professor-to-student or student-to-student over several years. They form the primordial ocean of applications from which community and ISV codes emerge. Government research grants pay for the development of most personal codes. This paper reports on the nature and performance of this class of codes. Over the last year, I have looked at over two dozen personal codes from more than a dozen research institutions. The codes cover a variety of scientific fields, including astronomy, atmospheric sciences, bioinformatics, biology, chemistry, geology, and physics. The sources range from a few hundred lines to more than ten thousand lines, and are written in Fortran, Fortran 90, C, and C++. For the most part, the codes are modular, documented, and written in a clear, straightforward manner. They do not use complex language features, advanced data structures, programming tricks, or libraries. I had little trouble understanding what the codes did or how data structures were used. Most came with a makefile. Surprisingly, only one of the applications is parallel. All developers have access to parallel machines, so availability is not an issue. Several tried to parallelize their applications, but stopped after encountering difficulties. Lack of education and a perception that parallelism is difficult prevented most from trying. I parallelized several of the codes using OpenMP, and did not judge any of the codes as difficult to parallelize. Even more surprising than the lack of parallelism is the inefficiency of the codes. I was able to get large improvements in performance in a matter of a few days applying simple optimization techniques. Table 1 lists ten representative codes [names and affiliation are omitted to preserve anonymity]. Improvements on one processor range from 2x to 15.5x with a simple average of 4.75x. I did not use sophisticated performance tools or drill deep into the program's execution character as one would do when tuning ISV or community codes. Using only a profiler and source line timers, I identified inefficient sections of code and improved their performance by inspection. The changes were at a high level. I am sure there is another factor of 2 or 3 in each code, and more if the codes are parallelized. The study’s results show that personal scientific codes are running many times slower than they should and that the problem is pervasive. Computational scientists are not sloppy programmers; however, few are trained in the art of computer programming or code optimization. I found that most have a working knowledge of some programming language and standard software engineering practices; but they do not know, or think about, how to make their programs run faster. They simply do not know the standard techniques used to make codes run faster. In fact, they do not even perceive that such techniques exist. The case studies described in this paper show that applying simple, well known techniques can significantly increase the performance of personal codes. It is important that the scientific community and the Government agencies that support scientific research find ways to better educate academic scientific programmers. The inefficiency of their codes is so bad that it is retarding both the quality and progress of scientific research. # cacheperformance redundantoperations loopstructures performanceimprovement 1 x x 15.5 2 x 2.8 3 x x 2.5 4 x 2.1 5 x x 2.0 6 x 5.0 7 x 5.8 8 x 6.3 9 2.2 10 x x 3.3 Table 1 — Area of improvement and performance gains of 10 codes The remainder of the paper is organized as follows: sections 2, 3, and 4 discuss the three most common sources of inefficiencies in the codes studied. These are cache performance, redundant operations, and loop structures. Each section includes several examples. The last section summaries the work and suggests a possible solution to the issues raised. Optimizing cache performance Commodity microprocessor systems use caches to increase memory bandwidth and reduce memory latencies. Typical latencies from processor to L1, L2, local, and remote memory are 3, 10, 50, and 200 cycles, respectively. Moreover, bandwidth falls off dramatically as memory distances increase. Programs that do not use cache effectively run many times slower than programs that do. When optimizing for cache, the biggest performance gains are achieved by accessing data in cache order and reusing data to amortize the overhead of cache misses. Secondary considerations are prefetching, associativity, and replacement; however, the understanding and analysis required to optimize for the latter are probably beyond the capabilities of the non-expert. Much can be gained simply by accessing data in the correct order and maximizing data reuse. 6 out of the 10 codes studied here benefited from such high level optimizations. Array Accesses The most important cache optimization is the most basic: accessing Fortran array elements in column order and C array elements in row order. Four of the ten codes—1, 2, 4, and 10—got it wrong. Compilers will restructure nested loops to optimize cache performance, but may not do so if the loop structure is too complex, or the loop body includes conditionals, complex addressing, or function calls. In code 1, the compiler failed to invert a key loop because of complex addressing do I = 0, 1010, delta_x IM = I - delta_x IP = I + delta_x do J = 5, 995, delta_x JM = J - delta_x JP = J + delta_x T1 = CA1(IP, J) + CA1(I, JP) T2 = CA1(IM, J) + CA1(I, JM) S1 = T1 + T2 - 4 * CA1(I, J) CA(I, J) = CA1(I, J) + D * S1 end do end do In code 2, the culprit is conditionals do I = 1, N do J = 1, N If (IFLAG(I,J) .EQ. 0) then T1 = Value(I, J-1) T2 = Value(I-1, J) T3 = Value(I, J) T4 = Value(I+1, J) T5 = Value(I, J+1) Value(I,J) = 0.25 * (T1 + T2 + T5 + T4) Delta = ABS(T3 - Value(I,J)) If (Delta .GT. MaxDelta) MaxDelta = Delta endif enddo enddo I fixed both programs by inverting the loops by hand. Code 10 has three-dimensional arrays and triply nested loops. The structure of the most computationally intensive loops is too complex to invert automatically or by hand. The only practical solution is to transpose the arrays so that the dimension accessed by the innermost loop is in cache order. The arrays can be transposed at construction or prior to entering a computationally intensive section of code. The former requires all array references to be modified, while the latter is cost effective only if the cost of the transpose is amortized over many accesses. I used the second approach to optimize code 10. Code 5 has four-dimensional arrays and loops are nested four deep. For all of the reasons cited above the compiler is not able to restructure three key loops. Assume C arrays and let the four dimensions of the arrays be i, j, k, and l. In the original code, the index structure of the three loops is L1: for i L2: for i L3: for i for l for l for j for k for j for k for j for k for l So only L3 accesses array elements in cache order. L1 is a very complex loop—much too complex to invert. I brought the loop into cache alignment by transposing the second and fourth dimensions of the arrays. Since the code uses a macro to compute all array indexes, I effected the transpose at construction and changed the macro appropriately. The dimensions of the new arrays are now: i, l, k, and j. L3 is a simple loop and easily inverted. L2 has a loop-carried scalar dependence in k. By promoting the scalar name that carries the dependence to an array, I was able to invert the third and fourth subloops aligning the loop with cache. Code 5 is by far the most difficult of the four codes to optimize for array accesses; but the knowledge required to fix the problems is no more than that required for the other codes. I would judge this code at the limits of, but not beyond, the capabilities of appropriately trained computational scientists. Array Strides When a cache miss occurs, a line (64 bytes) rather than just one word is loaded into the cache. If data is accessed stride 1, than the cost of the miss is amortized over 8 words. Any stride other than one reduces the cost savings. Two of the ten codes studied suffered from non-unit strides. The codes represent two important classes of "strided" codes. Code 1 employs a multi-grid algorithm to reduce time to convergence. The grids are every tenth, fifth, second, and unit element. Since time to convergence is inversely proportional to the distance between elements, coarse grids converge quickly providing good starting values for finer grids. The better starting values further reduce the time to convergence. The downside is that grids of every nth element, n > 1, introduce non-unit strides into the computation. In the original code, much of the savings of the multi-grid algorithm were lost due to this problem. I eliminated the problem by compressing (copying) coarse grids into continuous memory, and rewriting the computation as a function of the compressed grid. On convergence, I copied the final values of the compressed grid back to the original grid. The savings gained from unit stride access of the compressed grid more than paid for the cost of copying. Using compressed grids, the loop from code 1 included in the previous section becomes do j = 1, GZ do i = 1, GZ T1 = CA(i+0, j-1) + CA(i-1, j+0) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) S1 = T1 + T4 - 4 * CA1(i+0, j+0) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 enddo enddo where CA and CA1 are compressed arrays of size GZ. Code 7 traverses a list of objects selecting objects for later processing. The labels of the selected objects are stored in an array. The selection step has unit stride, but the processing steps have irregular stride. A fix is to save the parameters of the selected objects in temporary arrays as they are selected, and pass the temporary arrays to the processing functions. The fix is practical if the same parameters are used in selection as in processing, or if processing comprises a series of distinct steps which use overlapping subsets of the parameters. Both conditions are true for code 7, so I achieved significant improvement by copying parameters to temporary arrays during selection. Data reuse In the previous sections, we optimized for spatial locality. It is also important to optimize for temporal locality. Once read, a datum should be used as much as possible before it is forced from cache. Loop fusion and loop unrolling are two techniques that increase temporal locality. Unfortunately, both techniques increase register pressure—as loop bodies become larger, the number of registers required to hold temporary values grows. Once register spilling occurs, any gains evaporate quickly. For multiprocessors with small register sets or small caches, the sweet spot can be very small. In the ten codes presented here, I found no opportunities for loop fusion and only two opportunities for loop unrolling (codes 1 and 3). In code 1, unrolling the outer and inner loop one iteration increases the number of result values computed by the loop body from 1 to 4, do J = 1, GZ-2, 2 do I = 1, GZ-2, 2 T1 = CA1(i+0, j-1) + CA1(i-1, j+0) T2 = CA1(i+1, j-1) + CA1(i+0, j+0) T3 = CA1(i+0, j+0) + CA1(i-1, j+1) T4 = CA1(i+1, j+0) + CA1(i+0, j+1) T5 = CA1(i+2, j+0) + CA1(i+1, j+1) T6 = CA1(i+1, j+1) + CA1(i+0, j+2) T7 = CA1(i+2, j+1) + CA1(i+1, j+2) S1 = T1 + T4 - 4 * CA1(i+0, j+0) S2 = T2 + T5 - 4 * CA1(i+1, j+0) S3 = T3 + T6 - 4 * CA1(i+0, j+1) S4 = T4 + T7 - 4 * CA1(i+1, j+1) CA(i+0, j+0) = CA1(i+0, j+0) + DD * S1 CA(i+1, j+0) = CA1(i+1, j+0) + DD * S2 CA(i+0, j+1) = CA1(i+0, j+1) + DD * S3 CA(i+1, j+1) = CA1(i+1, j+1) + DD * S4 enddo enddo The loop body executes 12 reads, whereas as the rolled loop shown in the previous section executes 20 reads to compute the same four values. In code 3, two loops are unrolled 8 times and one loop is unrolled 4 times. Here is the before for (k = 0; k < NK[u]; k++) { sum = 0.0; for (y = 0; y < NY; y++) { sum += W[y][u][k] * delta[y]; } backprop[i++]=sum; } and after code for (k = 0; k < KK - 8; k+=8) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (y = 0; y < NY; y++) { sum0 += W[y][0][k+0] * delta[y]; sum1 += W[y][0][k+1] * delta[y]; sum2 += W[y][0][k+2] * delta[y]; sum3 += W[y][0][k+3] * delta[y]; sum4 += W[y][0][k+4] * delta[y]; sum5 += W[y][0][k+5] * delta[y]; sum6 += W[y][0][k+6] * delta[y]; sum7 += W[y][0][k+7] * delta[y]; } backprop[k+0] = sum0; backprop[k+1] = sum1; backprop[k+2] = sum2; backprop[k+3] = sum3; backprop[k+4] = sum4; backprop[k+5] = sum5; backprop[k+6] = sum6; backprop[k+7] = sum7; } for one of the loops unrolled 8 times. Optimizing for temporal locality is the most difficult optimization considered in this paper. The concepts are not difficult, but the sweet spot is small. Identifying where the program can benefit from loop unrolling or loop fusion is not trivial. Moreover, it takes some effort to get it right. Still, educating scientific programmers about temporal locality and teaching them how to optimize for it will pay dividends. Reducing instruction count Execution time is a function of instruction count. Reduce the count and you usually reduce the time. The best solution is to use a more efficient algorithm; that is, an algorithm whose order of complexity is smaller, that converges quicker, or is more accurate. Optimizing source code without changing the algorithm yields smaller, but still significant, gains. This paper considers only the latter because the intent is to study how much better codes can run if written by programmers schooled in basic code optimization techniques. The ten codes studied benefited from three types of "instruction reducing" optimizations. The two most prevalent were hoisting invariant memory and data operations out of inner loops. The third was eliminating unnecessary data copying. The nature of these inefficiencies is language dependent. Memory operations The semantics of C make it difficult for the compiler to determine all the invariant memory operations in a loop. The problem is particularly acute for loops in functions since the compiler may not know the values of the function's parameters at every call site when compiling the function. Most compilers support pragmas to help resolve ambiguities; however, these pragmas are not comprehensive and there is no standard syntax. To guarantee that invariant memory operations are not executed repetitively, the user has little choice but to hoist the operations by hand. The problem is not as severe in Fortran programs because in the absence of equivalence statements, it is a violation of the language's semantics for two names to share memory. Codes 3 and 5 are C programs. In both cases, the compiler did not hoist all invariant memory operations from inner loops. Consider the following loop from code 3 for (y = 0; y < NY; y++) { i = 0; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += delta[y] * I1[i++]; } } } Since dW[y][u] can point to the same memory space as delta for one or more values of y and u, assignment to dW[y][u][k] may change the value of delta[y]. In reality, dW and delta do not overlap in memory, so I rewrote the loop as for (y = 0; y < NY; y++) { i = 0; Dy = delta[y]; for (u = 0; u < NU; u++) { for (k = 0; k < NK[u]; k++) { dW[y][u][k] += Dy * I1[i++]; } } } Failure to hoist invariant memory operations may be due to complex address calculations. If the compiler can not determine that the address calculation is invariant, then it can hoist neither the calculation nor the associated memory operations. As noted above, code 5 uses a macro to address four-dimensional arrays #define MAT4D(a,q,i,j,k) (double *)((a)->data + (q)*(a)->strides[0] + (i)*(a)->strides[3] + (j)*(a)->strides[2] + (k)*(a)->strides[1]) The macro is too complex for the compiler to understand and so, it does not identify any subexpressions as loop invariant. The simplest way to eliminate the address calculation from the innermost loop (over i) is to define a0 = MAT4D(a,q,0,j,k) before the loop and then replace all instances of *MAT4D(a,q,i,j,k) in the loop with a0[i] A similar problem appears in code 6, a Fortran program. The key loop in this program is do n1 = 1, nh nx1 = (n1 - 1) / nz + 1 nz1 = n1 - nz * (nx1 - 1) do n2 = 1, nh nx2 = (n2 - 1) / nz + 1 nz2 = n2 - nz * (nx2 - 1) ndx = nx2 - nx1 ndy = nz2 - nz1 gxx = grn(1,ndx,ndy) gyy = grn(2,ndx,ndy) gxy = grn(3,ndx,ndy) balance(n1,1) = balance(n1,1) + (force(n2,1) * gxx + force(n2,2) * gxy) * h1 balance(n1,2) = balance(n1,2) + (force(n2,1) * gxy + force(n2,2) * gyy)*h1 end do end do The programmer has written this loop well—there are no loop invariant operations with respect to n1 and n2. However, the loop resides within an iterative loop over time and the index calculations are independent with respect to time. Trading space for time, I precomputed the index values prior to the entering the time loop and stored the values in two arrays. I then replaced the index calculations with reads of the arrays. Data operations Ways to reduce data operations can appear in many forms. Implementing a more efficient algorithm produces the biggest gains. The closest I came to an algorithm change was in code 4. This code computes the inner product of K-vectors A(i) and B(j), 0 = i < N, 0 = j < M, for most values of i and j. Since the program computes most of the NM possible inner products, it is more efficient to compute all the inner products in one triply-nested loop rather than one at a time when needed. The savings accrue from reading A(i) once for all B(j) vectors and from loop unrolling. for (i = 0; i < N; i+=8) { for (j = 0; j < M; j++) { sum0 = 0.0; sum1 = 0.0; sum2 = 0.0; sum3 = 0.0; sum4 = 0.0; sum5 = 0.0; sum6 = 0.0; sum7 = 0.0; for (k = 0; k < K; k++) { sum0 += A[i+0][k] * B[j][k]; sum1 += A[i+1][k] * B[j][k]; sum2 += A[i+2][k] * B[j][k]; sum3 += A[i+3][k] * B[j][k]; sum4 += A[i+4][k] * B[j][k]; sum5 += A[i+5][k] * B[j][k]; sum6 += A[i+6][k] * B[j][k]; sum7 += A[i+7][k] * B[j][k]; } C[i+0][j] = sum0; C[i+1][j] = sum1; C[i+2][j] = sum2; C[i+3][j] = sum3; C[i+4][j] = sum4; C[i+5][j] = sum5; C[i+6][j] = sum6; C[i+7][j] = sum7; }} This change requires knowledge of a typical run; i.e., that most inner products are computed. The reasons for the change, however, derive from basic optimization concepts. It is the type of change easily made at development time by a knowledgeable programmer. In code 5, we have the data version of the index optimization in code 6. Here a very expensive computation is a function of the loop indices and so cannot be hoisted out of the loop; however, the computation is invariant with respect to an outer iterative loop over time. We can compute its value for each iteration of the computation loop prior to entering the time loop and save the values in an array. The increase in memory required to store the values is small in comparison to the large savings in time. The main loop in Code 8 is doubly nested. The inner loop includes a series of guarded computations; some are a function of the inner loop index but not the outer loop index while others are a function of the outer loop index but not the inner loop index for (j = 0; j < N; j++) { for (i = 0; i < M; i++) { r = i * hrmax; R = A[j]; temp = (PRM[3] == 0.0) ? 1.0 : pow(r, PRM[3]); high = temp * kcoeff * B[j] * PRM[2] * PRM[4]; low = high * PRM[6] * PRM[6] / (1.0 + pow(PRM[4] * PRM[6], 2.0)); kap = (R > PRM[6]) ? high * R * R / (1.0 + pow(PRM[4]*r, 2.0) : low * pow(R/PRM[6], PRM[5]); < rest of loop omitted > }} Note that the value of temp is invariant to j. Thus, we can hoist the computation for temp out of the loop and save its values in an array. for (i = 0; i < M; i++) { r = i * hrmax; TEMP[i] = pow(r, PRM[3]); } [N.B. – the case for PRM[3] = 0 is omitted and will be reintroduced later.] We now hoist out of the inner loop the computations invariant to i. Since the conditional guarding the value of kap is invariant to i, it behooves us to hoist the computation out of the inner loop, thereby executing the guard once rather than M times. The final version of the code is for (j = 0; j < N; j++) { R = rig[j] / 1000.; tmp1 = kcoeff * par[2] * beta[j] * par[4]; tmp2 = 1.0 + (par[4] * par[4] * par[6] * par[6]); tmp3 = 1.0 + (par[4] * par[4] * R * R); tmp4 = par[6] * par[6] / tmp2; tmp5 = R * R / tmp3; tmp6 = pow(R / par[6], par[5]); if ((par[3] == 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp5; } else if ((par[3] == 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * tmp4 * tmp6; } else if ((par[3] != 0.0) && (R > par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp5; } else if ((par[3] != 0.0) && (R <= par[6])) { for (i = 1; i <= imax1; i++) KAP[i] = tmp1 * TEMP[i] * tmp4 * tmp6; } for (i = 0; i < M; i++) { kap = KAP[i]; r = i * hrmax; < rest of loop omitted > } } Maybe not the prettiest piece of code, but certainly much more efficient than the original loop, Copy operations Several programs unnecessarily copy data from one data structure to another. This problem occurs in both Fortran and C programs, although it manifests itself differently in the two languages. Code 1 declares two arrays—one for old values and one for new values. At the end of each iteration, the array of new values is copied to the array of old values to reset the data structures for the next iteration. This problem occurs in Fortran programs not included in this study and in both Fortran 77 and Fortran 90 code. Introducing pointers to the arrays and swapping pointer values is an obvious way to eliminate the copying; but pointers is not a feature that many Fortran programmers know well or are comfortable using. An easy solution not involving pointers is to extend the dimension of the value array by 1 and use the last dimension to differentiate between arrays at different times. For example, if the data space is N x N, declare the array (N, N, 2). Then store the problem’s initial values in (_, _, 2) and define the scalar names new = 2 and old = 1. At the start of each iteration, swap old and new to reset the arrays. The old–new copy problem did not appear in any C program. In programs that had new and old values, the code swapped pointers to reset data structures. Where unnecessary coping did occur is in structure assignment and parameter passing. Structures in C are handled much like scalars. Assignment causes the data space of the right-hand name to be copied to the data space of the left-hand name. Similarly, when a structure is passed to a function, the data space of the actual parameter is copied to the data space of the formal parameter. If the structure is large and the assignment or function call is in an inner loop, then copying costs can grow quite large. While none of the ten programs considered here manifested this problem, it did occur in programs not included in the study. A simple fix is always to refer to structures via pointers. Optimizing loop structures Since scientific programs spend almost all their time in loops, efficient loops are the key to good performance. Conditionals, function calls, little instruction level parallelism, and large numbers of temporary values make it difficult for the compiler to generate tightly packed, highly efficient code. Conditionals and function calls introduce jumps that disrupt code flow. Users should eliminate or isolate conditionls to their own loops as much as possible. Often logical expressions can be substituted for if-then-else statements. For example, code 2 includes the following snippet MaxDelta = 0.0 do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) if (Delta > MaxDelta) MaxDelta = Delta enddo enddo if (MaxDelta .gt. 0.001) goto 200 Since the only use of MaxDelta is to control the jump to 200 and all that matters is whether or not it is greater than 0.001, I made MaxDelta a boolean and rewrote the snippet as MaxDelta = .false. do J = 1, N do I = 1, M < code omitted > Delta = abs(OldValue ? NewValue) MaxDelta = MaxDelta .or. (Delta .gt. 0.001) enddo enddo if (MaxDelta) goto 200 thereby, eliminating the conditional expression from the inner loop. A microprocessor can execute many instructions per instruction cycle. Typically, it can execute one or more memory, floating point, integer, and jump operations. To be executed simultaneously, the operations must be independent. Thick loops tend to have more instruction level parallelism than thin loops. Moreover, they reduce memory traffice by maximizing data reuse. Loop unrolling and loop fusion are two techniques to increase the size of loop bodies. Several of the codes studied benefitted from loop unrolling, but none benefitted from loop fusion. This observation is not too surpising since it is the general tendency of programmers to write thick loops. As loops become thicker, the number of temporary values grows, increasing register pressure. If registers spill, then memory traffic increases and code flow is disrupted. A thick loop with many temporary values may execute slower than an equivalent series of thin loops. The biggest gain will be achieved if the thick loop can be split into a series of independent loops eliminating the need to write and read temporary arrays. I found such an occasion in code 10 where I split the loop do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do into two disjoint loops do i = 1, n do j = 1, m A24(j,i)= S24(j,i) * T24(j,i) + S25(j,i) * U25(j,i) B24(j,i)= S24(j,i) * T25(j,i) + S25(j,i) * U24(j,i) A25(j,i)= S24(j,i) * C24(j,i) + S25(j,i) * V24(j,i) B25(j,i)= S24(j,i) * U25(j,i) + S25(j,i) * V25(j,i) end do end do do i = 1, n do j = 1, m C24(j,i)= S26(j,i) * T26(j,i) + S27(j,i) * U26(j,i) D24(j,i)= S26(j,i) * T27(j,i) + S27(j,i) * V26(j,i) C25(j,i)= S27(j,i) * S28(j,i) + S26(j,i) * U28(j,i) D25(j,i)= S27(j,i) * T28(j,i) + S26(j,i) * V28(j,i) end do end do Conclusions Over the course of the last year, I have had the opportunity to work with over two dozen academic scientific programmers at leading research universities. Their research interests span a broad range of scientific fields. Except for two programs that relied almost exclusively on library routines (matrix multiply and fast Fourier transform), I was able to improve significantly the single processor performance of all codes. Improvements range from 2x to 15.5x with a simple average of 4.75x. Changes to the source code were at a very high level. I did not use sophisticated techniques or programming tools to discover inefficiencies or effect the changes. Only one code was parallel despite the availability of parallel systems to all developers. Clearly, we have a problem—personal scientific research codes are highly inefficient and not running parallel. The developers are unaware of simple optimization techniques to make programs run faster. They lack education in the art of code optimization and parallel programming. I do not believe we can fix the problem by publishing additional books or training manuals. To date, the developers in questions have not studied the books or manual available, and are unlikely to do so in the future. Short courses are a possible solution, but I believe they are too concentrated to be much use. The general concepts can be taught in a three or four day course, but that is not enough time for students to practice what they learn and acquire the experience to apply and extend the concepts to their codes. Practice is the key to becoming proficient at optimization. I recommend that graduate students be required to take a semester length course in optimization and parallel programming. We would never give someone access to state-of-the-art scientific equipment costing hundreds of thousands of dollars without first requiring them to demonstrate that they know how to use the equipment. Yet the criterion for time on state-of-the-art supercomputers is at most an interesting project. Requestors are never asked to demonstrate that they know how to use the system, or can use the system effectively. A semester course would teach them the required skills. Government agencies that fund academic scientific research pay for most of the computer systems supporting scientific research as well as the development of most personal scientific codes. These agencies should require graduate schools to offer a course in optimization and parallel programming as a requirement for funding. About the Author John Feo received his Ph.D. in Computer Science from The University of Texas at Austin in 1986. After graduate school, Dr. Feo worked at Lawrence Livermore National Laboratory where he was the Group Leader of the Computer Research Group and principal investigator of the Sisal Language Project. In 1997, Dr. Feo joined Tera Computer Company where he was project manager for the MTA, and oversaw the programming and evaluation of the MTA at the San Diego Supercomputer Center. In 2000, Dr. Feo joined Sun Microsystems as an HPC application specialist. He works with university research groups to optimize and parallelize scientific codes. Dr. Feo has published over two dozen research articles in the areas of parallel parallel programming, parallel programming languages, and application performance.

Read the article
Are there alternatives to Sysinternals ADInsight?

- by mmcglynn

I had been using ADInsight from Sysinternals to trace Active Directory calls from my workstation, but the application has failed. Where previously the Active Directory events were traced and logged, now the window remains blank, whether the application is in capture mode or not. I have run as Administrator, rebooted, downloaded a new version; none of those actions has returned the program to a functional state. The Sysinternals forums don't offer much hope, since this tool is known to fail often. Is there tool that has similar functionality? Questions Does the tool fail when run from another workstation with your account? Yes Does it fail from your (and/or) another workstation using someone else's account? Yes Is there anything in the event log of your workstation? No

Read the article

Search Results

Search found 7097 results on 284 pages for 'calls'.

Page 103/284 | < Previous Page | 99 100 101 102 103 104 105 106 107 108 109 110 | Next Page >

- by Gregory Burd

- by Rick Strahl

- by plblum

- by mehfuzh

- by James Michael Hare

- by Michael Stahre

- by Chris Muir

- by Bertrand Le Roy

- by red(at)work

- by Timmy Kokke

- by James Michael Hare

- by cibrax

- by cibrax

- by Steve Michelotti

- by Bill Graziano

- by Justin Jones

- by ToStringTheory

- by Alois Kraus

- by CliveT

- by charlie.mott

- by A.M.K

- by Ralf Westphal

- by rchrd

- by mmcglynn

< Previous Page | 99 100 101 102 103 104 105 106 107 108 109 110 | Next Page >