Search Results

Search found 468 results on 19 pages for 'inconsistent'.

Page 19/19 | < Previous Page | 15 16 17 18 19

Managed Service Architectures Part I

- by barryoreilly

Instead of thinking about service oriented architecture, a concept that is continually defined, redefined, abused and mistreated, perhaps it is time to drop the acronym and consider what we actually need to get the job done. ‘Pure’ SOA involves the modeling of an organisation’s processes, the so called ‘Top Down’ approach, followed by the implementation of these processes as services. Another approach, more commonly seen in the wild, is the bottom up approach. This usually involves services that simply start popping up in the organization, and SOA in this case is often just an attempt to rein in these services. Such projects, although described as SOA projects for a variety of reasons, have clearly little relation to process driven architecture. Much has been written about these two approaches, with many deciding that a hybrid of both methods is needed to succeed with SOA. These hybrid methods are a sensible compromise, but one gets the feeling that there is too much focus on ‘Succeeding with SOA’. Organisations who focus too much on bottom up development, or who waste too much time and money on top down approaches that don’t produce results, are often recommended to attempt an ‘agile’(Erl) or ‘middle-out’ (Microsoft) approach in order to succeed with SOA. The problem with recommending this approach is that, in most cases, succeeding with SOA isn’t the aim of the project. If a project is started with the simple aim of ‘Succeeding with SOA’ then the reasons for the projects existence probably need to be questioned. There are a number of things we can be sure of: · An organisation will have a number of disparate IT systems · Some of these systems will have redundant data and functionality · Integration will give considerable ROI · Integration will already be under way. · Services will already exist in the organisation · These services will be inconsistent in their implementation and in their governance So there are three goals here: 1. Alignment between the business and IT 2. Integration of disparate systems 3. Management of services. 2 and 3 are going to happen, in fact they must happen if any degree of return is expected from the IT department. Ignoring 1 is considered a typical mistake in SOA implementations, as it ignores the business implications. However, the business implication of this approach is the money saved in more efficient IT processes. 2 and 3 are ongoing, and they will continue happening, even if a large project to produce a SOA metamodel is started. The result will then be an unstructured cackle of services, and a metamodel that is already going out of date. So we get stuck in and rebuild our services so that they match the metamodel, with the far reaching consequences that this will have on all our LOB systems are current. Lets imagine that this actually works ( how often do we rip and replace working software because it doesn't fit a certain pattern? Never -that's the point of integration), we will now be working with a metamodel that is out of date, and most likely incomplete if the organisation is large. Accepting that an object can have more than one model over time, with perhaps more than one model being at any given time will help us realise the limitations of the top down model. It is entirely normal , and perhaps necessary, for an organisation to be able to view an entity from different perspectives. So, instead of trying to constantly force these goals in a straight line, why not let them happen in parallel, and manage the changes in each layer. If company A has chosen to model their business processes and create a business architecture, there will be a reason behind this. Often the aim is to make the business more flexible and able to cope with change, through alignment between the business and the IT department. If company B’s IT department recognizes the problem of wild services springing up everywhere, and decides to do something about it, by designing a platform and processes for the introduction of services, is this not a valid approach? With the hybrid approach, it is recommended that company A begin deploying services as quickly as possible. Based on models that are clearly incomplete, and which will therefore change rapidly and often in the near future. Natural business evolution will also mean that the models can be guaranteed to change in the not so near future. To ‘Succeed with SOA’ Company B needs to go back to the drawing board and start modeling processes and objects. So, in effect, we are telling business analysts to start developing code based on a model they are unsure of, and telling programmers to ignore the obvious and growing problems in their IT department and start drawing lines and boxes. Could the problem be that there are two different problem domains? And the whole concept of SOA as it being described by clever salespeople today creates an example of oft dreaded ‘tight coupling’ between these two domains? Could it be that we have taken two large problem areas, and bundled the solution together in order to create a magic bullet? And then convinced ourselves that the bullet actually exists? Company A wants to have a closer relationship between the business and its IT department, in order to become a more flexible organization. Company B wants to decrease the maintenance costs of its IT infrastructure. If both companies focus on succeeding with SOA, then they aren’t focusing on their actual goals. If Company A starts building services from incomplete models, without a gameplan, they will end up in the same situation as company B, with wild services. If company B focuses on modeling, they could easily end up with the same problems as company A. Now we have two companies, who a short while ago had one problem each, that now have two problems each. This has happened because of a focus on ‘Succeeding with SOA’, rather than solving the problem at hand. This is not to suggest that the two problem domains are unrelated, a strategy that encompasses both will obviously be good for the organization. But only if the organization realizes this and can develop such a strategy. This strategy cannot be bought in a box. Anyone who has worked with SOA for a while will be used to analyzing the solutions to a problem and judging the solution’s level of coupling. If we have two applications that each perform separate functions, but need to communicate with each other, we create a integration layer between them, perhaps with a service, but we do all we can to reduce the dependency between the two systems. Using the same approach, we can separate the modeling (business architecture) and the service hosting (technical architecture). The business architecture describes the processes and business objects in the business domain. The technical architecture describes the hosting and management and implementation of services. The glue that binds these together, the integration layer in our analogy, is the service contract, where the operations map the processes to their technical implementation, and the messages map business concepts to software objects in the implementation. If we reduce the coupling between these layers, we should be able to allow developers to develop services, and business analysts to develop models, without the changes rippling through from one side to the other. This would allow company A to carry on modeling, and company B to develop a service platform, each achieving their intended goal, without necessarily creating the problems seen in pure top down or bottom up approaches. Company B could then at a later date map their service infrastructure to a unified model, and company A could carry on modeling, insulating deployed services from changes in the ongoing modeling. How do we do this? The concept of service virtualization has been around for a while, and is instantly realizable in Microsoft’s Managed Services Engine. Here we can create a layer of virtual services, which represent the business analyst’s view, presenting uniform contracts to the outside world. These services can then transform and route messages to the actual service implementations. I like to think of the virtual services with their beautifully modeled interfaces as ‘SOA services’, and the implementations as simple integration ‘adapter’ services providing an interface to a technical implementation. The Managed Services Engine also provides policy based control over services, regardless of where they are deployed, simplifying handling of security, logging, exception handling etc. This solves a big problem. The pressure to deliver services quickly is always there in projects. It is very important to quickly show value when implementing service architectures. There is also pressure to deliver quality, and you can’t easily do both at the same time. This approach allows quick delivery with quality increasing over time, allowing modeling and service development to occur in parallel and independent of each other. The link between business modeling and service implementation is not one that is obvious to many organizations, and requires a certain maturity to realize and drive forward. It is also completely possible that a company can benefit from one without the other, even if this approach is frowned upon today, there are many companies doing so and seeing ROI. Of course there are disadvantages to this. The biggest one being the transformations necessary between the virtual interfaces and the service implementations. Bad choices in developing the services in the service implementation could mean that it is impossible to map the modeled processes to the implementation with redevelopment of the service. In many cases the architect will not have a choice here anyway, as proprietary systems are often delivered with predeveloped services. The alternative is to wait until the model is finished and then build the service according the model. However, if that approach worked we wouldn’t be having this discussion! And even when it does work, natural business evolution will mean that the two concepts (model and implementation) will immediately start to drift away from each other, so coupling them tightly together so that they are forever bound to the model that only applies at the time of the modeling work will not really achieve a great deal. Architecture is all about trade offs, and here a choice has to be made. The choice is between something will initially be of low quality but will work, or something that may well be impossible to achieve in most situations. In conclusion, top-down is a natural approach for business analysts, and bottom-up is a natural approach for developers. Instead of trying to force something on both that neither want, and which has not shown itself to be successful, why not let them get on with their jobs, and let an enterprise architect coordinate the processes?

Read the article
Of transactions and Mongo

- by Nuri Halperin

Originally posted on: http://geekswithblogs.net/nuri/archive/2014/05/20/of-transactions-and-mongo-again.aspxWhat's the first thing you hear about NoSQL databases? That they lose your data? That there's no transactions? No joins? No hope for "real" applications? Well, you *should* be wondering whether a certain of database is the right one for your job. But if you do so, you should be wondering that about "traditional" databases as well! In the spirit of exploration let's take a look at a common challenge: You are a bank. You have customers with accounts. Customer A wants to pay B. You want to allow that only if A can cover the amount being transferred. Let's looks at the problem without any context of any database engine in mind. What would you do? How would you ensure that the amount transfer is done "properly"? Would you prevent a "transaction" from taking place unless A can cover the amount? There are several options: Prevent any change to A's account while the transfer is taking place. That boils down to locking. Apply the change, and allow A's balance to go below zero. Charge person A some interest on the negative balance. Not friendly, but certainly a choice. Don't do either. Options 1 and 2 are difficult to attain in the NoSQL world. Mongo won't save you headaches here either. Option 3 looks a bit harsh. But here's where this can go: ledger. See, and account doesn't need to be represented by a single row in a table of all accounts with only the current balance on it. More often than not, accounting systems use ledgers. And entries in ledgers - as it turns out – don't actually get updated. Once a ledger entry is written, it is not removed or altered. A transaction is represented by an entry in the ledger stating and amount withdrawn from A's account and an entry in the ledger stating an addition of said amount to B's account. For sake of space-saving, that entry in the ledger can happen using one entry. Think {Timestamp, FromAccountId, ToAccountId, Amount}. The implication of the original question – "how do you enforce non-negative balance rule" then boils down to: Insert entry in ledger Run validation of recent entries Insert reverse entry to roll back transaction if validation failed. What is validation? Sum up the transactions that A's account has (all deposits and debits), and ensure the balance is positive. For sake of efficiency, one can roll up transactions and "close the book" on transactions with a pseudo entry stating balance as of midnight or something. This lets you avoid doing math on the fly on too many transactions. You simply run from the latest "approved balance" marker to date. But that's an optimization, and premature optimizations are the root of (some? most?) evil.. Back to some nagging questions though: "But mongo is only eventually consistent!" Well, yes, kind of. It's not actually true that Mongo has not transactions. It would be more descriptive to say that Mongo's transaction scope is a single document in a single collection. A write to a Mongo document happens completely or not at all. So although it is true that you can't update more than one documents "at the same time" under a "transaction" umbrella as an atomic update, it is NOT true that there' is no isolation. So a competition between two concurrent updates is completely coherent and the writes will be serialized. They will not scribble on the same document at the same time. In our case - in choosing a ledger approach - we're not even trying to "update" a document, we're simply adding a document to a collection. So there goes the "no transaction" issue. Now let's turn our attention to consistency. What you should know about mongo is that at any given moment, only on member of a replica set is writable. This means that the writable instance in a set of replicated instances always has "the truth". There could be a replication lag such that a reader going to one of the replicas still sees "old" state of a collection or document. But in our ledger case, things fall nicely into place: Run your validation against the writable instance. It is guaranteed to have a ledger either with (after) or without (before) the ledger entry got written. No funky states. Again, the ledger writing *adds* a document, so there's no inconsistent document state to be had either way. Next, we might worry about data loss. Here, mongo offers several write-concerns. Write-concern in Mongo is a mode that marshals how uptight you want the db engine to be about actually persisting a document write to disk before it reports to the application that it is "done". The most volatile, is to say you don't care. In that case, mongo would just accept your write command and say back "thanks" with no guarantee of persistence. If the server loses power at the wrong moment, it may have said "ok" but actually no written the data to disk. That's kind of bad. Don't do that with data you care about. It may be good for votes on a pole regarding how cute a furry animal is, but not so good for business. There are several other write-concerns varying from flushing the write to the disk of the writable instance, flushing to disk on several members of the replica set, a majority of the replica set or all of the members of a replica set. The former choice is the quickest, as no network coordination is required besides the main writable instance. The others impose extra network and time cost. Depending on your tolerance for latency and read-lag, you will face a choice of what works for you. It's really important to understand that no data loss occurs once a document is flushed to an instance. The record is on disk at that point. From that point on, backup strategies and disaster recovery are your worry, not loss of power to the writable machine. This scenario is not different from a relational database at that point. Where does this leave us? Oh, yes. Eventual consistency. By now, we ensured that the "source of truth" instance has the correct data, persisted and coherent. But because of lag, the app may have gone to the writable instance, performed the update and then gone to a replica and looked at the ledger there before the transaction replicated. Here are 2 options to deal with this. Similar to write concerns, mongo support read preferences. An app may choose to read only from the writable instance. This is not an awesome choice to make for every ready, because it just burdens the one instance, and doesn't make use of the other read-only servers. But this choice can be made on a query by query basis. So for the app that our person A is using, we can have person A issue the transfer command to B, and then if that same app is going to immediately as "are we there yet?" we'll query that same writable instance. But B and anyone else in the world can just chill and read from the read-only instance. They have no basis to expect that the ledger has just been written to. So as far as they know, the transaction hasn't happened until they see it appear later. We can further relax the demand by creating application UI that reacts to a write command with "thank you, we will post it shortly" instead of "thank you, we just did everything and here's the new balance". This is a very powerful thing. UI design for highly scalable systems can't insist that the all databases be locked just to paint an "all done" on screen. People understand. They were trained by many online businesses already that your placing of an order does not mean that your product is already outside your door waiting (yes, I know, large retailers are working on it... but were' not there yet). The second thing we can do, is add some artificial delay to a transaction's visibility on the ledger. The way that works is simply adding some logic such that the query against the ledger never nets a transaction for customers newer than say 15 minutes and who's validation flag is not set. This buys us time 2 ways: Replication can catch up to all instances by then, and validation rules can run and determine if this transaction should be "negated" with a compensating transaction. In case we do need to "roll back" the transaction, the backend system can place the timestamp of the compensating transaction at the exact same time or 1ms after the original one. Effectively, once A or B visits their ledger, both transactions would be visible and the overall balance "as of now" would reflect no change. The 2 transactions (attempted/ reverted) would be visible , since we do actually account for the attempt. Hold on a second. There's a hole in the story: what if several transfers from A to some accounts are registered, and 2 independent validators attempt to compute the balance concurrently? Is there a chance that both would conclude non-sufficient-funds even though rolling back transaction 100 would free up enough for transaction 117 (some random later transaction)? Yes. there is that chance. But the integrity of the business rule is not compromised, since the prime rule is don't dispense money you don't have. To minimize or eliminate this scenario, we can also assign a single validation process per origin account. This may seem non-scalable, but it can easily be done as a "sharded" distribution. Say we have 11 validation threads (or processing nodes etc.). We divide the account number space such that each validator is exclusively responsible for a certain range of account numbers. Sounds cunningly similar to Mongo's sharding strategy, doesn't it? Each validator then works in isolation. More capacity needed? Chop the account space into more chunks. So where are we now with the nagging questions? "No joins": Huh? What are those for? "No transactions": You mean no cross-collection and no cross-document transactions? Granted - but don't always need them either. "No hope for real applications": well... There are more issues and edge cases to slog through, I'm sure. But hopefully this gives you some ideas of how to solve common problems without distributed locking and relational databases. But then again, you can choose relational databases if they suit your problem.

Read the article
Inheritance Mapping Strategies with Entity Framework Code First CTP5: Part 3 – Table per Concrete Type (TPC) and Choosing Strategy Guidelines

- by mortezam

This is the third (and last) post in a series that explains different approaches to map an inheritance hierarchy with EF Code First. I've described these strategies in previous posts: Part 1 – Table per Hierarchy (TPH) Part 2 – Table per Type (TPT)In today’s blog post I am going to discuss Table per Concrete Type (TPC) which completes the inheritance mapping strategies supported by EF Code First. At the end of this post I will provide some guidelines to choose an inheritance strategy mainly based on what we've learned in this series. TPC and Entity Framework in the Past Table per Concrete type is somehow the simplest approach suggested, yet using TPC with EF is one of those concepts that has not been covered very well so far and I've seen in some resources that it was even discouraged. The reason for that is just because Entity Data Model Designer in VS2010 doesn't support TPC (even though the EF runtime does). That basically means if you are following EF's Database-First or Model-First approaches then configuring TPC requires manually writing XML in the EDMX file which is not considered to be a fun practice. Well, no more. You'll see that with Code First, creating TPC is perfectly possible with fluent API just like other strategies and you don't need to avoid TPC due to the lack of designer support as you would probably do in other EF approaches. Table per Concrete Type (TPC)In Table per Concrete type (aka Table per Concrete class) we use exactly one table for each (nonabstract) class. All properties of a class, including inherited properties, can be mapped to columns of this table, as shown in the following figure: As you can see, the SQL schema is not aware of the inheritance; effectively, we’ve mapped two unrelated tables to a more expressive class structure. If the base class was concrete, then an additional table would be needed to hold instances of that class. I have to emphasize that there is no relationship between the database tables, except for the fact that they share some similar columns. TPC Implementation in Code First Just like the TPT implementation, we need to specify a separate table for each of the subclasses. We also need to tell Code First that we want all of the inherited properties to be mapped as part of this table. In CTP5, there is a new helper method on EntityMappingConfiguration class called MapInheritedProperties that exactly does this for us. Here is the complete object model as well as the fluent API to create a TPC mapping: public abstract class BillingDetail { public int BillingDetailId { get; set; } public string Owner { get; set; } public string Number { get; set; } } public class BankAccount : BillingDetail { public string BankName { get; set; } public string Swift { get; set; } } public class CreditCard : BillingDetail { public int CardType { get; set; } public string ExpiryMonth { get; set; } public string ExpiryYear { get; set; } } public class InheritanceMappingContext : DbContext { public DbSet<BillingDetail> BillingDetails { get; set; } protected override void OnModelCreating(ModelBuilder modelBuilder) { modelBuilder.Entity<BankAccount>().Map(m => { m.MapInheritedProperties(); m.ToTable("BankAccounts"); }); modelBuilder.Entity<CreditCard>().Map(m => { m.MapInheritedProperties(); m.ToTable("CreditCards"); }); } } The Importance of EntityMappingConfiguration ClassAs a side note, it worth mentioning that EntityMappingConfiguration class turns out to be a key type for inheritance mapping in Code First. Here is an snapshot of this class: namespace System.Data.Entity.ModelConfiguration.Configuration.Mapping { public class EntityMappingConfiguration<TEntityType> where TEntityType : class { public ValueConditionConfiguration Requires(string discriminator); public void ToTable(string tableName); public void MapInheritedProperties(); } } As you have seen so far, we used its Requires method to customize TPH. We also used its ToTable method to create a TPT and now we are using its MapInheritedProperties along with ToTable method to create our TPC mapping. TPC Configuration is Not Done Yet!We are not quite done with our TPC configuration and there is more into this story even though the fluent API we saw perfectly created a TPC mapping for us in the database. To see why, let's start working with our object model. For example, the following code creates two new objects of BankAccount and CreditCard types and tries to add them to the database: using (var context = new InheritanceMappingContext()) { BankAccount bankAccount = new BankAccount(); CreditCard creditCard = new CreditCard() { CardType = 1 }; context.BillingDetails.Add(bankAccount); context.BillingDetails.Add(creditCard); context.SaveChanges(); } Running this code throws an InvalidOperationException with this message: The changes to the database were committed successfully, but an error occurred while updating the object context. The ObjectContext might be in an inconsistent state. Inner exception message: AcceptChanges cannot continue because the object's key values conflict with another object in the ObjectStateManager. Make sure that the key values are unique before calling AcceptChanges. The reason we got this exception is because DbContext.SaveChanges() internally invokes SaveChanges method of its internal ObjectContext. ObjectContext's SaveChanges method on its turn by default calls AcceptAllChanges after it has performed the database modifications. AcceptAllChanges method merely iterates over all entries in ObjectStateManager and invokes AcceptChanges on each of them. Since the entities are in Added state, AcceptChanges method replaces their temporary EntityKey with a regular EntityKey based on the primary key values (i.e. BillingDetailId) that come back from the database and that's where the problem occurs since both the entities have been assigned the same value for their primary key by the database (i.e. on both BillingDetailId = 1) and the problem is that ObjectStateManager cannot track objects of the same type (i.e. BillingDetail) with the same EntityKey value hence it throws. If you take a closer look at the TPC's SQL schema above, you'll see why the database generated the same values for the primary keys: the BillingDetailId column in both BankAccounts and CreditCards table has been marked as identity. How to Solve The Identity Problem in TPC As you saw, using SQL Server’s int identity columns doesn't work very well together with TPC since there will be duplicate entity keys when inserting in subclasses tables with all having the same identity seed. Therefore, to solve this, either a spread seed (where each table has its own initial seed value) will be needed, or a mechanism other than SQL Server’s int identity should be used. Some other RDBMSes have other mechanisms allowing a sequence (identity) to be shared by multiple tables, and something similar can be achieved with GUID keys in SQL Server. While using GUID keys, or int identity keys with different starting seeds will solve the problem but yet another solution would be to completely switch off identity on the primary key property. As a result, we need to take the responsibility of providing unique keys when inserting records to the database. We will go with this solution since it works regardless of which database engine is used. Switching Off Identity in Code First We can switch off identity simply by placing DatabaseGenerated attribute on the primary key property and pass DatabaseGenerationOption.None to its constructor. DatabaseGenerated attribute is a new data annotation which has been added to System.ComponentModel.DataAnnotations namespace in CTP5: public abstract class BillingDetail { [DatabaseGenerated(DatabaseGenerationOption.None)] public int BillingDetailId { get; set; } public string Owner { get; set; } public string Number { get; set; } } As always, we can achieve the same result by using fluent API, if you prefer that: modelBuilder.Entity<BillingDetail>() .Property(p => p.BillingDetailId) .HasDatabaseGenerationOption(DatabaseGenerationOption.None); Working With The Object Model Our TPC mapping is ready and we can try adding new records to the database. But, like I said, now we need to take care of providing unique keys when creating new objects: using (var context = new InheritanceMappingContext()) { BankAccount bankAccount = new BankAccount() { BillingDetailId = 1 }; CreditCard creditCard = new CreditCard() { BillingDetailId = 2, CardType = 1 }; context.BillingDetails.Add(bankAccount); context.BillingDetails.Add(creditCard); context.SaveChanges(); } Polymorphic Associations with TPC is Problematic The main problem with this approach is that it doesn’t support Polymorphic Associations very well. After all, in the database, associations are represented as foreign key relationships and in TPC, the subclasses are all mapped to different tables so a polymorphic association to their base class (abstract BillingDetail in our example) cannot be represented as a simple foreign key relationship. For example, consider the the domain model we introduced here where User has a polymorphic association with BillingDetail. This would be problematic in our TPC Schema, because if User has a many-to-one relationship with BillingDetail, the Users table would need a single foreign key column, which would have to refer both concrete subclass tables. This isn’t possible with regular foreign key constraints. Schema Evolution with TPC is Complex A further conceptual problem with this mapping strategy is that several different columns, of different tables, share exactly the same semantics. This makes schema evolution more complex. For example, a change to a base class property results in changes to multiple columns. It also makes it much more difficult to implement database integrity constraints that apply to all subclasses. Generated SQLLet's examine SQL output for polymorphic queries in TPC mapping. For example, consider this polymorphic query for all BillingDetails and the resulting SQL statements that being executed in the database: var query = from b in context.BillingDetails select b; Just like the SQL query generated by TPT mapping, the CASE statements that you see in the beginning of the query is merely to ensure columns that are irrelevant for a particular row have NULL values in the returning flattened table. (e.g. BankName for a row that represents a CreditCard type). TPC's SQL Queries are Union Based As you can see in the above screenshot, the first SELECT uses a FROM-clause subquery (which is selected with a red rectangle) to retrieve all instances of BillingDetails from all concrete class tables. The tables are combined with a UNION operator, and a literal (in this case, 0 and 1) is inserted into the intermediate result; (look at the lines highlighted in yellow.) EF reads this to instantiate the correct class given the data from a particular row. A union requires that the queries that are combined, project over the same columns; hence, EF has to pad and fill up nonexistent columns with NULL. This query will really perform well since here we can let the database optimizer find the best execution plan to combine rows from several tables. There is also no Joins involved so it has a better performance than the SQL queries generated by TPT where a Join is required between the base and subclasses tables. Choosing Strategy GuidelinesBefore we get into this discussion, I want to emphasize that there is no one single "best strategy fits all scenarios" exists. As you saw, each of the approaches have their own advantages and drawbacks. Here are some rules of thumb to identify the best strategy in a particular scenario: If you don’t require polymorphic associations or queries, lean toward TPC—in other words, if you never or rarely query for BillingDetails and you have no class that has an association to BillingDetail base class. I recommend TPC (only) for the top level of your class hierarchy, where polymorphism isn’t usually required, and when modification of the base class in the future is unlikely. If you do require polymorphic associations or queries, and subclasses declare relatively few properties (particularly if the main difference between subclasses is in their behavior), lean toward TPH. Your goal is to minimize the number of nullable columns and to convince yourself (and your DBA) that a denormalized schema won’t create problems in the long run. If you do require polymorphic associations or queries, and subclasses declare many properties (subclasses differ mainly by the data they hold), lean toward TPT. Or, depending on the width and depth of your inheritance hierarchy and the possible cost of joins versus unions, use TPC. By default, choose TPH only for simple problems. For more complex cases (or when you’re overruled by a data modeler insisting on the importance of nullability constraints and normalization), you should consider the TPT strategy. But at that point, ask yourself whether it may not be better to remodel inheritance as delegation in the object model (delegation is a way of making composition as powerful for reuse as inheritance). Complex inheritance is often best avoided for all sorts of reasons unrelated to persistence or ORM. EF acts as a buffer between the domain and relational models, but that doesn’t mean you can ignore persistence concerns when designing your classes. SummaryIn this series, we focused on one of the main structural aspect of the object/relational paradigm mismatch which is inheritance and discussed how EF solve this problem as an ORM solution. We learned about the three well-known inheritance mapping strategies and their implementations in EF Code First. Hopefully it gives you a better insight about the mapping of inheritance hierarchies as well as choosing the best strategy for your particular scenario. Happy New Year and Happy Code-Firsting! References ADO.NET team blog Java Persistence with Hibernate book a { color: #5A99FF; } a:visited { color: #5A99FF; } .title { padding-bottom: 5px; font-family: Segoe UI; font-size: 11pt; font-weight: bold; padding-top: 15px; } .code, .typeName { font-family: consolas; } .typeName { color: #2b91af; } .padTop5 { padding-top: 5px; } .padTop10 { padding-top: 10px; } .exception { background-color: #f0f0f0; font-style: italic; padding-bottom: 5px; padding-left: 5px; padding-top: 5px; padding-right: 5px; }

Read the article
The Execute SQL Task

In this article we are going to take you through the Execute SQL Task in SQL Server Integration Services for SQL Server 2005 (although it appies just as well to SQL Server 2008). We will be covering all the essentials that you will need to know to effectively use this task and make it as flexible as possible. The things we will be looking at are as follows: A tour of the Task. The properties of the Task. After looking at these introductory topics we will then get into some examples. The examples will show different types of usage for the task: Returning a single value from a SQL query with two input parameters. Returning a rowset from a SQL query. Executing a stored procedure and retrieveing a rowset, a return value, an output parameter value and passing in an input parameter. Passing in the SQL Statement from a variable. Passing in the SQL Statement from a file. Tour Of The Task Before we can start to use the Execute SQL Task in our packages we are going to need to locate it in the toolbox. Let's do that now. Whilst in the Control Flow section of the package expand your toolbox and locate the Execute SQL Task. Below is how we found ours. Now drag the task onto the designer. As you can see from the following image we have a validation error appear telling us that no connection manager has been assigned to the task. This can be easily remedied by creating a connection manager. There are certain types of connection manager that are compatable with this task so we cannot just create any connection manager and these are detailed in a few graphics time. Double click on the task itself to take a look at the custom user interface provided to us for this task. The task will open on the general tab as shown below. Take a bit of time to have a look around here as throughout this article we will be revisting this page many times. Whilst on the general tab, drop down the combobox next to the ConnectionType property. In here you will see the types of connection manager which this task will accept. As with SQL Server 2000 DTS, SSIS allows you to output values from this task in a number of formats. Have a look at the combobox next to the Resultset property. The major difference here is the ability to output into XML. If you drop down the combobox next to the SQLSourceType property you will see the ways in which you can pass a SQL Statement into the task itself. We will have examples of each of these later on but certainly when we saw these for the first time we were very excited. Next to the SQLStatement property if you click in the empty box next to it you will see ellipses appear. Click on them and you will see the very basic query editor that becomes available to you. Alternatively after you have specified a connection manager for the task you can click on the Build Query button to bring up a completely different query editor. This is slightly inconsistent. Once you've finished looking around the general tab, move on to the next tab which is the parameter mapping tab. We shall, again, be visiting this tab throughout the article but to give you an initial heads up this is where you define the input, output and return values from your task. Note this is not where you specify the resultset. If however you now move on to the ResultSet tab this is where you define what variable will receive the output from your SQL Statement in whatever form that is. Property Expressions are one of the most amazing things to happen in SSIS and they will not be covered here as they deserve a whole article to themselves. Watch out for this as their usefulness will astound you. For a more detailed discussion of what should be the parameter markers in the SQL Statements on the General tab and how to map them to variables on the Parameter Mapping tab see Working with Parameters and Return Codes in the Execute SQL Task. Task Properties There are two places where you can specify the properties for your task. One is in the task UI itself and the other is in the property pane which will appear if you right click on your task and select Properties from the context menu. We will be doing plenty of property setting in the UI later so let's take a moment to have a look at the property pane. Below is a graphic showing our properties pane. Now we shall take you through all the properties and tell you exactly what they mean. A lot of these properties you will see across all tasks as well as the package because of everything's base structure The Container. BypassPrepare Should the statement be prepared before sending to the connection manager destination (True/False) Connection This is simply the name of the connection manager that the task will use. We can get this from the connection manager tray at the bottom of the package. DelayValidation Really interesting property and it tells the task to not validate until it actually executes. A usage for this may be that you are operating on table yet to be created but at runtime you know the table will be there. Description Very simply the description of your Task. Disable Should the task be enabled or not? You can also set this through a context menu by right clicking on the task itself. DisableEventHandlers As a result of events that happen in the task, should the event handlers for the container fire? ExecValueVariable The variable assigned here will get or set the execution value of the task. Expressions Expressions as we mentioned earlier are a really powerful tool in SSIS and this graphic below shows us a small peek of what you can do. We select a property on the left and assign an expression to the value of that property on the right causing the value to be dynamically changed at runtime. One of the most obvious uses of this is that the property value can be built dynamically from within the package allowing you a great deal of flexibility FailPackageOnFailure If this task fails does the package? FailParentOnFailure If this task fails does the parent container? A task can he hosted inside another container i.e. the For Each Loop Container and this would then be the parent. ForcedExecutionValue This property allows you to hard code an execution value for the task. ForcedExecutionValueType What is the datatype of the ForcedExecutionValue? ForceExecutionResult Force the task to return a certain execution result. This could then be used by the workflow constraints. Possible values are None, Success, Failure and Completion. ForceExecutionValue Should we force the execution result? IsolationLevel This is the transaction isolation level of the task. IsStoredProcedure Certain optimisations are made by the task if it knows that the query is a Stored Procedure invocation. The docs say this will always be false unless the connection is an ADO connection. LocaleID Gets or sets the LocaleID of the container. LoggingMode Should we log for this container and what settings should we use? The value choices are UseParentSetting, Enabled and Disabled. MaximumErrorCount How many times can the task fail before we call it a day? Name Very simply the name of the task. ResultSetType How do you want the results of your query returned? The choices are ResultSetType_None, ResultSetType_SingleRow, ResultSetType_Rowset and ResultSetType_XML. SqlStatementSource Your Query/SQL Statement. SqlStatementSourceType The method of specifying the query. Your choices here are DirectInput, FileConnection and Variables TimeOut How long should the task wait to receive results? TransactionOption How should the task handle being asked to join a transaction? Usage Examples As we move through the examples we will only cover in them what we think you must know and what we think you should see. This means that some of the more elementary steps like setting up variables will be covered in the early examples but skipped and simply referred to in later ones. All these examples used the AventureWorks database that comes with SQL Server 2005. Returning a Single Value, Passing in Two Input Parameters So the first thing we are going to do is add some variables to our package. The graphic below shows us those variables having been defined. Here the CountOfEmployees variable will be used as the output from the query and EndDate and StartDate will be used as input parameters. As you can see all these variables have been scoped to the package. Scoping allows us to have domains for variables. Each container has a scope and remember a package is a container as well. Variable values of the parent container can be seen in child containers but cannot be passed back up to the parent from a child. Our following graphic has had a number of changes made. The first of those changes is that we have created and assigned an OLEDB connection manager to this Task ExecuteSQL Task Connection. The next thing is we have made sure that the SQLSourceType property is set to Direct Input as we will be writing in our statement ourselves. We have also specified that only a single row will be returned from this query. The expressions we typed in was: SELECT COUNT(*) AS CountOfEmployees FROM HumanResources.Employee WHERE (HireDate BETWEEN ? AND ?) Moving on now to the Parameter Mapping tab this is where we are going to tell the task about our input paramaters. We Add them to the window specifying their direction and datatype. A quick word here about the structure of the variable name. As you can see SSIS has preceeded the variable with the word user. This is a default namespace for variables but you can create your own. When defining your variables if you look at the variables window title bar you will see some icons. If you hover over the last one on the right you will see it says "Choose Variable Columns". If you click the button you will see a list of checkbox options and one of them is namespace. after checking this you will see now where you can define your own namespace. The next tab, result set, is where we need to get back the value(s) returned from our statement and assign to a variable which in our case is CountOfEmployees so we can use it later perhaps. Because we are only returning a single value then if you remember from earlier we are allowed to assign a name to the resultset but it must be the name of the column (or alias) from the query. A really cool feature of Business Intelligence Studio being hosted by Visual Studio is that we get breakpoint support for free. In our package we set a Breakpoint so we can break the package and have a look in a watch window at the variable values as they appear to our task and what the variable value of our resultset is after the task has done the assignment. Here's that window now. As you can see the count of employess that matched the data range was 2. Returning a Rowset In this example we are going to return a resultset back to a variable after the task has executed not just a single row single value. There are no input parameters required so the variables window is nice and straight forward. One variable of type object. Here is the statement that will form the soure for our Resultset. select p.ProductNumber, p.name, pc.Name as ProductCategoryNameFROM Production.ProductCategory pcJOIN Production.ProductSubCategory pscON pc.ProductCategoryID = psc.ProductCategoryIDJOIN Production.Product pON psc.ProductSubCategoryID = p.ProductSubCategoryID We need to make sure that we have selected Full result set as the ResultSet as shown below on the task's General tab. Because there are no input parameters we can skip the parameter mapping tab and move straight to the Result Set tab. Here we need to Add our variable defined earlier and map it to the result name of 0 (remember we covered this earlier) Once we run the task we can again set a breakpoint and have a look at the values coming back from the task. In the following graphic you can see the result set returned to us as a COM object. We can do some pretty interesting things with this COM object and in later articles that is exactly what we shall be doing. Return Values, Input/Output Parameters and Returning a Rowset from a Stored Procedure This example is pretty much going to give us a taste of everything. We have already covered in the previous example how to specify the ResultSet to be a Full result set so we will not cover it again here. For this example we are going to need 4 variables. One for the return value, one for the input parameter, one for the output parameter and one for the result set. Here is the statement we want to execute. Note how much cleaner it is than if you wanted to do it using the current version of DTS. In the Parameter Mapping tab we are going to Add our variables and specify their direction and datatypes. In the Result Set tab we can now map our final variable to the rowset returned from the stored procedure. It really is as simple as that and we were amazed at how much easier it is than in DTS 2000. Passing in the SQL Statement from a Variable SSIS as we have mentioned is hugely more flexible than its predecessor and one of the things you will notice when moving around the tasks and the adapters is that a lot of them accept a variable as an input for something they need. The ExecuteSQL task is no different. It will allow us to pass in a string variable as the SQL Statement. This variable value could have been set earlier on from inside the package or it could have been populated from outside using a configuration. The ResultSet property is set to single row and we'll show you why in a second when we look at the variables. Note also the SQLSourceType property. Here's the General Tab again. Looking at the variable we have in this package you can see we have only two. One for the return value from the statement and one which is obviously for the statement itself. Again we need to map the Result name to our variable and this can be a named Result Name (The column name or alias returned by the query) and not 0. The expected result into our variable should be the amount of rows in the Person.Contact table and if we look in the watch window we see that it is. Passing in the SQL Statement from a File The final example we are going to show is a really interesting one. We are going to pass in the SQL statement to the task by using a file connection manager. The file itself contains the statement to run. The first thing we are going to need to do is create our file connection mananger to point to our file. Click in the connections tray at the bottom of the designer, right click and choose "New File Connection" As you can see in the graphic below we have chosen to use an existing file and have passed in the name as well. Have a look around at the other "Usage Type" values available whilst you are here. Having set that up we can now see in the connection manager tray our file connection manager sitting alongside our OLE-DB connection we have been using for the rest of these examples. Now we can go back to the familiar General Tab to set up how the task will accept our file connection as the source. All the other properties in this task are set up exactly as we have been doing for other examples depending on the options chosen so we will not cover them again here. We hope you will agree that the Execute SQL Task has changed considerably in this release from its DTS predecessor. It has a lot of options available but once you have configured it a few times you get to learn what needs to go where. We hope you have found this article useful.

Read the article
VS 2012 Code Review – Before Check In OR After Check In?

- by Tarun Arora

“Is Code Review Important and Effective?” There is a consensus across the industry that code review is an effective and practical way to collar code inconsistency and possible defects early in the software development life cycle. Among others some of the advantages of code reviews are, Bugs are found faster Forces developers to write readable code (code that can be read without explanation or introduction!) Optimization methods/tricks/productive programs spread faster Programmers as specialists "evolve" faster It's fun “Code review is systematic examination (often known as peer review) of computer source code. It is intended to find and fix mistakes overlooked in the initial development phase, improving both the overall quality of software and the developers' skills. Reviews are done in various forms such as pair programming, informal walkthroughs, and formal inspections.” Wikipedia No where does the definition mention whether its better to review code before the code has been committed to version control or after the commit has been performed. No matter which side you favour, Visual Studio 2012 allows you to request for a code review both before check in and also request for a review after check in. Let’s weigh the pros and cons of the approaches independently. Code Review Before Check In or Code Review After Check In? Approach 1 – Code Review before Check in Developer completes the code and feels the code quality is appropriate for check in to TFS. The developer raises a code review request to have a second pair of eyes validate if the code abides to the recommended best practices, will not result in any defects due to common coding mistakes and whether any optimizations can be made to improve the code quality. Image 1 – code review before check in Pros Everything that gets committed to source control is reviewed. Minimizes the chances of smelly code making its way into the code base. Decreases the cost of fixing bugs, remember, the earlier you find them, the lesser the pain in fixing them. Cons Development Code Freeze – Since the changes aren’t in the source control yet. Further development can only be done off-line. The changes have not been through a CI build, hard to say whether the code abides to all build quality standards. Inconsistent! Cumbersome to track the actual code review process. Not every change to the code base is worth reviewing, a lot of effort is invested for very little gain. Approach 2 – Code Review after Check in Developer checks in, random code reviews are performed on the checked in code. Image 2 – Code review after check in Pros The code has already passed the CI build and run through any code analysis plug ins you may have running on the build server. Instruct the developer to ensure ZERO fx cop, style cop and static code analysis before check in. Code is cleaner and smell free even before the code review. No Offline development, developers can continue to develop against the source control. Cons Bad code can easily make its way into the code base. Since the review take place much later in the cycle, the cost of fixing issues can prove to be much higher. Approach 3 – Hybrid Approach The community advocates a more hybrid approach, a blend of tooling and human accountability quotient. Image 3 – Hybrid Approach 1. Code review high impact check ins. It is not possible to review everything, by setting up code review check in policies you can end up slowing your team. More over, the code that you are reviewing before check in hasn't even been through a green CI build either. 2. Tooling. Let the tooling work for you. By running static analysis, fx cop, style cop and other plug ins on the build agent, you can identify the real issues that in my opinion can't possibly be identified using human reviews. Configure the tooling to report back top 10 issues every day. Mandate the manual code review of individuals who keep making it to this list of shame more often. 3. During Merge. I would prefer eliminating some of the other code issues during merge from Main branch to the release branch. In a scrum project this is still easier because cheery picking the merges is a possibility and the size of code being reviewed is still limited. Let the tooling work for you, if some one breaks the CI build often, put them on a gated check in build course until you see improvement. If some one appears on the top 10 list of shame generated via the build then ensure that all their code is reviewed till you see improvement. At the end of the day, the goal is to ensure that the code being delivered is top quality. By enforcing a code review before any check in, you force the developer to work offline or stay put till the review is complete. What do the experts say? So I asked a few expects what they thought of “Code Review quality gate before Checking in code?" Terje Sandstrom | Microsoft ALM MVP You mean a review quality gate BEFORE checking in code????? That would mean a lot of code staying either local or in shelvesets, and not even been through a CI build, and a green CI build being the main criteria for going further, f.e. to the review state. I would not like code laying around with no checkin’s. Having a requirement that code is checked in small pieces, 4-8 hours work max, and AT LEAST daily checkins, a manual code review comes second down the lane. I would expect review quality gates to happen before merging back to main, or before merging to release. But that would all be on checked-in code. Branching is absolutely one way to ease the pain. Another way we are using is automatic quality builds, running metrics, coverage, static code analysis. Unfortunately it takes some time, would be great to be on CI’s – but…., so it’s done scheduled every night. Based on this we get, among other stuff, top 10 lists of suspicious code, which is then subjected to reviews. If a person seems to be very popular on these top 10 lists, we subject every check in from that person to a review for a period. That normally helps. None of the clients I have can afford to have every checkin reviewed, so we need to find ways around it. I don’t disagree with the nicety of having all the code reviewed, but I find it hard to find those resources in today’s enterprises. David V. Corbin | Visual Studio ALM Ranger I tend to agree with both sides. I hate having code that is not checked in, but at the same time hate having “bad” code in the repository. I have found that branching is one approach to solving this dilemma. Code is checked into the private/feature branch before the review, but is not merged over to the “official” branch until after the review. I advocate both, depending on circumstance (especially team dynamics) - The “pre-checkin” is usually for elements that may impact the project as a whole. Think of it as another “gate” along with passing unit tests. - The “post-checkin” may very well not be at the changeset level, but correlates to a review at the “user story” level. Again, this depends on team dynamics in play…. Robert MacLean | Microsoft ALM MVP I do not think there is no right answer for the industry as a whole. In short the question is why do you do reviews? Your question implies risk mitigation, so in low risk areas you can get away with it after check in while in high risk you need to do it before check in. An example is those new to a team or juniors need it much earlier (maybe that is before checkin, maybe that is soon after) than seniors who have shipped twenty sprints on the team. Abhimanyu Singhal | Visual Studio ALM Ranger Depends on per scenario basis. We recommend post check-in reviews when: 1. We don't want to block other checks and processes on manual code reviews. Manual reviews take time, and some pieces may not require manual reviews at all. 2. We need to trace all changes and track history. 3. We have a code promotion strategy/process in place. For risk mitigation, post checkin code can be promoted to Accepted branches. Or can be rejected. Pre Checkin Reviews are used when 1. There is a high risk factor associated 2. Reviewers are generally (most of times) have immediate availability. 3. Team does not have strict tracking needs. Simply speaking, no single process fits all scenarios. You need to select what works best for your team/project. Thomas Schissler | Visual Studio ALM Ranger This is an interesting discussion, I’m right now discussing details about executing code reviews with my teams. I see and understand the aspects you brought in, but there is another side as well, I’d like to point out. 1.) If you do reviews per check in this is not very practical as a hard rule because this will disturb the flow of the team very often or it will lead to reduce the checkin frequency of the devs which I would not accept. 2.) If you do later reviews, for example if you review PBIs, it is not easy to find out which code you should review. Either you review all changesets associate with the PBI, but then you might review code which has been changed with a later checkin and the dev maybe has already fixed the issue. Or you review the diff of the latest changeset of the PBI with the first but then you might also review changes of other PBIs. Jakob Leander | Sr. Director, Avanade In my experience, manual code review: 1. Does not get done and at the very least does not get redone after changes (regardless of intentions at start of project) 2. When a project actually do it, they often do not do it right away = errors pile up 3. Requires a lot of time discussing/defining the standard and for the team to learn it However code review is very important since e.g. even small memory leaks in a high volume web solution have big consequences In the last years I have advocated following approach for code review - Architects up front do “at least one best practice example” of each type of component and tell the team. Copy from this one. This should include error handling, logging, security etc. - Dev lead on project continuously browse code to validate that the best practices are used. Especially that patterns etc. are not broken. You can do this formally after each sprint/iteration if you want. Once this is validated it is unlikely to “go bad” even during later code changes Agree with customer to rely on static code analysis from Visual Studio as the one and only coding standard. This has HUUGE benefits - You can easily tweak to reach the level you desire together with customer - It is easy to measure for both developers/management - It is 100% consistent across code base - It gets validated all the time so you never end up getting hammered by a customer review in the end - It is easy to tell the developer that you do not want code back unless it has zero errors = minimize communication You need to track this at least during nightly builds and make sure team sees total # issues. Do not allow #issues it to grow uncontrolled. On the project I run I require code analysis to have run on code before checkin (checkin rule). This means - You have to have clean compile (or CA wont run) so this is extra benefit = very few broken builds - You can change a few of the rules to compile as errors instead of warnings. I often do this for “missing dispose” issues which you REALLY do not want in your app Tip: Place your custom CA rules files as part of solution. That way it works when you do branching etc. (path to CA file is relative in VS) Some may argue that CA is not as good as manual inspection. But since manual inspection in reality suffers from the 3 issues in start it is IMO a MUCH better (and much cheaper) approach from helicopter perspective Tirthankar Dutta | Director, Avanade I think code review should be run both before and after check ins. There are some code metrics that are meant to be run on the entire codebase … Also, especially on multi-site projects, one should strive to architect in a way that lets men manage the framework while boys write the repetitive code… scales very well with the need to review less by containment and imposing architectural restrictions to emphasise the design. Bruno Capuano | Microsoft ALM MVP For code reviews (means peer reviews) in distributed team I use http://www.vsanywhere.com/default.aspx David Jobling | Global Sr. Director, Avanade Peer review is the only way to scale and its a great practice for all in the team to learn to perform and accept. In my experience you soon learn who's code to watch more than others and tune the attention. Mikkel Toudal Kristiansen | Manager, Avanade If you have several branches in your code base, you will need to merge often. This requires manual merging, when a file has been changed in both branches. It offers a good opportunity to actually review to changed code. So my advice is: Merging between branches should be done as often as possible, it should be done by a senior developer, and he/she should perform a full code review of the code being merged. As for detecting architectural smells and code smells creeping into the code base, one really good third party tools exist: Ndepend (http://www.ndepend.com/, for static code analysis of the current state of the code base). You could also consider adding StyleCop to the solution. Jesse Houwing | Visual Studio ALM Ranger I gave a presentation on this subject on the TechDays conference in NL last year. See my presentation and slides here (talk in Dutch, but English presentation): http://blog.jessehouwing.nl/2012/03/did-you-miss-my-techdaysnl-talk-on-code.html I’d like to add a few more points: - Before/After checking is mostly a trust issue. If you have a team that does diligent peer reviews and regularly talk/sit together or peer review, there’s no need to enforce a before-checkin policy. The peer peer-programming and regular feedback during development can take care of most of the review requirements as long as the team isn’t under stress. - Under stress, enforce pre-checkin reviews, it might sound strange, if you’re already under time or budgetary constraints, but it is under such conditions most real issues start to be created or pile up. - Use tools to catch most common errors, Code Analysis/FxCop was already mentioned. HP Fortify, Resharper, Coderush etc can help you there. There are also a lot of 3rd party rules you can add to Code Analysis. I’ve written a few myself (http://fccopcontrib.codeplex.com) and various teams from Microsoft have added their own rules (MSOCAF for SharePoint, WSSF for WCF). For common errors that keep cropping up, see if you can define a rule. It’s much easier. But more importantly make sure you have a good help page explaining *WHY* it's wrong. If you have small feature or developer branches/shelvesets, you might want to review pre-merge. It’s still better to do peer reviews and peer programming, but the most important thing is that bad quality code doesn’t make it into the important branch. So my philosophy: - Use tooling as much as possible. - Make sure the team understands the tooling and the importance of the things it flags. It’s too easy to just click suppress all to ignore the warnings. - Under stress, tighten process, it’s under stress that the problems of late reviews will really surface - Most importantly if you do reviews do them as early as possible, but never later than needed. In other words, pre-checkin/post checking doesn’t really matter, as long as the review is done before the code is released. It’ll just be much more expensive to fix any review outcomes the later you find them. --- I would love to hear what you think!

Read the article
C#/.NET Little Wonders: The Useful But Overlooked Sets

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. Today we will be looking at two set implementations in the System.Collections.Generic namespace: HashSet<T> and SortedSet<T>. Even though most people think of sets as mathematical constructs, they are actually very useful classes that can be used to help make your application more performant if used appropriately. A Background From Math In mathematical terms, a set is an unordered collection of unique items. In other words, the set {2,3,5} is identical to the set {3,5,2}. In addition, the set {2, 2, 4, 1} would be invalid because it would have a duplicate item (2). In addition, you can perform set arithmetic on sets such as: Intersections: The intersection of two sets is the collection of elements common to both. Example: The intersection of {1,2,5} and {2,4,9} is the set {2}. Unions: The union of two sets is the collection of unique items present in either or both set. Example: The union of {1,2,5} and {2,4,9} is {1,2,4,5,9}. Differences: The difference of two sets is the removal of all items from the first set that are common between the sets. Example: The difference of {1,2,5} and {2,4,9} is {1,5}. Supersets: One set is a superset of a second set if it contains all elements that are in the second set. Example: The set {1,2,5} is a superset of {1,5}. Subsets: One set is a subset of a second set if all the elements of that set are contained in the first set. Example: The set {1,5} is a subset of {1,2,5}. If We’re Not Doing Math, Why Do We Care? Now, you may be thinking: why bother with the set classes in C# if you have no need for mathematical set manipulation? The answer is simple: they are extremely efficient ways to determine ownership in a collection. For example, let’s say you are designing an order system that tracks the price of a particular equity, and once it reaches a certain point will trigger an order. Now, since there’s tens of thousands of equities on the markets, you don’t want to track market data for every ticker as that would be a waste of time and processing power for symbols you don’t have orders for. Thus, we just want to subscribe to the stock symbol for an equity order only if it is a symbol we are not already subscribed to. Every time a new order comes in, we will check the list of subscriptions to see if the new order’s stock symbol is in that list. If it is, great, we already have that market data feed! If not, then and only then should we subscribe to the feed for that symbol. So far so good, we have a collection of symbols and we want to see if a symbol is present in that collection and if not, add it. This really is the essence of set processing, but for the sake of comparison, let’s say you do a list instead: 1: // class that handles are order processing service 2: public sealed class OrderProcessor 3: { 4: // contains list of all symbols we are currently subscribed to 5: private readonly List<string> _subscriptions = new List<string>(); 6: 7: ... 8: } Now whenever you are adding a new order, it would look something like: 1: public PlaceOrderResponse PlaceOrder(Order newOrder) 2: { 3: // do some validation, of course... 4: 5: // check to see if already subscribed, if not add a subscription 6: if (!_subscriptions.Contains(newOrder.Symbol)) 7: { 8: // add the symbol to the list 9: _subscriptions.Add(newOrder.Symbol); 10: 11: // do whatever magic is needed to start a subscription for the symbol 12: } 13: 14: // place the order logic! 15: } What’s wrong with this? In short: performance! Finding an item inside a List<T> is a linear - O(n) – operation, which is not a very performant way to find if an item exists in a collection. (I used to teach algorithms and data structures in my spare time at a local university, and when you began talking about big-O notation you could immediately begin to see eyes glossing over as if it was pure, useless theory that would not apply in the real world, but I did and still do believe it is something worth understanding well to make the best choices in computer science). Let’s think about this: a linear operation means that as the number of items increases, the time that it takes to perform the operation tends to increase in a linear fashion. Put crudely, this means if you double the collection size, you might expect the operation to take something like the order of twice as long. Linear operations tend to be bad for performance because they mean that to perform some operation on a collection, you must potentially “visit” every item in the collection. Consider finding an item in a List<T>: if you want to see if the list has an item, you must potentially check every item in the list before you find it or determine it’s not found. Now, we could of course sort our list and then perform a binary search on it, but sorting is typically a linear-logarithmic complexity – O(n * log n) - and could involve temporary storage. So performing a sort after each add would probably add more time. As an alternative, we could use a SortedList<TKey, TValue> which sorts the list on every Add(), but this has a similar level of complexity to move the items and also requires a key and value, and in our case the key is the value. This is why sets tend to be the best choice for this type of processing: they don’t rely on separate keys and values for ordering – so they save space – and they typically don’t care about ordering – so they tend to be extremely performant. The .NET BCL (Base Class Library) has had the HashSet<T> since .NET 3.5, but at that time it did not implement the ISet<T> interface. As of .NET 4.0, HashSet<T> implements ISet<T> and a new set, the SortedSet<T> was added that gives you a set with ordering. HashSet<T> – For Unordered Storage of Sets When used right, HashSet<T> is a beautiful collection, you can think of it as a simplified Dictionary<T,T>. That is, a Dictionary where the TKey and TValue refer to the same object. This is really an oversimplification, but logically it makes sense. I’ve actually seen people code a Dictionary<T,T> where they store the same thing in the key and the value, and that’s just inefficient because of the extra storage to hold both the key and the value. As it’s name implies, the HashSet<T> uses a hashing algorithm to find the items in the set, which means it does take up some additional space, but it has lightning fast lookups! Compare the times below between HashSet<T> and List<T>: Operation HashSet<T> List<T> Add() O(1) O(1) at end O(n) in middle Remove() O(1) O(n) Contains() O(1) O(n) Now, these times are amortized and represent the typical case. In the very worst case, the operations could be linear if they involve a resizing of the collection – but this is true for both the List and HashSet so that’s a less of an issue when comparing the two. The key thing to note is that in the general case, HashSet is constant time for adds, removes, and contains! This means that no matter how large the collection is, it takes roughly the exact same amount of time to find an item or determine if it’s not in the collection. Compare this to the List where almost any add or remove must rearrange potentially all the elements! And to find an item in the list (if unsorted) you must search every item in the List. So as you can see, if you want to create an unordered collection and have very fast lookup and manipulation, the HashSet is a great collection. And since HashSet<T> implements ICollection<T> and IEnumerable<T>, it supports nearly all the same basic operations as the List<T> and can use the System.Linq extension methods as well. All we have to do to switch from a List<T> to a HashSet<T> is change our declaration. Since List and HashSet support many of the same members, chances are we won’t need to change much else. 1: public sealed class OrderProcessor 2: { 3: private readonly HashSet<string> _subscriptions = new HashSet<string>(); 4: 5: // ... 6: 7: public PlaceOrderResponse PlaceOrder(Order newOrder) 8: { 9: // do some validation, of course... 10: 11: // check to see if already subscribed, if not add a subscription 12: if (!_subscriptions.Contains(newOrder.Symbol)) 13: { 14: // add the symbol to the list 15: _subscriptions.Add(newOrder.Symbol); 16: 17: // do whatever magic is needed to start a subscription for the symbol 18: } 19: 20: // place the order logic! 21: } 22: 23: // ... 24: } 25: Notice, we didn’t change any code other than the declaration for _subscriptions to be a HashSet<T>. Thus, we can pick up the performance improvements in this case with minimal code changes. SortedSet<T> – Ordered Storage of Sets Just like HashSet<T> is logically similar to Dictionary<T,T>, the SortedSet<T> is logically similar to the SortedDictionary<T,T>. The SortedSet can be used when you want to do set operations on a collection, but you want to maintain that collection in sorted order. Now, this is not necessarily mathematically relevant, but if your collection needs do include order, this is the set to use. So the SortedSet seems to be implemented as a binary tree (possibly a red-black tree) internally. Since binary trees are dynamic structures and non-contiguous (unlike List and SortedList) this means that inserts and deletes do not involve rearranging elements, or changing the linking of the nodes. There is some overhead in keeping the nodes in order, but it is much smaller than a contiguous storage collection like a List<T>. Let’s compare the three: Operation HashSet<T> SortedSet<T> List<T> Add() O(1) O(log n) O(1) at end O(n) in middle Remove() O(1) O(log n) O(n) Contains() O(1) O(log n) O(n) The MSDN documentation seems to indicate that operations on SortedSet are O(1), but this seems to be inconsistent with its implementation and seems to be a documentation error. There’s actually a separate MSDN document (here) on SortedSet that indicates that it is, in fact, logarithmic in complexity. Let’s put it in layman’s terms: logarithmic means you can double the collection size and typically you only add a single extra “visit” to an item in the collection. Take that in contrast to List<T>’s linear operation where if you double the size of the collection you double the “visits” to items in the collection. This is very good performance! It’s still not as performant as HashSet<T> where it always just visits one item (amortized), but for the addition of sorting this is a good thing. Consider the following table, now this is just illustrative data of the relative complexities, but it’s enough to get the point: Collection Size O(1) Visits O(log n) Visits O(n) Visits 1 1 1 1 10 1 4 10 100 1 7 100 1000 1 10 1000 Notice that the logarithmic – O(log n) – visit count goes up very slowly compare to the linear – O(n) – visit count. This is because since the list is sorted, it can do one check in the middle of the list, determine which half of the collection the data is in, and discard the other half (binary search). So, if you need your set to be sorted, you can use the SortedSet<T> just like the HashSet<T> and gain sorting for a small performance hit, but it’s still faster than a List<T>. Unique Set Operations Now, if you do want to perform more set-like operations, both implementations of ISet<T> support the following, which play back towards the mathematical set operations described before: IntersectWith() – Performs the set intersection of two sets. Modifies the current set so that it only contains elements also in the second set. UnionWith() – Performs a set union of two sets. Modifies the current set so it contains all elements present both in the current set and the second set. ExceptWith() – Performs a set difference of two sets. Modifies the current set so that it removes all elements present in the second set. IsSupersetOf() – Checks if the current set is a superset of the second set. IsSubsetOf() – Checks if the current set is a subset of the second set. For more information on the set operations themselves, see the MSDN description of ISet<T> (here). What Sets Don’t Do Don’t get me wrong, sets are not silver bullets. You don’t really want to use a set when you want separate key to value lookups, that’s what the IDictionary implementations are best for. Also sets don’t store temporal add-order. That is, if you are adding items to the end of a list all the time, your list is ordered in terms of when items were added to it. This is something the sets don’t do naturally (though you could use a SortedSet with an IComparer with a DateTime but that’s overkill) but List<T> can. Also, List<T> allows indexing which is a blazingly fast way to iterate through items in the collection. Iterating over all the items in a List<T> is generally much, much faster than iterating over a set. Summary Sets are an excellent tool for maintaining a lookup table where the item is both the key and the value. In addition, if you have need for the mathematical set operations, the C# sets support those as well. The HashSet<T> is the set of choice if you want the fastest possible lookups but don’t care about order. In contrast the SortedSet<T> will give you a sorted collection at a slight reduction in performance. Technorati Tags: C#,.Net,Little Wonders,BlackRabbitCoder,ISet,HashSet,SortedSet

Read the article
std::basic_stringstream<unsigned char> won't compile with MSVC 10

- by Michael J

I'm trying to get UTF-8 chars to co-exist with ANSI 8-bit chars. My strategy has been to represent utf-8 chars as unsigned char so that appropriate overloads of functions can be used for the two character types. e.g. namespace MyStuff { typedef uchar utf8_t; typedef std::basic_string<utf8_t> U8string; } void SomeFunc(std::string &s); void SomeFunc(std::wstring &s); void SomeFunc(MyStuff::U8string &s); This all works pretty well until I try to use a stringstream. std::basic_ostringstream<MyStuff::utf8_t> ostr; ostr << 1; MSVC Visual C++ Express V10 won't compile this: c:\program files\microsoft visual studio 10.0\vc\include\xlocmon(213): warning C4273: 'id' : inconsistent dll linkage c:\program files\microsoft visual studio 10.0\vc\include\xlocnum(65) : see previous definition of 'public: static std::locale::id std::numpunct<unsigned char>::id' c:\program files\microsoft visual studio 10.0\vc\include\xlocnum(65) : while compiling class template static data member 'std::locale::id std::numpunct<_Elem>::id' with [ _Elem=Tk::utf8_t ] c:\program files\microsoft visual studio 10.0\vc\include\xlocnum(1149) : see reference to function template instantiation 'const _Facet &std::use_facet<std::numpunct<_Elem>>(const std::locale &)' being compiled with [ _Facet=std::numpunct<Tk::utf8_t>, _Elem=Tk::utf8_t ] c:\program files\microsoft visual studio 10.0\vc\include\xlocnum(1143) : while compiling class template member function 'std::ostreambuf_iterator<_Elem,_Traits> std::num_put<_Elem,_OutIt>:: do_put(_OutIt,std::ios_base &,_Elem,std::_Bool) const' with [ _Elem=Tk::utf8_t, _Traits=std::char_traits<Tk::utf8_t>, _OutIt=std::ostreambuf_iterator<Tk::utf8_t,std::char_traits<Tk::utf8_t>> ] c:\program files\microsoft visual studio 10.0\vc\include\ostream(295) : see reference to class template instantiation 'std::num_put<_Elem,_OutIt>' being compiled with [ _Elem=Tk::utf8_t, _OutIt=std::ostreambuf_iterator<Tk::utf8_t,std::char_traits<Tk::utf8_t>> ] c:\program files\microsoft visual studio 10.0\vc\include\ostream(281) : while compiling class template member function 'std::basic_ostream<_Elem,_Traits> & std::basic_ostream<_Elem,_Traits>::operator <<(int)' with [ _Elem=Tk::utf8_t, _Traits=std::char_traits<Tk::utf8_t> ] c:\program files\microsoft visual studio 10.0\vc\include\sstream(526) : see reference to class template instantiation 'std::basic_ostream<_Elem,_Traits>' being compiled with [ _Elem=Tk::utf8_t, _Traits=std::char_traits<Tk::utf8_t> ] c:\users\michael\dvl\tmp\console\console.cpp(23) : see reference to class template instantiation 'std::basic_ostringstream<_Elem,_Traits,_Alloc>' being compiled with [ _Elem=Tk::utf8_t, _Traits=std::char_traits<Tk::utf8_t>, _Alloc=std::allocator<uchar> ] . c:\program files\microsoft visual studio 10.0\vc\include\xlocmon(213): error C2491: 'std::numpunct<_Elem>::id' : definition of dllimport static data member not allowed with [ _Elem=Tk::utf8_t ] Any ideas? ** Edited 19 June 2012 ** OK, I've gotten closer to understanding this, but not how to solve it. As we all know, static class variables get defined twice: once in the class definition and once outside the class definition which establishes storage space. e.g. // in .h file class CFoo { // ... static int x; }; // in .cpp file int CFoo::x = 42; Now in the VC10 headers we get something like this: template<class _Elem> class numpunct : public locale::facet { // ... _CRTIMP2_PURE static locale::id id; // ... } When the header is included in an application, _CRTIMP2_PURE is defined as __declspec(dllimport), which means that the variable is imported from a dll. Now the header also contains the following template<class _Elem> locale::id numpunct<_Elem>::id; Note the absence of the __declspec(dllimport) qualifier. i.e. The class declaration says that the static linkage of the id variable is in the dll, but for the general case, it gets declared outside the dll. For the known cases, there are specialisations. template locale::id numpunct<char>::id; template locale::id numpunct<wchar_t>::id; These are protected by #ifs so that they are only included when building the DLL. They are excluded otherwise. i.e. the char and wchar_t versions of numpunct ARE inside the dll So we have the class definition saying that id's storage is in the DLL, but that is only true for the char and wchar_t specialisations, meaning that my unsigned char version is doomed. :-( The only way forward that I can think of is to create my own specialisation: basically copying it from the header file and fixing it. This raises many issues. Anybody have a better idea?

Read the article
Why Does .Hide()ing and .Show()ing Panels in wxPython Result in the Sizer Changing the Layout?

- by MetaHyperBolic

As referenced in my previous question, I am trying to make something slightly wizard-like in function. I have settled on a single frame with a sizer added to it. I build panels for each of the screens I would like users to see, add them to the frame's sizer, then switch between panels by .Hide()ing one panel, then calling a custom .ShowYourself() on the next panel. Obviously, I would like the buttons to remain in the same place as the user progresses through the process. I have linked together two panels in an infinite loop by their "Back" and "Next" buttons so you can see what is going on. The first panel looks great; tom10's code worked on that level, as it eschewed my initial, over-fancy attempt with borders flying every which way. And then the second panel seems to have shrunk down to the bare minimum. As we return to the first panel, the shrinkage has occurred here as well. Why does it look fine on the first panel, but not after I return there? Why is calling .Fit() necessary if I do not want a 10 pixel by 10 pixel wad of grey? And if it is necessary, why does .Fit() give inconsistent results? This infinite loop seems to characterize my experience with this: I fix the layout on a panel, only to find that switching ruins the layout for other panels. I fix that problem, by using sizer_h.Add(self.panel1, 0) instead of sizer_h.Add(self.panel1, 1, wx.EXPAND), and now my layouts are off again. So far, my "solution" is to add a mastersizer.SetMinSize((475, 592)) to each panel's master sizer (commented out in the code below). This is a cruddy solution because 1) I have had to find the numbers that work by trial and error (-5 pixels for the width, -28 pixels for the height). 2) I don't understand why the underlying issue still happens. What's the correct, non-ugly solution? Instead of adding all of the panels to the frame's sizer at once, should switching panels involve .Detach()ing that panel from the frame's sizer and then .Add()ing the next panel to the frame's sizer? Is there a .JustMakeThisFillThePanel() method hiding somewhere I have missed in both the wxWidgets and the wxPython documents online? I'm obviously missing something in my mental model of layout. Here's a TinyURL link, if I can't manage to embed the . Minimalist code pasted below. import wx import sys class My_App(wx.App): def OnInit(self): self.frame = My_Frame(None) self.frame.Show() self.SetTopWindow(self.frame) return True def OnExit(self): print 'Dying ...' class My_Frame(wx.Frame): def __init__(self, image, parent=None,id=-1, title='Generic Title', pos=wx.DefaultPosition, style=wx.CAPTION | wx.STAY_ON_TOP): size = (480, 620) wx.Frame.__init__(self, parent, id, 'Program Title', pos, size, style) sizer_h = wx.BoxSizer(wx.HORIZONTAL) self.panel0 = User_Interaction0(self) sizer_h.Add(self.panel0, 1, wx.EXPAND) self.panel1 = User_Interaction1(self) sizer_h.Add(self.panel1, 1, wx.EXPAND) self.SetSizer(sizer_h) self.panel0.ShowYourself() def ShutDown(self): self.Destroy() class User_Interaction0(wx.Panel): def __init__(self, parent, id=-1): wx.Panel.__init__(self, parent, id) # master sizer for the whole panel mastersizer = wx.BoxSizer(wx.VERTICAL) #mastersizer.SetMinSize((475, 592)) mastersizer.AddSpacer(15) # build the top row txtHeader = wx.StaticText(self, -1, 'Welcome to This Boring\nProgram', (0, 0)) font = wx.Font(16, wx.DEFAULT, wx.NORMAL, wx.BOLD) txtHeader.SetFont(font) txtOutOf = wx.StaticText(self, -1, '1 out of 7', (0, 0)) rowtopsizer = wx.BoxSizer(wx.HORIZONTAL) rowtopsizer.Add(txtHeader, 3, wx.ALIGN_LEFT) rowtopsizer.Add((0,0), 1) rowtopsizer.Add(txtOutOf, 0, wx.ALIGN_RIGHT) mastersizer.Add(rowtopsizer, 0, flag=wx.EXPAND | wx.LEFT | wx.RIGHT, border=15) # build the middle row text = 'PANEL 0\n\n' text = text + 'This could be a giant blob of explanatory text.\n' txtBasic = wx.StaticText(self, -1, text) font = wx.Font(11, wx.DEFAULT, wx.NORMAL, wx.NORMAL) txtBasic.SetFont(font) mastersizer.Add(txtBasic, 1, flag=wx.EXPAND | wx.LEFT | wx.RIGHT, border=15) # build the bottom row btnBack = wx.Button(self, -1, 'Back') self.Bind(wx.EVT_BUTTON, self.OnBack, id=btnBack.GetId()) btnNext = wx.Button(self, -1, 'Next') self.Bind(wx.EVT_BUTTON, self.OnNext, id=btnNext.GetId()) btnCancelExit = wx.Button(self, -1, 'Cancel and Exit') self.Bind(wx.EVT_BUTTON, self.OnCancelAndExit, id=btnCancelExit.GetId()) rowbottomsizer = wx.BoxSizer(wx.HORIZONTAL) rowbottomsizer.Add(btnBack, 0, wx.ALIGN_LEFT) rowbottomsizer.AddSpacer(5) rowbottomsizer.Add(btnNext, 0) rowbottomsizer.AddSpacer(5) rowbottomsizer.AddStretchSpacer(1) rowbottomsizer.Add(btnCancelExit, 0, wx.ALIGN_RIGHT) mastersizer.Add(rowbottomsizer, flag=wx.EXPAND | wx.LEFT | wx.RIGHT, border=15) # finish master sizer mastersizer.AddSpacer(15) self.SetSizer(mastersizer) self.Raise() self.SetPosition((0,0)) self.Fit() self.Hide() def ShowYourself(self): self.Raise() self.SetPosition((0,0)) self.Fit() self.Show() def OnBack(self, event): self.Hide() self.GetParent().panel1.ShowYourself() def OnNext(self, event): self.Hide() self.GetParent().panel1.ShowYourself() def OnCancelAndExit(self, event): self.GetParent().ShutDown() class User_Interaction1(wx.Panel): def __init__(self, parent, id=-1): wx.Panel.__init__(self, parent, id) # master sizer for the whole panel mastersizer = wx.BoxSizer(wx.VERTICAL) #mastersizer.SetMinSize((475, 592)) mastersizer.AddSpacer(15) # build the top row txtHeader = wx.StaticText(self, -1, 'Read about This Boring\nProgram', (0, 0)) font = wx.Font(16, wx.DEFAULT, wx.NORMAL, wx.BOLD) txtHeader.SetFont(font) txtOutOf = wx.StaticText(self, -1, '2 out of 7', (0, 0)) rowtopsizer = wx.BoxSizer(wx.HORIZONTAL) rowtopsizer.Add(txtHeader, 3, wx.ALIGN_LEFT) rowtopsizer.Add((0,0), 1) rowtopsizer.Add(txtOutOf, 0, wx.ALIGN_RIGHT) mastersizer.Add(rowtopsizer, 0, flag=wx.EXPAND | wx.LEFT | wx.RIGHT, border=15) # build the middle row text = 'PANEL 1\n\n' text = text + 'This could be a giant blob of boring text.\n' txtBasic = wx.StaticText(self, -1, text) font = wx.Font(11, wx.DEFAULT, wx.NORMAL, wx.NORMAL) txtBasic.SetFont(font) mastersizer.Add(txtBasic, 1, flag=wx.EXPAND | wx.LEFT | wx.RIGHT, border=15) # build the bottom row btnBack = wx.Button(self, -1, 'Back') self.Bind(wx.EVT_BUTTON, self.OnBack, id=btnBack.GetId()) btnNext = wx.Button(self, -1, 'Next') self.Bind(wx.EVT_BUTTON, self.OnNext, id=btnNext.GetId()) btnCancelExit = wx.Button(self, -1, 'Cancel and Exit') self.Bind(wx.EVT_BUTTON, self.OnCancelAndExit, id=btnCancelExit.GetId()) rowbottomsizer = wx.BoxSizer(wx.HORIZONTAL) rowbottomsizer.Add(btnBack, 0, wx.ALIGN_LEFT) rowbottomsizer.AddSpacer(5) rowbottomsizer.Add(btnNext, 0) rowbottomsizer.AddSpacer(5) rowbottomsizer.AddStretchSpacer(1) rowbottomsizer.Add(btnCancelExit, 0, wx.ALIGN_RIGHT) mastersizer.Add(rowbottomsizer, flag=wx.EXPAND | wx.LEFT | wx.RIGHT, border=15) # finish master sizer mastersizer.AddSpacer(15) self.SetSizer(mastersizer) self.Raise() self.SetPosition((0,0)) self.Fit() self.Hide() def ShowYourself(self): self.Raise() self.SetPosition((0,0)) self.Fit() self.Show() def OnBack(self, event): self.Hide() self.GetParent().panel0.ShowYourself() def OnNext(self, event): self.Hide() self.GetParent().panel0.ShowYourself() def OnCancelAndExit(self, event): self.GetParent().ShutDown() def main(): app = My_App(redirect = False) app.MainLoop() if __name__ == '__main__': main()

Read the article
Incorrectly formatted html inconsistencies between DOM and what's displayed in firefox plugin

- by deadalnix

I'm currently developing a firefox plugin. This plugin has to handle very crappy website that is really incorrectly formatted. I cannot modify these websites, so I have to handle them. I reduced the bug I'm facing to a short sample of html (if this appellation is appropriate for an horror like this) : <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <title>Some title.</title>  <div style="visability:hidden;"> <a href="//example.com"> </a> </div>  <meta name="description" content="Homepage of Company.com, Company's corporate Web site" /> <meta name="keywords" content="Company, Company & Co., Inc., blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla" /> <meta http-equiv="Content-Language" content="en-US" /> <meta http-equiv="content-type" content="text/html; charset=utf-8"/> </head> <body class="homePage"> <div class="globalWrapper"><a href="/page.html">My gorgeous link !</a></div> </body> </html> When opening the webpage, « My gorgeous link ! » if displayed and clickable. However, when I'm exploring the DOM with Javascript into my plugin, everything behaves (DOM exploration and innerHTML property) like the code was this one : <html> <head> <title>Some title.</title>  </head><body><div style="visability:hidden;"> <a href="//example.com"> </a> </div>  <meta name="description" content="Homepage of Company.com, Company's corporate Web site"> <meta name="keywords" content="Company, Company & Co., Inc., blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla, blablabla"> <meta http-equiv="Content-Language" content="en-US"> </body> </html> So, when exploring the DOM within the plugin, the document is somehow fixed by firefox. But this fixed DOM is inconsistent with what is in the webpage. Thus, my plugin doesn't behave as expected. I'm really puzzled with that issue. The problem exists in both firefox 3.6 and firefox 4 (didn't tested firefox 5 yet). For example, reducing the meta, will fix the issue. Where does this discrepancy come from ? How can I handle it ? EDIT: With the answer I get, I think I should be a little more precise. I do know what firefow is doing when modifying the webpage in the second code snippet. The problem is the following one : « In the fixed DOM that I get into my plugin, the gorgeous link doesn't appear anywhere, but this link is actually visible on the webpage, and works. So the DOM I'm manipulating, and the DOM in the webpage are different - they are fixed in a different manner. » . So where does the difference come in the fixing behaviour, and how can I handle that, or, in other terms, how can I be aware, in my plugin, of the existance of the gorgeous link ?

Read the article
EF4 generates invalid script

- by Jaxidian

When I right-click in a .EDMX file and click Generate Database From Model, the resulting script is obviously wrong because of the table names. What it generates is the following script. Note the table names in the DROP TABLE part versus the CREATE TABLE part. Why is this inconsistent? This is obviously not a reusable script. What I created was an Entity named "Address" and an Entity named "Company", etc (all singular). The EntitySet names are pluralized. The "Pluralize New Objects" boolean does not change this either. So what's the deal? For what it's worth, I originally generated the EDMX by pointing it to a database that had tables with non-pluralized names and now that I've made some changes, I want to go back the other way. I'd like to have the option to go back and forth as neither the db-first nor the model-first model is ideal in all scenarios, and I have the control to ensure that there will be no merging issues from multiple people going both ways at the same time. -- -------------------------------------------------- -- Dropping existing FOREIGN KEY constraints -- NOTE: if the constraint does not exist, an ignorable error will be reported. -- -------------------------------------------------- ALTER TABLE [Address] DROP CONSTRAINT [FK_Address_StateID-State_ID]; GO ALTER TABLE [Company] DROP CONSTRAINT [FK_Company_AddressID-Address_ID]; GO ALTER TABLE [Employee] DROP CONSTRAINT [FK_Employee_BossEmployeeID-Employee_ID]; GO ALTER TABLE [Employee] DROP CONSTRAINT [FK_Employee_CompanyID-Company_ID]; GO ALTER TABLE [Employee] DROP CONSTRAINT [FK_Employee_PersonID-Person_ID]; GO ALTER TABLE [Person] DROP CONSTRAINT [FK_Person_AddressID-Address_ID]; GO -- -------------------------------------------------- -- Dropping existing tables -- NOTE: if the table does not exist, an ignorable error will be reported. -- -------------------------------------------------- DROP TABLE [Address]; GO DROP TABLE [Company]; GO DROP TABLE [Employee]; GO DROP TABLE [Person]; GO DROP TABLE [State]; GO -- -------------------------------------------------- -- Creating all tables -- -------------------------------------------------- -- Creating table 'Addresses' CREATE TABLE [Addresses] ( [ID] int IDENTITY(1,1) NOT NULL, [StreetAddress] nvarchar(100) NOT NULL, [City] nvarchar(100) NOT NULL, [StateID] int NOT NULL, [Zip] nvarchar(10) NOT NULL ); GO -- Creating table 'Companies' CREATE TABLE [Companies] ( [ID] int IDENTITY(1,1) NOT NULL, [Name] nvarchar(100) NOT NULL, [AddressID] int NOT NULL ); GO -- Creating table 'People' CREATE TABLE [People] ( [ID] int IDENTITY(1,1) NOT NULL, [FirstName] nvarchar(100) NOT NULL, [LastName] nvarchar(100) NOT NULL, [AddressID] int NOT NULL ); GO -- Creating table 'States' CREATE TABLE [States] ( [ID] int IDENTITY(1,1) NOT NULL, [Name] nvarchar(100) NOT NULL, [Abbreviation] nvarchar(2) NOT NULL ); GO -- Creating table 'Employees' CREATE TABLE [Employees] ( [ID] int IDENTITY(1,1) NOT NULL, [PersonID] int NOT NULL, [CompanyID] int NOT NULL, [Position] nvarchar(100) NOT NULL, [BossEmployeeID] int NULL ); GO -- -------------------------------------------------- -- Creating all PRIMARY KEY constraints -- -------------------------------------------------- -- Creating primary key on [ID] in table 'Addresses' ALTER TABLE [Addresses] ADD CONSTRAINT [PK_Addresses] PRIMARY KEY ([ID] ); GO -- Creating primary key on [ID] in table 'Companies' ALTER TABLE [Companies] ADD CONSTRAINT [PK_Companies] PRIMARY KEY ([ID] ); GO -- Creating primary key on [ID] in table 'People' ALTER TABLE [People] ADD CONSTRAINT [PK_People] PRIMARY KEY ([ID] ); GO -- Creating primary key on [ID] in table 'States' ALTER TABLE [States] ADD CONSTRAINT [PK_States] PRIMARY KEY ([ID] ); GO -- Creating primary key on [ID] in table 'Employees' ALTER TABLE [Employees] ADD CONSTRAINT [PK_Employees] PRIMARY KEY ([ID] ); GO -- -------------------------------------------------- -- Creating all FOREIGN KEY constraints -- -------------------------------------------------- -- Creating foreign key on [StateID] in table 'Addresses' ALTER TABLE [Addresses] ADD CONSTRAINT [FK_Address_StateID_State_ID] FOREIGN KEY ([StateID]) REFERENCES [States] ([ID]) ON DELETE NO ACTION ON UPDATE NO ACTION; -- Creating non-clustered index for FOREIGN KEY 'FK_Address_StateID_State_ID' CREATE INDEX [IX_FK_Address_StateID_State_ID] ON [Addresses] ([StateID]); GO -- Creating foreign key on [AddressID] in table 'Companies' ALTER TABLE [Companies] ADD CONSTRAINT [FK_Company_AddressID_Address_ID] FOREIGN KEY ([AddressID]) REFERENCES [Addresses] ([ID]) ON DELETE NO ACTION ON UPDATE NO ACTION; -- Creating non-clustered index for FOREIGN KEY 'FK_Company_AddressID_Address_ID' CREATE INDEX [IX_FK_Company_AddressID_Address_ID] ON [Companies] ([AddressID]); GO -- Creating foreign key on [AddressID] in table 'People' ALTER TABLE [People] ADD CONSTRAINT [FK_Person_AddressID_Address_ID] FOREIGN KEY ([AddressID]) REFERENCES [Addresses] ([ID]) ON DELETE NO ACTION ON UPDATE NO ACTION; -- Creating non-clustered index for FOREIGN KEY 'FK_Person_AddressID_Address_ID' CREATE INDEX [IX_FK_Person_AddressID_Address_ID] ON [People] ([AddressID]); GO -- Creating foreign key on [CompanyID] in table 'Employees' ALTER TABLE [Employees] ADD CONSTRAINT [FK_Employee_CompanyID_Company_ID] FOREIGN KEY ([CompanyID]) REFERENCES [Companies] ([ID]) ON DELETE NO ACTION ON UPDATE NO ACTION; -- Creating non-clustered index for FOREIGN KEY 'FK_Employee_CompanyID_Company_ID' CREATE INDEX [IX_FK_Employee_CompanyID_Company_ID] ON [Employees] ([CompanyID]); GO -- Creating foreign key on [BossEmployeeID] in table 'Employees' ALTER TABLE [Employees] ADD CONSTRAINT [FK_Employee_BossEmployeeID_Employee_ID] FOREIGN KEY ([BossEmployeeID]) REFERENCES [Employees] ([ID]) ON DELETE NO ACTION ON UPDATE NO ACTION; -- Creating non-clustered index for FOREIGN KEY 'FK_Employee_BossEmployeeID_Employee_ID' CREATE INDEX [IX_FK_Employee_BossEmployeeID_Employee_ID] ON [Employees] ([BossEmployeeID]); GO -- Creating foreign key on [PersonID] in table 'Employees' ALTER TABLE [Employees] ADD CONSTRAINT [FK_Employee_PersonID_Person_ID] FOREIGN KEY ([PersonID]) REFERENCES [People] ([ID]) ON DELETE NO ACTION ON UPDATE NO ACTION; -- Creating non-clustered index for FOREIGN KEY 'FK_Employee_PersonID_Person_ID' CREATE INDEX [IX_FK_Employee_PersonID_Person_ID] ON [Employees] ([PersonID]); GO -- -------------------------------------------------- -- Script has ended -- --------------------------------------------------

Read the article
PyML 0.7.2 - How to prevent accuracy from dropping after storing/loading a classifier?

- by Michael Aaron Safyan

This is a followup from "Save PyML.classifiers.multi.OneAgainstRest(SVM()) object?". The solution to that question was close, but not quite right, (the SparseDataSet is broken, so attempting to save/load with that dataset container type will fail, no matter what. Also, PyML is inconsistent in terms of whether labels should be numbers or strings... it turns out that the oneAgainstRest function is actually not good enough, because the labels need to be strings and simultaneously convertible to floats, because there are places where it is assumed to be a string and elsewhere converted to float) and so after a great deal of hacking and such I was finally able to figure out a way to save and load my multi-class classifier without it blowing up with an error.... however, although it is no longer giving me an error message, it is still not quite right as the accuracy of the classifier drops significantly when it is saved and then reloaded (so I'm still missing a piece of the puzzle). I am currently using the following custom mutli-class classifier for training, saving, and loading: class SVM(object): def __init__(self,features_or_filename,labels=None,kernel=None): if isinstance(features_or_filename,str): filename=features_or_filename; if labels!=None: raise ValueError,"Labels must be None if loading from a file."; with open(os.path.join(filename,"uniquelabels.list"),"rb") as uniquelabelsfile: self.uniquelabels=sorted(list(set(pickle.load(uniquelabelsfile)))); self.labeltoindex={}; for idx,label in enumerate(self.uniquelabels): self.labeltoindex[label]=idx; self.classifiers=[]; for classidx, classname in enumerate(self.uniquelabels): self.classifiers.append(PyML.classifiers.svm.loadSVM(os.path.join(filename,str(classname)+".pyml.svm"),datasetClass = PyML.VectorDataSet)); else: features=features_or_filename; if labels==None: raise ValueError,"Labels must not be None when training."; self.uniquelabels=sorted(list(set(labels))); self.labeltoindex={}; for idx,label in enumerate(self.uniquelabels): self.labeltoindex[label]=idx; points = [[float(xij) for xij in xi] for xi in features]; self.classifiers=[PyML.SVM(kernel) for label in self.uniquelabels]; for i in xrange(len(self.uniquelabels)): currentlabel=self.uniquelabels[i]; currentlabels=['+1' if k==currentlabel else '-1' for k in labels]; currentdataset=PyML.VectorDataSet(points,L=currentlabels,positiveClass='+1'); self.classifiers[i].train(currentdataset,saveSpace=False); def accuracy(self,pts,labels): logger=logging.getLogger("ml"); correct=0; total=0; classindexes=[self.labeltoindex[label] for label in labels]; h=self.hypotheses(pts); for idx in xrange(len(pts)): if h[idx]==classindexes[idx]: logger.info("RIGHT: Actual \"%s\" == Predicted \"%s\"" %(self.uniquelabels[ classindexes[idx] ], self.uniquelabels[ h[idx] ])); correct+=1; else: logger.info("WRONG: Actual \"%s\" != Predicted \"%s\"" %(self.uniquelabels[ classindexes[idx] ], self.uniquelabels[ h[idx] ])) total+=1; return float(correct)/float(total); def prediction(self,pt): h=self.hypothesis(pt); if h!=None: return self.uniquelabels[h]; return h; def predictions(self,pts): h=self.hypotheses(self,pts); return [self.uniquelabels[x] if x!=None else None for x in h]; def hypothesis(self,pt): bestvalue=None; bestclass=None; dataset=PyML.VectorDataSet([pt]); for classidx, classifier in enumerate(self.classifiers): val=classifier.decisionFunc(dataset,0); if (bestvalue==None) or (val>bestvalue): bestvalue=val; bestclass=classidx; return bestclass; def hypotheses(self,pts): bestvalues=[None for pt in pts]; bestclasses=[None for pt in pts]; dataset=PyML.VectorDataSet(pts); for classidx, classifier in enumerate(self.classifiers): for ptidx in xrange(len(pts)): val=classifier.decisionFunc(dataset,ptidx); if (bestvalues[ptidx]==None) or (val>bestvalues[ptidx]): bestvalues[ptidx]=val; bestclasses[ptidx]=classidx; return bestclasses; def save(self,filename): if not os.path.exists(filename): os.makedirs(filename); with open(os.path.join(filename,"uniquelabels.list"),"wb") as uniquelabelsfile: pickle.dump(self.uniquelabels,uniquelabelsfile,pickle.HIGHEST_PROTOCOL); for classidx, classname in enumerate(self.uniquelabels): self.classifiers[classidx].save(os.path.join(filename,str(classname)+".pyml.svm")); I am using the latest version of PyML (0.7.2, although PyML.__version__ is 0.7.0). When I construct the classifier with a training dataset, the reported accuracy is ~0.87. When I then save it and reload it, the accuracy is less than 0.001. So, there is something here that I am clearly not persisting correctly, although what that may be is completely non-obvious to me. Would you happen to know what that is?

Read the article
PyML 0.7.2 - How to prevent accuracy from dropping after stroing/loading a classifier?

- by Michael Aaron Safyan

This is a followup from "Save PyML.classifiers.multi.OneAgainstRest(SVM()) object?". The solution to that question was close, but not quite right, (the SparseDataSet is broken, so attempting to save/load with that dataset container type will fail, no matter what. Also, PyML is inconsistent in terms of whether labels should be numbers or strings... it turns out that the oneAgainstRest function is actually not good enough, because the labels need to be strings and simultaneously convertible to floats, because there are places where it is assumed to be a string and elsewhere converted to float) and so after a great deal of hacking and such I was finally able to figure out a way to save and load my multi-class classifier without it blowing up with an error.... however, although it is no longer giving me an error message, it is still not quite right as the accuracy of the classifier drops significantly when it is saved and then reloaded (so I'm still missing a piece of the puzzle). I am currently using the following custom mutli-class classifier for training, saving, and loading: class SVM(object): def __init__(self,features_or_filename,labels=None,kernel=None): if isinstance(features_or_filename,str): filename=features_or_filename; if labels!=None: raise ValueError,"Labels must be None if loading from a file."; with open(os.path.join(filename,"uniquelabels.list"),"rb") as uniquelabelsfile: self.uniquelabels=sorted(list(set(pickle.load(uniquelabelsfile)))); self.labeltoindex={}; for idx,label in enumerate(self.uniquelabels): self.labeltoindex[label]=idx; self.classifiers=[]; for classidx, classname in enumerate(self.uniquelabels): self.classifiers.append(PyML.classifiers.svm.loadSVM(os.path.join(filename,str(classname)+".pyml.svm"),datasetClass = PyML.VectorDataSet)); else: features=features_or_filename; if labels==None: raise ValueError,"Labels must not be None when training."; self.uniquelabels=sorted(list(set(labels))); self.labeltoindex={}; for idx,label in enumerate(self.uniquelabels): self.labeltoindex[label]=idx; points = [[float(xij) for xij in xi] for xi in features]; self.classifiers=[PyML.SVM(kernel) for label in self.uniquelabels]; for i in xrange(len(self.uniquelabels)): currentlabel=self.uniquelabels[i]; currentlabels=['+1' if k==currentlabel else '-1' for k in labels]; currentdataset=PyML.VectorDataSet(points,L=currentlabels,positiveClass='+1'); self.classifiers[i].train(currentdataset,saveSpace=False); def accuracy(self,pts,labels): logger=logging.getLogger("ml"); correct=0; total=0; classindexes=[self.labeltoindex[label] for label in labels]; h=self.hypotheses(pts); for idx in xrange(len(pts)): if h[idx]==classindexes[idx]: logger.info("RIGHT: Actual \"%s\" == Predicted \"%s\"" %(self.uniquelabels[ classindexes[idx] ], self.uniquelabels[ h[idx] ])); correct+=1; else: logger.info("WRONG: Actual \"%s\" != Predicted \"%s\"" %(self.uniquelabels[ classindexes[idx] ], self.uniquelabels[ h[idx] ])) total+=1; return float(correct)/float(total); def prediction(self,pt): h=self.hypothesis(pt); if h!=None: return self.uniquelabels[h]; return h; def predictions(self,pts): h=self.hypotheses(self,pts); return [self.uniquelabels[x] if x!=None else None for x in h]; def hypothesis(self,pt): bestvalue=None; bestclass=None; dataset=PyML.VectorDataSet([pt]); for classidx, classifier in enumerate(self.classifiers): val=classifier.decisionFunc(dataset,0); if (bestvalue==None) or (val>bestvalue): bestvalue=val; bestclass=classidx; return bestclass; def hypotheses(self,pts): bestvalues=[None for pt in pts]; bestclasses=[None for pt in pts]; dataset=PyML.VectorDataSet(pts); for classidx, classifier in enumerate(self.classifiers): for ptidx in xrange(len(pts)): val=classifier.decisionFunc(dataset,ptidx); if (bestvalues[ptidx]==None) or (val>bestvalues[ptidx]): bestvalues[ptidx]=val; bestclasses[ptidx]=classidx; return bestclasses; def save(self,filename): if not os.path.exists(filename): os.makedirs(filename); with open(os.path.join(filename,"uniquelabels.list"),"wb") as uniquelabelsfile: pickle.dump(self.uniquelabels,uniquelabelsfile,pickle.HIGHEST_PROTOCOL); for classidx, classname in enumerate(self.uniquelabels): self.classifiers[classidx].save(os.path.join(filename,str(classname)+".pyml.svm")); I am using the latest version of PyML (0.7.2, although PyML.__version__ is 0.7.0). When I construct the classifier with a training dataset, the reported accuracy is ~0.87. When I then save it and reload it, the accuracy is less than 0.001. So, there is something here that I am clearly not persisting correctly, although what that may be is completely non-obvious to me. Would you happen to know what that is?

Read the article
Error 0x800f0922 installing .NET 3.5 on Windows 8

- by Benjamin Nolan

I'm trying to install .NET 3.5 on my Windows 8 box and it keeps throwing Error 0x800f0922 at me. From what I've read on answers.microsoft.com and StackOverflow I gather the easiest way to fix this is to perform a system refresh, however this will remove all software I've installed from discs. I've just moved house, so I'd rather not do that as I don't know where all the installation media actually are for a lot of my software, so if possible I'd prefer to track down where the problem is actually occurring. (Also, I have a LOT of software installed. It'd take me a long time to reinstall it all, and I unfortunately haven't got that time.) The on-demand error screen sends me to KB2734782 (can't link it as I'm <10 rep), which doesn't help much. When I run this DISM line from the StackOverflow post: Dism.exe /online /enable-feature /featurename:NetFX3 /All /Source:C:\Windows\WinSxS /LimitAccess I get the following output on the terminal: Microsoft Windows [Version 6.2.9200] (c) 2012 Microsoft Corporation. All rights reserved. C:\Windows\system32>Dism.exe /online /enable-feature /featurename:NetFX3 /All /Source:C:\Windows\WinSxS /LimitAccess Deployment Image Servicing and Management tool Version: 6.2.9200.16384 Image Version: 6.2.9200.16384 Enabling feature(s) [==========================100.0%==========================] Error: 0x800f0922 DISM failed. No operation was performed. For more information, review the log file. The DISM log file can be found at C:\Windows\Logs\DISM\dism.log C:\Windows\system32> Incidentally, it jumps straight from 0 to 100% and then sits on that line for about 5 minutes before the error line occurs. dism.log contains the following lines around that time: (Link to full logs is at bottom of post) 2013-07-02 00:56:58, Info DISM DISM.EXE: Succesfully registered commands for the provider: Edition Manager. 2013-07-02 00:56:58, Info DISM DISM Provider Store: PID=5768 TID=5780 Getting Provider DISM Package Manager - CDISMProviderStore::GetProvider 2013-07-02 00:56:58, Info DISM DISM Provider Store: PID=5768 TID=5780 Provider has previously been initialized. Returning the existing instance. - CDISMProviderStore::Internal_GetProvider 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Processing the top level command token(enable-feature). - CPackageManagerCLIHandler::Private_ValidateCmdLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Attempting to route to appropriate command handler. - CPackageManagerCLIHandler::ExecuteCmdLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Routing the command... - CPackageManagerCLIHandler::ExecuteCmdLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Encountered the option "featurename" with value "NetFX3" - CPackageManagerCLIHandler::Private_GetPackagesFromCommandLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Encountered an unknown option "featurename" with value "NetFX3" - CPackageManagerCLIHandler::Private_GetPackagesFromCommandLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Encountered the option "source" with value "C:\Windows\WinSxS" - CPackageManagerCLIHandler::Private_GetPackagesFromCommandLine 2013-07-02 00:56:58, Info DISM DISM Package Manager: PID=5768 TID=5780 Encountered an unknown option "source" with value "C:\Windows\WinSxS" - CPackageManagerCLIHandler::Private_GetPackagesFromCommandLine 2013-07-02 00:56:59, Info DISM DISM Package Manager: PID=5768 TID=5780 Initiating Changes on Package with values: 5, 7 - CDISMPackage::Internal_ChangePackageState 2013-07-02 00:56:59, Info DISM DISM Package Manager: PID=5768 TID=5780 CBS session options=0x20100! - CDISMPackageManager::Internal_Finalize 2013-07-02 01:00:27, Info DISM DISM Package Manager: PID=5768 TID=2420 Error in operation: (null) (CBS HRESULT=0x800f0922) - CCbsConUIHandler::Error 2013-07-02 01:00:27, Error DISM DISM Package Manager: PID=5768 TID=5780 Failed finalizing changes. - CDISMPackageManager::Internal_Finalize(hr:0x800f0922) 2013-07-02 01:00:27, Error DISM DISM Package Manager: PID=5768 TID=5780 Failed processing package changes with session options - CDISMPackageManager::ProcessChangesWithOptions(hr:0x800f0922) 2013-07-02 01:00:27, Error DISM DISM Package Manager: PID=5768 TID=5780 Failed ProcessChanges. - CPackageManagerCLIHandler::Private_ProcessFeatureChange(hr:0x800f0922) 2013-07-02 01:00:27, Error DISM DISM Package Manager: PID=5768 TID=5780 Failed while processing command enable-feature. - CPackageManagerCLIHandler::ExecuteCmdLine(hr:0x800f0922) 2013-07-02 01:00:27, Info DISM DISM Package Manager: PID=5768 TID=5780 Further logs for online package and feature related operations can be found at %WINDIR%\logs\CBS\cbs.log - CPackageManagerCLIHandler::ExecuteCmdLine 2013-07-02 01:00:27, Error DISM DISM.EXE: DISM Package Manager processed the command line but failed. HRESULT=800F0922 cbs.log has the following chunks around then which could be relevant: 2013-07-02 00:55:06, Info CBS Exec: This is a PSF Package. Job has been saved and we are returning to client. 2013-07-02 00:55:06, Info CSI 0000042d@2013/7/1:23:55:06.203 CSI Transaction @0xe2f5e59500 destroyed 2013-07-02 00:55:06, Info CBS Exec: DPX job state saved for one or more packages, aborting the staging and install of execution. 2013-07-02 00:55:06, Info CSI 0000042e@2013/7/1:23:55:06.207 CSI Transaction @0xe2f5e58480 destroyed 2013-07-02 00:55:06, Info CBS Perf: Stage chain complete. 2013-07-02 00:55:06, Info CBS Failed to stage execution chain. [HRESULT = 0x800f0816 - CBS_E_DPX_JOB_STATE_SAVED] 2013-07-02 00:55:06, Info CBS Failed to process single phase execution. [HRESULT = 0x800f0816 - CBS_E_DPX_JOB_STATE_SAVED] 2013-07-02 00:55:06, Info CBS WER: Failure is not worth reporting [HRESULT = 0x800f0816 - CBS_E_DPX_JOB_STATE_SAVED] 2013-07-02 00:55:06, Info CBS Reboot mark cleared and further down: 2013-07-02 00:59:19, Info CSI 000004e6 Begin executing advanced installer phase 38 (0x00000026) index 253 (0x00000000000000fd) (sequence 289) Old component: [l:0]"" New component: [ml:306{153},l:304{152}]"NetFx35CDF-CDF_GenericCommands, Culture=neutral, Version=6.2.9200.16384, PublicKeyToken=31bf3856ad364e35, ProcessorArchitecture=x86, versionScope=NonSxS" Install mode: install Installer ID: {81a34a10-4256-436a-89d6-794b97ca407c} Installer name: [15]"Generic Command" 2013-07-02 00:59:19, Info CSI 000004e7 Performing 1 operations; 1 are not lock/unlock and follow: (0) LockComponentPath (10): flags: 0 comp: {l:16 b:19fc6600b776ce01c91f0000fc07a816} pathid: {l:16 b:19fc6600b776ce01ca1f0000fc07a816} path: [l:214{107}]"\SystemRoot\WinSxS\x86_netfx35cdf-cdf_genericcommands_31bf3856ad364e35_6.2.9200.16384_none_0cec490be12fb858" pid: 7fc starttime: 130171962799582915 (0x01ce76b5e2626ec3) 2013-07-02 00:59:19, Info CSI 000004e8 Performing 1 operations; 1 are not lock/unlock and follow: (0) LockComponentPath (10): flags: 0 comp: {l:16 b:27236700b776ce01cb1f0000fc07a816} pathid: {l:16 b:27236700b776ce01cc1f0000fc07a816} path: [l:210{105}]"\SystemRoot\WinSxS\x86_netfx35cdf-csd_cdf_installer_31bf3856ad364e35_6.2.9200.16384_none_55072425fd5c3716" pid: 7fc starttime: 130171962799582915 (0x01ce76b5e2626ec3) 2013-07-02 00:59:19, Info CSI 000004e9 Calling generic command executable (sequence 1): [122]"C:\Windows\WinSxS\x86_netfx35cdf-csd_cdf_installer_31bf3856ad364e35_6.2.9200.16384_none_55072425fd5c3716\WFServicesReg.exe" CmdLine: [139]""C:\Windows\WinSxS\x86_netfx35cdf-csd_cdf_installer_31bf3856ad364e35_6.2.9200.16384_none_55072425fd5c3716\WFServicesReg.exe" /c /b /v /m /i" 2013-07-02 00:59:20, Info CSI 000004ea Performing 1 operations; 1 are not lock/unlock and follow: (0) LockComponentPath (10): flags: 0 comp: {l:16 b:bd790401b776ce01cd1f0000fc07a816} pathid: {l:16 b:bd790401b776ce01ce1f0000fc07a816} path: [l:234{117}]"\SystemRoot\WinSxS\x86_microsoft.windows.s..ation.badcomponents_31bf3856ad364e35_6.2.9200.16384_none_353ccb4c94858655" pid: 7fc starttime: 130171962799582915 (0x01ce76b5e2626ec3) 2013-07-02 00:59:20, Info CSI 000004eb Creating NT transaction (seq 27), objectname [6]"(null)" 2013-07-02 00:59:20, Info CSI 000004ec Created NT transaction (seq 27) result 0x00000000, handle @0x24b8 2013-07-02 00:59:20, Info CSI 000004ed@2013/7/1:23:59:20.933 Beginning NT transaction commit... 2013-07-02 00:59:22, Info CSI 000004ee@2013/7/1:23:59:22.065 CSI perf trace: CSIPERF:TXCOMMIT;1387723 2013-07-02 00:59:22, Error CSI 000004ef (F) Done with generic command 1; CreateProcess returned 0, CPAW returned S_OK Process exit code 255 (0x000000ff) resulted in success? FALSE Process output: [l:28479 [4096]"DDSet_Entry: WFServicesReg.exe DDSet_Status: CFxInstaller::CopyConfigFilesToTemp is64bit=0 DDSet_Status: CFileHelper::CopyConfigFilesToTempLocation DDSet_Status: CFxInstaller::SetupBaseComponents isInstall=1 DDSet_Status: CFxInstaller::SetupBaseComponents Calling SetupExtensions. isInstall=1 (0x000000FF -- The extended attributes are inconsistent. ??) And a bit further down: 2013-07-02 00:59:22, Error [0x018007] CSI 000004f0 (F) Failed execution of queue item Installer: Generic Command ({81a34a10-4256-436a-89d6-794b97ca407c}) with HRESULT HRESULT_FROM_WIN32(14109). Failure will not be ignored: A rollback will be initiated after all the operations in the installer queue are completed; installer is reliable (2)[gle=0x80004005] [...snip...] 2013-07-02 00:59:22, Info CBS Not able to add pending.xml.bad to Windows Error Report. [HRESULT = 0x80070002 - ERROR_FILE_NOT_FOUND] 2013-07-02 00:59:28, Info CSI 000004f1@2013/7/1:23:59:28.467 CSI Advanced installer perf trace: CSIPERF:AIDONE;{81a34a10-4256-436a-89d6-794b97ca407c};NetFx35CDF-CDF_GenericCommands, Version = 6.2.9200.16384, pA = PROCESSOR_ARCHITECTURE_INTEL (0), Culture neutral, VersionScope = 1 nonSxS, PublicKeyToken = {l:8 b:31bf3856ad364e35}, Type neutral, TypeName neutral, PublicKey neutral;10609242us 2013-07-02 00:59:28, Info CSI 000004f2 End executing advanced installer (sequence 289) Completion status: HRESULT_FROM_WIN32(ERROR_ADVANCED_INSTALLER_FAILED) [...snip...] 2013-07-02 01:00:26, Info CBS Exec: Cancelled pending transactions after rollback. [HRESULT = 0x00000000 - S_OK] 2013-07-02 01:00:26, Error CBS Exec: An error occurred while committing the transaction, the transaction could not be rolled back. [HRESULT = 0x800f0922 - CBS_E_INSTALLERS_FAILED] The full DISM and CBS logs are at http://ben.mu/files/dotnet35_dism_cbs.zip as the CBS log is nearly 167MB uncompressed. o.o dism.log gives the timeframe of where its errors occur--00:56:20ish to 01:00:22. Does anyone have any ideas what's actually causing the installation to fail, and if so how I can fix it? Please don't just say "Refresh the OS". :)

Read the article
Network Logon Issues with Group Policy and Network

- by bobloki

I am gravely in need of your help and assistance. We have a problem with our logon and startup to our Windows 7 Enterprise system. We have more than 3000 Windows Desktops situated in roughly 20+ buildings around campus. Almost every computer on campus has the problem that I will be describing. I have spent over one month peering over etl files from Windows Performance Analyzer (A great product) and hundreds of thousands of event logs. I come to you today humbled that I could not figure this out. The problem as simply put our logon times are extremely long. An average first time logon is roughly 2-10 minutes depending on the software installed. All computers are Windows 7, the oldest computers being 5 years old. Startup times on various computers range from good (1-2 minutes) to very bad (5-60). Our second time logons range from 30 seconds to 4 minutes. We have a gigabit connection between each computer on the network. We have 5 domain controllers which also double as our DNS servers. Initial testing led us to believe that this was a software problem. So I spent a few days testing machines only to find inconsistent results from the etl files from xperfview. Each subset of computers on campus had a different subset of software issues, none seeming to interfere with logon just startup. So I started looking at our group policy and located some very interesting event ID’s. Group Policy 1129: The processing of Group Policy failed because of lack of network connectivity to a domain controller. Group Policy 1055: The processing of Group Policy failed. Windows could not resolve the computer name. This could be caused by one of more of the following: a) Name Resolution failure on the current domain controller. b) Active Directory Replication Latency (an account created on another domain controller has not replicated to the current domain controller). NETLOGON 5719 : This computer was not able to set up a secure session with a domain controller in domain OURDOMAIN due to the following: There are currently no logon servers available to service the logon request. This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator. E1kexpress 27: Intel®82567LM-3 Gigabit Network Connection – Network link is disconnected. NetBT 4300 – The driver could not be created. WMI 10 - Event filter with query "SELECT * FROM __InstanceModificationEvent WITHIN 60 WHERE TargetInstance ISA "Win32_Processor" AND TargetInstance.LoadPercentage 99" could not be reactivated in namespace "//./root/CIMV2" because of error 0x80041003. Events cannot be delivered through this filter until the problem is corrected. More or less with timestamps it becomes apparent that the network maybe the issue. 1:25:57 - Group Policy is trying to discover the domain controller information 1:25:57 - The network link has been disconnected 1:25:58 - The processing of Group Policy failed because of lack of network connectivity to a domain controller. This may be a transient condition. A success message would be generated once the machine gets connected to the domain controller and Group Policy has successfully processed. If you do not see a success message for several hours, then contact your administrator. 1:25:58 - Making LDAP calls to connect and bind to active directory. DC1.ourdomain.edu 1:25:58 - Call failed after 0 milliseconds. 1:25:58 - Forcing rediscovery of domain controller details. 1:25:58 - Group policy failed to discover the domain controller in 1030 milliseconds 1:25:58 - Periodic policy processing failed for computer OURDOMAIN\%name%$ in 1 seconds. 1:25:59 - A network link has been established at 1Gbps at full duplex 1:26:00 - The network link has been disconnected 1:26:02 - NtpClient was unable to set a domain peer to use as a time source because of discovery error. NtpClient will try again in 3473457 minutes and DOUBLE THE REATTEMPT INTERVAL thereafter. 1:26:05 - A network link has been established at 1Gbps at full duplex 1:26:08 - Name resolution for the name %Name% timed out after none of the configured DNS servers responded. 1:26:10 – The TCP/IP NetBIOS Helper service entered the running state. 1:26:11 - The time provider NtpClient is currently receiving valid time data at dc4.ourdomain.edu 1:26:14 – User Logon Notification for Customer Experience Improvement Program 1:26:15 - Group Policy received the notification Logon from Winlogon for session 1. 1:26:15 - Making LDAP calls to connect and bind to Active Directory. dc4.ourdomain.edu 1:26:18 - The LDAP call to connect and bind to Active Directory completed. dc4. ourdomain.edu. The call completed in 2309 milliseconds. 1:26:18 - Group Policy successfully discovered the Domain Controller in 2918 milliseconds. 1:26:18 - Computer details: Computer role : 2 Network name : (Blank) 1:26:18 - The LDAP call to connect and bind to Active Directory completed. dc4.ourdomain.edu. The call completed in 2309 milliseconds. 1:26:18 - Group Policy successfully discovered the Domain Controller in 2918 milliseconds. 1:26:19 - The WinHTTP Web Proxy Auto-Discovery Service service entered the running state. 1:26:46 - The Network Connections service entered the running state. 1:27:10 – Retrieved account information 1:27:10 – The system call to get account information completed. 1:27:10 - Starting policy processing due to network state change for computer OURDOMAIN\%name%$ 1:27:10 – Network state change detected 1:27:10 - Making system call to get account information. 1:27:11 - Making LDAP calls to connect and bind to Active Directory. dc4.ourdomain.edu 1:27:13 - Computer details: Computer role : 2 Network name : ourdomain.edu (Now not blank) 1:27:13 - Group Policy successfully discovered the Domain Controller in 2886 milliseconds. 1:27:13 - The LDAP call to connect and bind to Active Directory completed. dc4.ourdomain.edu The call completed in 2371 milliseconds. 1:27:15 - Estimated network bandwidth on one of the connections: 0 kbps. 1:27:15 - Estimated network bandwidth on one of the connections: 8545 kbps. 1:27:15 - A fast link was detected. The Estimated bandwidth is 8545 kbps. The slow link threshold is 500 kbps. 1:27:17 – Powershell - Engine state is changed from Available to Stopped. 1:27:20 - Completed Group Policy Local Users and Groups Extension Processing in 4539 milliseconds. 1:27:25 - Completed Group Policy Scheduled Tasks Extension Processing in 5210 milliseconds. 1:27:27 - Completed Group Policy Registry Extension Processing in 1529 milliseconds. 1:27:27 - Completed policy processing due to network state change for computer OURDOMAIN\%name%$ in 16 seconds. 1:27:27 – The Group Policy settings for the computer were processed successfully. There were no changes detected since the last successful processing of Group Policy. Any help would be appreciated. Please ask for any relevant information and it will be provided as soon as possible.

Read the article
Network Logon Issues with Group Policy and Network

- by bobloki

I am gravely in need of your help and assistance. We have a problem with our logon and startup to our Windows 7 Enterprise system. We have more than 3000 Windows Desktops situated in roughly 20+ buildings around campus. Almost every computer on campus has the problem that I will be describing. I have spent over one month peering over etl files from Windows Performance Analyzer (A great product) and hundreds of thousands of event logs. I come to you today humbled that I could not figure this out. The problem as simply put our logon times are extremely long. An average first time logon is roughly 2-10 minutes depending on the software installed. All computers are Windows 7, the oldest computers being 5 years old. Startup times on various computers range from good (1-2 minutes) to very bad (5-60). Our second time logons range from 30 seconds to 4 minutes. We have a gigabit connection between each computer on the network. We have 5 domain controllers which also double as our DNS servers. Initial testing led us to believe that this was a software problem. So I spent a few days testing machines only to find inconsistent results from the etl files from xperfview. Each subset of computers on campus had a different subset of software issues, none seeming to interfere with logon just startup. So I started looking at our group policy and located some very interesting event ID’s. Group Policy 1129: The processing of Group Policy failed because of lack of network connectivity to a domain controller. Group Policy 1055: The processing of Group Policy failed. Windows could not resolve the computer name. This could be caused by one of more of the following: a) Name Resolution failure on the current domain controller. b) Active Directory Replication Latency (an account created on another domain controller has not replicated to the current domain controller). NETLOGON 5719 : This computer was not able to set up a secure session with a domain controller in domain OURDOMAIN due to the following: There are currently no logon servers available to service the logon request. This may lead to authentication problems. Make sure that this computer is connected to the network. If the problem persists, please contact your domain administrator. E1kexpress 27: Intel®82567LM-3 Gigabit Network Connection – Network link is disconnected. NetBT 4300 – The driver could not be created. WMI 10 - Event filter with query "SELECT * FROM __InstanceModificationEvent WITHIN 60 WHERE TargetInstance ISA "Win32_Processor" AND TargetInstance.LoadPercentage 99" could not be reactivated in namespace "//./root/CIMV2" because of error 0x80041003. Events cannot be delivered through this filter until the problem is corrected. More or less with timestamps it becomes apparent that the network maybe the issue. 1:25:57 - Group Policy is trying to discover the domain controller information 1:25:57 - The network link has been disconnected 1:25:58 - The processing of Group Policy failed because of lack of network connectivity to a domain controller. This may be a transient condition. A success message would be generated once the machine gets connected to the domain controller and Group Policy has successfully processed. If you do not see a success message for several hours, then contact your administrator. 1:25:58 - Making LDAP calls to connect and bind to active directory. DC1.ourdomain.edu 1:25:58 - Call failed after 0 milliseconds. 1:25:58 - Forcing rediscovery of domain controller details. 1:25:58 - Group policy failed to discover the domain controller in 1030 milliseconds 1:25:58 - Periodic policy processing failed for computer OURDOMAIN\%name%$ in 1 seconds. 1:25:59 - A network link has been established at 1Gbps at full duplex 1:26:00 - The network link has been disconnected 1:26:02 - NtpClient was unable to set a domain peer to use as a time source because of discovery error. NtpClient will try again in 3473457 minutes and DOUBLE THE REATTEMPT INTERVAL thereafter. 1:26:05 - A network link has been established at 1Gbps at full duplex 1:26:08 - Name resolution for the name %Name% timed out after none of the configured DNS servers responded. 1:26:10 – The TCP/IP NetBIOS Helper service entered the running state. 1:26:11 - The time provider NtpClient is currently receiving valid time data at dc4.ourdomain.edu 1:26:14 – User Logon Notification for Customer Experience Improvement Program 1:26:15 - Group Policy received the notification Logon from Winlogon for session 1. 1:26:15 - Making LDAP calls to connect and bind to Active Directory. dc4.ourdomain.edu 1:26:18 - The LDAP call to connect and bind to Active Directory completed. dc4. ourdomain.edu. The call completed in 2309 milliseconds. 1:26:18 - Group Policy successfully discovered the Domain Controller in 2918 milliseconds. 1:26:18 - Computer details: Computer role : 2 Network name : (Blank) 1:26:18 - The LDAP call to connect and bind to Active Directory completed. dc4.ourdomain.edu. The call completed in 2309 milliseconds. 1:26:18 - Group Policy successfully discovered the Domain Controller in 2918 milliseconds. 1:26:19 - The WinHTTP Web Proxy Auto-Discovery Service service entered the running state. 1:26:46 - The Network Connections service entered the running state. 1:27:10 – Retrieved account information 1:27:10 – The system call to get account information completed. 1:27:10 - Starting policy processing due to network state change for computer OURDOMAIN\%name%$ 1:27:10 – Network state change detected 1:27:10 - Making system call to get account information. 1:27:11 - Making LDAP calls to connect and bind to Active Directory. dc4.ourdomain.edu 1:27:13 - Computer details: Computer role : 2 Network name : ourdomain.edu (Now not blank) 1:27:13 - Group Policy successfully discovered the Domain Controller in 2886 milliseconds. 1:27:13 - The LDAP call to connect and bind to Active Directory completed. dc4.ourdomain.edu The call completed in 2371 milliseconds. 1:27:15 - Estimated network bandwidth on one of the connections: 0 kbps. 1:27:15 - Estimated network bandwidth on one of the connections: 8545 kbps. 1:27:15 - A fast link was detected. The Estimated bandwidth is 8545 kbps. The slow link threshold is 500 kbps. 1:27:17 – Powershell - Engine state is changed from Available to Stopped. 1:27:20 - Completed Group Policy Local Users and Groups Extension Processing in 4539 milliseconds. 1:27:25 - Completed Group Policy Scheduled Tasks Extension Processing in 5210 milliseconds. 1:27:27 - Completed Group Policy Registry Extension Processing in 1529 milliseconds. 1:27:27 - Completed policy processing due to network state change for computer OURDOMAIN\%name%$ in 16 seconds. 1:27:27 – The Group Policy settings for the computer were processed successfully. There were no changes detected since the last successful processing of Group Policy. Any help would be appreciated. Please ask for any relevant information and it will be provided as soon as possible.

Read the article
C#/.NET Little Wonders: The ConcurrentDictionary

- by James Michael Hare

Once again we consider some of the lesser known classes and keywords of C#. In this series of posts, we will discuss how the concurrent collections have been developed to help alleviate these multi-threading concerns. Last week’s post began with a general introduction and discussed the ConcurrentStack<T> and ConcurrentQueue<T>. Today's post discusses the ConcurrentDictionary<T> (originally I had intended to discuss ConcurrentBag this week as well, but ConcurrentDictionary had enough information to create a very full post on its own!). Finally next week, we shall close with a discussion of the ConcurrentBag<T> and BlockingCollection<T>. For more of the "Little Wonders" posts, see the index here. Recap As you'll recall from the previous post, the original collections were object-based containers that accomplished synchronization through a Synchronized member. While these were convenient because you didn't have to worry about writing your own synchronization logic, they were a bit too finely grained and if you needed to perform multiple operations under one lock, the automatic synchronization didn't buy much. With the advent of .NET 2.0, the original collections were succeeded by the generic collections which are fully type-safe, but eschew automatic synchronization. This cuts both ways in that you have a lot more control as a developer over when and how fine-grained you want to synchronize, but on the other hand if you just want simple synchronization it creates more work. With .NET 4.0, we get the best of both worlds in generic collections. A new breed of collections was born called the concurrent collections in the System.Collections.Concurrent namespace. These amazing collections are fine-tuned to have best overall performance for situations requiring concurrent access. They are not meant to replace the generic collections, but to simply be an alternative to creating your own locking mechanisms. Among those concurrent collections were the ConcurrentStack<T> and ConcurrentQueue<T> which provide classic LIFO and FIFO collections with a concurrent twist. As we saw, some of the traditional methods that required calls to be made in a certain order (like checking for not IsEmpty before calling Pop()) were replaced in favor of an umbrella operation that combined both under one lock (like TryPop()). Now, let's take a look at the next in our series of concurrent collections!For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this wonderful whitepaper by the Microsoft Parallel Computing Platform team here. ConcurrentDictionary – the fully thread-safe dictionary The ConcurrentDictionary<TKey,TValue> is the thread-safe counterpart to the generic Dictionary<TKey, TValue> collection. Obviously, both are designed for quick – O(1) – lookups of data based on a key. If you think of algorithms where you need lightning fast lookups of data and don’t care whether the data is maintained in any particular ordering or not, the unsorted dictionaries are generally the best way to go. Note: as a side note, there are sorted implementations of IDictionary, namely SortedDictionary and SortedList which are stored as an ordered tree and a ordered list respectively. While these are not as fast as the non-sorted dictionaries – they are O(log2 n) – they are a great combination of both speed and ordering -- and still greatly outperform a linear search. Now, once again keep in mind that if all you need to do is load a collection once and then allow multi-threaded reading you do not need any locking. Examples of this tend to be situations where you load a lookup or translation table once at program start, then keep it in memory for read-only reference. In such cases locking is completely non-productive. However, most of the time when we need a concurrent dictionary we are interleaving both reads and updates. This is where the ConcurrentDictionary really shines! It achieves its thread-safety with no common lock to improve efficiency. It actually uses a series of locks to provide concurrent updates, and has lockless reads! This means that the ConcurrentDictionary gets even more efficient the higher the ratio of reads-to-writes you have. ConcurrentDictionary and Dictionary differences For the most part, the ConcurrentDictionary<TKey,TValue> behaves like it’s Dictionary<TKey,TValue> counterpart with a few differences. Some notable examples of which are: Add() does not exist in the concurrent dictionary. This means you must use TryAdd(), AddOrUpdate(), or GetOrAdd(). It also means that you can’t use a collection initializer with the concurrent dictionary. TryAdd() replaced Add() to attempt atomic, safe adds. Because Add() only succeeds if the item doesn’t already exist, we need an atomic operation to check if the item exists, and if not add it while still under an atomic lock. TryUpdate() was added to attempt atomic, safe updates. If we want to update an item, we must make sure it exists first and that the original value is what we expected it to be. If all these are true, we can update the item under one atomic step. TryRemove() was added to attempt atomic, safe removes. To safely attempt to remove a value we need to see if the key exists first, this checks for existence and removes under an atomic lock. AddOrUpdate() was added to attempt an thread-safe “upsert”. There are many times where you want to insert into a dictionary if the key doesn’t exist, or update the value if it does. This allows you to make a thread-safe add-or-update. GetOrAdd() was added to attempt an thread-safe query/insert. Sometimes, you want to query for whether an item exists in the cache, and if it doesn’t insert a starting value for it. This allows you to get the value if it exists and insert if not. Count, Keys, Values properties take a snapshot of the dictionary. Accessing these properties may interfere with add and update performance and should be used with caution. ToArray() returns a static snapshot of the dictionary. That is, the dictionary is locked, and then copied to an array as a O(n) operation. GetEnumerator() is thread-safe and efficient, but allows dirty reads. Because reads require no locking, you can safely iterate over the contents of the dictionary. The only downside is that, depending on timing, you may get dirty reads. Dirty reads during iteration The last point on GetEnumerator() bears some explanation. Picture a scenario in which you call GetEnumerator() (or iterate using a foreach, etc.) and then, during that iteration the dictionary gets updated. This may not sound like a big deal, but it can lead to inconsistent results if used incorrectly. The problem is that items you already iterated over that are updated a split second after don’t show the update, but items that you iterate over that were updated a split second before do show the update. Thus you may get a combination of items that are “stale” because you iterated before the update, and “fresh” because they were updated after GetEnumerator() but before the iteration reached them. Let’s illustrate with an example, let’s say you load up a concurrent dictionary like this: 1: // load up a dictionary. 2: var dictionary = new ConcurrentDictionary<string, int>(); 3: 4: dictionary["A"] = 1; 5: dictionary["B"] = 2; 6: dictionary["C"] = 3; 7: dictionary["D"] = 4; 8: dictionary["E"] = 5; 9: dictionary["F"] = 6; Then you have one task (using the wonderful TPL!) to iterate using dirty reads: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); And one task to attempt updates in a separate thread (probably): 1: // attempt updates in a separate thread 2: var updateTask = new Task(() => 3: { 4: // iterates, and updates the value by one 5: foreach (var pair in dictionary) 6: { 7: dictionary[pair.Key] = pair.Value + 1; 8: } 9: }); Now that we’ve done this, we can fire up both tasks and wait for them to complete: 1: // start both tasks 2: updateTask.Start(); 3: iterationTask.Start(); 4: 5: // wait for both to complete. 6: Task.WaitAll(updateTask, iterationTask); Now, if I you didn’t know about the dirty reads, you may have expected to see the iteration before the updates (such as A:1, B:2, C:3, D:4, E:5, F:6). However, because the reads are dirty, we will quite possibly get a combination of some updated, some original. My own run netted this result: 1: F:6 2: E:6 3: D:5 4: C:4 5: B:3 6: A:2 Note that, of course, iteration is not in order because ConcurrentDictionary, like Dictionary, is unordered. Also note that both E and F show the value 6. This is because the output task reached F before the update, but the updates for the rest of the items occurred before their output (probably because console output is very slow, comparatively). If we want to always guarantee that we will get a consistent snapshot to iterate over (that is, at the point we ask for it we see precisely what is in the dictionary and no subsequent updates during iteration), we should iterate over a call to ToArray() instead: 1: // attempt iteration in a separate thread 2: var iterationTask = new Task(() => 3: { 4: // iterates using a dirty read 5: foreach (var pair in dictionary.ToArray()) 6: { 7: Console.WriteLine(pair.Key + ":" + pair.Value); 8: } 9: }); The atomic Try…() methods As you can imagine TryAdd() and TryRemove() have few surprises. Both first check the existence of the item to determine if it can be added or removed based on whether or not the key currently exists in the dictionary: 1: // try add attempts an add and returns false if it already exists 2: if (dictionary.TryAdd("G", 7)) 3: Console.WriteLine("G did not exist, now inserted with 7"); 4: else 5: Console.WriteLine("G already existed, insert failed."); TryRemove() also has the virtue of returning the value portion of the removed entry matching the given key: 1: // attempt to remove the value, if it exists it is removed and the original is returned 2: int removedValue; 3: if (dictionary.TryRemove("C", out removedValue)) 4: Console.WriteLine("Removed C and its value was " + removedValue); 5: else 6: Console.WriteLine("C did not exist, remove failed."); Now TryUpdate() is an interesting creature. You might think from it’s name that TryUpdate() first checks for an item’s existence, and then updates if the item exists, otherwise it returns false. Well, note quite... It turns out when you call TryUpdate() on a concurrent dictionary, you pass it not only the new value you want it to have, but also the value you expected it to have before the update. If the item exists in the dictionary, and it has the value you expected, it will update it to the new value atomically and return true. If the item is not in the dictionary or does not have the value you expected, it is not modified and false is returned. 1: // attempt to update the value, if it exists and if it has the expected original value 2: if (dictionary.TryUpdate("G", 42, 7)) 3: Console.WriteLine("G existed and was 7, now it's 42."); 4: else 5: Console.WriteLine("G either didn't exist, or wasn't 7."); The composite Add methods The ConcurrentDictionary also has composite add methods that can be used to perform updates and gets, with an add if the item is not existing at the time of the update or get. The first of these, AddOrUpdate(), allows you to add a new item to the dictionary if it doesn’t exist, or update the existing item if it does. For example, let’s say you are creating a dictionary of counts of stock ticker symbols you’ve subscribed to from a market data feed: 1: public sealed class SubscriptionManager 2: { 3: private readonly ConcurrentDictionary<string, int> _subscriptions = new ConcurrentDictionary<string, int>(); 4: 5: // adds a new subscription, or increments the count of the existing one. 6: public void AddSubscription(string tickerKey) 7: { 8: // add a new subscription with count of 1, or update existing count by 1 if exists 9: var resultCount = _subscriptions.AddOrUpdate(tickerKey, 1, (symbol, count) => count + 1); 10: 11: // now check the result to see if we just incremented the count, or inserted first count 12: if (resultCount == 1) 13: { 14: // subscribe to symbol... 15: } 16: } 17: } Notice the update value factory Func delegate. If the key does not exist in the dictionary, the add value is used (in this case 1 representing the first subscription for this symbol), but if the key already exists, it passes the key and current value to the update delegate which computes the new value to be stored in the dictionary. The return result of this operation is the value used (in our case: 1 if added, existing value + 1 if updated). Likewise, the GetOrAdd() allows you to attempt to retrieve a value from the dictionary, and if the value does not currently exist in the dictionary it will insert a value. This can be handy in cases where perhaps you wish to cache data, and thus you would query the cache to see if the item exists, and if it doesn’t you would put the item into the cache for the first time: 1: public sealed class PriceCache 2: { 3: private readonly ConcurrentDictionary<string, double> _cache = new ConcurrentDictionary<string, double>(); 4: 5: // adds a new subscription, or increments the count of the existing one. 6: public double QueryPrice(string tickerKey) 7: { 8: // check for the price in the cache, if it doesn't exist it will call the delegate to create value. 9: return _cache.GetOrAdd(tickerKey, symbol => GetCurrentPrice(symbol)); 10: } 11: 12: private double GetCurrentPrice(string tickerKey) 13: { 14: // do code to calculate actual true price. 15: } 16: } There are other variations of these two methods which vary whether a value is provided or a factory delegate, but otherwise they work much the same. Oddities with the composite Add methods The AddOrUpdate() and GetOrAdd() methods are totally thread-safe, on this you may rely, but they are not atomic. It is important to note that the methods that use delegates execute those delegates outside of the lock. This was done intentionally so that a user delegate (of which the ConcurrentDictionary has no control of course) does not take too long and lock out other threads. This is not necessarily an issue, per se, but it is something you must consider in your design. The main thing to consider is that your delegate may get called to generate an item, but that item may not be the one returned! Consider this scenario: A calls GetOrAdd and sees that the key does not currently exist, so it calls the delegate. Now thread B also calls GetOrAdd and also sees that the key does not currently exist, and for whatever reason in this race condition it’s delegate completes first and it adds its new value to the dictionary. Now A is done and goes to get the lock, and now sees that the item now exists. In this case even though it called the delegate to create the item, it will pitch it because an item arrived between the time it attempted to create one and it attempted to add it. Let’s illustrate, assume this totally contrived example program which has a dictionary of char to int. And in this dictionary we want to store a char and it’s ordinal (that is, A = 1, B = 2, etc). So for our value generator, we will simply increment the previous value in a thread-safe way (perhaps using Interlocked): 1: public static class Program 2: { 3: private static int _nextNumber = 0; 4: 5: // the holder of the char to ordinal 6: private static ConcurrentDictionary<char, int> _dictionary 7: = new ConcurrentDictionary<char, int>(); 8: 9: // get the next id value 10: public static int NextId 11: { 12: get { return Interlocked.Increment(ref _nextNumber); } 13: } Then, we add a method that will perform our insert: 1: public static void Inserter() 2: { 3: for (int i = 0; i < 26; i++) 4: { 5: _dictionary.GetOrAdd((char)('A' + i), key => NextId); 6: } 7: } Finally, we run our test by starting two tasks to do this work and get the results… 1: public static void Main() 2: { 3: // 3 tasks attempting to get/insert 4: var tasks = new List<Task> 5: { 6: new Task(Inserter), 7: new Task(Inserter) 8: }; 9: 10: tasks.ForEach(t => t.Start()); 11: Task.WaitAll(tasks.ToArray()); 12: 13: foreach (var pair in _dictionary.OrderBy(p => p.Key)) 14: { 15: Console.WriteLine(pair.Key + ":" + pair.Value); 16: } 17: } If you run this with only one task, you get the expected A:1, B:2, ..., Z:26. But running this in parallel you will get something a bit more complex. My run netted these results: 1: A:1 2: B:3 3: C:4 4: D:5 5: E:6 6: F:7 7: G:8 8: H:9 9: I:10 10: J:11 11: K:12 12: L:13 13: M:14 14: N:15 15: O:16 16: P:17 17: Q:18 18: R:19 19: S:20 20: T:21 21: U:22 22: V:23 23: W:24 24: X:25 25: Y:26 26: Z:27 Notice that B is 3? This is most likely because both threads attempted to call GetOrAdd() at roughly the same time and both saw that B did not exist, thus they both called the generator and one thread got back 2 and the other got back 3. However, only one of those threads can get the lock at a time for the actual insert, and thus the one that generated the 3 won and the 3 was inserted and the 2 got discarded. This is why on these methods your factory delegates should be careful not to have any logic that would be unsafe if the value they generate will be pitched in favor of another item generated at roughly the same time. As such, it is probably a good idea to keep those generators as stateless as possible. Summary The ConcurrentDictionary is a very efficient and thread-safe version of the Dictionary generic collection. It has all the benefits of type-safety that it’s generic collection counterpart does, and in addition is extremely efficient especially when there are more reads than writes concurrently. Tweet Technorati Tags: C#, .NET, Concurrent Collections, Collections, Little Wonders, Black Rabbit Coder,James Michael Hare

Read the article
Microsoft Introduces WebMatrix

- by Rick Strahl

originally published in CoDe Magazine Editorial Microsoft recently released the first CTP of a new development environment called WebMatrix, which along with some of its supporting technologies are squarely aimed at making the Microsoft Web Platform more approachable for first-time developers and hobbyists. But in the process, it also provides some updated technologies that can make life easier for existing .NET developers. Let’s face it: ASP.NET development isn’t exactly trivial unless you already have a fair bit of familiarity with sophisticated development practices. Stick a non-developer in front of Visual Studio .NET or even the Visual Web Developer Express edition and it’s not likely that the person in front of the screen will be very productive or feel inspired. Yet other technologies like PHP and even classic ASP did provide the ability for non-developers and hobbyists to become reasonably proficient in creating basic web content quickly and efficiently. WebMatrix appears to be Microsoft’s attempt to bring back some of that simplicity with a number of technologies and tools. The key is to provide a friendly and fully self-contained development environment that provides all the tools needed to build an application in one place, as well as tools that allow publishing of content and databases easily to the web server. WebMatrix is made up of several components and technologies: IIS Developer Express IIS Developer Express is a new, self-contained development web server that is fully compatible with IIS 7.5 and based on the same codebase that IIS 7.5 uses. This new development server replaces the much less compatible Cassini web server that’s been used in Visual Studio and the Express editions. IIS Express addresses a few shortcomings of the Cassini server such as the inability to serve custom ISAPI extensions (i.e., things like PHP or ASP classic for example), as well as not supporting advanced authentication. IIS Developer Express provides most of the IIS 7.5 feature set providing much better compatibility between development and live deployment scenarios. SQL Server Compact 4.0 Database access is a key component for most web-driven applications, but on the Microsoft stack this has mostly meant you have to use SQL Server or SQL Server Express. SQL Server Compact is not new-it’s been around for a few years, but it’s been severely hobbled in the past by terrible tool support and the inability to support more than a single connection in Microsoft’s attempt to avoid losing SQL Server licensing. The new release of SQL Server Compact 4.0 supports multiple connections and you can run it in ASP.NET web applications simply by installing an assembly into the bin folder of the web application. In effect, you don’t have to install a special system configuration to run SQL Compact as it is a drop-in database engine: Copy the small assembly into your BIN folder (or from the GAC if installed fully), create a connection string against a local file-based database file, and then start firing SQL requests. Additionally WebMatrix includes nice tools to edit the database tables and files, along with tools to easily upsize (and hopefully downsize in the future) to full SQL Server. This is a big win, pending compatibility and performance limits. In my simple testing the data engine performed well enough for small data sets. This is not only useful for web applications, but also for desktop applications for which a fully installed SQL engine like SQL Server would be overkill. Having a local data store in those applications that can potentially be accessed by multiple users is a welcome feature. ASP.NET Razor View Engine What? Yet another native ASP.NET view engine? We already have Web Forms and various different flavors of using that view engine with Web Forms and MVC. Do we really need another? Microsoft thinks so, and Razor is an implementation of a lightweight, script-only view engine. Unlike the Web Forms view engine, Razor works only with inline code, snippets, and markup; therefore, it is more in line with current thinking of what a view engine should represent. There’s no support for a “page model” or any of the other Web Forms features of the full-page framework, but just a lightweight scripting engine that works with plain markup plus embedded expressions and code. The markup syntax for Razor is geared for minimal typing, plus some progressive detection of where a script block/expression starts and ends. This results in a much leaner syntax than the typical ASP.NET Web Forms alligator (<% %>) tags. Razor uses the @ sign plus standard C# (or Visual Basic) block syntax to delineate code snippets and expressions. Here’s a very simple example of what Razor markup looks like along with some comment annotations: <!DOCTYPE html> <html> <head> <title></title> </head> <body> <h1>Razor Test</h1>  @DateTime.Now <hr />  @DateTime.Now.ToString("T")  @{ List<string> names = new List<string>(); names.Add("Rick"); names.Add("Markus"); names.Add("Claudio"); names.Add("Kevin"); }  <ul> @foreach(string name in names){ <li>@name</li> } </ul>  @if(true) {  <text> true </text>; } else {  <text> false </text>; } </body> </html> Like the Web Forms view engine, Razor parses pages into code, and then executes that run-time compiled code. Effectively a “page” becomes a code file with markup becoming literal text written into the Response stream, code snippets becoming raw code, and expressions being written out with Response.Write(). The code generated from Razor doesn’t look much different from similar Web Forms code that only uses script tags; so although the syntax may look different, the operational model is fairly similar to the Web Forms engine minus the overhead of the large Page object model. However, there are differences: -Razor pages are based on a new base class, Microsoft.WebPages.WebPage, which is hosted in the Microsoft.WebPages assembly that houses all the Razor engine parsing and processing logic. Browsing through the assembly (in the generated ASP.NET Temporary Files folder or GAC) will give you a good idea of the functionality that Razor provides. If you look closely, a lot of the feature set matches ASP.NET MVC’s view implementation as well as many of the helper classes found in MVC. It’s not hard to guess the motivation for this sort of view engine: For beginning developers the simple markup syntax is easier to work with, although you obviously still need to have some understanding of the .NET Framework in order to create dynamic content. The syntax is easier to read and grok and much shorter to type than ASP.NET alligator tags (<% %>) and also easier to understand aesthetically what’s happening in the markup code. Razor also is a better fit for Microsoft’s vision of ASP.NET MVC: It’s a new view engine without the baggage of Web Forms attached to it. The engine is more lightweight since it doesn’t carry all the features and object model of Web Forms with it and it can be instantiated directly outside of the HTTP environment, which has been rather tricky to do for the Web Forms view engine. Having a standalone script parser is a huge win for other applications as well – it makes it much easier to create script or meta driven output generators for many types of applications from code/screen generators, to simple form letters to data merging applications with user customizability. For me personally this is very useful side effect and who knows maybe Microsoft will actually standardize they’re scripting engines (die T4 die!) on this engine. Razor also better fits the “view-based” approach where the view is supposed to be mostly a visual representation that doesn’t hold much, if any, code. While you can still use code, the code you do write has to be self-contained. Overall I wouldn’t be surprised if Razor will become the new standard view engine for MVC in the future – and in fact there have been announcements recently that Razor will become the default script engine in ASP.NET MVC 3.0. Razor can also be used in existing Web Forms and MVC applications, although that’s not working currently unless you manually configure the script mappings and add the appropriate assemblies. It’s possible to do it, but it’s probably better to wait until Microsoft releases official support for Razor scripts in Visual Studio. Once that happens, you can simply drop .cshtml and .vbhtml pages into an existing ASP.NET project and they will work side by side with classic ASP.NET pages. WebMatrix Development Environment To tie all of these three technologies together, Microsoft is shipping WebMatrix with an integrated development environment. An integrated gallery manager makes it easy to download and load existing projects, and then extend them with custom functionality. It seems to be a prominent goal to provide community-oriented content that can act as a starting point, be it via a custom templates or a complete standard application. The IDE includes a project manager that works with a single project and provides an integrated IDE/editor for editing the .cshtml and .vbhtml pages. A run button allows you to quickly run pages in the project manager in a variety of browsers. There’s no debugging support for code at this time. Note that Razor pages don’t require explicit compilation, so making a change, saving, and then refreshing your page in the browser is all that’s needed to see changes while testing an application locally. It’s essentially using the auto-compiling Web Project that was introduced with .NET 2.0. All code is compiled during run time into dynamically created assemblies in the ASP.NET temp folder. WebMatrix also has PHP Editing support with syntax highlighting. You can load various PHP-based applications from the WebMatrix Web Gallery directly into the IDE. Most of the Web Gallery applications are ready to install and run without further configuration, with Wizards taking you through installation of tools, dependencies, and configuration of the database as needed. WebMatrix leverages the Web Platform installer to pull the pieces down from websites in a tight integration of tools that worked nicely for the four or five applications I tried this out on. Click a couple of check boxes and fill in a few simple configuration options and you end up with a running application that’s ready to be customized. Nice! You can easily deploy completed applications via WebDeploy (to an IIS server) or FTP directly from within the development environment. The deploy tool also can handle automatically uploading and installing the database and all related assemblies required, making deployment a simple one-click install step. Simplified Database Access The IDE contains a database editor that can edit SQL Compact and SQL Server databases. There is also a Database helper class that facilitates database access by providing easy-to-use, high-level query execution and iteration methods: @{ var db = Database.OpenFile("FirstApp.sdf"); string sql = "select * from customers where Id > @0"; } <ul> @foreach(var row in db.Query(sql,1)){ <li>@row.FirstName @row.LastName</li> } </ul> The query function takes a SQL statement plus any number of positional (@0,@1 etc.) SQL parameters by simple values. The result is returned as a collection of rows which in turn have a row object with dynamic properties for each of the columns giving easy (though untyped) access to each of the fields. Likewise Execute and ExecuteNonQuery allow execution of more complex queries using similar parameter passing schemes. Note these queries use string-based queries rather than LINQ or Entity Framework’s strongly typed LINQ queries. While this may seem like a step back, it’s also in line with the expectations of non .NET script developers who are quite used to writing and using SQL strings in code rather than using OR/M frameworks. The only question is why was something not included from the beginning in .NET and Microsoft made developers build custom implementations of these basic building blocks. The implementation looks a lot like a DataTable-style data access mechanism, but to be fair, this is a common approach in scripting languages. This type of syntax that uses simple, static, data object methods to perform simple data tasks with one line of code are common in scripting languages and are a good match for folks working in PHP/Python, etc. Seems like Microsoft has taken great advantage of .NET 4.0’s dynamic typing to provide this sort of interface for row iteration where each row has properties for each field. FWIW, all the examples demonstrate using local SQL Compact files - I was unable to get a SQL Server connection string to work with the Database class (the connection string wasn’t accepted). However, since the code in the page is still plain old .NET, you can easily use standard ADO.NET code or even LINQ or Entity Framework models that are created outside of WebMatrix in separate assemblies as required. The good the bad the obnoxious - It’s still .NET The beauty (or curse depending on how you look at it :)) of Razor and the compilation model is that, behind it all, it’s still .NET. Although the syntax may look foreign, it’s still all .NET behind the scenes. You can easily access existing tools, helpers, and utilities simply by adding them to the project as references or to the bin folder. Razor automatically recognizes any assembly reference from assemblies in the bin folder. In the default configuration, Microsoft provides a host of helper functions in a Microsoft.WebPages assembly (check it out in the ASP.NET temp folder for your application), which includes a host of HTML Helpers. If you’ve used ASP.NET MVC before, a lot of the helpers should look familiar. Documentation at the moment is sketchy-there’s a very rough API reference you can check out here: http://www.asp.net/webmatrix/tutorials/asp-net-web-pages-api-reference Who needs WebMatrix? Uhm… good Question Clearly Microsoft is trying hard to create an environment with WebMatrix that is easy to use for newbie developers. The goal seems to be simplicity in providing a minimal development environment and an easy-to-use script engine/language that makes it easy to get started with. There’s also some focus on community features that can be used as starting points, such as Web Gallery applications and templates. The community features in particular are very nice and something that would be nice to eventually see in Visual Studio as well. The question is whether this is too little too late. Developers who have been clamoring for a simpler development environment on the .NET stack have mostly left for other simpler platforms like PHP or Python which are catering to the down and dirty developer. Microsoft will be hard pressed to win those folks-and other hardcore PHP developers-back. Regardless of how much you dress up a script engine fronted by the .NET Framework, it’s still the .NET Framework and all the complexity that drives it. While .NET is a fine solution in its breadth and features once you get a basic handle on the core features, the bar of entry to being productive with the .NET Framework is still pretty high. The MVC style helpers Microsoft provides are a good step in the right direction, but I suspect it’s not enough to shield new developers from having to delve much deeper into the Framework to get even basic applications built. Razor and its helpers is trying to make .NET more accessible but the reality is that in order to do useful stuff that goes beyond the handful of simple helpers you still are going to have to write some C# or VB or other .NET code. If the target is a hobby/amateur/non-programmer the learning curve isn’t made any easier by WebMatrix it’s just been shifted a tad bit further along in your development endeavor when you run out of canned components that are supplied either by Microsoft or the community. The database helpers are interesting and actually I’ve heard a lot of discussion from various developers who’ve been resisting .NET for a really long time perking up at the prospect of easier data access in .NET than the ridiculous amount of code it takes to do even simple data access with raw ADO.NET. It seems sad that such a simple concept and implementation should trigger this sort of response (especially since it’s practically trivial to create helpers like these or pick them up from countless libraries available), but there it is. It also shows that there are plenty of developers out there who are more interested in ‘getting stuff done’ easily than necessarily following the latest and greatest practices which are overkill for many development scenarios. Sometimes it seems that all of .NET is focused on the big life changing issues of development, rather than the bread and butter scenarios that many developers are interested in to get their work accomplished. And that in the end may be WebMatrix’s main raison d'être: To bring some focus back at Microsoft that simpler and more high level solutions are actually needed to appeal to the non-high end developers as well as providing the necessary tools for the high end developers who want to follow the latest and greatest trends. The current version of WebMatrix hits many sweet spots, but it also feels like it has a long way to go before it really can be a tool that a beginning developer or an accomplished developer can feel comfortable with. Although there are some really good ideas in the environment (like the gallery for downloading apps and components) which would be a great addition for Visual Studio as well, the rest of the development environment just feels like crippleware with required functionality missing especially debugging and Intellisense, but also general editor support. It’s not clear whether these are because the product is still in an early alpha release or whether it’s simply designed that way to be a really limited development environment. While simple can be good, nobody wants to feel left out when it comes to necessary tool support and WebMatrix just has that left out feeling to it. If anything WebMatrix’s technology pieces (which are really independent of the WebMatrix product) are what are interesting to developers in general. The compact IIS implementation is a nice improvement for development scenarios and SQL Compact 4.0 seems to address a lot of concerns that people have had and have complained about for some time with previous SQL Compact implementations. By far the most interesting and useful technology though seems to be the Razor view engine for its light weight implementation and it’s decoupling from the ASP.NET/HTTP pipeline to provide a standalone scripting/view engine that is pluggable. The first winner of this is going to be ASP.NET MVC which can now have a cleaner view model that isn’t inconsistent due to the baggage of non-implemented WebForms features that don’t work in MVC. But I expect that Razor will end up in many other applications as a scripting and code generation engine eventually. Visual Studio integration for Razor is currently missing, but is promised for a later release. The ASP.NET MVC team has already mentioned that Razor will eventually become the default MVC view engine, which will guarantee continued growth and development of this tool along those lines. And the Razor engine and support tools actually inherit many of the features that MVC pioneered, so there’s some synergy flowing both ways between Razor and MVC. As an existing ASP.NET developer who’s already familiar with Visual Studio and ASP.NET development, the WebMatrix IDE doesn’t give you anything that you want. The tools provided are minimal and provide nothing that you can’t get in Visual Studio today, except the minimal Razor syntax highlighting, so there’s little need to take a step back. With Visual Studio integration coming later there’s little reason to look at WebMatrix for tooling. It’s good to see that Microsoft is giving some thought about the ease of use of .NET as a platform For so many years, we’ve been piling on more and more new features without trying to take a step back and see how complicated the development/configuration/deployment process has become. Sometimes it’s good to take a step - or several steps - back and take another look and realize just how far we’ve come. WebMatrix is one of those reminders and one that likely will result in some positive changes on the platform as a whole. © Rick Strahl, West Wind Technologies, 2005-2010Posted in ASP.NET IIS7

Read the article
How do I get my dependenices inject using @Configurable in conjunction with readResolve()

- by bmatthews68

The framework I am developing for my application relies very heavily on dynamically generated domain objects. I recently started using Spring WebFlow and now need to be able to serialize my domain objects that will be kept in flow scope. I have done a bit of research and figured out that I can use writeReplace() and readResolve(). The only catch is that I need to look-up a factory in the Spring context. I tried to use @Configurable(preConstruction = true) in conjunction with the BeanFactoryAware marker interface. But beanFactory is always null when I try to use it in my createEntity() method. Neither the default constructor nor the setBeanFactory() injector are called. Has anybody tried this or something similar? I have included relevant class below. Thanks in advance, Brian /* * Copyright 2008 Brian Thomas Matthews Limited. * All rights reserved, worldwide. * * This software and all information contained herein is the property of * Brian Thomas Matthews Limited. Any dissemination, disclosure, use, or * reproduction of this material for any reason inconsistent with the * express purpose for which it has been disclosed is strictly forbidden. */ package com.btmatthews.dmf.domain.impl.cglib; import java.io.InvalidObjectException; import java.io.ObjectStreamException; import java.io.Serializable; import java.lang.reflect.InvocationTargetException; import java.util.HashMap; import java.util.Map; import org.apache.commons.beanutils.PropertyUtils; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.beans.factory.BeanFactory; import org.springframework.beans.factory.BeanFactoryAware; import org.springframework.beans.factory.annotation.Configurable; import org.springframework.util.StringUtils; import com.btmatthews.dmf.domain.IEntity; import com.btmatthews.dmf.domain.IEntityFactory; import com.btmatthews.dmf.domain.IEntityID; import com.btmatthews.dmf.spring.IEntityDefinitionBean; /** * This class represents the serialized form of a domain object implemented * using CGLib. The readResolve() method recreates the actual domain object * after it has been deserialized into Serializable. You must define * <spring-configured/> in the application context. * * @param <S> * The interface that defines the properties of the base domain * object. * @param <T> * The interface that defines the properties of the derived domain * object. * @author <a href="mailto:[email protected]">Brian Matthews</a> * @version 1.0 */ @Configurable(preConstruction = true) public final class SerializedCGLibEntity<S extends IEntity<S>, T extends S> implements Serializable, BeanFactoryAware { /** * Used for logging. */ private static final Logger LOG = LoggerFactory .getLogger(SerializedCGLibEntity.class); /** * The serialization version number. */ private static final long serialVersionUID = 3830830321957878319L; /** * The application context. Note this is not serialized. */ private transient BeanFactory beanFactory; /** * The domain object name. */ private String entityName; /** * The domain object identifier. */ private IEntityID<S> entityId; /** * The domain object version number. */ private long entityVersion; /** * The attributes of the domain object. */ private HashMap<?, ?> entityAttributes; /** * The default constructor. */ public SerializedCGLibEntity() { SerializedCGLibEntity.LOG .debug("Initializing with default constructor"); } /** * Initialise with the attributes to be serialised. * * @param name * The entity name. * @param id * The domain object identifier. * @param version * The entity version. * @param attributes * The entity attributes. */ public SerializedCGLibEntity(final String name, final IEntityID<S> id, final long version, final HashMap<?, ?> attributes) { SerializedCGLibEntity.LOG .debug("Initializing with parameterized constructor"); this.entityName = name; this.entityId = id; this.entityVersion = version; this.entityAttributes = attributes; } /** * Inject the bean factory. * * @param factory * The bean factory. */ public void setBeanFactory(final BeanFactory factory) { SerializedCGLibEntity.LOG.debug("Injected bean factory"); this.beanFactory = factory; } /** * Called after deserialisation. The corresponding entity factory is * retrieved from the bean application context and BeanUtils methods are * used to initialise the object. * * @return The initialised domain object. * @throws ObjectStreamException * If there was a problem creating or initialising the domain * object. */ public Object readResolve() throws ObjectStreamException { SerializedCGLibEntity.LOG.debug("Transforming deserialized object"); final T entity = this.createEntity(); entity.setId(this.entityId); try { PropertyUtils.setSimpleProperty(entity, "version", this.entityVersion); for (Map.Entry<?, ?> entry : this.entityAttributes.entrySet()) { PropertyUtils.setSimpleProperty(entity, entry.getKey() .toString(), entry.getValue()); } } catch (IllegalAccessException e) { throw new InvalidObjectException(e.getMessage()); } catch (InvocationTargetException e) { throw new InvalidObjectException(e.getMessage()); } catch (NoSuchMethodException e) { throw new InvalidObjectException(e.getMessage()); } return entity; } /** * Lookup the entity factory in the application context and create an * instance of the entity. The entity factory is located by getting the * entity definition bean and using the factory registered with it or * getting the entity factory. The name used for the definition bean lookup * is ${entityName}Definition while ${entityName} is used for the factory * lookup. * * @return The domain object instance. * @throws ObjectStreamException * If the entity definition bean or entity factory were not * available. */ @SuppressWarnings("unchecked") private T createEntity() throws ObjectStreamException { SerializedCGLibEntity.LOG.debug("Getting domain object factory"); // Try to use the entity definition bean final IEntityDefinitionBean<S, T> entityDefinition = (IEntityDefinitionBean<S, T>)this.beanFactory .getBean(StringUtils.uncapitalize(this.entityName) + "Definition", IEntityDefinitionBean.class); if (entityDefinition != null) { final IEntityFactory<S, T> entityFactory = entityDefinition .getFactory(); if (entityFactory != null) { SerializedCGLibEntity.LOG .debug("Domain object factory obtained via enity definition bean"); return entityFactory.create(); } } // Try to use the entity factory final IEntityFactory<S, T> entityFactory = (IEntityFactory<S, T>)this.beanFactory .getBean(StringUtils.uncapitalize(this.entityName) + "Factory", IEntityFactory.class); if (entityFactory != null) { SerializedCGLibEntity.LOG .debug("Domain object factory obtained via direct look-up"); return entityFactory.create(); } // Neither worked! SerializedCGLibEntity.LOG.warn("Cannot find domain object factory"); throw new InvalidObjectException( "No entity definition or factory found for " + this.entityName); } }

Read the article

Search Results

Search found 468 results on 19 pages for 'inconsistent'.

Page 19/19 | < Previous Page | 15 16 17 18 19

- by barryoreilly

- by Nuri Halperin

- by mortezam

- by Tarun Arora

- by James Michael Hare

- by Michael J

- by MetaHyperBolic

- by deadalnix

- by Jaxidian

- by Michael Aaron Safyan

- by Michael Aaron Safyan

- by Benjamin Nolan

- by bobloki

- by bobloki

- by James Michael Hare

- by Rick Strahl

- by bmatthews68

< Previous Page | 15 16 17 18 19