Search Results

Search found 20883 results on 836 pages for 'wont say'.

Page 146/836 | < Previous Page | 142 143 144 145 146 147 148 149 150 151 152 153 | Next Page >

how to keep display tick rate steady when using continuous collision detection?

- by nas Ns

(I've just found about this forum). I hope it is ok to repost my question again here. I posted this question at stackoverflow, but it looks like I might get better help here. Here is the question: I've implemented basic particles motion simulation with continuous collision detection. But there is small issue in display. Assume simple case of circles moving inside square. All elastic collisions. no firction. All motion is constant speed. No forces are involved, no gravity. So when a particle is moving, it is always moving at constant speed (in between collisions) What I do now is this: Let the simulation time step be 1 second (for example). This is the time step simulation is advanced before displaying the new state (unless there is a collision sooner than this). At start of each time step, time for the next collision between any particles or a particle with a wall is determined. Call this the TOC time; let’s say TOC was .5 seconds in this case. Since TOC is smaller than the standard time step, then the system is moved by TOC and the new system is displayed so that the new display shows any collisions as just taking place (say 2 circles just touched each other’s, or a circle just touched a wall) Next, the collision(s) are resolved (i.e. speeds updated, changed directions etc..). A new step is started. The same thing happens. Now assume there is no collision detected within the next 1 second (those 2 circles above will not be in collision any more, even though they are still touching, due to their speeds showing they are moving apart now), Hence, simulation time is advanced now by the full one second, the standard time step, and particles are moved on the screen using 1 second simulation time and new display is shown. You see what has just happened: One frame ran for .5 seconds, but the next frame runs for 1 second, may be the 3rd frame is displayed after 2 seconds, may be the 4th frame is displayed after 2.8 seconds (because TOC was .8 seconds then) and so on. What happens is that the motion of a particle on the screen appears to speed up or slow down, even though it is moving at constant speed and was not even involved in a collision. i.e. Looking at one particle on its own, I see it suddenly speeding up or slowing down, becuase another particle had hit a wall. This is because the display tick is not uniform. i.e. the frame rate update is changing, giving the false illusion that a particle is moving at non-constant speed while in fact it is moving at constant speed. The motion on the screen is not smooth, since the screen is not updating at constant rate. I am not able to figure how to fix this. If I want to show 2 particles at the moment of the collision, I must draw the screen at different times. Drawing the screen always at the same tick interval, results in seeing 2 particles before the collision, and then after the collision, and not just when they colliding, which looked bad when I tried it. So, how do real games handle this issue? How to display things in order to show collisions when it happen, yet keep the display tick constant? These 2 requirements seem to contradict each other’s.

Read the article
The long road to bug-free software

- by Tony Davis

The past decade has seen a burgeoning interest in functional programming languages such as Haskell or, in the Microsoft world, F#. Though still on the periphery of mainstream programming, functional programming concepts are gradually seeping into the imperative C# language (for example, Lambda expressions have their root in functional programming). One of the more interesting concepts from functional programming languages is the use of formal methods, the lofty ideal behind which is bug-free software. The idea is that we write a specification that describes exactly how our function (say) should behave. We then prove that our function conforms to it, and in doing so have proved beyond any doubt that it is free from bugs. All programmers already use one form of specification, specifically their programming language's type system. If a value has a specific type then, in a type-safe language, the compiler guarantees that value cannot be an instance of a different type. Many extensions to existing type systems, such as generics in Java and .NET, extend the range of programs that can be type-checked. Unfortunately, type systems can only prevent some bugs. To take a classic problem of retrieving an index value from an array, since the type system doesn't specify the length of the array, the compiler has no way of knowing that a request for the "value of index 4" from an array of only two elements is "unsafe". We restore safety via exception handling, but the ideal type system will prevent us from doing anything that is unsafe in the first place and this is where we start to borrow ideas from a language such as Haskell, with its concept of "dependent types". If the type of an array includes its length, we can ensure that any index accesses into the array are valid. The problem is that we now need to carry around the length of arrays and the values of indices throughout our code so that it can be type-checked. In general, writing the specification to prove a positive property, even for a problem very amenable to specification, such as a simple sorting algorithm, turns out to be very hard and the specification will be different for every program. Extend this to writing a specification for, say, Microsoft Word and we can see that the specification would end up being no simpler, and therefore no less buggy, than the implementation. Fortunately, it is easier to write a specification that proves that a program doesn't have certain, specific and undesirable properties, such as infinite loops or accesses to the wrong bit of memory. If we can write the specifications to prove that a program is immune to such problems, we could reuse them in many places. The problem is the lack of specification "provers" that can do this without a lot of manual intervention (i.e. hints from the programmer). All this might feel a very long way off, but computing power and our understanding of the theory of "provers" advances quickly, and Microsoft is doing some of it already. Via their Terminator research project they have started to prove that their device drivers will always terminate, and in so doing have suddenly eliminated a vast range of possible bugs. This is a huge step forward from saying, "we've tested it lots and it seems fine". What do you think? What might be good targets for specification and verification? SQL could be one: the cost of a bug in SQL Server is quite high given how many important systems rely on it, so there's a good incentive to eliminate bugs, even at high initial cost. [Many thanks to Mike Williamson for guidance and useful conversations during the writing of this piece] Cheers, Tony.

Read the article
Confused about modifying the sprint backlog during a sprint

- by Maltiriel

I've been reading a lot about scrum lately, and I've found what seem to me to be conflicting information about whether or not it's ok to change the sprint backlog during a sprint. The Wikipedia article on scrum says it's not ok, and various other articles say this as well. Also my Software Development professor taught the same thing during an overview of scrum. However, I read Scrum and XP from the Trenches and that describes a section for unplanned items on the taskboard. So then I looked up the Scrum Guide and it says that during the sprint "No changes are made that would affect the Sprint Goal" and in the discussion of the Sprint Goal "If the work turns out to be different than the Development Team expected, then they collaborate with the Product Owner to negotiate the scope of Sprint Backlog within the Sprint." It goes on to say in the discussion of the Sprint Backlog: The Sprint Backlog is a plan with enough detail that changes in progress can be understood in the Daily Scrum. The Development Team modifies Sprint Backlog throughout the Sprint, and the Sprint Backlog emerges during the Sprint. This emergence occurs as the Development Team works through the plan and learns more about the work needed to achieve the Sprint Goal. As new work is required, the Development Team adds it to the Sprint Backlog. As work is performed or completed, the estimated remaining work is updated. When elements of the plan are deemed unnecessary, they are removed. Only the Development Team can change its Sprint Backlog during a Sprint. The Sprint Backlog is a highly visible, real-time picture of the work that the Development Team plans to accomplish during the Sprint, and it belongs solely to the Development Team. So at this point I'm altogether confused. Thinking about it, it makes more sense to me to take the second approach. The individual, specific items in the backlog don't seem to me to be the most important thing, but rather the sprint goal, so not changing the sprint goal but being able to change the backlog makes sense. For instance if both the product owner and the team thought they were on the same page about a story, but as the sprint progressed they figured out there was a misunderstanding, it seems like it makes sense to change the tasks that make up that story accordingly. Or if there was some story or task that was forgotten about, but is required to reach the sprint goal, I would think it would be best to add the story or task to the backlog during the sprint. However, there are a lot of people who seem quite adamant that any change to the sprint backlog is not ok. Am I misunderstanding that position somehow? Are those folks defining the sprint backlog differently somehow? My understanding of the sprint backlog is that it consists of both the stories and the tasks they're broken down into. Anyway I would really appreciate input on this issue. I'm trying to figure out both what the idealistic scrum approach is to changing the sprint backlog during a sprint, and whether people who use scrum successfully for development allow changing the sprint backlog during a sprint.

Read the article
When will EBS 12.2 be released?

- by Steven Chan (Oracle Development)

The most frequently asked question at OpenWorld this year was, "When will EBS 12.2 be released?" Sadly, Oracle's communication policies prohibit us from speculating about release dates for unreleased software. We are not permitted to give estimates, rough timelines, guesses, or anything else that remotely resembles specific guidance on release dates. You can monitor My Oracle Support and this blog for updates on EBS 12.2. I'll post them here as soon as they're available. I'm embedding an old favourite from 2007 in its entirety here, since it applies equally to new releases as well as certifications. "Loose Lips Sink Ships" (March 20, 2007)If I were to sort emails in my inbox into groups, the biggest -- by far -- would be the one for emails that start with, "When will _____ be certified with the E-Business Suite?" I answer these dutifully but know that my replies can sometimes be maddening, for two reasons: technical uncertainty, and Oracle's rules for such communications. On the Spiral Model of CertificationsTechnology stack certifications tend to be highly iterative in nature. As a result, statements about certification dates tend to be accurate only when made in hindsight. Laypeople are horrified to hear this, but it's the ugly truth. Uncertainty is simply inherent to the process. I've become inured to it over the years, but it might come as a surprise to you that it can take many cycles to get fully-released software to work together. Take this scenario: We test a particular combination of Component A and B. If we encounter a problem, say, with Component A, we log a bug. We receive a new version of Component A. The process iterates again. The reality is this: until a certification is completed and released, there's no accurate way of telling how many iterations are yet to come. This is true regardless of the number of iterations that have already been completed. Our Lips Are SealedGenerally, people understand that things are subject to change, so the second reason I can't say anything specific is actually much more important than the first. "Loose lips might sink ships" was coined in World War II in an effort to remind people that careless talk can have serious consequences. Curiously, this applies to Oracle's communications about upcoming features, configurations, and releases, too. As a publicly traded company, we have very strict policies that prohibit us from linking specific releases to specific dates. If you've ever listened to an earnings call with analysts, you'll often hear them asking, "Can you add a little more color to that statement?" For certifications, color is usually the only thing that I have. Sometimes I can provide a bit more information about the technical nature of the certification in question, such as expected footprints or version levels. I can occasionally share technical issues that we've found, too, to convey the degree of risk or complexity involved in the certification. Aside from that, there's little additional information about specific dates, date ranges, or even speculation about dates that I can provide... that is, without having one of those uncomfortable conversations with Oracle Legal. So, as much as it pains me to do so, when it comes to dates, I'm always forced to conclude with a generic reply that blandly states one of the following: We're working on that certification right now That certification is in the pipeline but hasn't been started yet We don't have plans for that certification Don't Shoot the MessengerThankfully, I've developed a thick skin over the years -- which is a good thing, considering the colorful and energetic responses I've received over the years after answering these questions. However, on behalf of my Oracle colleagues who are faced with these questions every day in the field, I urge you to remember that they're required to follow these same corporate rules about date disclosures. It never hurts to ask, but don't be too disappointed if we can't provide you with a detailed answer. The Go-Go's had it right, after all. Related Articles Webcast Replay Available: Technical Preview of EBS 12.2 Online Patching

Read the article
Does *every* project benefit from written specifications?

- by nikie

I know this is holy war territory, so please read the question to the end before answering. There are many cases where written specifications make a lot of sense. For example, if you're a contractor and you want to get paid, you need written specs. If you're working in a team with 20 persons, you need written specs. If you're writing a programming language compiler or interpreter (and it's not perl), you'll usually write a formal specification. I don't doubt that there are many more cases where written specifications are a really good idea. I just think that there are cases where there's so little benefit in written specs, that it doesn't outweigh the costs of writing and maintaining them. EDIT: The close votes say that "it is difficult to say what is asked here", so let me clarify: The usefulness of written, detailed specifications is often claimed like a dogma. (If you want examples, look at the comments.) But I don't see the use of them for the kind of development I'm doing. So what is asked here is: How would written specifications help me? Background information: I work for a small company that's developing vertical market software. If our product is easier to use and has better performance than the competition, it sells. If it's harder to use, even if it behaves 100% as the specification says, it doesn't sell. So there are no "external forces" for having written specs. The advantage would have to be somewhere in the development process. Now, I can see how frozen specifications would make a developer's life easier. But we'll never have frozen specs. If we see in the middle of development that feature X is not intuitive to use the way it's specified, then we can only choose between changing the specification or developing a product that won't sell. You'll probably ask by now: How do you know when you're done? Well, we're continually improving our product. The competition does the same. So (hopefully) we're never done. We keep improving the software, and when we reach a point when the benefits of the improvements we've added since the last release outweigh the costs of an update, we create a new release that is then tested, localized, documented and deployed. This also means that there's rarely any schedule pressure. Nobody has to do overtime to make a deadline. If the feature isn't done by the time we want to release the next version, it'll simply go into the next version. The next question might be: How do your developers know what they're supposed to implement? The answer is: They have a lot of domain knowledge. They know the customers business well enough, so a high-level description of the feature (or even just the problem that the customer needs solved) is enough to implement it. If it's not clear, the developer creates a few fake screens to get feedback from marketing/management or customers, but this is nowhere near the level of detail of actual specifications. This might be inefficient for larger teams, but for a small team with low turnover it works quite well. It has the additional benefit that the developer in question often comes up with a better solution than the person writing the specs might have. This question is already getting very long, but let me address one last point: Testing. Like I said in the beginning, if our software behaves 100% like the spec says, it still can be crap. In fact, if it's so unintuitive that you need a spec to know how to test it, it probably is crap. It makes sense to have fixed, written tests for some core functionality and for regression bugs, but again, this is nowhere near a full written spec of how the software should behave when. The main test is: hand the software to a user who doesn't know it yet and tell him to use the new feature X. If she can figure out how to use it and it works, it works.

Read the article
Why is x=x++ undefined?

- by ugoren

It's undefined because the it modifies x twice between sequence points. The standard says it's undefined, therefore it's undefined. That much I know. But why? My understanding is that forbidding this allows compilers to optimize better. This could have made sense when C was invented, but now seems like a weak argument. If we were to reinvent C today, would we do it this way, or can it be done better? Or maybe there's a deeper problem, that makes it hard to define consistent rules for such expressions, so it's best to forbid them? So suppose we were to reinvent C today. I'd like to suggest simple rules for expressions such as x=x++, which seem to me to work better than the existing rules. I'd like to get your opinion on the suggested rules compared to the existing ones, or other suggestions. Suggested Rules: Between sequence points, order of evaluation is unspecified. Side effects take place immediately. There's no undefined behavior involved. Expressions evaluate to this value or that, but surely won't format your hard disk (strangely, I've never seen an implementation where x=x++ formats the hard disk). Example Expressions x=x++ - Well defined, doesn't change x. First, x is incremented (immediately when x++ is evaluated), then it's old value is stored in x. x++ + ++x - Increments x twice, evaluates to 2*x+2. Though either side may be evaluated first, the result is either x + (x+2) (left side first) or (x+1) + (x+1) (right side first). x = x + (x=3) - Unspecified, x set to either x+3 or 6. If the right side is evaluated first, it's x+3. It's also possible that x=3 is evaluated first, so it's 3+3. In either case, the x=3 assignment happens immediately when x=3 is evaluated, so the value stored is overwritten by the other assignment. x+=(x=3) - Well defined, sets x to 6. You could argue that this is just shorthand for the expression above. But I'd say that += must be executed after x=3, and not in two parts (read x, evaluate x=3, add and store new value). What's the Advantage? Some comments raised this good point. It's not that I'm after the pleasure of using x=x++ in my code. It's a strange and misleading expression. What I want is to be able to understand complicated expressions. Normally, a complicated expression is no more than the sum of its parts. If you understand the parts and the operators combining them, you can understand the whole. C's current behavior seems to deviate from this principle. One assignment plus another assignment suddenly doesn't make two assignments. Today, when I look at x=x++, I can't say what it does. With my suggested rules, I can, by simply examining its components and their relations.

Read the article
Generic Adjacency List Graph implementation

- by DmainEvent

I am trying to come up with a decent Adjacency List graph implementation so I can start tooling around with all kinds of graph problems and algorithms like traveling salesman and other problems... But I can't seem to come up with a decent implementation. This is probably because I am trying to dust the cobwebs off my data structures class. But what I have so far... and this is implemented in Java... is basically an edgeNode class that has a generic type and a weight-in the event the graph is indeed weighted. public class edgeNode<E> { private E y; private int weight; //... getters and setters as well as constructors... } I have a graph class that has a list of edges a value for the number of Vertices and and an int value for edges as well as a boolean value for whether or not it is directed. The brings up my first question, if the graph is indeed directed, shouldn't I have a value in my edgeNode class? Or would I just need to add another vertices to my LinkedList? That would imply that a directed graph is 2X as big as an undirected graph wouldn't it? public class graph { private List<edgeNode<?>> edges; private int nVertices; private int nEdges; private boolean directed; //... getters and setters as well as constructors... } Finally does anybody have a standard way of initializing there graph? I was thinking of reading in a pipe-delimited file but that is so 1997. public graph GenereateGraph(boolean directed, String file){ List<edgeNode<?>> edges; graph g; try{ int count = 0; String line; FileReader input = new FileReader("C:\\Users\\derekww\\Documents\\JavaEE Projects\\graphFile"); BufferedReader bufRead = new BufferedReader(input); line = bufRead.readLine(); count++; edges = new ArrayList<edgeNode<?>>(); while(line != null){ line = bufRead.readLine(); Object edgeInfo = line.split("|")[0]; int weight = Integer.parseInt(line.split("|")[1]); edgeNode<String> e = new edgeNode<String>((String) edges.add(e); } return g; } catch(Exception e){ return null; } } I guess when I am adding edges if boolean is true I would be adding a second edge. So far, this all depends on the file I write. So if I wrote a file with the following Vertices and weights... Buffalo | 18 br Pittsburgh | 20 br New York | 15 br D.C | 45 br I would obviously load them into my list of edges, but how can I represent one vertices connected to the other... so on... I would need the opposite vertices? Say I was representing Highways connected to each city weighted and un-directed (each edge is bi-directional with weights in some fictional distance unit)... Would my implementation be the best way to do that? I found this tutorial online Graph Tutorial that has a connector object. This appears to me be a collection of vertices pointing to each other. So you would have A and B each with there weights and so on, and you would add this to a list and this list of connectors to your graph... That strikes me as somewhat cumbersome and a little dismissive of the adjacency list concept? Am I wrong and that is a novel solution? This is all inspired by steve skiena's Algorithm Design Manual. Which I have to say is pretty good so far. Thanks for any help you can provide.

Read the article
Can someone explain the true landscape of Rails vs PHP deployment, particularly within the context of Reseller-based web hosting (e.g., Hostgator)?

- by rcd

Currently, I have a reseller account with the company HostGator. I design websites, which up until now have occasionally been wrapped in Wordpress CMSs and the like (PHP applications). I then sell hosting (of the site I've designed) to the client, which is pretty simple, in that I can simply click a button and add a new shared hosting account/site with whatever settings I want. Furthermore, I then utilize WHMCS to automate billing and account management. It's a nice package and pretty simple. I pay something like $25 a month, and can sell a hundred accounts under this (because my clients bandwidth requirements are low). Now I am finding the need to develop more customized applications, including a minimalist CMS and several proprietary things. I soon anticipate developing these apps for clients as well. Thus, I've spent the past few months learning Rails, and it's coming along well now. The thing that has nagged at me all along, though, is the deployment issue. I can't wrap my brain around it. It seems like all of the popular options (Heroku, etc) have nice automation with git and are set up in the "Rails Way". I get that (sort of). But it's terribly expensive... a single dyno, a helper, and the cheapest database (which they say is mainly suitable for testing) that isn't limited to 5MB runs $51. This is for ONE app!!! Throw in a "production" DB and you're over $200. This is like... the same prices as getting a server somewhere, right? Meanwhile, going back to what I guess is a "traditional" hosting environment with Hostgator, their server only has Ruby 1.8.7 and Rails 2.3.5... No Rails 3. AND, no Passenger (not that I really understand the difference in CGI or mod_rails or whatever, but they say Passenger is the simplest). So I'm to understand that if I build an app in Rails 3, it won't run at all on this host? But damn, I already have these accounts under my reseller account there, all running static html and/or PHP stuff, right? So what now? How do I get all of this under one simple (and affordable) roof? Forgive my ignorance, but I just don't get it. Managing a VPS is cool and all, but entails learning server admin stuff and security... And it's expensive. I get that a shared and/or reseller "server-based" (forgive the terminology) may be inadequate for large-scale apps that use a lot of bandwidth... But what about for those of us who are building real (but small and low bandwidth) apps (with Rails) and who want to deploy them simply, cheaply, using the same conceptual approach as PHP? Even after learning all of this Ruby and Rails stuff for months, I'm questioning whether it's worth it when it comes to deployment. I want to build a small app, upload it to my home directory on a shared server account, and just make it run. Why should that be so hard? Am I just choosing the wrong language/framework? Forgive my ignorance in the subject; these questions are not rhetorical; just trying to learn here. So: 1) I'd appreciate if someone could give me a good rundown of how to understand deployment in Rails vs. PHP. 2) I'd appreciate if someone could address my issue with running a hosting/web business around reseller hosting (Hostgator) while also being able to host Rails apps. Can it be done? And how can a company like Hostgator completely ignore what's current in Rails/Ruby? Thanks.

Read the article
The long road to bug-free software

- by Tony Davis

The past decade has seen a burgeoning interest in functional programming languages such as Haskell or, in the Microsoft world, F#. Though still on the periphery of mainstream programming, functional programming concepts are gradually seeping into the imperative C# language (for example, Lambda expressions have their root in functional programming). One of the more interesting concepts from functional programming languages is the use of formal methods, the lofty ideal behind which is bug-free software. The idea is that we write a specification that describes exactly how our function (say) should behave. We then prove that our function conforms to it, and in doing so have proved beyond any doubt that it is free from bugs. All programmers already use one form of specification, specifically their programming language's type system. If a value has a specific type then, in a type-safe language, the compiler guarantees that value cannot be an instance of a different type. Many extensions to existing type systems, such as generics in Java and .NET, extend the range of programs that can be type-checked. Unfortunately, type systems can only prevent some bugs. To take a classic problem of retrieving an index value from an array, since the type system doesn't specify the length of the array, the compiler has no way of knowing that a request for the "value of index 4" from an array of only two elements is "unsafe". We restore safety via exception handling, but the ideal type system will prevent us from doing anything that is unsafe in the first place and this is where we start to borrow ideas from a language such as Haskell, with its concept of "dependent types". If the type of an array includes its length, we can ensure that any index accesses into the array are valid. The problem is that we now need to carry around the length of arrays and the values of indices throughout our code so that it can be type-checked. In general, writing the specification to prove a positive property, even for a problem very amenable to specification, such as a simple sorting algorithm, turns out to be very hard and the specification will be different for every program. Extend this to writing a specification for, say, Microsoft Word and we can see that the specification would end up being no simpler, and therefore no less buggy, than the implementation. Fortunately, it is easier to write a specification that proves that a program doesn't have certain, specific and undesirable properties, such as infinite loops or accesses to the wrong bit of memory. If we can write the specifications to prove that a program is immune to such problems, we could reuse them in many places. The problem is the lack of specification "provers" that can do this without a lot of manual intervention (i.e. hints from the programmer). All this might feel a very long way off, but computing power and our understanding of the theory of "provers" advances quickly, and Microsoft is doing some of it already. Via their Terminator research project they have started to prove that their device drivers will always terminate, and in so doing have suddenly eliminated a vast range of possible bugs. This is a huge step forward from saying, "we've tested it lots and it seems fine". What do you think? What might be good targets for specification and verification? SQL could be one: the cost of a bug in SQL Server is quite high given how many important systems rely on it, so there's a good incentive to eliminate bugs, even at high initial cost. [Many thanks to Mike Williamson for guidance and useful conversations during the writing of this piece] Cheers, Tony.

Read the article
How do you go from a so so programmer to a great one? [closed]

- by Cervo

How do you go from being an okay programmer to being able to write maintainable clean code? For example David Hansson was writing Basecamp when in the process he created Rails as part of writing Basecamp in a clean/maintainable way. But how do you know when there is value in a side project like that? I have a bachelors in computer science, and I am about to get a masters and I will say that colleges teach you to write code to solve problems, not neatly or anything. Basically you think of a problem, come up with a solution, and write it down...not necessarily the most maintainable way in the world. Also my first job was in a startup, and now my third is in a small team in a large company where the attitude was/is get it done yesterday (also most of my jobs are mainly database development with SQL with a few ASP.NET web pages/.NET apps on the side). So of course cut/paste is more favored than making things more cleanly. And they would rather have something yesterday even if you have to rewrite it next month rather than to have something in a week that lasts for a year. Also spaghetti code turns up all over the place, and it takes very smart people to write/understand/maintain spaghetti code...However it would be better to do things so simple/clean that even a caveman/woman could do maintenance. Also I get very bored/unmotivated having to go modify the same things cut/pasted in a few locations. Is this the type of skill that you need to learn by working with a serious software organization that has an emphasis on maintenance and maybe even an architect who designs a system architecture and reviews code? Could you really learn it by volunteering on an open source project (it seems to me that a full time programmer job is way more practice than a few hours a week on an open source project)? Is there some course where you can learn this? I can attest that graduate school and undergraduate school do not really emphasize clean software at all. They just teach the structures/algorithms and then send you off into the world to solve problems. Overall I think the first thing is learning to write clean/maintainable code within the bounds of the project in order to become a good programmer. Then the next thing is learning when you need to do a side project (like a framework) to make things more maintainable/clean even while you still deliver things for the deadline in order to become a great programmer. For example, you are making an SQL report and someone gives you 100 calculations for individual columns. At what point does it make sense to construct a domain specific language to encode the rules in simply and then generate all the SQL as opposed to cut/pasting the query from the table a bunch of times and then adjusting each query to do the appropriate calculations. This is the type of thing I would say a great programmer would know. He/she would maybe even know ways to avoid the domain specific language and to still do all the calculations without creating an unmaintainable mess or a ton of repetitive code to cut/paste everywhere.

Read the article
Reading graph inputs for a programming puzzle and then solving it

- by Vrashabh

I just took a programming competition question and I absolutely bombed it. I had trouble right at the beginning itself from reading the input set. The question was basically a variant of this puzzle http://codercharts.com/puzzle/evacuation-plan but also had an hour component in the first line(say 3 hours after start of evacuation). It reads like this This puzzle is a tribute to all the people who suffered from the earthquake in Japan. The goal of this puzzle is, given a network of road and locations, to determine the maximum number of people that can be evacuated. The people must be evacuated from evacuation points to rescue points. The list of road and the number of people they can carry per hour is provided. Input Specifications Your program must accept one and only one command line argument: the input file. The input file is formatted as follows: the first line contains 4 integers n r s t n is the number of locations (each location is given by a number from 0 to n-1) r is the number of roads s is the number of locations to be evacuated from (evacuation points) t is the number of locations where people must be evacuated to (rescue points) the second line contains s integers giving the locations of the evacuation points the third line contains t integers giving the locations of the rescue points the r following lines contain to the road definitions. Each road is defined by 3 integers l1 l2 width where l1 and l2 are the locations connected by the road (roads are one-way) and width is the number of people per hour that can fit on the road Now look at the sample input set 5 5 1 2 3 0 3 4 0 1 10 0 2 5 1 2 4 1 3 5 2 4 10 The 3 in the first line is the additional component and is defined as the number of hours since the resuce has started which is 3 in this case. Now my solution was to use Dijisktras algorithm to find the shortest path between each of the rescue and evac nodes. Now my problem started with how to read the input set. I read the first line in python and stored the values in variables. But then I did not know how to store the values of the distance between the nodes and what DS to use and how to input it to say a standard implementation of dijikstras algorithm. So my question is two fold 1.) How do I take the input of such problems? - I have faced this problem in quite a few competitions recently and I hope I can get a simple code snippet or an explanation in java or python to read the data input set in such a way that I can input it as a graph to graph algorithms like dijikstra and floyd/warshall. Also a solution to the above problem would also help. 2.) How to solve this puzzle? My algorithm was: Find shortest path between evac points (in the above example it is 14 from 0 to 3) Multiply it by number of hours to get maximal number of saves Also the answer given for the variant for the input set was 24 which I dont understand. Can someone explain that also. UPDATE: I get how the answer is 14 in the given problem link - it seems to be just the shortest path between node 0 and 3. But with the 3 hour component how is the answer 24 UPDATE I get how it is 24 - its a complete graph traversal at every hour and this is how I solve it Hour 1 Node 0 to Node 1 - 10 people Node 0 to Node 2- 5 people TotalRescueCount=0 Node 1=10 Node 2= 5 Hour 2 Node 1 to Node 3 = 5(Rescued) Node 2 to Node 4 = 5(Rescued) Node 0 to Node 1 = 10 Node 0 to Node 2 = 5 Node 1 to Node 2 = 4 TotalRescueCount = 10 Node 1 = 10 Node 2= 5+4 = 9 Hour 3 Node 1 to Node 3 = 5(Rescued) Node 2 to Node 4 = 5+4 = 9(Rescued) TotalRescueCount = 9+5+10 = 24 It hard enough for this case , for multiple evac and rescue points how in the world would I write a pgm for this ?

Read the article
How to export SSIS to Microsoft Excel without additional software?

- by Dr. Zim

This question is long winded because I have been updating the question over a very long time trying to get SSIS to properly export Excel data. I managed to solve this issue, although not correctly. Aside from someone providing a correct answer, the solution listed in this question is not terrible. The only answer I found was to create a single row named range wide enough for my columns. In the named range put sample data and hide it. SSIS appends the data and reads metadata from the single row (that is close enough for it to drop stuff in it). The data takes the format of the hidden single row. This allows headers, etc. WOW what a pain in the butt. It will take over 450 days of exports to recover the time lost. However, I still love SSIS and will continue to use it because it is still way better than Filemaker LOL. My next attempt will be doing the same thing in the report server. Original question notes: If you are in Sql Server Integrations Services designer and want to export data to an Excel file starting on something other than the first line, lets say the forth line, how do you specify this? I tried going in to the Excel Destination of the Data Flow, changed the AccessMode to OpenRowSet from Variable, then set the variable to "YPlatters$A4:I20000" This fails saying it cannot find the sheet. The sheet is called YPlatters. I thought you could specify (Sheet$)(Starting Cell):(Ending Cell)? Update Apparently in Excel you can select a set of cells and name them with the name box. This allows you to select the name instead of the sheet without the $ dollar sign. Oddly enough, whatever the range you specify, it appends the data to the next row after the range. Oddly, as you add data, it increases the named selection's row count. Another odd thing is the data takes the format of the last line of the range specified. My header rows are bold. If I specify a range that ends with the header row, the data appends to the row below, and makes all the entries bold. if you specify one row lower, it puts a blank line between the header row and the data, but the data is not bold. Another update No matter what I try, SSIS samples the "first row" of the file and sets the metadata according to what it finds. However, if you have sample data that has a value of zero but is formatted as the first row, it treats that column as text and inserts numeric values with a single quote in front ('123.34). I also tried headers that do not reflect the data types of the columns. I tried changing the metadata of the Excel destination, but it always changes it back when I run the project, then fails saying it will truncate data. If I tell it to ignore errors, it imports everything except that column. Several days of several hours a piece later... Another update I tried every combination. A mostly working example is to create the named range starting with the column headers. Format your column headers as you want the data to look as the data takes on this format. In my example, these exist from A4 to E4, which is my defined range. SSIS appends to the row after the defined range, so defining A4 to E68 appends the rows starting at A69. You define the Connection as having the first row contains the field names. It takes on the metadata of the header row, oddly, not the second row, and it guesses at the data type, not the formatted data type of the column, i.e., headers are text, so all my metadata is text. If your headers are bold, so is all of your data. I even tried making a sample data row without success... I don't think anyone actually uses Excel with the default MS SSIS export. If you could define the "insert range" (A5 to E5) with no header row and format those columns (currency, not bold, etc.) without it skipping a row in Excel, this would be very helpful. From what I gather, noone uses SSIS to export Excel without a third party connection manager. Any ideas on how to set this up properly so that data is formatted correctly, i.e., the metadata read from Excel is proper to the real data, and formatting inherits from the first row of data, not the headers in Excel? One last update (July 17, 2009) I got this to work very well. One thing I added to Excel was the IMEX=1 in the Excel connection string: "Excel 8.0;HDR=Yes;IMEX=1". This forces Excel (I think) to look at all rows to see what kind of data is in it. Generally, this does not drop information, say for instance if you have a zip code then about 9 rows down you have a zip+4, Excel without this blanks that field entirely without error. With IMEX=1, it recognizes that Zip is actually a character field instead of numeric. And of course, one more update (August 27, 2009) The IMEX=1 will succeed importing data with missing contents in the first 8 rows, but it will fail exporting data where no data exists. So, have it on your import connection string, but not your export Excel connection string. I have to say, after so much fiddling, it works pretty well.

Read the article
Changing CSS on the fly in a UIWebView on iPhone

- by Shaggy Frog

Let's say I'm developing an iPhone app that is a catalogue of cars. The user will choose a car from a list, and I will present a detail view for the car, which will describe things like top speed. The detail view will essentially be a UIWebView that is loading an existing HTML file. Different users will live in different parts of the world, so they will like to see the top speed for the car in whatever units are appropriate for their locale. Let's say there are two such units: SI (km/h) and conventional (mph). Let's also say the user will be able to change the display units by hitting a button on the screen; when that happens, the detail screen should switch to show the relevant units. So far, here's what I've done to try and solve this. The HTML might look something like this: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US" lang="en-US"> <head> <title>Some Car</title> <link rel="stylesheet" media="screen" type="text/css" href="persistent.css" /> <link rel="alternate stylesheet" media="screen" type="text/css" href="si.css" title="si" /> <link rel="alternate stylesheet" media="screen" type="text/css" href="conventional.css" title="conventional" /> <script type="text/javascript" src="switch.js"></script> </head> <body> <h1>Some Car</h1> <div id="si"> <h2>Top Speed: 160 km/h</h2> </div> <div id="conventional"> <h2>Top Speed: 100 mph</h2> </div> </body> The peristent stylesheet, persistent.css: #si { display:none; } #conventional { display:none; } The first alternate stylesheet, si.css: #si { display:inline; } #conventional { display:none; } And the second alternate stylesheet, conventional.css: #si { display:none; } #conventional { display:inline; } Based on a tutorial at A List Apart, my switch.js looks something like this: function disableStyleSheet(title) { var i, a; for (i = 0; (a = document.getElementsByTagName("link")[i]); i++) { if ((a.getAttribute("rel").indexOf("alt") != -1) && (a.getAttribute("title") == title)) { a.disabled = true; } } } function enableStyleSheet(title) { var i, a; for (i = 0; (a = document.getElementsByTagName("link")[i]); i++) { if ((a.getAttribute("rel").indexOf("alt") != -1) && (a.getAttribute("title") == title)) { a.disabled = false; } } } function switchToSiStyleSheet() { disableStyleSheet("conventional"); enableStyleSheet("si"); } function switchToConventionalStyleSheet() { disableStyleSheet("si"); enableStyleSheet("conventional"); } My button action handler looks something like this: - (void)notesButtonAction:(id)sender { static BOOL isUsingSi = YES; if (isUsingSi) { NSString* command = [[NSString alloc] initWithString:@"switchToSiStyleSheet();"]; [self.webView stringByEvaluatingJavaScriptFromString:command]; [command release]; } else { NSString* command = [[NSString alloc] initWithFormat:@"switchToConventionalStyleSheet();"]; [self.webView stringByEvaluatingJavaScriptFromString:command]; [command release]; } isUsingSi = !isUsingSi; } Here's the first problem. The first time the button is hit, the UIWebView doesn't change. The second time it's hit, it looks like the conventional style sheet is loaded. The third time, it switches to the SI style sheet; the fourth time, back to the conventional, and so on. So, basically, only that first button press doesn't seem to do anything. Here's the second problem. I'm not sure how to switch to the correct style sheet upon initial load of the UIWebView. I tried this: - (void)webViewDidFinishLoad:(UIWebView *)webView { NSString* command = [[NSString alloc] initWithString:@"switchToSiStyleSheet();"]; [self.webView stringByEvaluatingJavaScriptFromString:command]; [command release]; } But, like the first button hit, it doesn't seem to do anything. Can anyone help me with these two problems?

Read the article
Laissez les bon temps rouler! (Microsoft BI Conference 2010)

- by smisner

"Laissez les bons temps rouler" is a Cajun phrase that I heard frequently when I lived in New Orleans in the mid-1990s. It means "Let the good times roll!" and encapsulates a feeling of happy expectation. As I met with many of my peers and new acquaintances at the Microsoft BI Conference last week, this phrase kept running through my mind as people spoke about their plans in their respective businesses, the benefits and opportunities that the recent releases in the BI stack are providing, and their expectations about the future of the BI stack. Notwithstanding some jabs here and there to point out the platform is neither perfect now nor will be anytime soon (along with admissions that the competitors are also not perfect), and notwithstanding several missteps by the event organizers (which I don't care to enumerate), the overarching mood at the conference was positive. It was a refreshing change from the doom and gloom hovering over several conferences that I attended in 2009. Although many people expect economic hardships to continue over the coming year or so, everyone I know in the BI field is busier than ever and expects to stay busy for quite a while. Self-Service BI Self-service was definitely a theme of the BI conference. In the keynote, Ted Kummert opened with a look back to a fairy tale vision of self-service BI that he told in 2008. At that time, the fairy tale future was a time when "every end user was able to use BI technologies within their job in order to move forward more effectively" and transitioned to the present time in which SQL Server 2008 R2, Office 2010, and SharePoint 2010 are available to deliver managed self-service BI. This set of technologies is presumably poised to address the needs of the 80% of users that Kummert said do not use BI today. He proceeded to outline a series of activities that users ought to be able to do themselves--from simple changes to a report like formatting or an addtional data visualization to integration of an additional data source. The keynote then continued with a series of demonstrations of both current and future technology in support of self-service BI. Some highlights that interested me: PowerPivot, of course, is the flagship product for self-service BI in the Microsoft BI stack. In the TechEd keynote, which was open to the BI conference attendees, Amir Netz (twitter) impressed the audience by demonstrating interactivity with a workbook containing 100 million rows. He upped the ante at the BI keynote with his demonstration of a future-state PowerPivot workbook containing over 2 billion records. It's important to note that this volume of data is being processed by a server engine, and not in the PowerPivot client engine. (Yes, I think it's impressive, but none of my clients are typically wrangling with 2 billion records at a time. Maybe they're thinking too small. This ability to work quickly with large data sets has greater implications for BI solutions than for self-service BI, in my opinion.) Amir also demonstrated KPIs for the future PowerPivot, which appeared to be easier to implement than in any other Microsoft product that supports KPIs, apart from simple KPIs in SharePoint. (My initial reaction is that we have one more place to build KPIs. Great. It's confusing enough. I haven't seen how well those KPIs integrate with other BI tools, which will be important for adoption.) One more PowerPivot feature that Amir showed was a graphical display of the lineage for calculations. (This is hugely practical, especially if you build up calculations incrementally. You can more easily follow the logic from calculation to calculation. Furthermore, if you need to make a change to one calculation, you can assess the impact on other calculations.) Another product demonstration will be available within the next 30 days--Pivot for Reporting Services. If you haven't seen this technology yet, check it out at www.getpivot.com. (It definitely has a wow factor, but I'm skeptical about its practicality. However, I'm looking forward to trying it out with data that I understand.) Michael Tejedor (twitter) demonstrated a feature that I think is really interesting and not emphasized nearly enough--overshadowed by PowerPivot, no doubt. That feature is the Microsoft Business Intelligence Indexing Connector, which enables search of the content of Excel workbooks and Reporting Services reports. (This capability existed in MOSS 2007, but was more cumbersome to implement. The search results in SharePoint 2010 are not only cooler, but more useful by describing whether the content is found in a table or a chart, for example.) This may yet be the dawning of the age of self-service BI - a phrase I've heard repeated from time to time over the last decade - but I think BI professionals are likely to stay busy for a long while, and need not start looking for a new line of work. Kummert repeatedly referenced strategic BI solutions in contrast to self-service BI to emphasize that self-service BI is not a replacement for the services that BI professionals provide. After all, self-service BI does not appear magically on user desktops (or whatever device they want to use). A supporting infrastructure is necessary, and grows in complexity in proportion to the need to simplify BI for users. It's one thing to hear the party line touted by Microsoft employees at the BI keynote, but it's another to hear from the people who are responsible for implementing and supporting it within an organization. Rob Collie (blog | twitter), Kasper de Jonge (blog | twitter), Vidas Matelis (site | twitter), and I were invited to join Andrew Brust (blog | twitter) as he led a Birds of a Feather session at TechEd entitled "PowerPivot: Is It the BI Deal-Changer for Developers and IT Pros?" I would single out the prevailing concern in this session as the issue of control. On one side of this issue were those who were concerned that they would lose control once PowerPivot is implemented. On the other side were those who believed that data should be freely accessible to users in PowerPivot, and even acknowledgment that users would get the data they want even if it meant they would have to manually enter into a workbook to have it ready for analysis. For another viewpoint on how PowerPivot played out at the conference, see Rob Collie's observations. Collaborative BI I have been intrigued by the notion of collaborative BI for a very long time. Before I discovered BI, I was a Lotus Notes developer and later a manager of developers, working in a software company that enabled collaboration in the legal industry. Not only did I help create collaborative systems for our clients, I created a complete project management from the ground up to collaboratively manage our custom development work. In that case, collaboration involved my team, my client contacts, and me. I was also able to produce my own BI from that system as well, but didn't know that's what I was doing at the time. Only in recent years has SharePoint begun to catch up with the capabilities that I had with Lotus Notes more than a decade ago. Eventually, I had the opportunity at that job to formally investigate BI as another product offering for our software, and the rest - as they say - is history. I built my first data warehouse with Scott Cameron (who has also ventured into the authoring world by writing Analysis Services 2008 Step by Step and was at the BI Conference last week where I got to reminisce with him for a bit) and that began a career that I never imagined at the time. Fast forward to 2010, and I'm still lauding the virtues of collaborative BI, if only the tools will catch up to my vision! Thus, I was anxious to see what Donald Farmer (blog | twitter) and Rita Sallam of Gartner had to say on the subject in their session "Collaborative Decision Making." As I suspected, the tools aren't quite there yet, but the vendors are moving in the right direction. One thing I liked about this session was a non-Microsoft perspective of the state of the industry with regard to collaborative BI. In addition, this session included a better demonstration of SharePoint collaborative BI capabilities than appeared in the BI keynote. Check out the video in the link to the session to see the demonstration. One of the use cases that was demonstrated was linking from information to a person, because, as Donald put it, "People don't trust data, they trust people." The Microsoft BI Stack in General A question I hear all the time from students when I'm teaching is how to know what tools to use when there is overlap between products in the BI stack. I've never taken the time to codify my thoughts on the subject, but saw that my friend Dan Bulos provided good insight on this topic from a variety of perspectives in his session, "So Many BI Tools, So Little Time." I thought one of his best points was that ideally you should be able to design in your tool of choice, and then deploy to your tool of choice. Unfortunately, the ideal is yet to become real across the platform. The closest we come is with the RDL in Reporting Services which can be produced from two different tools (Report Builder or Business Intelligence Development Studio's Report Designer), manually, or by a third-party or custom application. I have touted the idea for years (and publicly said so about 5 years ago) that eventually more products would be RDL producers or consumers, but we aren't there yet. Maybe in another 5 years. Another interesting session that covered the BI stack against a backdrop of competitive products was delivered by Andrew Brust. Andrew did a marvelous job of consolidating a lot of information in a way that clearly communicated how various vendors' offerings compared to the Microsoft BI stack. He also made a particularly compelling argument about how the existence of an ecosystem around the Microsoft BI stack provided innovation and opportunities lacking for other vendors. Check out his presentation, "How Does the Microsoft BI Stack...Stack Up?" Expo Hall I had planned to spend more time in the Expo Hall to see who was doing new things with the BI stack, but didn't manage to get very far. Each time I set out on an exploratory mission, I got caught up in some fascinating conversations with one or more of my peers. I find interacting with people that I meet at conferences just as important as attending sessions to learn something new. There were a couple of items that really caught me eye, however, that I'll share here. Pragmatic Works. Whether you develop SSIS packages, build SSAS cubes, or author SSRS reports (or all of the above), you really must take a look at BI Documenter. Brian Knight (twitter) walked me through the key features, and I must say I was impressed. Once you've seen what this product can do, you won't want to document your BI projects any other way. You can download a free single-user database edition, or choose from more feature-rich standard or professional editions. Microsoft Press ebooks. I also stopped by the O'Reilly Media booth to meet some folks that one of my acquisitions editors at Microsoft Press recommended. In case you haven't heard, Microsoft Press has partnered with O'Reilly Media for distribution and publishing. Apart from my interest in learning more about O'Reilly Media as an author, an advertisement in their booth caught me eye which I think is a really great move. When you buy Microsoft Press ebooks through the O'Reilly web site, you can receive it in any (or all) of the following formats where possible: PDF, epub, .mobi for Kindle and .apk for Android. You also have lifetime DRM-free access to the ebooks. As someone who is an avid collector of books, I fnd myself running out of room for storage. In addition, I travel a lot, and it's hard to lug my reference library with me. Today's e-reader options make the move to digital books a more viable way to grow my library. Having a variety of formats means I am not limited to a single device, and lifetime access means I don't have to worry about keeping track of where I've stored my files. Because the e-books are DRM-free, I can copy and paste when I'm compiling notes, and I can print pages when necessary. That's a winning combination in my mind! Overall, I was pleased with the BI conference. There were many more sessions that I couldn't attend, either because the room was full when I got there or there were multiple sessions running concurrently that I wanted to see. Fortunately, many of the sessions are accessible for viewing online at http://www.msteched.com/2010/NorthAmerica along with the TechEd sessions. You can spot the BI sessions by the yellow skyline on the title slide of the presentation as shown below. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
C#/.NET Little Wonders: The Joy of Anonymous Types

- by James Michael Hare

Once again, in this series of posts I look at the parts of the .NET Framework that may seem trivial, but can help improve your code by making it easier to write and maintain. The index of all my past little wonders posts can be found here. In the .NET 3 Framework, Microsoft introduced the concept of anonymous types, which provide a way to create a quick, compiler-generated types at the point of instantiation. These may seem trivial, but are very handy for concisely creating lightweight, strongly-typed objects containing only read-only properties that can be used within a given scope. Creating an Anonymous Type In short, an anonymous type is a reference type that derives directly from object and is defined by its set of properties base on their names, number, types, and order given at initialization. In addition to just holding these properties, it is also given appropriate overridden implementations for Equals() and GetHashCode() that take into account all of the properties to correctly perform property comparisons and hashing. Also overridden is an implementation of ToString() which makes it easy to display the contents of an anonymous type instance in a fairly concise manner. To construct an anonymous type instance, you use basically the same initialization syntax as with a regular type. So, for example, if we wanted to create an anonymous type to represent a particular point, we could do this: 1: var point = new { X = 13, Y = 7 }; Note the similarity between anonymous type initialization and regular initialization. The main difference is that the compiler generates the type name and the properties (as readonly) based on the names and order provided, and inferring their types from the expressions they are assigned to. It is key to remember that all of those factors (number, names, types, order of properties) determine the anonymous type. This is important, because while these two instances share the same anonymous type: 1: // same names, types, and order 2: var point1 = new { X = 13, Y = 7 }; 3: var point2 = new { X = 5, Y = 0 }; These similar ones do not: 1: var point3 = new { Y = 3, X = 5 }; // different order 2: var point4 = new { X = 3, Y = 5.0 }; // different type for Y 3: var point5 = new {MyX = 3, MyY = 5 }; // different names 4: var point6 = new { X = 1, Y = 2, Z = 3 }; // different count Limitations on Property Initialization Expressions The expression for a property in an anonymous type initialization cannot be null (though it can evaluate to null) or an anonymous function. For example, the following are illegal: 1: // Null can't be used directly. Null reference of what type? 2: var cantUseNull = new { Value = null }; 3: 4: // Anonymous methods cannot be used. 5: var cantUseAnonymousFxn = new { Value = () => Console.WriteLine(“Can’t.”) }; Note that the restriction on null is just that you can’t use it directly as the expression, because otherwise how would it be able to determine the type? You can, however, use it indirectly assigning a null expression such as a typed variable with the value null, or by casting null to a specific type: 1: string str = null; 2: var fineIndirectly = new { Value = str }; 3: var fineCast = new { Value = (string)null }; All of the examples above name the properties explicitly, but you can also implicitly name properties if they are being set from a property, field, or variable. In these cases, when a field, property, or variable is used alone, and you don’t specify a property name assigned to it, the new property will have the same name. For example: 1: int variable = 42; 2: 3: // creates two properties named varriable and Now 4: var implicitProperties = new { variable, DateTime.Now }; Is the same type as: 1: var explicitProperties = new { variable = variable, Now = DateTime.Now }; But this only works if you are using an existing field, variable, or property directly as the expression. If you use a more complex expression then the name cannot be inferred: 1: // can't infer the name variable from variable * 2, must name explicitly 2: var wontWork = new { variable * 2, DateTime.Now }; In the example above, since we typed variable * 2, it is no longer just a variable and thus we would have to assign the property a name explicitly. ToString() on Anonymous Types One of the more trivial overrides that an anonymous type provides you is a ToString() method that prints the value of the anonymous type instance in much the same format as it was initialized (except actual values instead of expressions as appropriate of course). For example, if you had: 1: var point = new { X = 13, Y = 42 }; And then print it out: 1: Console.WriteLine(point.ToString()); You will get: 1: { X = 13, Y = 42 } While this isn’t necessarily the most stunning feature of anonymous types, it can be handy for debugging or logging values in a fairly easy to read format. Comparing Anonymous Type Instances Because anonymous types automatically create appropriate overrides of Equals() and GetHashCode() based on the underlying properties, we can reliably compare two instances or get hash codes. For example, if we had the following 3 points: 1: var point1 = new { X = 1, Y = 2 }; 2: var point2 = new { X = 1, Y = 2 }; 3: var point3 = new { Y = 2, X = 1 }; If we compare point1 and point2 we’ll see that Equals() returns true because they overridden version of Equals() sees that the types are the same (same number, names, types, and order of properties) and that the values are the same. In addition, because all equal objects should have the same hash code, we’ll see that the hash codes evaluate to the same as well: 1: // true, same type, same values 2: Console.WriteLine(point1.Equals(point2)); 3: 4: // true, equal anonymous type instances always have same hash code 5: Console.WriteLine(point1.GetHashCode() == point2.GetHashCode()); However, if we compare point2 and point3 we get false. Even though the names, types, and values of the properties are the same, the order is not, thus they are two different types and cannot be compared (and thus return false). And, since they are not equal objects (even though they have the same value) there is a good chance their hash codes are different as well (though not guaranteed): 1: // false, different types 2: Console.WriteLine(point2.Equals(point3)); 3: 4: // quite possibly false (was false on my machine) 5: Console.WriteLine(point2.GetHashCode() == point3.GetHashCode()); Using Anonymous Types Now that we’ve created instances of anonymous types, let’s actually use them. The property names (whether implicit or explicit) are used to access the individual properties of the anonymous type. The main thing, once again, to keep in mind is that the properties are readonly, so you cannot assign the properties a new value (note: this does not mean that instances referred to by a property are immutable – for more information check out C#/.NET Fundamentals: Returning Data Immutably in a Mutable World). Thus, if we have the following anonymous type instance: 1: var point = new { X = 13, Y = 42 }; We can get the properties as you’d expect: 1: Console.WriteLine(“The point is: ({0},{1})”, point.X, point.Y); But we cannot alter the property values: 1: // compiler error, properties are readonly 2: point.X = 99; Further, since the anonymous type name is only known by the compiler, there is no easy way to pass anonymous type instances outside of a given scope. The only real choices are to pass them as object or dynamic. But really that is not the intention of using anonymous types. If you find yourself needing to pass an anonymous type outside of a given scope, you should really consider making a POCO (Plain Old CLR Type – i.e. a class that contains just properties to hold data with little/no business logic) instead. Given that, why use them at all? Couldn’t you always just create a POCO to represent every anonymous type you needed? Sure you could, but then you might litter your solution with many small POCO classes that have very localized uses. It turns out this is the key to when to use anonymous types to your advantage: when you just need a lightweight type in a local context to store intermediate results, consider an anonymous type – but when that result is more long-lived and used outside of the current scope, consider a POCO instead. So what do we mean by intermediate results in a local context? Well, a classic example would be filtering down results from a LINQ expression. For example, let’s say we had a List<Transaction>, where Transaction is defined something like: 1: public class Transaction 2: { 3: public string UserId { get; set; } 4: public DateTime At { get; set; } 5: public decimal Amount { get; set; } 6: // … 7: } And let’s say we had this data in our List<Transaction>: 1: var transactions = new List<Transaction> 2: { 3: new Transaction { UserId = "Jim", At = DateTime.Now, Amount = 2200.00m }, 4: new Transaction { UserId = "Jim", At = DateTime.Now, Amount = -1100.00m }, 5: new Transaction { UserId = "Jim", At = DateTime.Now.AddDays(-1), Amount = 900.00m }, 6: new Transaction { UserId = "John", At = DateTime.Now.AddDays(-2), Amount = 300.00m }, 7: new Transaction { UserId = "John", At = DateTime.Now, Amount = -10.00m }, 8: new Transaction { UserId = "Jane", At = DateTime.Now, Amount = 200.00m }, 9: new Transaction { UserId = "Jane", At = DateTime.Now, Amount = -50.00m }, 10: new Transaction { UserId = "Jaime", At = DateTime.Now.AddDays(-3), Amount = -100.00m }, 11: new Transaction { UserId = "Jaime", At = DateTime.Now.AddDays(-3), Amount = 300.00m }, 12: }; So let’s say we wanted to get the transactions for each day for each user. That is, for each day we’d want to see the transactions each user performed. We could do this very simply with a nice LINQ expression, without the need of creating any POCOs: 1: // group the transactions based on an anonymous type with properties UserId and Date: 2: byUserAndDay = transactions 3: .GroupBy(tx => new { tx.UserId, tx.At.Date }) 4: .OrderBy(grp => grp.Key.Date) 5: .ThenBy(grp => grp.Key.UserId); Now, those of you who have attempted to use custom classes as a grouping type before (such as GroupBy(), Distinct(), etc.) may have discovered the hard way that LINQ gets a lot of its speed by utilizing not on Equals(), but also GetHashCode() on the type you are grouping by. Thus, when you use custom types for these purposes, you generally end up having to write custom Equals() and GetHashCode() implementations or you won’t get the results you were expecting (the default implementations of Equals() and GetHashCode() are reference equality and reference identity based respectively). As we said before, it turns out that anonymous types already do these critical overrides for you. This makes them even more convenient to use! Instead of creating a small POCO to handle this grouping, and then having to implement a custom Equals() and GetHashCode() every time, we can just take advantage of the fact that anonymous types automatically override these methods with appropriate implementations that take into account the values of all of the properties. Now, we can look at our results: 1: foreach (var group in byUserAndDay) 2: { 3: // the group’s Key is an instance of our anonymous type 4: Console.WriteLine("{0} on {1:MM/dd/yyyy} did:", group.Key.UserId, group.Key.Date); 5: 6: // each grouping contains a sequence of the items. 7: foreach (var tx in group) 8: { 9: Console.WriteLine("\t{0}", tx.Amount); 10: } 11: } And see: 1: Jaime on 06/18/2012 did: 2: -100.00 3: 300.00 4: 5: John on 06/19/2012 did: 6: 300.00 7: 8: Jim on 06/20/2012 did: 9: 900.00 10: 11: Jane on 06/21/2012 did: 12: 200.00 13: -50.00 14: 15: Jim on 06/21/2012 did: 16: 2200.00 17: -1100.00 18: 19: John on 06/21/2012 did: 20: -10.00 Again, sure we could have just built a POCO to do this, given it an appropriate Equals() and GetHashCode() method, but that would have bloated our code with so many extra lines and been more difficult to maintain if the properties change. Summary Anonymous types are one of those Little Wonders of the .NET language that are perfect at exactly that time when you need a temporary type to hold a set of properties together for an intermediate result. While they are not very useful beyond the scope in which they are defined, they are excellent in LINQ expressions as a way to create and us intermediary values for further expressions and analysis. Anonymous types are defined by the compiler based on the number, type, names, and order of properties created, and they automatically implement appropriate Equals() and GetHashCode() overrides (as well as ToString()) which makes them ideal for LINQ expressions where you need to create a set of properties to group, evaluate, etc. Technorati Tags: C#,CSharp,.NET,Little Wonders,Anonymous Types,LINQ

Read the article
You should NOT be writing jQuery in SharePoint if…

- by Mark Rackley

Yes… another one of these posts. What can I say? I’m a pot stirrer.. a rabble rouser *rabble rabble* jQuery in SharePoint seems to be a fairly polarizing issue with one side thinking it is the most awesome thing since Princess Leia as the slave girl in Return of the Jedi and the other half thinking it is the worst idea since Mannequin 2: On the Move. The correct answer is OF COURSE “it depends”. But what are those deciding factors that make jQuery an awesome fit or leave a bad taste in your mouth? Let’s see if I can drive the discussion here with some polarizing comments of my own… I know some of you are getting ready to leave your comments even now before reading the rest of the blog, which is great! Iron sharpens iron… These discussions hopefully open us up to understanding the entire process better and think about things in a different way. You should not be writing jQuery in SharePoint if you are not a developer… Let’s start off with my most polarizing and rant filled portion of the blog post. If you don’t know what you are doing or you don’t have a background that helps you understand the implications of what you are writing then you should not be writing jQuery in SharePoint! I truly believe that one of the biggest reasons for the jQuery haters is because of all the bad jQuery out there. If you don’t know what you are doing you can do some NASTY things! One of the best stories I’ve heard about this is from my good friend John Ferringer (@ferringer). John tells this story during our Mythbusters session we do together. One of his clients was undergoing a Denial of Service attack and they couldn’t figure out what was going on! After much searching they found that some genius jQuery developer wrote some code for an image rotator, but did not take into account what happens when there are no images to load! The code just kept hitting the servers over and over and over again which prevented anything else from getting done! Now, I’m NOT saying that I have not done the same sort of thing in the past or am immune from such mistakes. My point is that if you don’t know what you are doing, there are very REAL consequences that can have a major impact on your organization AND they will be hard to track down. Think how happy your boss will be after you copy and pasted some jQuery from a blog without understanding what it does, it brings down the farm, AND it takes them 3 days to track it back to you. :/ Good times will not be had. Like it or not JavaScript/jQuery is a programming language. While you .NET people sit on your high horses because your code is compiled and “runs faster” (also debatable), the rest of us will be actually getting work done and delivering solutions while you are trying to figure out why your widget won’t deploy. I can pick at that scab because I write .NET code too and speak from experience. I can do both, and do both well. So, I am not speaking from ignorance here. In JavaScript/jQuery you have variables, loops, conditionals, functions, arrays, events, and built in methods. If you are not a developer you just aren’t going to take advantage of all of that and use it correctly. Ahhh.. but there is hope! There is a lot of jQuery resources out there to help you learn and learn well! There are many experts on the subject that will gladly tell you when you are smoking crack. I just this minute saw a tweet from @cquick with a link to: “jQuery Fundamentals”. I just glanced through it and this may be a great primer for you aspiring jQuery devs. Take advantage of all the resources and become a developer! Hey, it will look awesome on your resume right? You should not be writing jQuery in SharePoint if it depends too much on client resources for a good user experience I’ve said it once and I’ll say it over and over until you understand. jQuery is executed on the client’s computer. Got it? If you are looping through hundreds of rows of data, searching through an enormous DOM, or performing many calculations it is going to take some time! AND if your user happens to be sitting on some old PC somewhere that they picked up at a garage sale their experience will be that much worse! If you can’t give the user a good experience they will not use the site. So, if jQuery is causing the user to have a bad experience, don’t use it. I sometimes go as far to say that you should NOT go to jQuery as a first option for external facing web sites because you have ZERO control over what the end user’s computer will be. You just can’t guarantee an awesome user experience all of the time. Ahhh… but you have no choice? (where have I heard that before?). Well… if you really have no choice, here are some tips to help improve the experience: Avoid screen scraping This is not 1999 and SharePoint is not an old green screen from a mainframe… so why are you treating it like it is? Screen scraping is time consuming and client intensive. Take advantage of tools like SPServices to do your data retrieval when possible. Fine tune your DOM searches A lot of time can be eaten up just searching the DOM and ignoring table rows that you don’t need. Write better jQuery to only loop through tables rows that you need, or only access specific elements you need. Take advantage of Element ID’s to return the one element you are looking for instead of looping through all the DOM over and over again. Write better jQuery Remember this is development. Think about how you can write cleaner, faster jQuery. This directly relates to the previous point of improving your DOM searches, but also when using arrays, variables and loops. Do you REALLY need to loop through that array 3 times? How can you knock it down to 2 times or even 1? When you have lots of calculations and data that you are manipulating every operation adds up. Think about how you can streamline it. Back in the old days before RAM was abundant, Cores were plentiful and dinosaurs roamed the earth, us developers had to take performance into account in everything we did. It’s a lost art that really needs to be used here. You should not be writing jQuery in SharePoint if you are sending a lot of data over the wire… Developer: “Awesome… you can easily call SharePoint’s web services to retrieve and write data using SPServices!” Administrator: “Crap! you can easily call SharePoint’s web services to retrieve and write data using SPServices!” SPServices may indeed be the best thing that happened to SharePoint since the invention of SharePoint Saturdays by Godfather Lotter… BUT you HAVE to use it wisely! (I REFUSE to make the Spiderman reference). If you do not know what you are doing your code will bring back EVERY field and EVERY row from a list and push that over the internet with all that lovely XML wrapped around it. That can be a HUGE amount of data and will GREATLY impact performance! Calling several web service methods at the same time can cause the same problem and can negatively impact your SharePoint servers. These problems, thankfully, are not difficult to rectify if you are careful: Limit list data retrieved Use CAML to reduce the number of rows returned and limit the fields returned using ViewFields. You should definitely be doing this regardless. If you aren’t I hope your admin thumps you upside the head. Batch large list updates You may or may not have noticed that if you try to do large updates (hundreds of rows) that the performance is either completely abysmal or it fails over half the time. You can greatly improve performance and avoid timeouts by breaking up your updates into several smaller updates. I don’t know if there is a magic number for best performance, it really depends on how much data you are sending back more than the number of rows. However, I have found that 200 rows generally works well. Play around and find the right number for your situation. Delay Web Service calls when possible One of the cool things about jQuery and SPServices is that you can delay queries to the server until they are actually needed instead of doing them all at once. This can lead to performance improvements over DataViewWebParts and even .NET code in the right situations. So, don’t load the data until it’s needed. In some instances you may not need to retrieve the data at all, so why retrieve it ALL the time? You should not be writing jQuery in SharePoint if there is a better solution… jQuery is NOT the silver bullet in SharePoint, it is not the answer to every question, it is just another tool in the developers toolkit. I urge all developers to know what options exist out there and choose the right one! Sometimes it will be jQuery, sometimes it will be .NET, sometimes it will be XSL, and sometimes it will be some other choice… So, when is there a better solution to jQuery? When you can’t get away from performance problems Sometimes jQuery will just give you horrible performance regardless of what you do because of unavoidable obstacles. In these situations you are going to have to figure out an alternative. Can I do it with a DVWP or do I have to crack open Visual Studio? When you need to do something that jQuery can’t do There are lots of things you can’t do in jQuery like elevate privileges, event handlers, workflows, or interact with back end systems that have no web service interface. It just can’t do everything. When it can be done faster and more efficiently another way Why are you spending time to write jQuery to do a DataViewWebPart that would take 5 minutes? Or why are you trying to implement complicated logic that would be simple to do in .NET? If your answer is that you don’t have the option, okay. BUT if you do have the option don’t reinvent the wheel! Take advantage of the other tools. The answer is not always jQuery… sorry… the kool-aid tastes good, but sweet tea is pretty awesome too. You should not be using jQuery in SharePoint if you are a moron… Let’s finish up the blog on a high note… Yes.. it’s true, I sometimes type things just to get a reaction… guess this section title might be a good example, but it feels good sometimes just to type the words that a lot of us think… So.. don’t be that guy! Another good buddy of mine that works for Microsoft told me. “I loved jQuery in SharePoint…. until I had to support it.”. He went on to explain that some user was making several web service calls on a page using jQuery and then was calling Microsoft and COMPLAINING because the page took so long to load… DUH! What do you expect to happen when you are pushing that much data over the wire and are making that many web service calls at once!! It’s one thing to write that kind of code and accept it’s just going to take a while, it’s COMPLETELY another issue to do that and then complain when it’s not lightning fast! Someone’s gene pool needs some chlorine. So, I think this is a nice summary of the blog… DON’T be that guy… don’t be a moron. How can you stop yourself from being a moron? Ah.. glad you asked, here are some tips: Think Is jQuery the right solution to my problem? Is there a better approach? What are the implications and pitfalls of using jQuery in this situation? Search What are others doing? Does someone have a better solution? Is there a third party library that does the same thing I need? Plan Write good jQuery. Limit calculations and data sent over the wire and don’t reinvent the wheel when possible. Test Okay, it works well on your machine. Try it on others ESPECIALLY if this is for an external site. Test with empty data. Test with hundreds of rows of data. Test as many scenarios as possible. Monitor those server resources to see the impact there as well. Ask the experts As smart as you are, there are people smarter than you. Even the experts talk to each other to make sure they aren't doing something stupid. And for the MOST part they are pretty nice guys. Marc Anderson and Christophe Humbert are two guys who regularly keep me in line. Make sure you aren’t doing something stupid. Repeat So, when you think you have the best solution possible, repeat the steps above just to be safe. Conclusion jQuery is an awesome tool and has come in handy on many occasions. I’m even teaching a 1/2 day SharePoint & jQuery workshop at the upcoming SPTechCon in Boston if you want to berate me in person. However, it’s only as awesome as the developer behind the keyboard. It IS development and has its pitfalls. Knowledge and experience are invaluable to giving the user the best experience possible. Let’s face it, in the end, no matter our opinions, prejudices, or ego providing our clients, customers, and users with the best solution possible is what counts. Period… end of sentence…

Read the article
SQL Server – Undelete a Table and Restore a Single Table from Backup

- by Mladen Prajdic

This post is part of the monthly community event called T-SQL Tuesday started by Adam Machanic (blog|twitter) and hosted by someone else each month. This month the host is Sankar Reddy (blog|twitter) and the topic is Misconceptions in SQL Server. You can follow posts for this theme on Twitter by looking at #TSQL2sDay hashtag. Let me start by saying: This code is a crazy hack that is to never be used unless you really, really have to. Really! And I don’t think there’s a time when you would really have to use it for real. Because it’s a hack there are number of things that can go wrong so play with it knowing that. I’ve managed to totally corrupt one database. :) Oh… and for those saying: yeah yeah.. you have a single table in a file group and you’re restoring that, I say “nay nay” to you. As we all know SQL Server can’t do single table restores from backup. This is kind of a obvious thing due to different relational integrity (RI) concerns. Since we have to maintain that we have to restore all tables represented in a RI graph. For this exercise i say BAH! to those concerns. Note that this method “works” only for simple tables that don’t have LOB and off rows data. The code can be expanded to include those but I’ve tried to leave things “simple”. Note that for this to work our table needs to be relatively static data-wise. This doesn’t work for OLTP table. Products are a perfect example of static data. They don’t change much between backups, pretty much everything depends on them and their table is one of those tables that are relatively easy to accidentally delete everything from. This only works if the database is in Full or Bulk-Logged recovery mode for tables where the contents have been deleted or truncated but NOT when a table was dropped. Everything we’ll talk about has to be done before the data pages are reused for other purposes. After deletion or truncation the pages are marked as reusable so you have to act fast. The best thing probably is to put the database into single user mode ASAP while you’re performing this procedure and return it to multi user after you’re done. How do we do it? We will be using an undocumented but known DBCC commands: DBCC PAGE, an undocumented function sys.fn_dblog and a little known DATABASE RESTORE PAGE option. All tests will be on a copy of Production.Product table in AdventureWorks database called Production.Product1 because the original table has FK constraints that prevent us from truncating it for testing. -- create a duplicate table. This doesn't preserve indexes!SELECT *INTO AdventureWorks.Production.Product1FROM AdventureWorks.Production.Product After we run this code take a full back to perform further testing. First let’s see what the difference between DELETE and TRUNCATE is when it comes to logging. With DELETE every row deletion is logged in the transaction log. With TRUNCATE only whole data page deallocations are logged in the transaction log. Getting deleted data pages is simple. All we have to look for is row delete entry in the sys.fn_dblog output. But getting data pages that were truncated from the transaction log presents a bit of an interesting problem. I will not go into depths of IAM(Index Allocation Map) and PFS (Page Free Space) pages but suffice to say that every IAM page has intervals that tell us which data pages are allocated for a table and which aren’t. If we deep dive into the sys.fn_dblog output we can see that once you truncate a table all the pages in all the intervals are deallocated and this is shown in the PFS page transaction log entry as deallocation of pages. For every 8 pages in the same extent there is one PFS page row in the transaction log. This row holds information about all 8 pages in CSV format which means we can get to this data with some parsing. A great help for parsing this stuff is Peter Debetta’s handy function dbo.HexStrToVarBin that converts hexadecimal string into a varbinary value that can be easily converted to integer tus giving us a readable page number. The shortened (columns removed) sys.fn_dblog output for a PFS page with CSV data for 1 extent (8 data pages) looks like this: -- [Page ID] is displayed in hex format. -- To convert it to readable int we'll use dbo.HexStrToVarBin function found at -- http://sqlblog.com/blogs/peter_debetta/archive/2007/03/09/t-sql-convert-hex-string-to-varbinary.aspx -- This function must be installed in the master databaseSELECT Context, AllocUnitName, [Page ID], DescriptionFROM sys.fn_dblog(NULL, NULL)WHERE [Current LSN] = '00000031:00000a46:007d' The pages at the end marked with 0x00—> are pages that are allocated in the extent but are not part of a table. We can inspect the raw content of each data page with a DBCC PAGE command: -- we need this trace flag to redirect output to the query window.DBCC TRACEON (3604); -- WITH TABLERESULTS gives us data in table format instead of message format-- we use format option 3 because it's the easiest to read and manipulate further onDBCC PAGE (AdventureWorks, 1, 613, 3) WITH TABLERESULTS Since the DBACC PAGE output can be quite extensive I won’t put it here. You can see an example of it in the link at the beginning of this section. Getting deleted data back When we run a delete statement every row to be deleted is marked as a ghost record. A background process periodically cleans up those rows. A huge misconception is that the data is actually removed. It’s not. Only the pointers to the rows are removed while the data itself is still on the data page. We just can’t access it with normal means. To get those pointers back we need to restore every deleted page using the RESTORE PAGE option mentioned above. This restore must be done from a full backup, followed by any differential and log backups that you may have. This is necessary to bring the pages up to the same point in time as the rest of the data. However the restore doesn’t magically connect the restored page back to the original table. It simply replaces the current page with the one from the backup. After the restore we use the DBCC PAGE to read data directly from all data pages and insert that data into a temporary table. To finish the RESTORE PAGE procedure we finally have to take a tail log backup (simple backup of the transaction log) and restore it back. We can now insert data from the temporary table to our original table by hand. Getting truncated data back When we run a truncate the truncated data pages aren’t touched at all. Even the pointers to rows stay unchanged. Because of this getting data back from truncated table is simple. we just have to find out which pages belonged to our table and use DBCC PAGE to read data off of them. No restore is necessary. Turns out that the problems we had with finding the data pages is alleviated by not having to do a RESTORE PAGE procedure. Stop stalling… show me The Code! This is the code for getting back deleted and truncated data back. It’s commented in all the right places so don’t be afraid to take a closer look. Make sure you have a full backup before trying this out. Also I suggest that the last step of backing and restoring the tail log is performed by hand. USE masterGOIF OBJECT_ID('dbo.HexStrToVarBin') IS NULL RAISERROR ('No dbo.HexStrToVarBin installed. Go to http://sqlblog.com/blogs/peter_debetta/archive/2007/03/09/t-sql-convert-hex-string-to-varbinary.aspx and install it in master database' , 18, 1) SET NOCOUNT ONBEGIN TRY DECLARE @dbName VARCHAR(1000), @schemaName VARCHAR(1000), @tableName VARCHAR(1000), @fullBackupName VARCHAR(1000), @undeletedTableName VARCHAR(1000), @sql VARCHAR(MAX), @tableWasTruncated bit; /* THE FIRST LINE ARE OUR INPUT PARAMETERS In this case we're trying to recover Production.Product1 table in AdventureWorks database. My full backup of AdventureWorks database is at e:\AW.bak */ SELECT @dbName = 'AdventureWorks', @schemaName = 'Production', @tableName = 'Product1', @fullBackupName = 'e:\AW.bak', @undeletedTableName = '##' + @tableName + '_Undeleted', @tableWasTruncated = 0, -- copy the structure from original table to a temp table that we'll fill with restored data @sql = 'IF OBJECT_ID(''tempdb..' + @undeletedTableName + ''') IS NOT NULL DROP TABLE ' + @undeletedTableName + ' SELECT *' + ' INTO ' + @undeletedTableName + ' FROM [' + @dbName + '].[' + @schemaName + '].[' + @tableName + ']' + ' WHERE 1 = 0' EXEC (@sql) IF OBJECT_ID('tempdb..#PagesToRestore') IS NOT NULL DROP TABLE #PagesToRestore /* FIND DATA PAGES WE NEED TO RESTORE*/ CREATE TABLE #PagesToRestore ([ID] INT IDENTITY(1,1), [FileID] INT, [PageID] INT, [SQLtoExec] VARCHAR(1000)) -- DBCC PACE statement to run later RAISERROR ('Looking for deleted pages...', 10, 1) -- use T-LOG direct read to get deleted data pages INSERT INTO #PagesToRestore([FileID], [PageID], [SQLtoExec]) EXEC('USE [' + @dbName + '];SELECT FileID, PageID, ''DBCC TRACEON (3604); DBCC PAGE ([' + @dbName + '], '' + FileID + '', '' + PageID + '', 3) WITH TABLERESULTS'' as SQLToExecFROM (SELECT DISTINCT LEFT([Page ID], 4) AS FileID, CONVERT(VARCHAR(100), ' + 'CONVERT(INT, master.dbo.HexStrToVarBin(SUBSTRING([Page ID], 6, 20)))) AS PageIDFROM sys.fn_dblog(NULL, NULL)WHERE AllocUnitName LIKE ''%' + @schemaName + '.' + @tableName + '%'' ' + 'AND Context IN (''LCX_MARK_AS_GHOST'', ''LCX_HEAP'') AND Operation in (''LOP_DELETE_ROWS''))t');SELECT *FROM #PagesToRestore -- if upper EXEC returns 0 rows it means the table was truncated so find truncated pages IF (SELECT COUNT(*) FROM #PagesToRestore) = 0 BEGIN RAISERROR ('No deleted pages found. Looking for truncated pages...', 10, 1) -- use T-LOG read to get truncated data pages INSERT INTO #PagesToRestore([FileID], [PageID], [SQLtoExec]) -- dark magic happens here -- because truncation simply deallocates pages we have to find out which pages were deallocated. -- we can find this out by looking at the PFS page row's Description column. -- for every deallocated extent the Description has a CSV of 8 pages in that extent. -- then it's just a matter of parsing it. -- we also remove the pages in the extent that weren't allocated to the table itself -- marked with '0x00-->00' EXEC ('USE [' + @dbName + '];DECLARE @truncatedPages TABLE(DeallocatedPages VARCHAR(8000), IsMultipleDeallocs BIT);INSERT INTO @truncatedPagesSELECT REPLACE(REPLACE(Description, ''Deallocated '', ''Y''), ''0x00-->00 '', ''N'') + '';'' AS DeallocatedPages, CHARINDEX('';'', Description) AS IsMultipleDeallocsFROM (SELECT DISTINCT LEFT([Page ID], 4) AS FileID, CONVERT(VARCHAR(100), CONVERT(INT, master.dbo.HexStrToVarBin(SUBSTRING([Page ID], 6, 20)))) AS PageID, DescriptionFROM sys.fn_dblog(NULL, NULL)WHERE Context IN (''LCX_PFS'') AND Description LIKE ''Deallocated%'' AND AllocUnitName LIKE ''%' + @schemaName + '.' + @tableName + '%'') t;SELECT FileID, PageID , ''DBCC TRACEON (3604); DBCC PAGE ([' + @dbName + '], '' + FileID + '', '' + PageID + '', 3) WITH TABLERESULTS'' as SQLToExecFROM (SELECT LEFT(PageAndFile, 1) as WasPageAllocatedToTable , SUBSTRING(PageAndFile, 2, CHARINDEX('':'', PageAndFile) - 2 ) as FileID , CONVERT(VARCHAR(100), CONVERT(INT, master.dbo.HexStrToVarBin(SUBSTRING(PageAndFile, CHARINDEX('':'', PageAndFile) + 1, LEN(PageAndFile))))) as PageIDFROM ( SELECT SUBSTRING(DeallocatedPages, delimPosStart, delimPosEnd - delimPosStart) as PageAndFile, IsMultipleDeallocs FROM ( SELECT *, CHARINDEX('';'', DeallocatedPages)*(N-1) + 1 AS delimPosStart, CHARINDEX('';'', DeallocatedPages)*N AS delimPosEnd FROM @truncatedPages t1 CROSS APPLY (SELECT TOP (case when t1.IsMultipleDeallocs = 1 then 8 else 1 end) ROW_NUMBER() OVER(ORDER BY number) as N FROM master..spt_values) t2 )t)t)tWHERE WasPageAllocatedToTable = ''Y''') SELECT @tableWasTruncated = 1 END DECLARE @lastID INT, @pagesCount INT SELECT @lastID = 1, @pagesCount = COUNT(*) FROM #PagesToRestore SELECT @sql = 'Number of pages to restore: ' + CONVERT(VARCHAR(10), @pagesCount) IF @pagesCount = 0 RAISERROR ('No data pages to restore.', 18, 1) ELSE RAISERROR (@sql, 10, 1) -- If the table was truncated we'll read the data directly from data pages without restoring from backup IF @tableWasTruncated = 0 BEGIN -- RESTORE DATA PAGES FROM FULL BACKUP IN BATCHES OF 200 WHILE @lastID <= @pagesCount BEGIN -- create CSV string of pages to restore SELECT @sql = STUFF((SELECT ',' + CONVERT(VARCHAR(100), FileID) + ':' + CONVERT(VARCHAR(100), PageID) FROM #PagesToRestore WHERE ID BETWEEN @lastID AND @lastID + 200 ORDER BY ID FOR XML PATH('')), 1, 1, '') SELECT @sql = 'RESTORE DATABASE [' + @dbName + '] PAGE = ''' + @sql + ''' FROM DISK = ''' + @fullBackupName + '''' RAISERROR ('Starting RESTORE command:' , 10, 1) WITH NOWAIT; RAISERROR (@sql , 10, 1) WITH NOWAIT; EXEC(@sql); RAISERROR ('Restore DONE' , 10, 1) WITH NOWAIT; SELECT @lastID = @lastID + 200 END /* If you have any differential or transaction log backups you should restore them here to bring the previously restored data pages up to date */ END DECLARE @dbccSinglePage TABLE ( [ParentObject] NVARCHAR(500), [Object] NVARCHAR(500), [Field] NVARCHAR(500), [VALUE] NVARCHAR(MAX) ) DECLARE @cols NVARCHAR(MAX), @paramDefinition NVARCHAR(500), @SQLtoExec VARCHAR(1000), @FileID VARCHAR(100), @PageID VARCHAR(100), @i INT = 1 -- Get deleted table columns from information_schema view -- Need sp_executeSQL because database name can't be passed in as variable SELECT @cols = 'select @cols = STUFF((SELECT '', ['' + COLUMN_NAME + '']''FROM ' + @dbName + '.INFORMATION_SCHEMA.COLUMNSWHERE TABLE_NAME = ''' + @tableName + ''' AND TABLE_SCHEMA = ''' + @schemaName + '''ORDER BY ORDINAL_POSITIONFOR XML PATH('''')), 1, 2, '''')', @paramDefinition = N'@cols nvarchar(max) OUTPUT' EXECUTE sp_executesql @cols, @paramDefinition, @cols = @cols OUTPUT -- Loop through all the restored data pages, -- read data from them and insert them into temp table -- which you can then insert into the orignial deleted table DECLARE dbccPageCursor CURSOR GLOBAL FORWARD_ONLY FOR SELECT [FileID], [PageID], [SQLtoExec] FROM #PagesToRestore ORDER BY [FileID], [PageID] OPEN dbccPageCursor; FETCH NEXT FROM dbccPageCursor INTO @FileID, @PageID, @SQLtoExec; WHILE @@FETCH_STATUS = 0 BEGIN RAISERROR ('---------------------------------------------', 10, 1) WITH NOWAIT; SELECT @sql = 'Loop iteration: ' + CONVERT(VARCHAR(10), @i); RAISERROR (@sql, 10, 1) WITH NOWAIT; SELECT @sql = 'Running: ' + @SQLtoExec RAISERROR (@sql, 10, 1) WITH NOWAIT; -- if something goes wrong with DBCC execution or data gathering, skip it but print error BEGIN TRY INSERT INTO @dbccSinglePage EXEC (@SQLtoExec) -- make the data insert magic happen here IF (SELECT CONVERT(BIGINT, [VALUE]) FROM @dbccSinglePage WHERE [Field] LIKE '%Metadata: ObjectId%') = OBJECT_ID('['+@dbName+'].['+@schemaName +'].['+@tableName+']') BEGIN DELETE @dbccSinglePage WHERE NOT ([ParentObject] LIKE 'Slot % Offset %' AND [Object] LIKE 'Slot % Column %') SELECT @sql = 'USE tempdb; ' + 'IF (OBJECTPROPERTY(object_id(''' + @undeletedTableName + '''), ''TableHasIdentity'') = 1) ' + 'SET IDENTITY_INSERT ' + @undeletedTableName + ' ON; ' + 'INSERT INTO ' + @undeletedTableName + '(' + @cols + ') ' + STUFF((SELECT ' UNION ALL SELECT ' + STUFF((SELECT ', ' + CASE WHEN VALUE = '[NULL]' THEN 'NULL' ELSE '''' + [VALUE] + '''' END FROM ( -- the unicorn help here to correctly set ordinal numbers of columns in a data page -- it's turning STRING order into INT order (1,10,11,2,21 into 1,2,..10,11...21) SELECT [ParentObject], [Object], Field, VALUE, RIGHT('00000' + O1, 6) AS ParentObjectOrder, RIGHT('00000' + REVERSE(LEFT(O2, CHARINDEX(' ', O2)-1)), 6) AS ObjectOrder FROM ( SELECT [ParentObject], [Object], Field, VALUE, REPLACE(LEFT([ParentObject], CHARINDEX('Offset', [ParentObject])-1), 'Slot ', '') AS O1, REVERSE(LEFT([Object], CHARINDEX('Offset ', [Object])-2)) AS O2 FROM @dbccSinglePage WHERE t.ParentObject = ParentObject )t)t ORDER BY ParentObjectOrder, ObjectOrder FOR XML PATH('')), 1, 2, '') FROM @dbccSinglePage t GROUP BY ParentObject FOR XML PATH('') ), 1, 11, '') + ';' RAISERROR (@sql, 10, 1) WITH NOWAIT; EXEC (@sql) END END TRY BEGIN CATCH SELECT @sql = 'ERROR!!!' + CHAR(10) + CHAR(13) + 'ErrorNumber: ' + ERROR_NUMBER() + '; ErrorMessage' + ERROR_MESSAGE() + CHAR(10) + CHAR(13) + 'FileID: ' + @FileID + '; PageID: ' + @PageID RAISERROR (@sql, 10, 1) WITH NOWAIT; END CATCH DELETE @dbccSinglePage SELECT @sql = 'Pages left to process: ' + CONVERT(VARCHAR(10), @pagesCount - @i) + CHAR(10) + CHAR(13) + CHAR(10) + CHAR(13) + CHAR(10) + CHAR(13), @i = @i+1 RAISERROR (@sql, 10, 1) WITH NOWAIT; FETCH NEXT FROM dbccPageCursor INTO @FileID, @PageID, @SQLtoExec; END CLOSE dbccPageCursor; DEALLOCATE dbccPageCursor; EXEC ('SELECT ''' + @undeletedTableName + ''' as TableName; SELECT * FROM ' + @undeletedTableName)END TRYBEGIN CATCH SELECT ERROR_NUMBER() AS ErrorNumber, ERROR_MESSAGE() AS ErrorMessage IF CURSOR_STATUS ('global', 'dbccPageCursor') >= 0 BEGIN CLOSE dbccPageCursor; DEALLOCATE dbccPageCursor; ENDEND CATCH-- if the table was deleted we need to finish the restore page sequenceIF @tableWasTruncated = 0BEGIN -- take a log tail backup and then restore it to complete page restore process DECLARE @currentDate VARCHAR(30) SELECT @currentDate = CONVERT(VARCHAR(30), GETDATE(), 112) RAISERROR ('Starting Log Tail backup to c:\Temp ...', 10, 1) WITH NOWAIT; PRINT ('BACKUP LOG [' + @dbName + '] TO DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''') EXEC ('BACKUP LOG [' + @dbName + '] TO DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''') RAISERROR ('Log Tail backup done.', 10, 1) WITH NOWAIT; RAISERROR ('Starting Log Tail restore from c:\Temp ...', 10, 1) WITH NOWAIT; PRINT ('RESTORE LOG [' + @dbName + '] FROM DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''') EXEC ('RESTORE LOG [' + @dbName + '] FROM DISK = ''c:\Temp\' + @dbName + '_TailLogBackup_' + @currentDate + '.trn''') RAISERROR ('Log Tail restore done.', 10, 1) WITH NOWAIT;END-- The last step is manual. Insert data from our temporary table to the original deleted table The misconception here is that you can do a single table restore properly in SQL Server. You can't. But with little experimentation you can get pretty close to it. One way to possible remove a dependency on a backup to retrieve deleted pages is to quickly run a similar script to the upper one that gets data directly from data pages while the rows are still marked as ghost records. It could be done if we could beat the ghost record cleanup task.

Read the article
C#/.NET Little Wonders: ConcurrentBag and BlockingCollection

- by James Michael Hare

In the first week of concurrent collections, began with a general introduction and discussed the ConcurrentStack<T> and ConcurrentQueue<T>. The last post discussed the ConcurrentDictionary<T> . Finally this week, we shall close with a discussion of the ConcurrentBag<T> and BlockingCollection<T>. For more of the "Little Wonders" posts, see C#/.NET Little Wonders: A Redux. Recap As you'll recall from the previous posts, the original collections were object-based containers that accomplished synchronization through a Synchronized member. With the advent of .NET 2.0, the original collections were succeeded by the generic collections which are fully type-safe, but eschew automatic synchronization. With .NET 4.0, a new breed of collections was born in the System.Collections.Concurrent namespace. Of these, the final concurrent collection we will examine is the ConcurrentBag and a very useful wrapper class called the BlockingCollection. For some excellent information on the performance of the concurrent collections and how they perform compared to a traditional brute-force locking strategy, see this informative whitepaper by the Microsoft Parallel Computing Platform team here. ConcurrentBag<T> – Thread-safe unordered collection. Unlike the other concurrent collections, the ConcurrentBag<T> has no non-concurrent counterpart in the .NET collections libraries. Items can be added and removed from a bag just like any other collection, but unlike the other collections, the items are not maintained in any order. This makes the bag handy for those cases when all you care about is that the data be consumed eventually, without regard for order of consumption or even fairness – that is, it’s possible new items could be consumed before older items given the right circumstances for a period of time. So why would you ever want a container that can be unfair? Well, to look at it another way, you can use a ConcurrentQueue and get the fairness, but it comes at a cost in that the ordering rules and synchronization required to maintain that ordering can affect scalability a bit. Thus sometimes the bag is great when you want the fastest way to get the next item to process, and don’t care what item it is or how long its been waiting. The way that the ConcurrentBag works is to take advantage of the new ThreadLocal<T> type (new in System.Threading for .NET 4.0) so that each thread using the bag has a list local to just that thread. This means that adding or removing to a thread-local list requires very low synchronization. The problem comes in where a thread goes to consume an item but it’s local list is empty. In this case the bag performs “work-stealing” where it will rob an item from another thread that has items in its list. This requires a higher level of synchronization which adds a bit of overhead to the take operation. So, as you can imagine, this makes the ConcurrentBag good for situations where each thread both produces and consumes items from the bag, but it would be less-than-idea in situations where some threads are dedicated producers and the other threads are dedicated consumers because the work-stealing synchronization would outweigh the thread-local optimization for a thread taking its own items. Like the other concurrent collections, there are some curiosities to keep in mind: IsEmpty(), Count, ToArray(), and GetEnumerator() lock collection Each of these needs to take a snapshot of whole bag to determine if empty, thus they tend to be more expensive and cause Add() and Take() operations to block. ToArray() and GetEnumerator() are static snapshots Because it is based on a snapshot, will not show subsequent updates after snapshot. Add() is lightweight Since adding to the thread-local list, there is very little overhead on Add. TryTake() is lightweight if items in thread-local list As long as items are in the thread-local list, TryTake() is very lightweight, much more so than ConcurrentStack() and ConcurrentQueue(), however if the local thread list is empty, it must steal work from another thread, which is more expensive. Remember, a bag is not ideal for all situations, it is mainly ideal for situations where a process consumes an item and either decomposes it into more items to be processed, or handles the item partially and places it back to be processed again until some point when it will complete. The main point is that the bag works best when each thread both takes and adds items. For example, we could create a totally contrived example where perhaps we want to see the largest power of a number before it crosses a certain threshold. Yes, obviously we could easily do this with a log function, but bare with me while I use this contrived example for simplicity. So let’s say we have a work function that will take a Tuple out of a bag, this Tuple will contain two ints. The first int is the original number, and the second int is the last multiple of that number. So we could load our bag with the initial values (let’s say we want to know the last multiple of each of 2, 3, 5, and 7 under 100. 1: var bag = new ConcurrentBag<Tuple<int, int>> 2: { 3: Tuple.Create(2, 1), 4: Tuple.Create(3, 1), 5: Tuple.Create(5, 1), 6: Tuple.Create(7, 1) 7: }; Then we can create a method that given the bag, will take out an item, apply the multiplier again, 1: public static void FindHighestPowerUnder(ConcurrentBag<Tuple<int,int>> bag, int threshold) 2: { 3: Tuple<int,int> pair; 4: 5: // while there are items to take, this will prefer local first, then steal if no local 6: while (bag.TryTake(out pair)) 7: { 8: // look at next power 9: var result = Math.Pow(pair.Item1, pair.Item2 + 1); 10: 11: if (result < threshold) 12: { 13: // if smaller than threshold bump power by 1 14: bag.Add(Tuple.Create(pair.Item1, pair.Item2 + 1)); 15: } 16: else 17: { 18: // otherwise, we're done 19: Console.WriteLine("Highest power of {0} under {3} is {0}^{1} = {2}.", 20: pair.Item1, pair.Item2, Math.Pow(pair.Item1, pair.Item2), threshold); 21: } 22: } 23: } Now that we have this, we can load up this method as an Action into our Tasks and run it: 1: // create array of tasks, start all, wait for all 2: var tasks = new[] 3: { 4: new Task(() => FindHighestPowerUnder(bag, 100)), 5: new Task(() => FindHighestPowerUnder(bag, 100)), 6: }; 7: 8: Array.ForEach(tasks, t => t.Start()); 9: 10: Task.WaitAll(tasks); Totally contrived, I know, but keep in mind the main point! When you have a thread or task that operates on an item, and then puts it back for further consumption – or decomposes an item into further sub-items to be processed – you should consider a ConcurrentBag as the thread-local lists will allow for quick processing. However, if you need ordering or if your processes are dedicated producers or consumers, this collection is not ideal. As with anything, you should performance test as your mileage will vary depending on your situation! BlockingCollection<T> – A producers & consumers pattern collection The BlockingCollection<T> can be treated like a collection in its own right, but in reality it adds a producers and consumers paradigm to any collection that implements the interface IProducerConsumerCollection<T>. If you don’t specify one at the time of construction, it will use a ConcurrentQueue<T> as its underlying store. If you don’t want to use the ConcurrentQueue, the ConcurrentStack and ConcurrentBag also implement the interface (though ConcurrentDictionary does not). In addition, you are of course free to create your own implementation of the interface. So, for those who don’t remember the producers and consumers classical computer-science problem, the gist of it is that you have one (or more) processes that are creating items (producers) and one (or more) processes that are consuming these items (consumers). Now, the crux of the problem is that there is a bin (queue) where the produced items are placed, and typically that bin has a limited size. Thus if a producer creates an item, but there is no space to store it, it must wait until an item is consumed. Also if a consumer goes to consume an item and none exists, it must wait until an item is produced. The BlockingCollection makes it trivial to implement any standard producers/consumers process set by providing that “bin” where the items can be produced into and consumed from with the appropriate blocking operations. In addition, you can specify whether the bin should have a limited size or can be (theoretically) unbounded, and you can specify timeouts on the blocking operations. As far as your choice of “bin”, for the most part the ConcurrentQueue is the right choice because it is fairly light and maximizes fairness by ordering items so that they are consumed in the same order they are produced. You can use the concurrent bag or stack, of course, but your ordering would be random-ish in the case of the former and LIFO in the case of the latter. So let’s look at some of the methods of note in BlockingCollection: BoundedCapacity returns capacity of the “bin” If the bin is unbounded, the capacity is int.MaxValue. Count returns an internally-kept count of items This makes it O(1), but if you modify underlying collection directly (not recommended) it is unreliable. CompleteAdding() is used to cut off further adds. This sets IsAddingCompleted and begins to wind down consumers once empty. IsAddingCompleted is true when producers are “done”. Once you are done producing, should complete the add process to alert consumers. IsCompleted is true when producers are “done” and “bin” is empty. Once you mark the producers done, and all items removed, this will be true. Add() is a blocking add to collection. If bin is full, will wait till space frees up Take() is a blocking remove from collection. If bin is empty, will wait until item is produced or adding is completed. GetConsumingEnumerable() is used to iterate and consume items. Unlike the standard enumerator, this one consumes the items instead of iteration. TryAdd() attempts add but does not block completely If adding would block, returns false instead, can specify TimeSpan to wait before stopping. TryTake() attempts to take but does not block completely Like TryAdd(), if taking would block, returns false instead, can specify TimeSpan to wait. Note the use of CompleteAdding() to signal the BlockingCollection that nothing else should be added. This means that any attempts to TryAdd() or Add() after marked completed will throw an InvalidOperationException. In addition, once adding is complete you can still continue to TryTake() and Take() until the bin is empty, and then Take() will throw the InvalidOperationException and TryTake() will return false. So let’s create a simple program to try this out. Let’s say that you have one process that will be producing items, but a slower consumer process that handles them. This gives us a chance to peek inside what happens when the bin is bounded (by default, the bin is NOT bounded). 1: var bin = new BlockingCollection<int>(5); Now, we create a method to produce items: 1: public static void ProduceItems(BlockingCollection<int> bin, int numToProduce) 2: { 3: for (int i = 0; i < numToProduce; i++) 4: { 5: // try for 10 ms to add an item 6: while (!bin.TryAdd(i, TimeSpan.FromMilliseconds(10))) 7: { 8: Console.WriteLine("Bin is full, retrying..."); 9: } 10: } 11: 12: // once done producing, call CompleteAdding() 13: Console.WriteLine("Adding is completed."); 14: bin.CompleteAdding(); 15: } And one to consume them: 1: public static void ConsumeItems(BlockingCollection<int> bin) 2: { 3: // This will only be true if CompleteAdding() was called AND the bin is empty. 4: while (!bin.IsCompleted) 5: { 6: int item; 7: 8: if (!bin.TryTake(out item, TimeSpan.FromMilliseconds(10))) 9: { 10: Console.WriteLine("Bin is empty, retrying..."); 11: } 12: else 13: { 14: Console.WriteLine("Consuming item {0}.", item); 15: Thread.Sleep(TimeSpan.FromMilliseconds(20)); 16: } 17: } 18: } Then we can fire them off: 1: // create one producer and two consumers 2: var tasks = new[] 3: { 4: new Task(() => ProduceItems(bin, 20)), 5: new Task(() => ConsumeItems(bin)), 6: new Task(() => ConsumeItems(bin)), 7: }; 8: 9: Array.ForEach(tasks, t => t.Start()); 10: 11: Task.WaitAll(tasks); Notice that the producer is faster than the consumer, thus it should be hitting a full bin often and displaying the message after it times out on TryAdd(). 1: Consuming item 0. 2: Consuming item 1. 3: Bin is full, retrying... 4: Bin is full, retrying... 5: Consuming item 3. 6: Consuming item 2. 7: Bin is full, retrying... 8: Consuming item 4. 9: Consuming item 5. 10: Bin is full, retrying... 11: Consuming item 6. 12: Consuming item 7. 13: Bin is full, retrying... 14: Consuming item 8. 15: Consuming item 9. 16: Bin is full, retrying... 17: Consuming item 10. 18: Consuming item 11. 19: Bin is full, retrying... 20: Consuming item 12. 21: Consuming item 13. 22: Bin is full, retrying... 23: Bin is full, retrying... 24: Consuming item 14. 25: Adding is completed. 26: Consuming item 15. 27: Consuming item 16. 28: Consuming item 17. 29: Consuming item 19. 30: Consuming item 18. Also notice that once CompleteAdding() is called and the bin is empty, the IsCompleted property returns true, and the consumers will exit. Summary The ConcurrentBag is an interesting collection that can be used to optimize concurrency scenarios where tasks or threads both produce and consume items. In this way, it will choose to consume its own work if available, and then steal if not. However, in situations where you want fair consumption or ordering, or in situations where the producers and consumers are distinct processes, the bag is not optimal. The BlockingCollection is a great wrapper around all of the concurrent queue, stack, and bag that allows you to add producer and consumer semantics easily including waiting when the bin is full or empty. That’s the end of my dive into the concurrent collections. I’d also strongly recommend, once again, you read this excellent Microsoft white paper that goes into much greater detail on the efficiencies you can gain using these collections judiciously (here). Tweet Technorati Tags: C#,.NET,Concurrent Collections,Little Wonders

Read the article
What's up with OCFS2?

- by wcoekaer

On Linux there are many filesystem choices and even from Oracle we provide a number of filesystems, all with their own advantages and use cases. Customers often confuse ACFS with OCFS or OCFS2 which then causes assumptions to be made such as one replacing the other etc... I thought it would be good to write up a summary of how OCFS2 got to where it is, what we're up to still, how it is different from other options and how this really is a cool native Linux cluster filesystem that we worked on for many years and is still widely used. Work on a cluster filesystem at Oracle started many years ago, in the early 2000's when the Oracle Database Cluster development team wrote a cluster filesystem for Windows that was primarily focused on providing an alternative to raw disk devices and help customers with the deployment of Oracle Real Application Cluster (RAC). Oracle RAC is a cluster technology that lets us make a cluster of Oracle Database servers look like one big database. The RDBMS runs on many nodes and they all work on the same data. It's a Shared Disk database design. There are many advantages doing this but I will not go into detail as that is not the purpose of my write up. Suffice it to say that Oracle RAC expects all the database data to be visible in a consistent, coherent way, across all the nodes in the cluster. To do that, there were/are a few options : 1) use raw disk devices that are shared, through SCSI, FC, or iSCSI 2) use a network filesystem (NFS) 3) use a cluster filesystem(CFS) which basically gives you a filesystem that's coherent across all nodes using shared disks. It is sort of (but not quite) combining option 1 and 2 except that you don't do network access to the files, the files are effectively locally visible as if it was a local filesystem. So OCFS (Oracle Cluster FileSystem) on Windows was born. Since Linux was becoming a very important and popular platform, we decided that we would also make this available on Linux and thus the porting of OCFS/Windows started. The first version of OCFS was really primarily focused on replacing the use of Raw devices with a simple filesystem that lets you create files and provide direct IO to these files to get basically native raw disk performance. The filesystem was not designed to be fully POSIX compliant and it did not have any where near good/decent performance for regular file create/delete/access operations. Cache coherency was easy since it was basically always direct IO down to the disk device and this ensured that any time one issues a write() command it would go directly down to the disk, and not return until the write() was completed. Same for read() any sort of read from a datafile would be a read() operation that went all the way to disk and return. We did not cache any data when it came down to Oracle data files. So while OCFS worked well for that, since it did not have much of a normal filesystem feel, it was not something that could be submitted to the kernel mail list for inclusion into Linux as another native linux filesystem (setting aside the Windows porting code ...) it did its job well, it was very easy to configure, node membership was simple, locking was disk based (so very slow but it existed), you could create regular files and do regular filesystem operations to a certain extend but anything that was not database data file related was just not very useful in general. Logfiles ok, standard filesystem use, not so much. Up to this point, all the work was done, at Oracle, by Oracle developers. Once OCFS (1) was out for a while and there was a lot of use in the database RAC world, many customers wanted to do more and were asking for features that you'd expect in a normal native filesystem, a real "general purposes cluster filesystem". So the team sat down and basically started from scratch to implement what's now known as OCFS2 (Oracle Cluster FileSystem release 2). Some basic criteria were : Design it with a real Distributed Lock Manager and use the network for lock negotiation instead of the disk Make it a Linux native filesystem instead of a native shim layer and a portable core Support standard Posix compliancy and be fully cache coherent with all operations Support all the filesystem features Linux offers (ACL, extended Attributes, quotas, sparse files,...) Be modern, support large files, 32/64bit, journaling, data ordered journaling, endian neutral, we can mount on both endian /cross architecture,.. Needless to say, this was a huge development effort that took many years to complete. A few big milestones happened along the way... OCFS2 was development in the open, we did not have a private tree that we worked on without external code review from the Linux Filesystem maintainers, great folks like Christopher Hellwig reviewed the code regularly to make sure we were not doing anything out of line, we submitted the code for review on lkml a number of times to see if we were getting close for it to be included into the mainline kernel. Using this development model is standard practice for anyone that wants to write code that goes into the kernel and having any chance of doing so without a complete rewrite or.. shall I say flamefest when submitted. It saved us a tremendous amount of time by not having to re-fit code for it to be in a Linus acceptable state. Some other filesystems that were trying to get into the kernel that didn't follow an open development model had a lot harder time and a lot harsher criticism. March 2006, when Linus released 2.6.16, OCFS2 officially became part of the mainline kernel, it was accepted a little earlier in the release candidates but in 2.6.16. OCFS2 became officially part of the mainline Linux kernel tree as one of the many filesystems. It was the first cluster filesystem to make it into the kernel tree. Our hope was that it would then end up getting picked up by the distribution vendors to make it easy for everyone to have access to a CFS. Today the source code for OCFS2 is approximately 85000 lines of code. We made OCFS2 production with full support for customers that ran Oracle database on Linux, no extra or separate support contract needed. OCFS2 1.0.0 started being built for RHEL4 for x86, x86-64, ppc, s390x and ia64. For RHEL5 starting with OCFS2 1.2. SuSE was very interested in high availability and clustering and decided to build and include OCFS2 with SLES9 for their customers and was, next to Oracle, the main contributor to the filesystem for both new features and bug fixes. Source code was always available even prior to inclusion into mainline and as of 2.6.16, source code was just part of a Linux kernel download from kernel.org, which it still is, today. So the latest OCFS2 code is always the upstream mainline Linux kernel. OCFS2 is the cluster filesystem used in Oracle VM 2 and Oracle VM 3 as the virtual disk repository filesystem. Since the filesystem is in the Linux kernel it's released under the GPL v2 The release model has always been that new feature development happened in the mainline kernel and we then built consistent, well tested, snapshots that had versions, 1.2, 1.4, 1.6, 1.8. But these releases were effectively just snapshots in time that were tested for stability and release quality. OCFS2 is very easy to use, there's a simple text file that contains the node information (hostname, node number, cluster name) and a file that contains the cluster heartbeat timeouts. It is very small, and very efficient. As Sunil Mushran wrote in the manual : OCFS2 is an efficient, easily configured, quickly installed, fully integrated and compatible, feature-rich, architecture and endian neutral, cache coherent, ordered data journaling, POSIX-compliant, shared disk cluster file system. Here is a list of some of the important features that are included : Variable Block and Cluster sizes Supports block sizes ranging from 512 bytes to 4 KB and cluster sizes ranging from 4 KB to 1 MB (increments in power of 2). Extent-based Allocations Tracks the allocated space in ranges of clusters making it especially efficient for storing very large files. Optimized Allocations Supports sparse files, inline-data, unwritten extents, hole punching and allocation reservation for higher performance and efficient storage. File Cloning/snapshots REFLINK is a feature which introduces copy-on-write clones of files in a cluster coherent way. Indexed Directories Allows efficient access to millions of objects in a directory. Metadata Checksums Detects silent corruption in inodes and directories. Extended Attributes Supports attaching an unlimited number of name:value pairs to the file system objects like regular files, directories, symbolic links, etc. Advanced Security Supports POSIX ACLs and SELinux in addition to the traditional file access permission model. Quotas Supports user and group quotas. Journaling Supports both ordered and writeback data journaling modes to provide file system consistency in the event of power failure or system crash. Endian and Architecture neutral Supports a cluster of nodes with mixed architectures. Allows concurrent mounts on nodes running 32-bit and 64-bit, little-endian (x86, x86_64, ia64) and big-endian (ppc64) architectures. In-built Cluster-stack with DLM Includes an easy to configure, in-kernel cluster-stack with a distributed lock manager. Buffered, Direct, Asynchronous, Splice and Memory Mapped I/Os Supports all modes of I/Os for maximum flexibility and performance. Comprehensive Tools Support Provides a familiar EXT3-style tool-set that uses similar parameters for ease-of-use. The filesystem was distributed for Linux distributions in separate RPM form and this had to be built for every single kernel errata release or every updated kernel provided by the vendor. We provided builds from Oracle for Oracle Linux and all kernels released by Oracle and for Red Hat Enterprise Linux. SuSE provided the modules directly for every kernel they shipped. With the introduction of the Unbreakable Enterprise Kernel for Oracle Linux and our interest in reducing the overhead of building filesystem modules for every minor release, we decide to make OCFS2 available as part of UEK. There was no more need for separate kernel modules, everything was built-in and a kernel upgrade automatically updated the filesystem, as it should. UEK allowed us to not having to backport new upstream filesystem code into an older kernel version, backporting features into older versions introduces risk and requires extra testing because the code is basically partially rewritten. The UEK model works really well for continuing to provide OCFS2 without that extra overhead. Because the RHEL kernel did not contain OCFS2 as a kernel module (it is in the source tree but it is not built by the vendor in kernel module form) we stopped adding the extra packages to Oracle Linux and its RHEL compatible kernel and for RHEL. Oracle Linux customers/users obviously get OCFS2 included as part of the Unbreakable Enterprise Kernel, SuSE customers get it by SuSE distributed with SLES and Red Hat can decide to distribute OCFS2 to their customers if they chose to as it's just a matter of compiling the module and making it available. OCFS2 today, in the mainline kernel is pretty much feature complete in terms of integration with every filesystem feature Linux offers and it is still actively maintained with Joel Becker being the primary maintainer. Since we use OCFS2 as part of Oracle VM, we continue to look at interesting new functionality to add, REFLINK was a good example, and as such we continue to enhance the filesystem where it makes sense. Bugfixes and any sort of code that goes into the mainline Linux kernel that affects filesystems, automatically also modifies OCFS2 so it's in kernel, actively maintained but not a lot of new development happening at this time. We continue to fully support OCFS2 as part of Oracle Linux and the Unbreakable Enterprise Kernel and other vendors make their own decisions on support as it's really a Linux cluster filesystem now more than something that we provide to customers. It really just is part of Linux like EXT3 or BTRFS etc, the OS distribution vendors decide. Do not confuse OCFS2 with ACFS (ASM cluster Filesystem) also known as Oracle Cloud Filesystem. ACFS is a filesystem that's provided by Oracle on various OS platforms and really integrates into Oracle ASM (Automatic Storage Management). It's a very powerful Cluster Filesystem but it's not distributed as part of the Operating System, it's distributed with the Oracle Database product and installs with and lives inside Oracle ASM. ACFS obviously is fully supported on Linux (Oracle Linux, Red Hat Enterprise Linux) but OCFS2 independently as a native Linux filesystem is also, and continues to also be supported. ACFS is very much tied into the Oracle RDBMS, OCFS2 is just a standard native Linux filesystem with no ties into Oracle products. Customers running the Oracle database and ASM really should consider using ACFS as it also provides storage/clustered volume management. Customers wanting to use a simple, easy to use generic Linux cluster filesystem should consider using OCFS2. To learn more about OCFS2 in detail, you can find good documentation on http://oss.oracle.com/projects/ocfs2 in the Documentation area, or get the latest mainline kernel from http://kernel.org and read the source. One final, unrelated note - since I am not always able to publicly answer or respond to comments, I do not want to selectively publish comments from readers. Sometimes I forget to publish comments, sometime I publish them and sometimes I would publish them but if for some reason I cannot publicly comment on them, it becomes a very one-sided stream. So for now I am going to not publish comments from anyone, to be fair to all sides. You are always welcome to email me and I will do my best to respond to technical questions, questions about strategy or direction are sometimes not possible to answer for obvious reasons.

Read the article
I never really understood: what is CGI?

- by claws

CGI is a Comman Gateway Interface. As the name says, it is a "common" gateway interface for everything. It is so trivial and naive from the name. I feel that I understood this and I felt this every time I encountered this word. But frankly, I didn't. I'm still confused. I am a PHP programmer. I did lot of web development. user (client) request for page --- webserver(-embedded PHP interpreter) ---- Server side(PHP) Script --- MySQL Server. Now say my PHP Script can fetch results from MySQL Server && MATLAB Server && Some other server. So, now PHP Script is the CGI? because its interface for the between webserver & All other servers? I don't know. Sometimes they call CGI, a technology & othertimes they call CGI a program or someother server. What exactly is CGI? Whats the big deal with /cgi-bin/*.cgi? Whats up with this? I don't know what is this cgi-bin directory on the server for. I don't know why they have *.cgi extensions. Why does Perl always comes in the way. CGI & Perl (language). I also don't know whats up with these two. Almost all the time I keep hearing these two in combination "CGI & Perl". This book is another great example CGI Programming with Perl Why not "CGI Programming with PHP/JSP/ASP". I never saw such things. CGI Programming in C this confuses me a lot. in C?? Seriously?? I don't know what to say. I"m just confused. "in C"?? This changes everything. Program needs to be compiled and executed. This entirely changes my view of web programming. When do I compile? How does the program gets executed (because it will be a machine code, so it must execute as a independent process). How does it communicate with the web server? IPC? and interfacing with all the servers (in my example MATLAB & MySQL) using socket programming? I'm lost!! They say that CGI is depreciated. Its no more in use. Is it so? What is its latest update? Once, I ran into a situation where I had to give HTTP PUT request access to web server (Apache HTTPD). Its a long back. So, as far as I remember this is what I did: Edited the configuration file of Apache HTTPD to tell webserver to pass all HTTP PUT requests to some put.php ( I had to write this PHP script) Implement put.php to handle the request (save the file to the location mentioned) People said that I wrote a CGI Script. Seriously, I didn't have clue what they were talking about. Did I really write CGI Script? I hope you understood what my confusion is. (Because I myself don't know where I'm confused). I request you guys to keep your answer as simple as possible. I really can't understand any fancy technical terminology. At least not in this case. EDIT: I found this amazing tutorial "CGI Programming Is Simple!" - CGI Tutorial Which explains the concepts in simplest possible way. I've only have one complaint about this tutorial. Just to make what ever he explained complete he should have shown the C code he used for generating response for those GET / POST requests. I've also added link to this tutorial to Wikipedia's article : http://en.wikipedia.org/wiki/Common_Gateway_Interface

Read the article
Laissez les bon temps rouler! (Microsoft BI Conference 2010)

- by smisner

Laissez les bons temps rouler" is a Cajun phrase that I heard frequently when I lived in New Orleans in the mid-1990s. It means "Let the good times roll!" and encapsulates a feeling of happy expectation. As I met with many of my peers and new acquaintances at the Microsoft BI Conference last week, this phrase kept running through my mind as people spoke about their plans in their respective businesses, the benefits and opportunities that the recent releases in the BI stack are providing, and their expectations about the future of the BI stack.Notwithstanding some jabs here and there to point out the platform is neither perfect now nor will be anytime soon (along with admissions that the competitors are also not perfect), and notwithstanding several missteps by the event organizers (which I don't care to enumerate), the overarching mood at the conference was positive. It was a refreshing change from the doom and gloom hovering over several conferences that I attended in 2009. Although many people expect economic hardships to continue over the coming year or so, everyone I know in the BI field is busier than ever and expects to stay busy for quite a while.Self-Service BISelf-service was definitely a theme of the BI conference. In the keynote, Ted Kummert opened with a look back to a fairy tale vision of self-service BI that he told in 2008. At that time, the fairy tale future was a time when "every end user was able to use BI technologies within their job in order to move forward more effectively" and transitioned to the present time in which SQL Server 2008 R2, Office 2010, and SharePoint 2010 are available to deliver managed self-service BI.This set of technologies is presumably poised to address the needs of the 80% of users that Kummert said do not use BI today. He proceeded to outline a series of activities that users ought to be able to do themselves--from simple changes to a report like formatting or an addtional data visualization to integration of an additional data source. The keynote then continued with a series of demonstrations of both current and future technology in support of self-service BI. Some highlights that interested me:PowerPivot, of course, is the flagship product for self-service BI in the Microsoft BI stack. In the TechEd keynote, which was open to the BI conference attendees, Amir Netz (twitter) impressed the audience by demonstrating interactivity with a workbook containing 100 million rows. He upped the ante at the BI keynote with his demonstration of a future-state PowerPivot workbook containing over 2 billion records. It's important to note that this volume of data is being processed by a server engine, and not in the PowerPivot client engine. (Yes, I think it's impressive, but none of my clients are typically wrangling with 2 billion records at a time. Maybe they're thinking too small. This ability to work quickly with large data sets has greater implications for BI solutions than for self-service BI, in my opinion.)Amir also demonstrated KPIs for the future PowerPivot, which appeared to be easier to implement than in any other Microsoft product that supports KPIs, apart from simple KPIs in SharePoint. (My initial reaction is that we have one more place to build KPIs. Great. It's confusing enough. I haven't seen how well those KPIs integrate with other BI tools, which will be important for adoption.)One more PowerPivot feature that Amir showed was a graphical display of the lineage for calculations. (This is hugely practical, especially if you build up calculations incrementally. You can more easily follow the logic from calculation to calculation. Furthermore, if you need to make a change to one calculation, you can assess the impact on other calculations.)Another product demonstration will be available within the next 30 days--Pivot for Reporting Services. If you haven't seen this technology yet, check it out at www.getpivot.com. (It definitely has a wow factor, but I'm skeptical about its practicality. However, I'm looking forward to trying it out with data that I understand.)Michael Tejedor (twitter) demonstrated a feature that I think is really interesting and not emphasized nearly enough--overshadowed by PowerPivot, no doubt. That feature is the Microsoft Business Intelligence Indexing Connector, which enables search of the content of Excel workbooks and Reporting Services reports. (This capability existed in MOSS 2007, but was more cumbersome to implement. The search results in SharePoint 2010 are not only cooler, but more useful by describing whether the content is found in a table or a chart, for example.)This may yet be the dawning of the age of self-service BI - a phrase I've heard repeated from time to time over the last decade - but I think BI professionals are likely to stay busy for a long while, and need not start looking for a new line of work. Kummert repeatedly referenced strategic BI solutions in contrast to self-service BI to emphasize that self-service BI is not a replacement for the services that BI professionals provide. After all, self-service BI does not appear magically on user desktops (or whatever device they want to use). A supporting infrastructure is necessary, and grows in complexity in proportion to the need to simplify BI for users.It's one thing to hear the party line touted by Microsoft employees at the BI keynote, but it's another to hear from the people who are responsible for implementing and supporting it within an organization. Rob Collie (blog | twitter), Kasper de Jonge (blog | twitter), Vidas Matelis (site | twitter), and I were invited to join Andrew Brust (blog | twitter) as he led a Birds of a Feather session at TechEd entitled "PowerPivot: Is It the BI Deal-Changer for Developers and IT Pros?" I would single out the prevailing concern in this session as the issue of control. On one side of this issue were those who were concerned that they would lose control once PowerPivot is implemented. On the other side were those who believed that data should be freely accessible to users in PowerPivot, and even acknowledgment that users would get the data they want even if it meant they would have to manually enter into a workbook to have it ready for analysis. For another viewpoint on how PowerPivot played out at the conference, see Rob Collie's observations.Collaborative BII have been intrigued by the notion of collaborative BI for a very long time. Before I discovered BI, I was a Lotus Notes developer and later a manager of developers, working in a software company that enabled collaboration in the legal industry. Not only did I help create collaborative systems for our clients, I created a complete project management from the ground up to collaboratively manage our custom development work. In that case, collaboration involved my team, my client contacts, and me. I was also able to produce my own BI from that system as well, but didn't know that's what I was doing at the time. Only in recent years has SharePoint begun to catch up with the capabilities that I had with Lotus Notes more than a decade ago. Eventually, I had the opportunity at that job to formally investigate BI as another product offering for our software, and the rest - as they say - is history. I built my first data warehouse with Scott Cameron (who has also ventured into the authoring world by writing Analysis Services 2008 Step by Step and was at the BI Conference last week where I got to reminisce with him for a bit) and that began a career that I never imagined at the time.Fast forward to 2010, and I'm still lauding the virtues of collaborative BI, if only the tools will catch up to my vision! Thus, I was anxious to see what Donald Farmer (blog | twitter) and Rita Sallam of Gartner had to say on the subject in their session "Collaborative Decision Making." As I suspected, the tools aren't quite there yet, but the vendors are moving in the right direction. One thing I liked about this session was a non-Microsoft perspective of the state of the industry with regard to collaborative BI. In addition, this session included a better demonstration of SharePoint collaborative BI capabilities than appeared in the BI keynote. Check out the video in the link to the session to see the demonstration. One of the use cases that was demonstrated was linking from information to a person, because, as Donald put it, "People don't trust data, they trust people."The Microsoft BI Stack in GeneralA question I hear all the time from students when I'm teaching is how to know what tools to use when there is overlap between products in the BI stack. I've never taken the time to codify my thoughts on the subject, but saw that my friend Dan Bulos provided good insight on this topic from a variety of perspectives in his session, "So Many BI Tools, So Little Time." I thought one of his best points was that ideally you should be able to design in your tool of choice, and then deploy to your tool of choice. Unfortunately, the ideal is yet to become real across the platform. The closest we come is with the RDL in Reporting Services which can be produced from two different tools (Report Builder or Business Intelligence Development Studio's Report Designer), manually, or by a third-party or custom application. I have touted the idea for years (and publicly said so about 5 years ago) that eventually more products would be RDL producers or consumers, but we aren't there yet. Maybe in another 5 years.Another interesting session that covered the BI stack against a backdrop of competitive products was delivered by Andrew Brust. Andrew did a marvelous job of consolidating a lot of information in a way that clearly communicated how various vendors' offerings compared to the Microsoft BI stack. He also made a particularly compelling argument about how the existence of an ecosystem around the Microsoft BI stack provided innovation and opportunities lacking for other vendors. Check out his presentation, "How Does the Microsoft BI Stack...Stack Up?"Expo HallI had planned to spend more time in the Expo Hall to see who was doing new things with the BI stack, but didn't manage to get very far. Each time I set out on an exploratory mission, I got caught up in some fascinating conversations with one or more of my peers. I find interacting with people that I meet at conferences just as important as attending sessions to learn something new. There were a couple of items that really caught me eye, however, that I'll share here.Pragmatic Works. Whether you develop SSIS packages, build SSAS cubes, or author SSRS reports (or all of the above), you really must take a look at BI Documenter. Brian Knight (twitter) walked me through the key features, and I must say I was impressed. Once you've seen what this product can do, you won't want to document your BI projects any other way. You can download a free single-user database edition, or choose from more feature-rich standard or professional editions.Microsoft Press ebooks. I also stopped by the O'Reilly Media booth to meet some folks that one of my acquisitions editors at Microsoft Press recommended. In case you haven't heard, Microsoft Press has partnered with O'Reilly Media for distribution and publishing. Apart from my interest in learning more about O'Reilly Media as an author, an advertisement in their booth caught me eye which I think is a really great move. When you buy Microsoft Press ebooks through the O'Reilly web site, you can receive it in any (or all) of the following formats where possible: PDF, epub, .mobi for Kindle and .apk for Android. You also have lifetime DRM-free access to the ebooks. As someone who is an avid collector of books, I fnd myself running out of room for storage. In addition, I travel a lot, and it's hard to lug my reference library with me. Today's e-reader options make the move to digital books a more viable way to grow my library. Having a variety of formats means I am not limited to a single device, and lifetime access means I don't have to worry about keeping track of where I've stored my files. Because the e-books are DRM-free, I can copy and paste when I'm compiling notes, and I can print pages when necessary. That's a winning combination in my mind!Overall, I was pleased with the BI conference. There were many more sessions that I couldn't attend, either because the room was full when I got there or there were multiple sessions running concurrently that I wanted to see. Fortunately, many of the sessions are accessible for viewing online at http://www.msteched.com/2010/NorthAmerica along with the TechEd sessions. You can spot the BI sessions by the yellow skyline on the title slide of the presentation as shown below. Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
From HttpRuntime.Cache to Windows Azure Caching (Preview)

- by Jeff

I don’t know about you, but the announcement of Windows Azure Caching (Preview) (yes, the parentheses are apparently part of the interim name) made me a lot more excited about using Azure. Why? Because one of the great performance tricks of any Web app is to cache frequently used data in memory, so it doesn’t have to hit the database, a service, or whatever. When you run your Web app on one box, HttpRuntime.Cache is a sweet and stupid-simple solution. Somewhere in the data fetching pieces of your app, you can see if an object is available in cache, and return that instead of hitting the data store. I did this quite a bit in POP Forums, and it dramatically cuts down on the database chatter. The problem is that it falls apart if you run the app on many servers, in a Web farm, where one server may initiate a change to that data, and the others will have no knowledge of the change, making it stale. Of course, if you have the infrastructure to do so, you can use something like memcached or AppFabric to do a distributed cache, and achieve the caching flavor you desire. You could do the same thing in Azure before, but it would cost more because you’d need to pay for another role or VM or something to host the cache. Now, you can use a portion of the memory from each instance of a Web role to act as that cache, with no additional cost. That’s huge. So if you’re using a percentage of memory that comes out to 100 MB, and you have three instances running, that’s 300 MB available for caching. For the uninitiated, a Web role in Azure is essentially a VM that runs a Web app (worker roles are the same idea, only without the IIS part). You can spin up many instances of the role, and traffic is load balanced to the various instances. It’s like adding or removing servers to a Web farm all willy-nilly and at your discretion, and it’s what the cloud is all about. I’d say it’s my favorite thing about Windows Azure. The slightly annoying thing about developing for a Web role in Azure is that the local emulator that’s launched by Visual Studio is a little on the slow side. If you’re used to using the built-in Web server, you’re used to building and then alt-tabbing to your browser and refreshing a page. If you’re just changing an MVC view, you’re not even doing the building part. Spinning up the simulated Azure environment is too slow for this, but ideally you want to code your app to use this fantastic distributed cache mechanism. So first off, here’s the link to the page showing how to code using the caching feature. If you’re used to using HttpRuntime.Cache, this should be pretty familiar to you. Let’s say that you want to use the Azure cache preview when you’re running in Azure, but HttpRuntime.Cache if you’re running local, or in a regular IIS server environment. Through the magic of dependency injection, we can get there pretty quickly. First, design an interface to handle the cache insertion, fetching and removal. Mine looks like this: public interface ICacheProvider { void Add(string key, object item, int duration); T Get<T>(string key) where T : class; void Remove(string key); } Now we’ll create two implementations of this interface… one for Azure cache, one for HttpRuntime: public class AzureCacheProvider : ICacheProvider { public AzureCacheProvider() { _cache = new DataCache("default"); // in Microsoft.ApplicationServer.Caching, see how-to } private readonly DataCache _cache; public void Add(string key, object item, int duration) { _cache.Add(key, item, new TimeSpan(0, 0, 0, 0, duration)); } public T Get<T>(string key) where T : class { return _cache.Get(key) as T; } public void Remove(string key) { _cache.Remove(key); } } public class LocalCacheProvider : ICacheProvider { public LocalCacheProvider() { _cache = HttpRuntime.Cache; } private readonly System.Web.Caching.Cache _cache; public void Add(string key, object item, int duration) { _cache.Insert(key, item, null, DateTime.UtcNow.AddMilliseconds(duration), System.Web.Caching.Cache.NoSlidingExpiration); } public T Get<T>(string key) where T : class { return _cache[key] as T; } public void Remove(string key) { _cache.Remove(key); } } Feel free to expand these to use whatever cache features you want. I’m not going to go over dependency injection here, but I assume that if you’re using ASP.NET MVC, you’re using it. Somewhere in your app, you set up the DI container that resolves interfaces to concrete implementations (Ninject call is a “kernel” instead of a container). For this example, I’ll show you how StructureMap does it. It uses a convention based scheme, where if you need to get an instance of IFoo, it looks for a class named Foo. You can also do this mapping explicitly. The initialization of the container looks something like this: ObjectFactory.Initialize(x => { x.Scan(scan => { scan.AssembliesFromApplicationBaseDirectory(); scan.WithDefaultConventions(); }); if (Microsoft.WindowsAzure.ServiceRuntime.RoleEnvironment.IsAvailable) x.For<ICacheProvider>().Use<AzureCacheProvider>(); else x.For<ICacheProvider>().Use<LocalCacheProvider>(); }); If you use Ninject or Windsor or something else, that’s OK. Conceptually they’re all about the same. The important part is the conditional statement that checks to see if the app is running in Azure. If it is, it maps ICacheProvider to AzureCacheProvider, otherwise it maps to LocalCacheProvider. Now when a request comes into your MVC app, and the chain of dependency resolution occurs, you can see to it that the right caching code is called. A typical design may have a call stack that goes: Controller –> BusinessLogicClass –> Repository. Let’s say your repository class looks like this: public class MyRepo : IMyRepo { public MyRepo(ICacheProvider cacheProvider) { _context = new MyDataContext(); _cache = cacheProvider; } private readonly MyDataContext _context; private readonly ICacheProvider _cache; public SomeType Get(int someTypeID) { var key = "somename-" + someTypeID; var cachedObject = _cache.Get<SomeType>(key); if (cachedObject != null) { _context.SomeTypes.Attach(cachedObject); return cachedObject; } var someType = _context.SomeTypes.SingleOrDefault(p => p.SomeTypeID == someTypeID); _cache.Add(key, someType, 60000); return someType; } ... // more stuff to update, delete or whatever, being sure to remove // from cache when you do so When the DI container gets an instance of the repo, it passes an instance of ICacheProvider to the constructor, which in this case will be whatever implementation was specified when the container was initialized. The Get method first tries to hit the cache, and of course doesn’t care what the underlying implementation is, Azure, HttpRuntime, or otherwise. If it finds the object, it returns it right then. If not, it hits the database (this example is using Entity Framework), and inserts the object into the cache before returning it. The important thing not pictured here is that other methods in the repo class will construct the key for the cached object, in this case “somename-“ plus the ID of the object, and then remove it from cache, in any method that alters or deletes the object. That way, no matter what instance of the role is processing the request, it won’t find the object if it has been made stale, that is, updated or outright deleted, forcing it to attempt to hit the database. So is this good technique? Well, sort of. It depends on how you use it, and what your testing looks like around it. Because of differences in behavior and execution of the two caching providers, for example, you could see some strange errors. For example, I immediately got an error indicating there was no parameterless constructor for an MVC controller, because the DI resolver failed to create instances for the dependencies it had. In reality, the NuGet packaged DI resolver for StructureMap was eating an exception thrown by the Azure components that said my configuration, outlined in that how-to article, was wrong. That error wouldn’t occur when using the HttpRuntime. That’s something a lot of people debate about using different components like that, and how you configure them. I kinda hate XML config files, and like the idea of the code-based approach above, but you should be darn sure that your unit and integration testing can account for the differences.

Read the article
How to get DCOMperm.exe from Microsoft

- by Doltknuckle

So I've been trying to solve the "The application-specific permission settings do not grant Local Activation permission" problem and everything I've been reading says I need to get "DCOMperm.exe". There are plenty of links to usage and download links that point to non-MS sources. I'd like to get this direct from Microsoft, but I can't find it there. Some people say that it's part of an SDK, but I'm not sure which one. Does anyone have any experience getting this exe?

Read the article
Cisco Linksys SLM2024/2048 firmware update

- by pplrppl

The initial firmware is 1.0.1 the new firmware is 2.0.0.8 http://www.cisco.com/en/US/docs/switches/lan/csbss/slm2024/release/notes/SLM2024-SLM2048_Release_Note.pdf Release notes say "Supports new GUI Style" and mention the default user name and pw. Does anyone have any experience with the new firmware? Is it Better, Worse, or totally cosmetic with no functional change?

Read the article
WSS3.0 to SharePoint Foundation 2010 upgrade

- by niklassaers

Hi guys, I've got a Windows 2003 x86 server with a small WSS3.0 installation. Now we've bought a new server with Windows Server 2008 x64 and installed SharePoint Foundation 2010 on it. I wish to transfer the few lists and their view from the old WSS3 to SP2010. How should I do this? All the migration websites I've read talk about preparing, but no-one really say how to migrate from one x86 2007 server to a x64 2010 server. Cheers Nik

Read the article

Search Results

Search found 20883 results on 836 pages for 'wont say'.

Page 146/836 | < Previous Page | 142 143 144 145 146 147 148 149 150 151 152 153 | Next Page >

- by nas Ns

- by Tony Davis

- by Maltiriel

- by Steven Chan (Oracle Development)

- by nikie

- by ugoren

- by DmainEvent

- by rcd

- by Tony Davis

- by Cervo

- by Vrashabh

- by Dr. Zim

- by Shaggy Frog

- by smisner

- by James Michael Hare

- by Mark Rackley

- by Mladen Prajdic

- by James Michael Hare

- by wcoekaer

- by claws

- by smisner

- by Jeff

- by Doltknuckle

- by pplrppl

- by niklassaers

< Previous Page | 142 143 144 145 146 147 148 149 150 151 152 153 | Next Page >