Search Results

Search found 20433 results on 818 pages for 'marketing always wins'.

Page 115/818 | < Previous Page | 111 112 113 114 115 116 117 118 119 120 121 122  | Next Page >

  • The Incremental Architect&rsquo;s Napkin - #5 - Design functions for extensibility and readability

    - by Ralf Westphal
    Originally posted on: http://geekswithblogs.net/theArchitectsNapkin/archive/2014/08/24/the-incremental-architectrsquos-napkin---5---design-functions-for.aspx The functionality of programs is entered via Entry Points. So what we´re talking about when designing software is a bunch of functions handling the requests represented by and flowing in through those Entry Points. Designing software thus consists of at least three phases: Analyzing the requirements to find the Entry Points and their signatures Designing the functionality to be executed when those Entry Points get triggered Implementing the functionality according to the design aka coding I presume, you´re familiar with phase 1 in some way. And I guess you´re proficient in implementing functionality in some programming language. But in my experience developers in general are not experienced in going through an explicit phase 2. “Designing functionality? What´s that supposed to mean?” you might already have thought. Here´s my definition: To design functionality (or functional design for short) means thinking about… well, functions. You find a solution for what´s supposed to happen when an Entry Point gets triggered in terms of functions. A conceptual solution that is, because those functions only exist in your head (or on paper) during this phase. But you may have guess that, because it´s “design” not “coding”. And here is, what functional design is not: It´s not about logic. Logic is expressions (e.g. +, -, && etc.) and control statements (e.g. if, switch, for, while etc.). Also I consider calling external APIs as logic. It´s equally basic. It´s what code needs to do in order to deliver some functionality or quality. Logic is what´s doing that needs to be done by software. Transformations are either done through expressions or API-calls. And then there is alternative control flow depending on the result of some expression. Basically it´s just jumps in Assembler, sometimes to go forward (if, switch), sometimes to go backward (for, while, do). But calling your own function is not logic. It´s not necessary to produce any outcome. Functionality is not enhanced by adding functions (subroutine calls) to your code. Nor is quality increased by adding functions. No performance gain, no higher scalability etc. through functions. Functions are not relevant to functionality. Strange, isn´t it. What they are important for is security of investment. By introducing functions into our code we can become more productive (re-use) and can increase evolvability (higher unterstandability, easier to keep code consistent). That´s no small feat, however. Evolvable code can hardly be overestimated. That´s why to me functional design is so important. It´s at the core of software development. To sum this up: Functional design is on a level of abstraction above (!) logical design or algorithmic design. Functional design is only done until you get to a point where each function is so simple you are very confident you can easily code it. Functional design an logical design (which mostly is coding, but can also be done using pseudo code or flow charts) are complementary. Software needs both. If you start coding right away you end up in a tangled mess very quickly. Then you need back out through refactoring. Functional design on the other hand is bloodless without actual code. It´s just a theory with no experiments to prove it. But how to do functional design? An example of functional design Let´s assume a program to de-duplicate strings. The user enters a number of strings separated by commas, e.g. a, b, a, c, d, b, e, c, a. And the program is supposed to clear this list of all doubles, e.g. a, b, c, d, e. There is only one Entry Point to this program: the user triggers the de-duplication by starting the program with the string list on the command line C:\>deduplicate "a, b, a, c, d, b, e, c, a" a, b, c, d, e …or by clicking on a GUI button. This leads to the Entry Point function to get called. It´s the program´s main function in case of the batch version or a button click event handler in the GUI version. That´s the physical Entry Point so to speak. It´s inevitable. What then happens is a three step process: Transform the input data from the user into a request. Call the request handler. Transform the output of the request handler into a tangible result for the user. Or to phrase it a bit more generally: Accept input. Transform input into output. Present output. This does not mean any of these steps requires a lot of effort. Maybe it´s just one line of code to accomplish it. Nevertheless it´s a distinct step in doing the processing behind an Entry Point. Call it an aspect or a responsibility - and you will realize it most likely deserves a function of its own to satisfy the Single Responsibility Principle (SRP). Interestingly the above list of steps is already functional design. There is no logic, but nevertheless the solution is described - albeit on a higher level of abstraction than you might have done yourself. But it´s still on a meta-level. The application to the domain at hand is easy, though: Accept string list from command line De-duplicate Present de-duplicated strings on standard output And this concrete list of processing steps can easily be transformed into code:static void Main(string[] args) { var input = Accept_string_list(args); var output = Deduplicate(input); Present_deduplicated_string_list(output); } Instead of a big problem there are three much smaller problems now. If you think each of those is trivial to implement, then go for it. You can stop the functional design at this point. But maybe, just maybe, you´re not so sure how to go about with the de-duplication for example. Then just implement what´s easy right now, e.g.private static string Accept_string_list(string[] args) { return args[0]; } private static void Present_deduplicated_string_list( string[] output) { var line = string.Join(", ", output); Console.WriteLine(line); } Accept_string_list() contains logic in the form of an API-call. Present_deduplicated_string_list() contains logic in the form of an expression and an API-call. And then repeat the functional design for the remaining processing step. What´s left is the domain logic: de-duplicating a list of strings. How should that be done? Without any logic at our disposal during functional design you´re left with just functions. So which functions could make up the de-duplication? Here´s a suggestion: De-duplicate Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Processing step 2 obviously was the core of the solution. That´s where real creativity was needed. That´s the core of the domain. But now after this refinement the implementation of each step is easy again:private static string[] Parse_string_list(string input) { return input.Split(',') .Select(s => s.Trim()) .ToArray(); } private static Dictionary<string,object> Compile_unique_strings(string[] strings) { return strings.Aggregate( new Dictionary<string, object>(), (agg, s) => { agg[s] = null; return agg; }); } private static string[] Serialize_unique_strings( Dictionary<string,object> dict) { return dict.Keys.ToArray(); } With these three additional functions Main() now looks like this:static void Main(string[] args) { var input = Accept_string_list(args); var strings = Parse_string_list(input); var dict = Compile_unique_strings(strings); var output = Serialize_unique_strings(dict); Present_deduplicated_string_list(output); } I think that´s very understandable code: just read it from top to bottom and you know how the solution to the problem works. It´s a mirror image of the initial design: Accept string list from command line Parse the input string into a true list of strings. Register each string in a dictionary/map/set. That way duplicates get cast away. Transform the data structure into a list of unique strings. Present de-duplicated strings on standard output You can even re-generate the design by just looking at the code. Code and functional design thus are always in sync - if you follow some simple rules. But about that later. And as a bonus: all the functions making up the process are small - which means easy to understand, too. So much for an initial concrete example. Now it´s time for some theory. Because there is method to this madness ;-) The above has only scratched the surface. Introducing Flow Design Functional design starts with a given function, the Entry Point. Its goal is to describe the behavior of the program when the Entry Point is triggered using a process, not an algorithm. An algorithm consists of logic, a process on the other hand consists just of steps or stages. Each processing step transforms input into output or a side effect. Also it might access resources, e.g. a printer, a database, or just memory. Processing steps thus can rely on state of some sort. This is different from Functional Programming, where functions are supposed to not be stateful and not cause side effects.[1] In its simplest form a process can be written as a bullet point list of steps, e.g. Get data from user Output result to user Transform data Parse data Map result for output Such a compilation of steps - possibly on different levels of abstraction - often is the first artifact of functional design. It can be generated by a team in an initial design brainstorming. Next comes ordering the steps. What should happen first, what next etc.? Get data from user Parse data Transform data Map result for output Output result to user That´s great for a start into functional design. It´s better than starting to code right away on a given function using TDD. Please get me right: TDD is a valuable practice. But it can be unnecessarily hard if the scope of a functionn is too large. But how do you know beforehand without investing some thinking? And how to do this thinking in a systematic fashion? My recommendation: For any given function you´re supposed to implement first do a functional design. Then, once you´re confident you know the processing steps - which are pretty small - refine and code them using TDD. You´ll see that´s much, much easier - and leads to cleaner code right away. For more information on this approach I call “Informed TDD” read my book of the same title. Thinking before coding is smart. And writing down the solution as a bunch of functions possibly is the simplest thing you can do, I´d say. It´s more according to the KISS (Keep It Simple, Stupid) principle than returning constants or other trivial stuff TDD development often is started with. So far so good. A simple ordered list of processing steps will do to start with functional design. As shown in the above example such steps can easily be translated into functions. Moving from design to coding thus is simple. However, such a list does not scale. Processing is not always that simple to be captured in a list. And then the list is just text. Again. Like code. That means the design is lacking visuality. Textual representations need more parsing by your brain than visual representations. Plus they are limited in their “dimensionality”: text just has one dimension, it´s sequential. Alternatives and parallelism are hard to encode in text. In addition the functional design using numbered lists lacks data. It´s not visible what´s the input, output, and state of the processing steps. That´s why functional design should be done using a lightweight visual notation. No tool is necessary to draw such designs. Use pen and paper; a flipchart, a whiteboard, or even a napkin is sufficient. Visualizing processes The building block of the functional design notation is a functional unit. I mostly draw it like this: Something is done, it´s clear what goes in, it´s clear what comes out, and it´s clear what the processing step requires in terms of state or hardware. Whenever input flows into a functional unit it gets processed and output is produced and/or a side effect occurs. Flowing data is the driver of something happening. That´s why I call this approach to functional design Flow Design. It´s about data flow instead of control flow. Control flow like in algorithms is of no concern to functional design. Thinking about control flow simply is too low level. Once you start with control flow you easily get bogged down by tons of details. That´s what you want to avoid during design. Design is supposed to be quick, broad brush, abstract. It should give overview. But what about all the details? As Robert C. Martin rightly said: “Programming is abot detail”. Detail is a matter of code. Once you start coding the processing steps you designed you can worry about all the detail you want. Functional design does not eliminate all the nitty gritty. It just postpones tackling them. To me that´s also an example of the SRP. Function design has the responsibility to come up with a solution to a problem posed by a single function (Entry Point). And later coding has the responsibility to implement the solution down to the last detail (i.e. statement, API-call). TDD unfortunately mixes both responsibilities. It´s just coding - and thereby trying to find detailed implementations (green phase) plus getting the design right (refactoring). To me that´s one reason why TDD has failed to deliver on its promise for many developers. Using functional units as building blocks of functional design processes can be depicted very easily. Here´s the initial process for the example problem: For each processing step draw a functional unit and label it. Choose a verb or an “action phrase” as a label, not a noun. Functional design is about activities, not state or structure. Then make the output of an upstream step the input of a downstream step. Finally think about the data that should flow between the functional units. Write the data above the arrows connecting the functional units in the direction of the data flow. Enclose the data description in brackets. That way you can clearly see if all flows have already been specified. Empty brackets mean “no data is flowing”, but nevertheless a signal is sent. A name like “list” or “strings” in brackets describes the data content. Use lower case labels for that purpose. A name starting with an upper case letter like “String” or “Customer” on the other hand signifies a data type. If you like, you also can combine descriptions with data types by separating them with a colon, e.g. (list:string) or (strings:string[]). But these are just suggestions from my practice with Flow Design. You can do it differently, if you like. Just be sure to be consistent. Flows wired-up in this manner I call one-dimensional (1D). Each functional unit just has one input and/or one output. A functional unit without an output is possible. It´s like a black hole sucking up input without producing any output. Instead it produces side effects. A functional unit without an input, though, does make much sense. When should it start to work? What´s the trigger? That´s why in the above process even the first processing step has an input. If you like, view such 1D-flows as pipelines. Data is flowing through them from left to right. But as you can see, it´s not always the same data. It get´s transformed along its passage: (args) becomes a (list) which is turned into (strings). The Principle of Mutual Oblivion A very characteristic trait of flows put together from function units is: no functional units knows another one. They are all completely independent of each other. Functional units don´t know where their input is coming from (or even when it´s gonna arrive). They just specify a range of values they can process. And they promise a certain behavior upon input arriving. Also they don´t know where their output is going. They just produce it in their own time independent of other functional units. That means at least conceptually all functional units work in parallel. Functional units don´t know their “deployment context”. They now nothing about the overall flow they are place in. They are just consuming input from some upstream, and producing output for some downstream. That makes functional units very easy to test. At least as long as they don´t depend on state or resources. I call this the Principle of Mutual Oblivion (PoMO). Functional units are oblivious of others as well as an overall context/purpose. They are just parts of a whole focused on a single responsibility. How the whole is built, how a larger goal is achieved, is of no concern to the single functional units. By building software in such a manner, functional design interestingly follows nature. Nature´s building blocks for organisms also follow the PoMO. The cells forming your body do not know each other. Take a nerve cell “controlling” a muscle cell for example:[2] The nerve cell does not know anything about muscle cells, let alone the specific muscel cell it is “attached to”. Likewise the muscle cell does not know anything about nerve cells, let a lone a specific nerve cell “attached to” it. Saying “the nerve cell is controlling the muscle cell” thus only makes sense when viewing both from the outside. “Control” is a concept of the whole, not of its parts. Control is created by wiring-up parts in a certain way. Both cells are mutually oblivious. Both just follow a contract. One produces Acetylcholine (ACh) as output, the other consumes ACh as input. Where the ACh is going, where it´s coming from neither cell cares about. Million years of evolution have led to this kind of division of labor. And million years of evolution have produced organism designs (DNA) which lead to the production of these different cell types (and many others) and also to their co-location. The result: the overall behavior of an organism. How and why this happened in nature is a mystery. For our software, though, it´s clear: functional and quality requirements needs to be fulfilled. So we as developers have to become “intelligent designers” of “software cells” which we put together to form a “software organism” which responds in satisfying ways to triggers from it´s environment. My bet is: If nature gets complex organisms working by following the PoMO, who are we to not apply this recipe for success to our much simpler “machines”? So my rule is: Wherever there is functionality to be delivered, because there is a clear Entry Point into software, design the functionality like nature would do it. Build it from mutually oblivious functional units. That´s what Flow Design is about. In that way it´s even universal, I´d say. Its notation can also be applied to biology: Never mind labeling the functional units with nouns. That´s ok in Flow Design. You´ll do that occassionally for functional units on a higher level of abstraction or when their purpose is close to hardware. Getting a cockroach to roam your bedroom takes 1,000,000 nerve cells (neurons). Getting the de-duplication program to do its job just takes 5 “software cells” (functional units). Both, though, follow the same basic principle. Translating functional units into code Moving from functional design to code is no rocket science. In fact it´s straightforward. There are two simple rules: Translate an input port to a function. Translate an output port either to a return statement in that function or to a function pointer visible to that function. The simplest translation of a functional unit is a function. That´s what you saw in the above example. Functions are mutually oblivious. That why Functional Programming likes them so much. It makes them composable. Which is the reason, nature works according to the PoMO. Let´s be clear about one thing: There is no dependency injection in nature. For all of an organism´s complexity no DI container is used. Behavior is the result of smooth cooperation between mutually oblivious building blocks. Functions will often be the adequate translation for the functional units in your designs. But not always. Take for example the case, where a processing step should not always produce an output. Maybe the purpose is to filter input. Here the functional unit consumes words and produces words. But it does not pass along every word flowing in. Some words are swallowed. Think of a spell checker. It probably should not check acronyms for correctness. There are too many of them. Or words with no more than two letters. Such words are called “stop words”. In the above picture the optionality of the output is signified by the astrisk outside the brackets. It means: Any number of (word) data items can flow from the functional unit for each input data item. It might be none or one or even more. This I call a stream of data. Such behavior cannot be translated into a function where output is generated with return. Because a function always needs to return a value. So the output port is translated into a function pointer or continuation which gets passed to the subroutine when called:[3]void filter_stop_words( string word, Action<string> onNoStopWord) { if (...check if not a stop word...) onNoStopWord(word); } If you want to be nitpicky you might call such a function pointer parameter an injection. And technically you´re right. Conceptually, though, it´s not an injection. Because the subroutine is not functionally dependent on the continuation. Firstly continuations are procedures, i.e. subroutines without a return type. Remember: Flow Design is about unidirectional data flow. Secondly the name of the formal parameter is chosen in a way as to not assume anything about downstream processing steps. onNoStopWord describes a situation (or event) within the functional unit only. Translating output ports into function pointers helps keeping functional units mutually oblivious in cases where output is optional or produced asynchronically. Either pass the function pointer to the function upon call. Or make it global by putting it on the encompassing class. Then it´s called an event. In C# that´s even an explicit feature.class Filter { public void filter_stop_words( string word) { if (...check if not a stop word...) onNoStopWord(word); } public event Action<string> onNoStopWord; } When to use a continuation and when to use an event dependens on how a functional unit is used in flows and how it´s packed together with others into classes. You´ll see examples further down the Flow Design road. Another example of 1D functional design Let´s see Flow Design once more in action using the visual notation. How about the famous word wrap kata? Robert C. Martin has posted a much cited solution including an extensive reasoning behind his TDD approach. So maybe you want to compare it to Flow Design. The function signature given is:string WordWrap(string text, int maxLineLength) {...} That´s not an Entry Point since we don´t see an application with an environment and users. Nevertheless it´s a function which is supposed to provide a certain functionality. The text passed in has to be reformatted. The input is a single line of arbitrary length consisting of words separated by spaces. The output should consist of one or more lines of a maximum length specified. If a word is longer than a the maximum line length it can be split in multiple parts each fitting in a line. Flow Design Let´s start by brainstorming the process to accomplish the feat of reformatting the text. What´s needed? Words need to be assembled into lines Words need to be extracted from the input text The resulting lines need to be assembled into the output text Words too long to fit in a line need to be split Does sound about right? I guess so. And it shows a kind of priority. Long words are a special case. So maybe there is a hint for an incremental design here. First let´s tackle “average words” (words not longer than a line). Here´s the Flow Design for this increment: The the first three bullet points turned into functional units with explicit data added. As the signature requires a text is transformed into another text. See the input of the first functional unit and the output of the last functional unit. In between no text flows, but words and lines. That´s good to see because thereby the domain is clearly represented in the design. The requirements are talking about words and lines and here they are. But note the asterisk! It´s not outside the brackets but inside. That means it´s not a stream of words or lines, but lists or sequences. For each text a sequence of words is output. For each sequence of words a sequence of lines is produced. The asterisk is used to abstract from the concrete implementation. Like with streams. Whether the list of words gets implemented as an array or an IEnumerable is not important during design. It´s an implementation detail. Does any processing step require further refinement? I don´t think so. They all look pretty “atomic” to me. And if not… I can always backtrack and refine a process step using functional design later once I´ve gained more insight into a sub-problem. Implementation The implementation is straightforward as you can imagine. The processing steps can all be translated into functions. Each can be tested easily and separately. Each has a focused responsibility. And the process flow becomes just a sequence of function calls: Easy to understand. It clearly states how word wrapping works - on a high level of abstraction. And it´s easy to evolve as you´ll see. Flow Design - Increment 2 So far only texts consisting of “average words” are wrapped correctly. Words not fitting in a line will result in lines too long. Wrapping long words is a feature of the requested functionality. Whether it´s there or not makes a difference to the user. To quickly get feedback I decided to first implement a solution without this feature. But now it´s time to add it to deliver the full scope. Fortunately Flow Design automatically leads to code following the Open Closed Principle (OCP). It´s easy to extend it - instead of changing well tested code. How´s that possible? Flow Design allows for extension of functionality by inserting functional units into the flow. That way existing functional units need not be changed. The data flow arrow between functional units is a natural extension point. No need to resort to the Strategy Pattern. No need to think ahead where extions might need to be made in the future. I just “phase in” the remaining processing step: Since neither Extract words nor Reformat know of their environment neither needs to be touched due to the “detour”. The new processing step accepts the output of the existing upstream step and produces data compatible with the existing downstream step. Implementation - Increment 2 A trivial implementation checking the assumption if this works does not do anything to split long words. The input is just passed on: Note how clean WordWrap() stays. The solution is easy to understand. A developer looking at this code sometime in the future, when a new feature needs to be build in, quickly sees how long words are dealt with. Compare this to Robert C. Martin´s solution:[4] How does this solution handle long words? Long words are not even part of the domain language present in the code. At least I need considerable time to understand the approach. Admittedly the Flow Design solution with the full implementation of long word splitting is longer than Robert C. Martin´s. At least it seems. Because his solution does not cover all the “word wrap situations” the Flow Design solution handles. Some lines would need to be added to be on par, I guess. But even then… Is a difference in LOC that important as long as it´s in the same ball park? I value understandability and openness for extension higher than saving on the last line of code. Simplicity is not just less code, it´s also clarity in design. But don´t take my word for it. Try Flow Design on larger problems and compare for yourself. What´s the easier, more straightforward way to clean code? And keep in mind: You ain´t seen all yet ;-) There´s more to Flow Design than described in this chapter. In closing I hope I was able to give you a impression of functional design that makes you hungry for more. To me it´s an inevitable step in software development. Jumping from requirements to code does not scale. And it leads to dirty code all to quickly. Some thought should be invested first. Where there is a clear Entry Point visible, it´s functionality should be designed using data flows. Because with data flows abstraction is possible. For more background on why that´s necessary read my blog article here. For now let me point out to you - if you haven´t already noticed - that Flow Design is a general purpose declarative language. It´s “programming by intention” (Shalloway et al.). Just write down how you think the solution should work on a high level of abstraction. This breaks down a large problem in smaller problems. And by following the PoMO the solutions to those smaller problems are independent of each other. So they are easy to test. Or you could even think about getting them implemented in parallel by different team members. Flow Design not only increases evolvability, but also helps becoming more productive. All team members can participate in functional design. This goes beyon collective code ownership. We´re talking collective design/architecture ownership. Because with Flow Design there is a common visual language to talk about functional design - which is the foundation for all other design activities.   PS: If you like what you read, consider getting my ebook “The Incremental Architekt´s Napkin”. It´s where I compile all the articles in this series for easier reading. I like the strictness of Function Programming - but I also find it quite hard to live by. And it certainly is not what millions of programmers are used to. Also to me it seems, the real world is full of state and side effects. So why give them such a bad image? That´s why functional design takes a more pragmatic approach. State and side effects are ok for processing steps - but be sure to follow the SRP. Don´t put too much of it into a single processing step. ? Image taken from www.physioweb.org ? My code samples are written in C#. C# sports typed function pointers called delegates. Action is such a function pointer type matching functions with signature void someName(T t). Other languages provide similar ways to work with functions as first class citizens - even Java now in version 8. I trust you find a way to map this detail of my translation to your favorite programming language. I know it works for Java, C++, Ruby, JavaScript, Python, Go. And if you´re using a Functional Programming language it´s of course a no brainer. ? Taken from his blog post “The Craftsman 62, The Dark Path”. ?

    Read the article

  • Is trailing slash automagically added on click of home page URL in browser?

    - by Question Overflow
    I am asking this because whenever I mouseover a link to a home page (e.g. http://www.example.com), I notice that a trailing slash is always added (as observed on the status bar of the browser) whether the home page link contains a href attribute that ends with a slash or not. But whenever I am on the home page, the URL on display will not have a trailing slash. I tried entering a slash to the URL in the URL bar. And with Firebug enabled, I notice that the site always return a 200 OK status. An article here discussing this states that having a slash at the end will avoid a 301 redirection. But I am not seeing any redirection, even on this page. Could this be a browser feature that is appending the slash?

    Read the article

  • How to find and fix performance problems in ORM powered applications

    - by FransBouma
    Once in a while we get requests about how to fix performance problems with our framework. As it comes down to following the same steps and looking into the same things every single time, I decided to write a blogpost about it instead, so more people can learn from this and solve performance problems in their O/R mapper powered applications. In some parts it's focused on LLBLGen Pro but it's also usable for other O/R mapping frameworks, as the vast majority of performance problems in O/R mapper powered applications are not specific for a certain O/R mapper framework. Too often, the developer looks at the wrong part of the application, trying to fix what isn't a problem in that part, and getting frustrated that 'things are so slow with <insert your favorite framework X here>'. I'm in the O/R mapper business for a long time now (almost 10 years, full time) and as it's a small world, we O/R mapper developers know almost all tricks to pull off by now: we all know what to do to make task ABC faster and what compromises (because there are almost always compromises) to deal with if we decide to make ABC faster that way. Some O/R mapper frameworks are faster in X, others in Y, but you can be sure the difference is mainly a result of a compromise some developers are willing to deal with and others aren't. That's why the O/R mapper frameworks on the market today are different in many ways, even though they all fetch and save entities from and to a database. I'm not suggesting there's no room for improvement in today's O/R mapper frameworks, there always is, but it's not a matter of 'the slowness of the application is caused by the O/R mapper' anymore. Perhaps query generation can be optimized a bit here, row materialization can be optimized a bit there, but it's mainly coming down to milliseconds. Still worth it if you're a framework developer, but it's not much compared to the time spend inside databases and in user code: if a complete fetch takes 40ms or 50ms (from call to entity object collection), it won't make a difference for your application as that 10ms difference won't be noticed. That's why it's very important to find the real locations of the problems so developers can fix them properly and don't get frustrated because their quest to get a fast, performing application failed. Performance tuning basics and rules Finding and fixing performance problems in any application is a strict procedure with four prescribed steps: isolate, analyze, interpret and fix, in that order. It's key that you don't skip a step nor make assumptions: these steps help you find the reason of a problem which seems to be there, and how to fix it or leave it as-is. Skipping a step, or when you assume things will be bad/slow without doing analysis will lead to the path of premature optimization and won't actually solve your problems, only create new ones. The most important rule of finding and fixing performance problems in software is that you have to understand what 'performance problem' actually means. Most developers will say "when a piece of software / code is slow, you have a performance problem". But is that actually the case? If I write a Linq query which will aggregate, group and sort 5 million rows from several tables to produce a resultset of 10 rows, it might take more than a couple of milliseconds before that resultset is ready to be consumed by other logic. If I solely look at the Linq query, the code consuming the resultset of the 10 rows and then look at the time it takes to complete the whole procedure, it will appear to me to be slow: all that time taken to produce and consume 10 rows? But if you look closer, if you analyze and interpret the situation, you'll see it does a tremendous amount of work, and in that light it might even be extremely fast. With every performance problem you encounter, always do realize that what you're trying to solve is perhaps not a technical problem at all, but a perception problem. The second most important rule you have to understand is based on the old saying "Penny wise, Pound Foolish": the part which takes e.g. 5% of the total time T for a given task isn't worth optimizing if you have another part which takes a much larger part of the total time T for that same given task. Optimizing parts which are relatively insignificant for the total time taken is not going to bring you better results overall, even if you totally optimize that part away. This is the core reason why analysis of the complete set of application parts which participate in a given task is key to being successful in solving performance problems: No analysis -> no problem -> no solution. One warning up front: hunting for performance will always include making compromises. Fast software can be made maintainable, but if you want to squeeze as much performance out of your software, you will inevitably be faced with the dilemma of compromising one or more from the group {readability, maintainability, features} for the extra performance you think you'll gain. It's then up to you to decide whether it's worth it. In almost all cases it's not. The reason for this is simple: the vast majority of performance problems can be solved by implementing the proper algorithms, the ones with proven Big O-characteristics so you know the performance you'll get plus you know the algorithm will work. The time taken by the algorithm implementing code is inevitable: you already implemented the best algorithm. You might find some optimizations on the technical level but in general these are minor. Let's look at the four steps to see how they guide us through the quest to find and fix performance problems. Isolate The first thing you need to do is to isolate the areas in your application which are assumed to be slow. For example, if your application is a web application and a given page is taking several seconds or even minutes to load, it's a good candidate to check out. It's important to start with the isolate step because it allows you to focus on a single code path per area with a clear begin and end and ignore the rest. The rest of the steps are taken per identified problematic area. Keep in mind that isolation focuses on tasks in an application, not code snippets. A task is something that's started in your application by either another task or the user, or another program, and has a beginning and an end. You can see a task as a piece of functionality offered by your application.  Analyze Once you've determined the problem areas, you have to perform analysis on the code paths of each area, to see where the performance problems occur and which areas are not the problem. This is a multi-layered effort: an application which uses an O/R mapper typically consists of multiple parts: there's likely some kind of interface (web, webservice, windows etc.), a part which controls the interface and business logic, the O/R mapper part and the RDBMS, all connected with either a network or inter-process connections provided by the OS or other means. Each of these parts, including the connectivity plumbing, eat up a part of the total time it takes to complete a task, e.g. load a webpage with all orders of a given customer X. To understand which parts participate in the task / area we're investigating and how much they contribute to the total time taken to complete the task, analysis of each participating task is essential. Start with the code you wrote which starts the task, analyze the code and track the path it follows through your application. What does the code do along the way, verify whether it's correct or not. Analyze whether you have implemented the right algorithms in your code for this particular area. Remember we're looking at one area at a time, which means we're ignoring all other code paths, just the code path of the current problematic area, from begin to end and back. Don't dig in and start optimizing at the code level just yet. We're just analyzing. If your analysis reveals big architectural stupidity, it's perhaps a good idea to rethink the architecture at this point. For the rest, we're analyzing which means we collect data about what could be wrong, for each participating part of the complete application. Reviewing the code you wrote is a good tool to get deeper understanding of what is going on for a given task but ultimately it lacks precision and overview what really happens: humans aren't good code interpreters, computers are. We therefore need to utilize tools to get deeper understanding about which parts contribute how much time to the total task, triggered by which other parts and for example how many times are they called. There are two different kind of tools which are necessary: .NET profilers and O/R mapper / RDBMS profilers. .NET profiling .NET profilers (e.g. dotTrace by JetBrains or Ants by Red Gate software) show exactly which pieces of code are called, how many times they're called, and the time it took to run that piece of code, at the method level and sometimes even at the line level. The .NET profilers are essential tools for understanding whether the time taken to complete a given task / area in your application is consumed by .NET code, where exactly in your code, the path to that code, how many times that code was called by other code and thus reveals where hotspots are located: the areas where a solution can be found. Importantly, they also reveal which areas can be left alone: remember our penny wise pound foolish saying: if a profiler reveals that a group of methods are fast, or don't contribute much to the total time taken for a given task, ignore them. Even if the code in them is perhaps complex and looks like a candidate for optimization: you can work all day on that, it won't matter.  As we're focusing on a single area of the application, it's best to start profiling right before you actually activate the task/area. Most .NET profilers support this by starting the application without starting the profiling procedure just yet. You navigate to the particular part which is slow, start profiling in the profiler, in your application you perform the actions which are considered slow, and afterwards you get a snapshot in the profiler. The snapshot contains the data collected by the profiler during the slow action, so most data is produced by code in the area to investigate. This is important, because it allows you to stay focused on a single area. O/R mapper and RDBMS profiling .NET profilers give you a good insight in the .NET side of things, but not in the RDBMS side of the application. As this article is about O/R mapper powered applications, we're also looking at databases, and the software making it possible to consume the database in your application: the O/R mapper. To understand which parts of the O/R mapper and database participate how much to the total time taken for task T, we need different tools. There are two kind of tools focusing on O/R mappers and database performance profiling: O/R mapper profilers and RDBMS profilers. For O/R mapper profilers, you can look at LLBLGen Prof by hibernating rhinos or the Linq to Sql/LLBLGen Pro profiler by Huagati. Hibernating rhinos also have profilers for other O/R mappers like NHibernate (NHProf) and Entity Framework (EFProf) and work the same as LLBLGen Prof. For RDBMS profilers, you have to look whether the RDBMS vendor has a profiler. For example for SQL Server, the profiler is shipped with SQL Server, for Oracle it's build into the RDBMS, however there are also 3rd party tools. Which tool you're using isn't really important, what's important is that you get insight in which queries are executed during the task / area we're currently focused on and how long they took. Here, the O/R mapper profilers have an advantage as they collect the time it took to execute the query from the application's perspective so they also collect the time it took to transport data across the network. This is important because a query which returns a massive resultset or a resultset with large blob/clob/ntext/image fields takes more time to get transported across the network than a small resultset and a database profiler doesn't take this into account most of the time. Another tool to use in this case, which is more low level and not all O/R mappers support it (though LLBLGen Pro and NHibernate as well do) is tracing: most O/R mappers offer some form of tracing or logging system which you can use to collect the SQL generated and executed and often also other activity behind the scenes. While tracing can produce a tremendous amount of data in some cases, it also gives insight in what's going on. Interpret After we've completed the analysis step it's time to look at the data we've collected. We've done code reviews to see whether we've done anything stupid and which parts actually take place and if the proper algorithms have been implemented. We've done .NET profiling to see which parts are choke points and how much time they contribute to the total time taken to complete the task we're investigating. We've performed O/R mapper profiling and RDBMS profiling to see which queries were executed during the task, how many queries were generated and executed and how long they took to complete, including network transportation. All this data reveals two things: which parts are big contributors to the total time taken and which parts are irrelevant. Both aspects are very important. The parts which are irrelevant (i.e. don't contribute significantly to the total time taken) can be ignored from now on, we won't look at them. The parts which contribute a lot to the total time taken are important to look at. We now have to first look at the .NET profiler results, to see whether the time taken is consumed in our own code, in .NET framework code, in the O/R mapper itself or somewhere else. For example if most of the time is consumed by DbCommand.ExecuteReader, the time it took to complete the task is depending on the time the data is fetched from the database. If there was just 1 query executed, according to tracing or O/R mapper profilers / RDBMS profilers, check whether that query is optimal, uses indexes or has to deal with a lot of data. Interpret means that you follow the path from begin to end through the data collected and determine where, along the path, the most time is contributed. It also means that you have to check whether this was expected or is totally unexpected. My previous example of the 10 row resultset of a query which groups millions of rows will likely reveal that a long time is spend inside the database and almost no time is spend in the .NET code, meaning the RDBMS part contributes the most to the total time taken, the rest is compared to that time, irrelevant. Considering the vastness of the source data set, it's expected this will take some time. However, does it need tweaking? Perhaps all possible tweaks are already in place. In the interpret step you then have to decide that further action in this area is necessary or not, based on what the analysis results show: if the analysis results were unexpected and in the area where the most time is contributed to the total time taken is room for improvement, action should be taken. If not, you can only accept the situation and move on. In all cases, document your decision together with the analysis you've done. If you decide that the perceived performance problem is actually expected due to the nature of the task performed, it's essential that in the future when someone else looks at the application and starts asking questions you can answer them properly and new analysis is only necessary if situations changed. Fix After interpreting the analysis results you've concluded that some areas need adjustment. This is the fix step: you're actively correcting the performance problem with proper action targeted at the real cause. In many cases related to O/R mapper powered applications it means you'll use different features of the O/R mapper to achieve the same goal, or apply optimizations at the RDBMS level. It could also mean you apply caching inside your application (compromise memory consumption over performance) to avoid unnecessary re-querying data and re-consuming the results. After applying a change, it's key you re-do the analysis and interpretation steps: compare the results and expectations with what you had before, to see whether your actions had any effect or whether it moved the problem to a different part of the application. Don't fall into the trap to do partly analysis: do the full analysis again: .NET profiling and O/R mapper / RDBMS profiling. It might very well be that the changes you've made make one part faster but another part significantly slower, in such a way that the overall problem hasn't changed at all. Performance tuning is dealing with compromises and making choices: to use one feature over the other, to accept a higher memory footprint, to go away from the strict-OO path and execute queries directly onto the RDBMS, these are choices and compromises which will cross your path if you want to fix performance problems with respect to O/R mappers or data-access and databases in general. In most cases it's not a big issue: alternatives are often good choices too and the compromises aren't that hard to deal with. What is important is that you document why you made a choice, a compromise: which analysis data, which interpretation led you to the choice made. This is key for good maintainability in the years to come. Most common performance problems with O/R mappers Below is an incomplete list of common performance problems related to data-access / O/R mappers / RDBMS code. It will help you with fixing the hotspots you found in the interpretation step. SELECT N+1: (Lazy-loading specific). Lazy loading triggered performance bottlenecks. Consider a list of Orders bound to a grid. You have a Field mapped onto a related field in Order, Customer.CompanyName. Showing this column in the grid will make the grid fetch (indirectly) for each row the Customer row. This means you'll get for the single list not 1 query (for the orders) but 1+(the number of orders shown) queries. To solve this: use eager loading using a prefetch path to fetch the customers with the orders. SELECT N+1 is easy to spot with an O/R mapper profiler or RDBMS profiler: if you see a lot of identical queries executed at once, you have this problem. Prefetch paths using many path nodes or sorting, or limiting. Eager loading problem. Prefetch paths can help with performance, but as 1 query is fetched per node, it can be the number of data fetched in a child node is bigger than you think. Also consider that data in every node is merged on the client within the parent. This is fast, but it also can take some time if you fetch massive amounts of entities. If you keep fetches small, you can use tuning parameters like the ParameterizedPrefetchPathThreshold setting to get more optimal queries. Deep inheritance hierarchies of type Target Per Entity/Type. If you use inheritance of type Target per Entity / Type (each type in the inheritance hierarchy is mapped onto its own table/view), fetches will join subtype- and supertype tables in many cases, which can lead to a lot of performance problems if the hierarchy has many types. With this problem, keep inheritance to a minimum if possible, or switch to a hierarchy of type Target Per Hierarchy, which means all entities in the inheritance hierarchy are mapped onto the same table/view. Of course this has its own set of drawbacks, but it's a compromise you might want to take. Fetching massive amounts of data by fetching large lists of entities. LLBLGen Pro supports paging (and limiting the # of rows returned), which is often key to process through large sets of data. Use paging on the RDBMS if possible (so a query is executed which returns only the rows in the page requested). When using paging in a web application, be sure that you switch server-side paging on on the datasourcecontrol used. In this case, paging on the grid alone is not enough: this can lead to fetching a lot of data which is then loaded into the grid and paged there. Keep note that analyzing queries for paging could lead to the false assumption that paging doesn't occur, e.g. when the query contains a field of type ntext/image/clob/blob and DISTINCT can't be applied while it should have (e.g. due to a join): the datareader will do DISTINCT filtering on the client. this is a little slower but it does perform paging functionality on the data-reader so it won't fetch all rows even if the query suggests it does. Fetch massive amounts of data because blob/clob/ntext/image fields aren't excluded. LLBLGen Pro supports field exclusion for queries. You can exclude fields (also in prefetch paths) per query to avoid fetching all fields of an entity, e.g. when you don't need them for the logic consuming the resultset. Excluding fields can greatly reduce the amount of time spend on data-transport across the network. Use this optimization if you see that there's a big difference between query execution time on the RDBMS and the time reported by the .NET profiler for the ExecuteReader method call. Doing client-side aggregates/scalar calculations by consuming a lot of data. If possible, try to formulate a scalar query or group by query using the projection system or GetScalar functionality of LLBLGen Pro to do data consumption on the RDBMS server. It's far more efficient to process data on the RDBMS server than to first load it all in memory, then traverse the data in-memory to calculate a value. Using .ToList() constructs inside linq queries. It might be you use .ToList() somewhere in a Linq query which makes the query be run partially in-memory. Example: var q = from c in metaData.Customers.ToList() where c.Country=="Norway" select c; This will actually fetch all customers in-memory and do an in-memory filtering, as the linq query is defined on an IEnumerable<T>, and not on the IQueryable<T>. Linq is nice, but it can often be a bit unclear where some parts of a Linq query might run. Fetching all entities to delete into memory first. To delete a set of entities it's rather inefficient to first fetch them all into memory and then delete them one by one. It's more efficient to execute a DELETE FROM ... WHERE query on the database directly to delete the entities in one go. LLBLGen Pro supports this feature, and so do some other O/R mappers. It's not always possible to do this operation in the context of an O/R mapper however: if an O/R mapper relies on a cache, these kind of operations are likely not supported because they make it impossible to track whether an entity is actually removed from the DB and thus can be removed from the cache. Fetching all entities to update with an expression into memory first. Similar to the previous point: it is more efficient to update a set of entities directly with a single UPDATE query using an expression instead of fetching the entities into memory first and then updating the entities in a loop, and afterwards saving them. It might however be a compromise you don't want to take as it is working around the idea of having an object graph in memory which is manipulated and instead makes the code fully aware there's a RDBMS somewhere. Conclusion Performance tuning is almost always about compromises and making choices. It's also about knowing where to look and how the systems in play behave and should behave. The four steps I provided should help you stay focused on the real problem and lead you towards the solution. Knowing how to optimally use the systems participating in your own code (.NET framework, O/R mapper, RDBMS, network/services) is key for success as well as knowing what's going on inside the application you built. I hope you'll find this guide useful in tracking down performance problems and dealing with them in a useful way.  

    Read the article

  • NINE Questions with Michelle Juett

    - by NINEQuestions
    Michelle Juett is one of the more interesting people I know, even though we’ve never met face to face. She’s part artist, part techie and all cool. We “met” via my good buddy George Clingerman and have plotting to take over the world, errr… I mean “collaborating” ever since. If you happen to live in the Seattle area, you can catch her and her work at Sakura Con on April 2-4, 2010 and various other gamer and art cons throughout the year. You can also find her on Twitter as @Shelldragon. Now that you know a little bit, I’ll let her tell you the rest of the story in these NINE Questions: 1. Where are you from? I was born in Clearwater, Florida. I like to tell people I'm from the Bermuda Triangle, it just makes explaining myself so much easier. My family moved to Washington when I was 5 and I've been in the Pacific Northwest ever since. We like to QQ about the rain but we really love the green trees and clean water. 2. What do you do? I fight evil by moonlight and win love by daylight.. or something like that.  I’ve been in quality assurance for games during the day since January 2008 and an artist for life. I currently work in QA for a really awesome game company in Bellevue.  At home, I work on personal digital art, making game assets as well as other random freelance projects as they pop up. 3. How did you get to where you are now? I'm still not where I want to be but I'm getting closer. The biggest piece of advice I can give is to work hard and never settle for the minimum required. I tend to overwork myself but I've never regretted it. You can want something really bad but if you aren't willing to work for it, then you can't expect it to just happen. I've always drawn and had an unhealthy love for video games that I was told I’d grow out of.  I knew I would not ‘grow out’ of games and that real adults make them and I could too. After I graduated, in searching for jobs, I discovered game testing. I figured this would be a good way to get my foot in the door and start networking. I’ve worked with consoles, websites and now, PC games.  I stuck with my journey, although it has been a rocky one, daylighting as a tester and moonlighting as an artist. I'm still on that journey but I wouldn't have it any other way. Test has given me a perspective that is difficult, if not impossible, to obtain any other way. It gives an unconditional respect for other hard working testers and an insight into creative problem solving. 4. So video game testing probably sounds WAY cooler than the reality. What's it like? What's a given day for you? Game testers don't get a lot of respect because of their stigmas and the fact most people don't actually know what we do.  People hear about the opening and closing disc trays all day. Many places do treat their testers like numbers. It all depends on where you work and how awesome your company is. I've had to deal with a lot of bad work situations to get to a really good one. QA exists to ensure the game is as flawless and enjoyable as it can be by the time it has to leave the nest and go out into the world. This includes everything obvious: “can I beat the level and save the princess?” to the more obscure: ‘What happens when I lose internet connection while trying to save right before falling into a pit to my death while holding the jump key then my cat pulls out my memory card and hides it in her litter box?” On the dev side, for developers, testers can be very scary people. Especially when the test team is not in house and you can’t see each other’s faces.  I've seen both sides. We don't mean to hurt your feelings. We really DO love you and want your game to be the best it can be! It can be some serious tough love. 5. You are also an accomplished artist. Got any major projects right now you'd like to talk about? LOL, I don't know if I’d say I'm an accomplished artist just yet. I’m still a long way from where I want to be. I figure that’s what makes you grow though: the desire to never stop improving. I like QA but I want to be a full time artist. I was lucky enough to register for a table at Sakura Con in the 11 second window that the tables sold out. As such, I’ll be selling my wares in the Artist Alley April 2-4th. Part of preparing for this is actually making the art to be sold there. Anime is a fun pass time but I don’t draw a whole lot of it so I’m making up for lost time. As I seem to enjoy burying myself in work, I’m an art lead for a secret project that’s so secret I might be killed tonight for even mentioning it. I also take on various freelance projects and do what I can to help out indie games. I discovered the XNA community a year and a half ago and developed a love for Indies when I was writing a weekly newsletter on XBLA news. I’m a little late to the party but I find myself in a unique position where I am an artist and also have technical skills in games. While not programmer myself, I have a lot of game sense and experience. I hope to make some awesome happen. Lastly, I have an ongoing web comic Shell’s Angels) that tends to get neglected when I get busy. I still love drawing comics and keep a little book with me to sketch down ideas as they pop into my head. I may pick it back up again as a larger project sometime in the future. 6. Can you talk about any of the other freelance projects you're doing or are you sworn to secrecy on those too? We wouldn't want a team of game developer ninjas to take you out or anything. All my projects are currently 2d. I have personal projects such as the ongoing comic as well as a graphic novel I've been picking at here and there. My main focus until April is Sakura Con, Sakura Con, Sakura Con.  I see it as a great way to get exposure and convention experience. I found out I love conventions a couple years ago and I want to get more involved in them. 7. As an artist, what is your weapon of choice? What do you use to get most of your stuff done? I am a Photoshop Hero and I have the hoodie to prove it. (http://www.pennyarcademerch.com/pah090011.html) I've dabbled in other paint programs but I always gravitate back to Photoshop. She is my one true love. I'd like to learn programs like Flash or Anime Studio when I get a bit more time because of their animation abilities. I've worked on frame by frame animation forever but I would love to learn 2d rigging. Still, nothing can compare to a simple sketchpad and a pencil. I always have one on me in case I come across or think of something interesting and can't get to a computer. If the Courier ever comes to exist it will be an ideal weapon for me. 8. You did some videos too, depicting the art creation process. What was the motivation behind those? The creative process is just as important as the final product, if not more so.  I've always loved watching speed paint videos and wanted to try it out myself. Turns out it's a lot of work and time but it's definitely fun to go back and rewatch them. Art isn't always the end result and is more often the process itself. 9. Got any interesting tattoos? Designed any for yourself or other people? Not yet, but not for lack of desire. I've toiled over what and where for years. Last year, I finally decided the back of my shoulders would be the place. Like anything permanent, I want it to have meaning. I thought of somehow incorporating games but I couldn't find something I felt would stand the test of time even with all the classic sprite games. I'm very picky so we'll see if I can get something solid decided. Come see me at Sakura Con April 2 -4!!!

    Read the article

  • What are the strategies to become a good open source developer?

    - by u3050
    I always hear that involving with open source projects is good for career and the more (good) open source you release, the closer you will be to getting your dream job before you've even had an interview.I am expert in Java and I am trying to become fluent in Scala. I always think about getting involved in open source development in Java/Scala but the following confusions stopping me to do so. How/where do I start in open source development projects in GitHub etc? Or How to find active/busy open source development projects? How to find an area where improvement is required or enhancement required in such projects? It looks too complex in the first analysis or its pretty hard to find such opportunities. What are the common strategies to follow if I want to become hobbyist/free time open source developer?People who have experience in open source development please share your learnings/expertise from scratch.Thanks in advance.

    Read the article

  • Unit testing best practices for a unit testing newbie

    - by wilhil
    In recent years, I have only written small components for people in larger projects or small tools. I have never written a unit test and it always seems like learning how to write them and actually making one takes a lot longer than simply firing up the program and testing for real. I am just about to start a fairly large scale project that could take a few months to complete and whilst I will try to test elements as I write them (like always), I am wondering if unit testing could save me time. I was just wondering if anyone could give good advice: Should I be looking at unit testing at the start of the project and possibly adopt a TDD approach. Should I just write tests as I go along, after each section is complete. Should I complete the project and then write unit tests at the end.

    Read the article

  • How to get over “Did I lock the door?” syndrome

    - by Boonei
    I am person who always asks myself  ”Did I lock the house door?”,  And I do ask that question when I have almost reached office. I don’t have a bad memory or I am not a “forget it all after a min person”. Infact I have a fantastic memory of things. This problem has been haunting me for a very long time. My wife used to always have a angry face after we had get down from the car. Because after we have walked for about 20 yards I would run back to the car to check if I had locked the car, you see this problem exists for all locked objects. This happens everyday all round the year. Now a days I don’t have the problem ! I did not get the solution from any doctor or any book that that talks about my inner mind. It was a practical advice given by my aunt….. When I told her that I had this problem, she smiled and said its very very easy to get around this. I was stunned. The solution she gave me was simple. After I had locked the door, should hold the lock and look at it for 5 sec and say to myself   “I have locked the door”. Believe me it works like a charm. The reason why it works is my aunt goes to explain, that your mind always thinks twice of important things that we do on our daily life and raises doubts after sometime. The only way to stop is it by looking at it, holding it and telling yourself that its ok and its done. This holds good for all the things that you generally doubt like, did I turn off the AC?, did I turn off the lights in the house when I left?. Just look at it for 5 sec, hold it tell yourself its done. You will not look back. Image credit [Håkan Dahlström]   This article titled,How to get over “Did I lock the door?” syndrome, was originally published at Tech Dreams. Grab our rss feed or fan us on Facebook to get updates from us.

    Read the article

  • So…is it a Seek or a Scan?

    - by Paul White
    You’re probably most familiar with the terms ‘Seek’ and ‘Scan’ from the graphical plans produced by SQL Server Management Studio (SSMS).  The image to the left shows the most common ones, with the three types of scan at the top, followed by four types of seek.  You might look to the SSMS tool-tip descriptions to explain the differences between them: Not hugely helpful are they?  Both mention scans and ranges (nothing about seeks) and the Index Seek description implies that it will not scan the index entirely (which isn’t necessarily true). Recall also yesterday’s post where we saw two Clustered Index Seek operations doing very different things.  The first Seek performed 63 single-row seeking operations; and the second performed a ‘Range Scan’ (more on those later in this post).  I hope you agree that those were two very different operations, and perhaps you are wondering why there aren’t different graphical plan icons for Range Scans and Seeks?  I have often wondered about that, and the first person to mention it after yesterday’s post was Erin Stellato (twitter | blog): Before we go on to make sense of all this, let’s look at another example of how SQL Server confusingly mixes the terms ‘Scan’ and ‘Seek’ in different contexts.  The diagram below shows a very simple heap table with two columns, one of which is the non-clustered Primary Key, and the other has a non-unique non-clustered index defined on it.  The right hand side of the diagram shows a simple query, it’s associated query plan, and a couple of extracts from the SSMS tool-tip and Properties windows. Notice the ‘scan direction’ entry in the Properties window snippet.  Is this a seek or a scan?  The different references to Scans and Seeks are even more pronounced in the XML plan output that the graphical plan is based on.  This fragment is what lies behind the single Index Seek icon shown above: You’ll find the same confusing references to Seeks and Scans throughout the product and its documentation. Making Sense of Seeks Let’s forget all about scans for a moment, and think purely about seeks.  Loosely speaking, a seek is the process of navigating an index B-tree to find a particular index record, most often at the leaf level.  A seek starts at the root and navigates down through the levels of the index to find the point of interest: Singleton Lookups The simplest sort of seek predicate performs this traversal to find (at most) a single record.  This is the case when we search for a single value using a unique index and an equality predicate.  It should be readily apparent that this type of search will either find one record, or none at all.  This operation is known as a singleton lookup.  Given the example table from before, the following query is an example of a singleton lookup seek: Sadly, there’s nothing in the graphical plan or XML output to show that this is a singleton lookup – you have to infer it from the fact that this is a single-value equality seek on a unique index.  The other common examples of a singleton lookup are bookmark lookups – both the RID and Key Lookup forms are singleton lookups (an RID lookup finds a single record in a heap from the unique row locator, and a Key Lookup does much the same thing on a clustered table).  If you happen to run your query with STATISTICS IO ON, you will notice that ‘Scan Count’ is always zero for a singleton lookup. Range Scans The other type of seek predicate is a ‘seek plus range scan’, which I will refer to simply as a range scan.  The seek operation makes an initial descent into the index structure to find the first leaf row that qualifies, and then performs a range scan (either backwards or forwards in the index) until it reaches the end of the scan range. The ability of a range scan to proceed in either direction comes about because index pages at the same level are connected by a doubly-linked list – each page has a pointer to the previous page (in logical key order) as well as a pointer to the following page.  The doubly-linked list is represented by the green and red dotted arrows in the index diagram presented earlier.  One subtle (but important) point is that the notion of a ‘forward’ or ‘backward’ scan applies to the logical key order defined when the index was built.  In the present case, the non-clustered primary key index was created as follows: CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col ASC) ) ; Notice that the primary key index specifies an ascending sort order for the single key column.  This means that a forward scan of the index will retrieve keys in ascending order, while a backward scan would retrieve keys in descending key order.  If the index had been created instead on key_col DESC, a forward scan would retrieve keys in descending order, and a backward scan would return keys in ascending order. A range scan seek predicate may have a Start condition, an End condition, or both.  Where one is missing, the scan starts (or ends) at one extreme end of the index, depending on the scan direction.  Some examples might help clarify that: the following diagram shows four queries, each of which performs a single seek against a column holding every integer from 1 to 100 inclusive.  The results from each query are shown in the blue columns, and relevant attributes from the Properties window appear on the right: Query 1 specifies that all key_col values less than 5 should be returned in ascending order.  The query plan achieves this by seeking to the start of the index leaf (there is no explicit starting value) and scanning forward until the End condition (key_col < 5) is no longer satisfied (SQL Server knows it can stop looking as soon as it finds a key_col value that isn’t less than 5 because all later index entries are guaranteed to sort higher). Query 2 asks for key_col values greater than 95, in descending order.  SQL Server returns these results by seeking to the end of the index, and scanning backwards (in descending key order) until it comes across a row that isn’t greater than 95.  Sharp-eyed readers may notice that the end-of-scan condition is shown as a Start range value.  This is a bug in the XML show plan which bubbles up to the Properties window – when a backward scan is performed, the roles of the Start and End values are reversed, but the plan does not reflect that.  Oh well. Query 3 looks for key_col values that are greater than or equal to 10, and less than 15, in ascending order.  This time, SQL Server seeks to the first index record that matches the Start condition (key_col >= 10) and then scans forward through the leaf pages until the End condition (key_col < 15) is no longer met. Query 4 performs much the same sort of operation as Query 3, but requests the output in descending order.  Again, we have to mentally reverse the Start and End conditions because of the bug, but otherwise the process is the same as always: SQL Server finds the highest-sorting record that meets the condition ‘key_col < 25’ and scans backward until ‘key_col >= 20’ is no longer true. One final point to note: seek operations always have the Ordered: True attribute.  This means that the operator always produces rows in a sorted order, either ascending or descending depending on how the index was defined, and whether the scan part of the operation is forward or backward.  You cannot rely on this sort order in your queries of course (you must always specify an ORDER BY clause if order is important) but SQL Server can make use of the sort order internally.  In the four queries above, the query optimizer was able to avoid an explicit Sort operator to honour the ORDER BY clause, for example. Multiple Seek Predicates As we saw yesterday, a single index seek plan operator can contain one or more seek predicates.  These seek predicates can either be all singleton seeks or all range scans – SQL Server does not mix them.  For example, you might expect the following query to contain two seek predicates, a singleton seek to find the single record in the unique index where key_col = 10, and a range scan to find the key_col values between 15 and 20: SELECT key_col FROM dbo.Example WHERE key_col = 10 OR key_col BETWEEN 15 AND 20 ORDER BY key_col ASC ; In fact, SQL Server transforms the singleton seek (key_col = 10) to the equivalent range scan, Start:[key_col >= 10], End:[key_col <= 10].  This allows both range scans to be evaluated by a single seek operator.  To be clear, this query results in two range scans: one from 10 to 10, and one from 15 to 20. Final Thoughts That’s it for today – tomorrow we’ll look at monitoring singleton lookups and range scans, and I’ll show you a seek on a heap table. Yes, a seek.  On a heap.  Not an index! If you would like to run the queries in this post for yourself, there’s a script below.  Thanks for reading! IF OBJECT_ID(N'dbo.Example', N'U') IS NOT NULL BEGIN DROP TABLE dbo.Example; END ; -- Test table is a heap -- Non-clustered primary key on 'key_col' CREATE TABLE dbo.Example ( key_col INTEGER NOT NULL, data INTEGER NOT NULL, CONSTRAINT [PK dbo.Example key_col] PRIMARY KEY NONCLUSTERED (key_col) ) ; -- Non-unique non-clustered index on the 'data' column CREATE NONCLUSTERED INDEX [IX dbo.Example data] ON dbo.Example (data) ; -- Add 100 rows INSERT dbo.Example WITH (TABLOCKX) ( key_col, data ) SELECT key_col = V.number, data = V.number FROM master.dbo.spt_values AS V WHERE V.[type] = N'P' AND V.number BETWEEN 1 AND 100 ; -- ================ -- Singleton lookup -- ================ ; -- Single value equality seek in a unique index -- Scan count = 0 when STATISTIS IO is ON -- Check the XML SHOWPLAN SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col = 32 ; -- =========== -- Range Scans -- =========== ; -- Query 1 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col <= 5 ORDER BY E.key_col ASC ; -- Query 2 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col > 95 ORDER BY E.key_col DESC ; -- Query 3 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col >= 10 AND E.key_col < 15 ORDER BY E.key_col ASC ; -- Query 4 SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col >= 20 AND E.key_col < 25 ORDER BY E.key_col DESC ; -- Final query (singleton + range = 2 range scans) SELECT E.key_col FROM dbo.Example AS E WHERE E.key_col = 10 OR E.key_col BETWEEN 15 AND 20 ORDER BY E.key_col ASC ; -- === TIDY UP === DROP TABLE dbo.Example; © 2011 Paul White email: [email protected] twitter: @SQL_Kiwi

    Read the article

  • Language Club – Battle of the Dynamic Languages

    - by Ben Griswold
    After dedicating the last eight weeks to learning Ruby, it’s time to move onto another language.  I really dig Ruby.  I really enjoy its dynamism and expressiveness and always-openness and it’s been the highlight of our coding club for me so far. But that’s just my take on the language.  I know a lot of coders who’s stomachs turn with the mere thought of Ruby.  They say it’s Ruby’s openness which has them feeling uneasy.  I’d say “write a bunch of tests and get over it,” but I figure there must be more to it than always open classes and possible method collisions. Yes, there’s something else to it alright. The folks who didn’t fall head over heals for Ruby are already in love with Python.  You might remember that Python was the first language we tackled in our coding club.  My time with Python was okay but it didn’t feel as natural to me as Ruby.  But let’s say we started with Ruby and then moved onto Python.  Would I see Python in a different light right now.  Might I even prefer Python over Ruby?  I suppose it’s possible but it’s pretty tough to test that theory – unless we visit Python for a second time. That’s right. The language club is going to focus on Python again and in my attempt to learn Python – yet again – in the open, I’ll be posting my solutions here just as I did for Ruby.  We don’t always have second chances so I going about this relearning with two primary goals in mind:  First, I’m going to use IronPython and the IronPython tools which provide a Python code editor, a file-based project system, and an interactive Python interpreter, all inside Visual Studio 2010.  As a note, the IronPython tools are now part of the main IronPython installer which is Version 2.7 Alpha 1 (not the latest stable version, 2.6.1) and I’d be crazy not to use them.  Second, I’d like to make sure I’m still learning Python without a complete MS skew so I’m going to run my code through Eclipse using the PyDev plugin as well.  Heck, I might use IDLE too. I already have this setup on my machine so it’s no big deal. Okay, that’s it for now.  I worked on the first ten Euler problems last night and the solutions will be posted shortly. Wish me luck.

    Read the article

  • Cisco ASA (Client VPN) to LAN - through second VPN to second LAN

    - by user50855
    We have 2 site that is linked by an IPSEC VPN to remote Cisco ASAs: Site 1 1.5Mb T1 Connection Cisco(1) 2841 Site 2 1.5Mb T1 Connection Cisco 2841 In addition: Site 1 has a 2nd WAN 3Mb bonded T1 Connection Cisco 5510 that connects to same LAN as Cisco(1) 2841. Basically, Remote Access (VPN) users connecting through Cisco ASA 5510 needs access to a service at the end of Site 2. This is due to the way the service is sold - Cisco 2841 routers are not under our management and it is setup to allow connection from local LAN VLAN 1 IP address 10.20.0.0/24. My idea is to have all traffic from Remote Users through Cisco ASA destined for Site 2 to go via the VPN between Site 1 and Site 2. The end result being all traffic that hits Site 2 has come via Site 1. I'm struggling to find a great deal of information on how this is setup. So, firstly, can anyone confirm that what I'm trying to achieve is possible? Secondly, can anyone help me to correct the configuration bellow or point me in the direction of an example of such a configuration? Many Thanks. interface Ethernet0/0 nameif outside security-level 0 ip address 7.7.7.19 255.255.255.240 interface Ethernet0/1 nameif inside security-level 100 ip address 10.20.0.249 255.255.255.0 object-group network group-inside-vpnclient description All inside networks accessible to vpn clients network-object 10.20.0.0 255.255.255.0 network-object 10.20.1.0 255.255.255.0 object-group network group-adp-network description ADP IP Address or network accessible to vpn clients network-object 207.207.207.173 255.255.255.255 access-list outside_access_in extended permit icmp any any echo-reply access-list outside_access_in extended permit icmp any any source-quench access-list outside_access_in extended permit icmp any any unreachable access-list outside_access_in extended permit icmp any any time-exceeded access-list outside_access_in extended permit tcp any host 7.7.7.20 eq smtp access-list outside_access_in extended permit tcp any host 7.7.7.20 eq https access-list outside_access_in extended permit tcp any host 7.7.7.20 eq pop3 access-list outside_access_in extended permit tcp any host 7.7.7.20 eq www access-list outside_access_in extended permit tcp any host 7.7.7.21 eq www access-list outside_access_in extended permit tcp any host 7.7.7.21 eq https access-list outside_access_in extended permit tcp any host 7.7.7.21 eq 5721 access-list acl-vpnclient extended permit ip object-group group-inside-vpnclient any access-list acl-vpnclient extended permit ip object-group group-inside-vpnclient object-group group-adp-network access-list acl-vpnclient extended permit ip object-group group-adp-network object-group group-inside-vpnclient access-list PinesFLVPNTunnel_splitTunnelAcl standard permit 10.20.0.0 255.255.255.0 access-list inside_nat0_outbound_1 extended permit ip 10.20.0.0 255.255.255.0 10.20.1.0 255.255.255.0 access-list inside_nat0_outbound_1 extended permit ip 10.20.0.0 255.255.255.0 host 207.207.207.173 access-list inside_nat0_outbound_1 extended permit ip 10.20.1.0 255.255.255.0 host 207.207.207.173 ip local pool VPNPool 10.20.1.100-10.20.1.200 mask 255.255.255.0 route outside 0.0.0.0 0.0.0.0 7.7.7.17 1 route inside 207.207.207.173 255.255.255.255 10.20.0.3 1 crypto ipsec transform-set ESP-3DES-SHA esp-3des esp-sha-hmac crypto ipsec security-association lifetime seconds 28800 crypto ipsec security-association lifetime kilobytes 4608000 crypto dynamic-map outside_dyn_map 20 set transform-set ESP-3DES-SHA crypto dynamic-map outside_dyn_map 20 set security-association lifetime seconds 288000 crypto dynamic-map outside_dyn_map 20 set security-association lifetime kilobytes 4608000 crypto dynamic-map outside_dyn_map 20 set reverse-route crypto map outside_map 20 ipsec-isakmp dynamic outside_dyn_map crypto map outside_map interface outside crypto map outside_dyn_map 20 match address acl-vpnclient crypto map outside_dyn_map 20 set security-association lifetime seconds 28800 crypto map outside_dyn_map 20 set security-association lifetime kilobytes 4608000 crypto isakmp identity address crypto isakmp enable outside crypto isakmp policy 20 authentication pre-share encryption 3des hash sha group 2 lifetime 86400 group-policy YeahRightflVPNTunnel internal group-policy YeahRightflVPNTunnel attributes wins-server value 10.20.0.9 dns-server value 10.20.0.9 vpn-tunnel-protocol IPSec password-storage disable pfs disable split-tunnel-policy tunnelspecified split-tunnel-network-list value acl-vpnclient default-domain value YeahRight.com group-policy YeahRightFLVPNTunnel internal group-policy YeahRightFLVPNTunnel attributes wins-server value 10.20.0.9 dns-server value 10.20.0.9 10.20.0.7 vpn-tunnel-protocol IPSec split-tunnel-policy tunnelspecified split-tunnel-network-list value YeahRightFLVPNTunnel_splitTunnelAcl default-domain value yeahright.com tunnel-group YeahRightFLVPN type remote-access tunnel-group YeahRightFLVPN general-attributes address-pool VPNPool tunnel-group YeahRightFLVPNTunnel type remote-access tunnel-group YeahRightFLVPNTunnel general-attributes address-pool VPNPool authentication-server-group WinRadius default-group-policy YeahRightFLVPNTunnel tunnel-group YeahRightFLVPNTunnel ipsec-attributes pre-shared-key *

    Read the article

  • How to Automate your Database Documentation

    - by Jonathan Hickford
    In my previous post, “Automating Deployments with SQL Compare command line” I looked at how teams can automate the deployment and post deployment validation of SQL Server databases using the command line versions of Red Gate tools. In this post I’m looking at another use for the command line tools, namely using them to generate up-to-date documentation with every database change. There are many reasons why up-to-date documentation is valuable. For example when somebody new has to work on or administer a database for the first time, or when a new database comes into service. Having database documentation reduces the risks of making incorrect decisions when making changes. Documentation is very useful to business intelligence analysts when writing reports, for example in SSRS. There are a couple of great examples talking about why up to date documentation is valuable on this site:  Database Documentation – Lands of Trolls: Why and How? and Database Documentation Using SQL Doc. The short answer is that it can save you time and reduce risk when you need that most! SQL Doc is a fast simple tool that automatically generates database documentation. It can create documents in HTML, Word or pdf files. The documentation contains information about object definitions and dependencies, along with any other information you want to associate with each object. The SQL Doc GUI, which is included in Red Gate’s SQL Developer Bundle and SQL Toolbelt, allows you to add additional notes to objects, and customise which objects are shown in the docs.  These settings can be saved as a .sqldoc project file. The SQL Doc command line can use this project file to automatically update the documentation every time the database is changed, ensuring that documentation that is always up to date. The simplest way to keep documentation up to date is probably to use a scheduled task to run a script every day. However if you have a source controlled database, or are using a Continuous Integration (CI) server or a build server, it may make more sense to use that instead. If  you’re using SQL Source Control or SSDT Database Projects to help version control your database, you can automatically update the documentation after each change is made to the source control repository that contains your database. To get this automation in place,  you can use the functionality of a Continuous Integration (CI) server, which can trigger commands to run when a source control repository has changed. A CI server will also capture and save the documentation that is created as an artifact, so you can always find the exact documentation for a specific version of the database. This forms an always up to date data dictionary. If you don’t already have a CI server in place there are several you can use, such as the free open source Jenkins or the free starter editions of TeamCity. I won’t cover setting these up in this article, but there is information about using CI servers for automating database tasks on the Red Gate Database Delivery webpage. You may be interested in Red Gate’s SQL CI utility (part of the SQL Automation Pack) which is an easy way to update a database with the latest changes from source control. The PowerShell example below shows how to create the documentation from a database. That database might be your integration database or a shared development database that is always up to date with the latest changes. $serverName = "server\instance" $databaseName = "databaseName" # If you want to document multiple databases use a comma separated list $userName = "username" $password = "password" # Path to SQLDoc.exe $SQLDocPath = "C:\Program Files (x86)\Red Gate\SQL Doc 3\SQLDoc.exe" $arguments = @( "/server:$($serverName)", "/database:$($databaseName)", "/username:$($userName)", "/password:$($password)", "/filetype:html", "/outputfolder:.", # "/project:$args[0]", # If you already have a .sqldoc project file you can pass it as an argument to this script. Values in the project will be overridden with any options set on the command line "/name:$databaseName Report", "/copyrightauthor:$([Environment]::UserName)" ) write-host $arguments & $SQLDocPath $arguments There are several options you can set on the command line to vary how your documentation is created. For example, you can document multiple databases or exclude certain types of objects. In the example above, we set the name of the report to match the database name, and use the current Windows user as the documentation author. For more examples of how you can customise the report from the command line please see the SQL Doc command line documentation If you already have a .sqldoc project file, or wish to further customise the report by including or excluding specific objects, you can use this project on the command line. Any settings you specify on the command line will override the defaults in the project. For details of what you can customise in the project please see the SQL Doc project documentation. In the example above, the line to use a project is commented out, but you can uncomment this line and then pass a path to a .sqldoc project file as an argument to this script.  Conclusion Keeping documentation about your databases up to date is very easy to set up using SQL Doc and PowerShell. By using a CI server to run this process you can trigger the documentation to be run on every change to a source controlled database, and keep historic documentation available. If you are considering more advanced database automation, e.g. database unit testing, change script generation, deploying to large numbers of targets and backup/verification, please email me at [email protected] for further script samples or if you have any questions.

    Read the article

  • On Turning 30&hellip;

    - by MOSSLover
    I know I am not a wise old sage like some people in the community.  I just turned 30 however I feel like all my years looking back have changed me.  My collective experiences and thoughts have given me a different perspective on life recently.  Seven months ago my head was in a gutter and since then a lot of things have happened.  I was always the weird kid in the corner reading Star Trek books.  When I was in elementary school I thought that kids would throw me birthday parties out of pity because I was the poor kid who everyone hated.  I am no longer that person.  I realized that during the worst possible period between my 29th and 30th year when I hit rock bottom.  You all know the insane story as I’ve told it two billion times over.  Honestly it was the best thing that ever happened to me in my life time, because many things would not have happened.  My friends came through for me at every given moment people from all over were checking up on me all over the world.  I fell and I landed on a bunch of people it was awesome.  I landed on family and friends who I thought I was never close enough to talk about these things.  They helped me realize I had a ton of unfulfilled dreams.  I got to move to New York City one of the greatest cities in the universe.  I got to do whatever I wanted without judgment from anyone.  I got to meet some great people at a few meetup groups in the past few months.  I got to meet an awesome person that I have been dating for 3 months.  I am trying to run for the 8 billionth time and keep up with it.  I got to go to Europe and next week for the first time New Orleans.  I got renewed for MVP for 2012.  I am grateful for all the people and things in my life.  I understand that sometimes when things seem bad you can always seek friends and family.  They will always help me.  I have to learn to lean on people sometimes just how they occasionally lean on me.  That is the biggest thing I have learned from the decade of 20 to 30.  I hope that 30 to 40 will be the best decade.  I hope that I can continue to grow.  I will catch you all later. Technorati Tags: Turning 30,Wisdom

    Read the article

  • Developer Training – Difficult Questions and Alternative Perspective – Part 3

    - by pinaldave
    Developer Training - Importance and Significance - Part 1 Developer Training – Employee Morals and Ethics – Part 2 Developer Training – Difficult Questions and Alternative Perspective - Part 3 Developer Training – Various Options for Developer Training – Part 4 Developer Training – A Conclusive Summary- Part 5 Congratulations!  You are now a fully trained developer!  You spent hours in a classroom, watching webinars, and reading materials.  You are now more educated and more prepared than ever before.  Now what? Stay or Quit The simple answer is that you now have two options – stay where you are or move on to a new job.  Even though you might now be smarter than you have ever felt before, this can still be a tough decision to make.  You feel extra trained and ready for a promotion or a raise, but you and your employer might not see eye to eye on this issue.  The logical conclusion is to go on a job hunt, but that might not be the most ethical thing to do. Click Image to Enlarge Manager’s Perspective Click Image to Enlarge Try to see the issue from your manager’s perspective.  You feel that you have just spent a lot of time and energy getting trained, and you should be rewarded.  But they have invested their time and energy in you.  They might see the training as a way to help you complete the goals they require from you, or as a way to help you complete tasks that will ultimately end in a reward or promotion. Moral Compass As in most cases, honesty is the best policy.  Be open with your manager about your expectations, and ask them to explain their goals.  When there is open and honest communication, everyone can walk away happy.  If you’re unable to discuss with your manager for one reason or another, just try to keep the company policy in mind and follow your own moral compass.  If all else fails, and your company is unwilling to make allowances for your new value, offer to pay the company back for the training before moving on your way. Whether you stay at your old job or move on to a new one, you are still faced with the question of what you’re going to do with all your new knowledge.  If you feel comfortable, offer to train others around you who are interested in the same subject.  This can look very good on your resume, and if you are working in a team environment it is sure to help you in the long run! What Next? You can even offer to train other trainers at the company – managers, those above you, or even report back to your original trainer about how your education is helping you in the work place.  Obviously this should be completely voluntary on the trainer’s part.  Taking advice from a “newbie” may not be their favorite idea, but it could also show the company that you are open to expanding your horizons and being helpful to everyone around you. Last in Line for Opportunity Click Image to Enlarge At this time, let us address a subject related to training and what to do with it – what if you are always overlooked for training?  This can as thorny a problem as receiving training in the first place.  The best advice is to let your supervisors know that you are always open to training and very interested in certain topics.  If you are consistently passed over, be patient.  Your turn will probably come, but the company as a whole has to focus on other problems at the moment.  If you feel that there are more personal issues at play, be sure to bring this up with your supervisor in a calm and professional manager so that everything can be worked out best for both parties. You, Yourself and Your Future! If all else fails, offer to pay for training yourself.  Perhaps money problems are at the root of being passed over.  Even if there are other reasons, offering to pay your own way shows your dedication and could work out well for you in the long run.  Always remember – in life you have to go out and make your own way, you cannot always sit and wait for things to land in your lap. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Developer Training, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • SQL SERVER – Online Index Rebuilding Index Improvement in SQL Server 2012

    - by pinaldave
    Have you ever faced situation when you see something working and you feel it should not be working? Well, I had similar moments few days ago. I know that SQL Server 2008 supports online indexing. However, I also know that I cannot rebuild index ONLINE if I have used VARCHAR(MAX), NVARCHAR(MAX) or few other data types. While I held my belief very strongly I came across situation, where I had to go online and do little bit reading from Book Online. Here is the similar example. First of all – run following code in SQL Server 2008 or SQL Server 2008 R2. USE TempDB GO CREATE TABLE TestTable (ID INT, FirstCol NVARCHAR(10), SecondCol NVARCHAR(MAX)) GO CREATE CLUSTERED INDEX [IX_TestTable] ON TestTable (ID) GO CREATE NONCLUSTERED INDEX [IX_TestTable_Cols] ON TestTable (FirstCol) INCLUDE (SecondCol) GO USE [tempdb] GO ALTER INDEX [IX_TestTable_Cols] ON [dbo].[TestTable] REBUILD WITH (ONLINE = ON) GO DROP TABLE TestTable GO Now run the same code in SQL Server 2012 version. Observe the difference between both of the execution. You will be get following resultset. In SQL Server 2008/R2 it will throw following error: Msg 2725, Level 16, State 2, Line 1 An online operation cannot be performed for index ‘IX_TestTable_Cols’ because the index contains column ‘SecondCol’ of data type text, ntext, image, varchar(max), nvarchar(max), varbinary(max), xml, or large CLR type. For a non-clustered index, the column could be an include column of the index. For a clustered index, the column could be any column of the table. If DROP_EXISTING is used, the column could be part of a new or old index. The operation must be performed offline. In SQL Server 2012 it will run successfully and will not throw any error. Command(s) completed successfully. I always thought it will throw an error if there is VARCHAR(MAX) or NVARCHAR(MAX) used in table schema definition. When I saw this result it was clear to me that it will be for sure not bug enhancement in SQL Server 2012. For matter for the fact, I always wanted this feature to be added in SQL Server Engine as this will enable ONLINE Index Rebuilding for mission critical tables which needs to be always online. I quickly searched online and landed on Jacob Sebastian’s blog where he has blogged about it as well. Well, is there any other new feature in SQL Server 2012 which gave you good surprise? Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Index, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Best practices when creating/modeling databases?

    - by Oscar Mederos
    I learned at the University some steps to model a database: Model the problem using the Extended Entity-Relationship Model. Extract the functional dependencies Apply some algorithms to normalize the database (3NF or Boyce-Codd) Create the database I'm studying Computer Science and since I received that course I'm wondering if I always need to do those steps when creating a complex database for an specified problem. For example, do PHP / .NET / .. programmers always do that? or there are some tools to simplify that process, maybe using another way of represent the problem instead of the EERM?

    Read the article

  • C# 4.0: Covariance And Contravariance In Generics

    - by Paulo Morgado
    C# 4.0 (and .NET 4.0) introduced covariance and contravariance to generic interfaces and delegates. But what is this variance thing? According to Wikipedia, in multilinear algebra and tensor analysis, covariance and contravariance describe how the quantitative description of certain geometrical or physical entities changes when passing from one coordinate system to another.(*) But what does this have to do with C# or .NET? In type theory, a the type T is greater (>) than type S if S is a subtype (derives from) T, which means that there is a quantitative description for types in a type hierarchy. So, how does covariance and contravariance apply to C# (and .NET) generic types? In C# (and .NET), variance applies to generic type parameters and not to the resulting generic type. A generic type parameter is: covariant if the ordering of the generic types follows the ordering of the generic type parameters: Generic<T> = Generic<S> for T = S. contravariant if the ordering of the generic types is reversed from the ordering of the generic type parameters: Generic<T> = Generic<S> for T = S. invariant if neither of the above apply. If this definition is applied to arrays, we can see that arrays have always been covariant because this is valid code: object[] objectArray = new string[] { "string 1", "string 2" }; objectArray[0] = "string 3"; objectArray[1] = new object(); However, when we try to run this code, the second assignment will throw an ArrayTypeMismatchException. Although the compiler was fooled into thinking this was valid code because an object is being assigned to an element of an array of object, at run time, there is always a type check to guarantee that the runtime type of the definition of the elements of the array is greater or equal to the instance being assigned to the element. In the above example, because the runtime type of the array is array of string, the first assignment of array elements is valid because string = string and the second is invalid because string = object. This leads to the conclusion that, although arrays have always been covariant, they are not safely covariant – code that compiles is not guaranteed to run without errors. In C#, the way to define that a generic type parameter as covariant is using the out generic modifier: public interface IEnumerable<out T> { IEnumerator<T> GetEnumerator(); } public interface IEnumerator<out T> { T Current { get; } bool MoveNext(); } Notice the convenient use the pre-existing out keyword. Besides the benefit of not having to remember a new hypothetic covariant keyword, out is easier to remember because it defines that the generic type parameter can only appear in output positions — read-only properties and method return values. In a similar way, the way to define a type parameter as contravariant is using the in generic modifier: public interface IComparer<in T> { int Compare(T x, T y); } Once again, the use of the pre-existing in keyword makes it easier to remember that the generic type parameter can only be used in input positions — write-only properties and method non ref and non out parameters. Because covariance and contravariance apply only to the generic type parameters, a generic type definition can have both covariant and contravariant generic type parameters in its definition: public delegate TResult Func<in T, out TResult>(T arg); A generic type parameter that is not marked covariant (out) or contravariant (in) is invariant. All the types in the .NET Framework where variance could be applied to its generic type parameters have been modified to take advantage of this new feature. In summary, the rules for variance in C# (and .NET) are: Variance in type parameters are restricted to generic interface and generic delegate types. A generic interface or generic delegate type can have both covariant and contravariant type parameters. Variance applies only to reference types; if you specify a value type for a variant type parameter, that type parameter is invariant for the resulting constructed type. Variance does not apply to delegate combination. That is, given two delegates of types Action<Derived> and Action<Base>, you cannot combine the second delegate with the first although the result would be type safe. Variance allows the second delegate to be assigned to a variable of type Action<Derived>, but delegates can combine only if their types match exactly. If you want to learn more about variance in C# (and .NET), you can always read: Covariance and Contravariance in Generics — MSDN Library Exact rules for variance validity — Eric Lippert Events get a little overhaul in C# 4, Afterward: Effective Events — Chris Burrows Note: Because variance is a feature of .NET 4.0 and not only of C# 4.0, all this also applies to Visual Basic 10.

    Read the article

  • Collision filtering by object, team

    - by Bill Zimmerman
    Hi, I am looking for a good method to determine which objects will be considered for collision with other objects. My current idea is that each object has the following properties: alwaysCollidesWith = [list of objects that will always trigger a collision check] neverCollidesWith = [lost of objects that will never be considered] teamCollidesWith = [list of objects that will be checked, provided they belong to a different team] For example: -projectiles never have to be checked for collisions with other projectiles -players are always checked for collisions with players, regardless of team -projectiles are only considered for collisions if they collide with another teams players Does anyone see any weaknesses with this approach? Can anyone recommend a better approach?

    Read the article

  • What Did You Do? is a Bad Question

    - by Ajarn Mark Caldwell
    Brian Moran (blog | Twitter) did a great presentation today for the PASS Professional Development Virtual Chapter on The Art of Questions.  One of the points that Brian made was that there are good questions and bad (or at least not-as-good) questions.  Good questions tend to open-up the conversation and engender positive reactions (perhaps even trust and respect) between the participants; and bad questions tend to close-down a conversation either through the narrow list of possible responses (e.g. strictly Yes/No) or through the negative reactions they can produce.  And this explains why I so frequently had problems troubleshooting real-time problems with users in the past.  I’ll explain that in more detail below, but before we go on, let me recommend that you watch the recording of Brian’s presentation to learn why the question Why is often problematic in the U.S. and yet we so often resort to it. For a short portion (3 years) of my career, I taught basic computer skills and Office applications in an adult vocational school, and this gave me ample opportunity to do live troubleshooting of user challenges with computers.  And like many people who ended up in computer related jobs, I also have had numerous times where I was called upon by less computer-savvy individuals to help them with some challenge they were having, whether it was part of my job or not.  One of the things that I noticed, especially during my time as a teacher, was that when I was helping somebody, typically the first question I would ask them was, “What did you do?”  This seemed to me like a good way to start my detective work trying to figure out what happened, what went wrong, how to fix it, and how to help the person avoid it again in the future.  I always asked it in a polite tone of voice as I was just trying to gather the facts before diving in deeper.  However; 99.999% of the time, I always got the same answer, “Nothing!”  For a long time this frustrated me because (remember I’m in detective mode at that point) I knew it could not possibly be true.  They HAD to have done SOMETHING…just tell me what were the last actions you took before this problem presented itself.  But no, they always stuck with “Nothing”.  At which point, with frustration growing, and not a little bit of disdain for their lack of helpfulness, I would usually ask them to move aside while I took over their machine and got them out of whatever they had gotten themselves into.  After a while I just grew used to the fact that this was the answer I would usually receive, but I always kept asking because for the .001% of the people who would actually tell me, I could then help them understand what went wrong and how to avoid it in the future. Now, after hearing Brian’s talk, I understand what the problem was.  Even though I meant to just be in an information gathering mode, the words I was using, “What did YOU do?” have such a strong negative connotation that people would instinctively go into defense-mode and stop sharing information that might make them look bad.  Many of them probably were not even consciously aware that they had gone on the defensive, but the self-preservation instinct, especially self-preservation of the ego, is so strong that people would end up there without even realizing it. So, if “What did you do” is a bad question, what would have been better?  Well, one suggestion that Brian makes in his talk is something along the lines of, “Can you tell me what led up to this?” or “what was happening on the computer right before this came up?”  It’s subtle, but the point is to take the focus off of the person and their behavior; instead depersonalizing it and talk about events from more of a 3rd-party observer point of view.  With this approach, people will be more likely to talk about what the computer did and what they did in response to it without feeling the interrogation spotlight is on them.  They are also more likely to mention other events that occurred around the same time that may or may not be related, but which could certainly help you troubleshoot a larger problem if it is not just user actions.  And that is the ultimate goal of your asking the questions.  So yes, it does matter how you ask the question; and there are such things as good questions and bad questions.  Excellent topic Brian!  Thanks for getting the thinking gears churning! (Cross-posted to the Professional Development Virtual Chapter blog.)

    Read the article

  • Best practices when creating/modeling databases?

    - by Oscar Mederos
    Hello, I learned at the University some steps to model a database: Model the problem using the Extended Entity-Relationship Model. Extract the functional dependencies Apply some algorithms to normalize the database (3NF or Boyce-Codd) Create the database I'm studying Computer Science and since I received that course I'm wondering if I always need to do those steps when creating a complex database for an specified problem. For example, do PHP / .NET / .. programmers always do that? or there are some tools to simplify that process, maybe using another way of represent the problem instead of the EERM?

    Read the article

  • Opinion on LastPass's security for the Average Joe [closed]

    - by Rook
    This is borderline on objective/subjective, but I'm posting it here since I'm more interested in objective facts, without going into too much technical details, than I am in user reviews of LastPass. I've always used offline ways for (password / sensitive data) storage, but lately I keep hearing good things about LastPass. Indeed, it is more practical having it always accessible from every computer you're using without syncing and related problems, but the security aspect still troubles me. How (in a nutshell for dummies) does LastPass keep your data secure / can their employees see your data, and what is your opinion for such storage of more than usual keeping of sensitive data (bank PIN codes, some financial / business related stuff and so on - you know, the things that would practically hurt if lost / phished)? What are your opinions of it, and do you trust it for such? Any bad experiences? If someone for example is sniffing your wifi network, would such data be easier than usual to sniff out?

    Read the article

  • Finally home - and something fully off topic

    - by Mike Dietrich
    Arrived at Munich Pasing last night at 0:50am ... finally :-) On Sunday I've left the Dylan Hotel in Dublin (thanks to the staff there as well: you were REALLY helpful!!) around 7:30pm to go to the port - and came home on Tuesday morning 1:15am. So all together 29:45hrs door-to-door - not bad for nearly 2000km just relying on public transport. And could have been faster if there were seats in ealier TGV's left. But I don't complain at all ;-) Just checked the website of Dublin Airport - it says currently: 17.00pm: Latest on flight disruptions at Dublin Airport The IAA have advised us that based on the latest Volcanic Ash Advisory Centre London Dublin Airport will remain closed for all inbound and outbound commercial flights until 20.00hours. This effectively means that no flights will land or take off at Dublin Airport until then. A further update will be posted this afternoon. When traveling I have always my iPod with me. It has gotten a bit old now (I think I've bought it 3 years ago in November 2007) but it has a 160GB hard disk in it so it fits most of my music collection (not the entire collection anymore as I'm currently re-riping everything to Apple Lossless because at least for my ears it makes a big difference - but I listen to good ol' vinyl as well ...and I don't download compressed music ;-) ). The battery of my little travel companion is still good for more than 20 hours consistent music playback - and there was a band from Texas being in my ears most of the whole journey called Midlake. I haven't heard of them before until I asked a lady at a Munich store some few weeks ago what she's playing on the speakers in the shop. She was amazed and came back with the CD cover but I hesitated to buy it as I always want to listen the tunes before - and at this day I had no time left to do so. But in Dublin I had a bit of spare time on Saturday and I always enter record stores - and the Tower Records was the sort of store I really enjoy and so I've spent there nearly two hours - leaving with 3 Midlake CDs in my bag. So if you are interested just listen those tunes which may remind some people on Fleetwood Mac: As I said in the title, fully off topic ;-)

    Read the article

  • Don’t Program by Fear, Question Everything

    - by João Angelo
    Perusing some code base I’ve recently came across with a code comment that I would like to share. It was something like this: class Animal { public Animal() { this.Id = Guid.NewGuid(); } public Guid Id { get; private set; } } class Cat : Animal { public Cat() : base() // Always call base since it's not always done automatically { } } Note: All class names were changed to protect the innocent. To clear any possible doubts the C# specification explicitly states that: If an instance constructor has no constructor initializer, a constructor initializer of the form base() is implicitly provided. Thus, an instance constructor declaration of the form C(...) {...} is exactly equivalent to C(...): base() {...} So in conclusion it’s clearly an incorrect comment but what I find alarming is how a comment like that gets into a code base and survives the test of time. Not to forget what it can do to someone who is making a jump from other technologies to C# and reads stuff like that.

    Read the article

  • SQL SERVER – SSMS: Database Consistency History Report

    - by Pinal Dave
    Doctor and Database The last place I like to visit is always a hospital. With the monsoon season starting, intermittent rains, it has become sort of a routine to get a cycle of fever every other year (seriously I hate it). So when I visit my doctor, it is always interesting in the way he quizzes me. The routine question of – “How many days have you had this?”, “Is there any pattern?”, “Did you drench in rain?”, “Do you have any other symptom?” and so on. The idea here is that the doctor wants to find any anomaly or a pattern that will guide him to a viral or bacterial type. Most of the time they get it based on experience and sometimes after a battery of tests. So if there is consistent behavior to your problem, there is always a solution out. SQL Server has its way to find if the server data / files are in consistent state using the DBCC commands. Back to SQL Server In real life, Database consistency check is one of the critical operations a DBA generally doesn’t give much priority. Many readers of my blogs have asked many times, how do we know if the database is consistent? How do I read output of DBCC CHECKDB and find if everything is right or not? My common answer to all of them is – look at the bottom of checkdb (or checktable) output and look for below line. CHECKDB found 0 allocation errors and 0 consistency errors in database ‘DatabaseName’. Above is a “good sign” because we are seeing zero allocation and zero consistency error. If you are seeing non-zero errors then there is some problem with the database. Sample output is shown as below: CHECKDB found 0 allocation errors and 2 consistency errors in database ‘DatabaseName’. repair_allow_data_loss is the minimum repair level for the errors found by DBCC CHECKDB (DatabaseName). If we see non-zero error then most of the time (not always) we get repair options depending on the level of corruption. There is risk involved with above option (repair_allow_data_loss), that is – we would lose the data. Sometimes the option would be repair_rebuild which is little safer. Though these options are available, it is important to find the root cause to the problem. In standard report, there is a report which can show the history of checkdb executed for the selected database. Since this is a database level report, we need to right click on database, click Reports, click Standard Reports and then choose “Database Consistency History” report. The information in this report is picked from default trace. If default trace is disabled or there is no checkdb run or information is not there in default trace (because it’s rolled over), we would get report like below. As we can see report says it very clearly: Currently, no execution history of CHECKDB is available or default trace is not enabled. To demonstrate, I have caused corruption in one of the database and did below steps. Run CheckDB so that errors are reported. Fix the corruption by losing the data using repair option Run CheckDB again to check if corruption is cleared. After that I have launched the report and below is what we would see. If you are lazy like me and don’t want to run the report manually for each database then below query would be handy to provide same report for all database. This query is runs behind the scenes by the report. All I have done is remove the filter for database name (at the last – highlighted). DECLARE @curr_tracefilename VARCHAR(500); DECLARE @base_tracefilename VARCHAR(500); DECLARE @indx INT; SELECT @curr_tracefilename = path FROM sys.traces WHERE is_default = 1; SET @curr_tracefilename = REVERSE(@curr_tracefilename); SELECT @indx  = PATINDEX('%\%', @curr_tracefilename) ; SET @curr_tracefilename = REVERSE(@curr_tracefilename); SET @base_tracefilename = LEFT( @curr_tracefilename,LEN(@curr_tracefilename) - @indx) + '\log.trc'; SELECT  SUBSTRING(CONVERT(NVARCHAR(MAX),TEXTData),36, PATINDEX('%executed%',TEXTData)-36) AS command ,       LoginName ,       StartTime ,       CONVERT(INT,SUBSTRING(CONVERT(NVARCHAR(MAX),TEXTData),PATINDEX('%found%',TEXTData) +6,PATINDEX('%errors %',TEXTData)-PATINDEX('%found%',TEXTData)-6)) AS errors ,       CONVERT(INT,SUBSTRING(CONVERT(NVARCHAR(MAX),TEXTData),PATINDEX('%repaired%',TEXTData) +9,PATINDEX('%errors.%',TEXTData)-PATINDEX('%repaired%',TEXTData)-9)) repaired ,       SUBSTRING(CONVERT(NVARCHAR(MAX),TEXTData),PATINDEX('%time:%',TEXTData)+6,PATINDEX('%hours%',TEXTData)-PATINDEX('%time:%',TEXTData)-6)+':'+SUBSTRING(CONVERT(NVARCHAR(MAX),TEXTData),PATINDEX('%hours%',TEXTData) +6,PATINDEX('%minutes%',TEXTData)-PATINDEX('%hours%',TEXTData)-6)+':'+SUBSTRING(CONVERT(NVARCHAR(MAX),TEXTData),PATINDEX('%minutes%',TEXTData) +8,PATINDEX('%seconds.%',TEXTData)-PATINDEX('%minutes%',TEXTData)-8) AS time FROM::fn_trace_gettable( @base_tracefilename, DEFAULT) WHERE EventClass = 22 AND SUBSTRING(TEXTData,36,12) = 'DBCC CHECKDB' -- AND DatabaseName = @DatabaseName; Don’t get worried about the logic above. All it is doing is reading the trace files, parsing below entry and getting out information for underlined words. DBCC CHECKDB (CorruptedDatabase) executed by sa found 2 errors and repaired 0 errors. Elapsed time: 0 hours 0 minutes 0 seconds.  Internal database snapshot has split point LSN = 00000029:00000030:0001 and first LSN = 00000029:00000020:0001. Hopefully now onwards you would run checkdb and understand the importance of it. As responsible DBAs I am sure you are already doing it, let me know how often do you actually run them on you production environment? Reference: Pinal Dave (http://blog.sqlauthority.com)Filed under: PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Server Management Studio, SQL Tips and Tricks, T SQL Tagged: SQL Reports

    Read the article

  • Linux Mint is Brilliant

    - by Simon Moon
    Most of my blog posts sound way too whiny. I'm not that guy. (Am I?) I've been using SUSE-flavored Linux for personal projects since 2002 (SUSE Linux 8.1). This past weekend, I made the heart-wrenching decision to abandon openSUSE (version 12.1) in favor of Linux Mint (version Maya). OpenSUSE had just become too burdensome. Packages that installed easily on RedHat or Debian always had issues running on top of OpenSUSE. And I never could get the Heroku Toolbelt installed in any kind of usable state.And so, ...I'm beginning again with this enticing young thing -- Mint with the Cinnamon window environment. Delicious. And while I'll always have fond memories of my years with openSUSE, I've got to admit that Mint makes running Linux feel good again. http://blog.linuxmint.com/?p=2031

    Read the article

  • SQL Server 2012 edition comparison details are published

    - by DavidWimbush
    Interesting stuff, particularly if you're doing BI. BISM tabular and Power View will not be in Standard Edition, only in the new - presumably more expensive - Business Intelligence Edition. That kind of makes sense as you need a fairly pricey edition of SharePoint to really get all the benefits, but it's a shame there won't be some kind of limited version in Standard Edition. And Always On will be in Standard Edition but limited to 2 nodes. I really expected Always On to be Enterprise-only so this is a great decision. It allows those of us working at a more modest scale to benefit and raises the fault tolerance of SQL Server as a product to a new level.Read all about it here: http://www.microsoft.com/sqlserver/en/us/future-editions/sql2012-editions.aspx

    Read the article

< Previous Page | 111 112 113 114 115 116 117 118 119 120 121 122  | Next Page >