Search Results

Search found 88749 results on 3550 pages for 'code tuning'.

Page 7/3550 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >

  • Working with Legacy code #5: The blackhole.

    - by andrewstopford
    Someone creates a class or series of classes for something, the classes are big in size with large complicated methods. The effort is a sea of technical debt for the entire team but in the thick of the daily chaos it is lost. With out the coder talking to the team, with no team code policy and no code reviews (and action points) it remains. Pretty soon the team forget about that code. A few weeks\months\years goes by, some of the team may have left, some may remain but business asks for the team to add to that code. The team is now looking at a black hole, no one knows how it works, what it does, what it is for, it is a smelly hell hole and the deadline is fast approaching. The team now tries to change the code, with no approach at unit tests or refactoring in fear of breaking the black hole the team do just that and the business have just lost money. If you are faced with a black hole you need to look back over my series, even a black hole in what might seem like a clean unit tested application. Don't be fooled into thinking that legacy code does not apply to your code base.  The next stage is don't let blackholes in your codebase. Effective code reviews, team communication and good overal team coding policies will really help. Even if you are faced with a deadline do not let them appear, stop, take stock, what can be done and who can help. If you allow them through they will grow and grow and grow and the technical debt will hit you like a tidal wave soon enough,.  

    Read the article

  • Code Generation and IDE vs writing per Hand

    - by sytycs
    I have been programming for about a year now. Pretty soon I realized that I need a great Tool for writing code and learned Vim. I was happy with C and Ruby and never liked the idea of an IDE. Which was encouraged by a lot of reading about programming.[1] However I started with (my first) Java Project. In a CS Course we were using Visual Paradigm and encouraged to let the program generate our code from a class diagram. I did not like that Idea because: Our class diagram was buggy. Students more experienced in Java said they would write the code per hand. I had never written any Java before and would not understand a lot of the generated code. So I took a different approach and wrote all methods per Hand (getter and Setter included). My Team-members have written their parts (partly generated by VP) in an IDE and I was "forced" to use it too. I realized they had generated equal amounts of code in a shorter amount of time and did not spend a lot of time setting their CLASSPATH and writing scripts for compiling that son of a b***. Additionally we had to implement a GUI and I dont see how we could have done that in a sane matter in Vim. So here is my Problem: I fell in love with Vim and the Unix way. But it looks like for getting this job done (on time) the IDE/Code generation approach is superior. Do you have equal experiences? Is Java by the nature of the language just more suitable for an IDE/Code generated approach? Or am I lacking the knowledge to produce equal amounts of code "per Hand"? [1] http://heather.cs.ucdavis.edu/~matloff/eclipse.html

    Read the article

  • How to promote code reuse and documentation?

    - by Graviton
    As a team lead of about 10+ developers, I would want to promote code reuse. We have written a lot of code-- a lot of them are repetitive over the past few years. The problem now is that a lot of these code are just duplicate of some other code or a slight variation of them. I have started the movement ( discussion) on how to make code into components so that they can be reused for the future projects, but the problem is that I afraid the new developers or other developers who are ignorant of the components will just go forward and write their own thing. Is there anyway to remind the developers to reuse the components/ improve the documentation/ contribute to the underlying component instead of duplicating the existing code and tweaking on it or just write their own? How to make the components easily discover-able, easily usable so that everyone will use it? Edit: I think every developer knows about the benefit of reusable components and wants to use them, it's just that we don't know how to make them discoverable. Also, the developers when they are writing code, they know they should write reusable code but lack of the motivation to do so.

    Read the article

  • Books and resources for Java Performance tuning - when working with databases, huge lists

    - by Arvind
    Hi All, I am relatively new to working on huge applications in Java. I am working on a Java web service which is pretty heavily used by various clients. The service basically queries the database (hibernate) and then works with a lot of Lists (there are adapters to convert list returned from DB to the interface which the service publishes) and I am seeing lot of issues with the service like high CPU usage or high heap space. While I can troubleshoot the performance issues using a profiler, I want to actually learn about what all I need to take care when I actually write code. Like what kind of List to use or things like using StringBuilder instead of String, etc... Is there any book or blogs which I can refer which will help me while I write new services? Also my application is multithreaded - each service call from a client is a new thread, and I want to know some best practices around that area as well. I did search the web but I found many tips which are not relevant in the latest Java 6 releases, so wanted to know what kind of resources would help a developer starting out now on Java for heavily used applications. Arvind

    Read the article

  • Code Chess: Fibonacci Sequence

    - by SLaks
    Building upon the proven success of Code Golf, I would like to introduce Code Chess. Unlike Code Golf, which strives for concision, Code Chess will strive for cleverness. Can you create a clever or unexpected Fibonacci generator? Your code must print or return either the nth Fibonacci number or the first n Fibonacci numbers.

    Read the article

  • Reconciling the Boy Scout Rule and Opportunistic Refactoring with code reviews

    - by t0x1n
    I am a great believer in the Boy Scout Rule: Always check a module in cleaner than when you checked it out." No matter who the original author was, what if we always made some effort, no matter how small, to improve the module. What would be the result? I think if we all followed that simple rule, we'd see the end of the relentless deterioration of our software systems. Instead, our systems would gradually get better and better as they evolved. We'd also see teams caring for the system as a whole, rather than just individuals caring for their own small little part. I am also a great believer in the related idea of Opportunistic Refactoring: Although there are places for some scheduled refactoring efforts, I prefer to encourage refactoring as an opportunistic activity, done whenever and wherever code needs to cleaned up - by whoever. What this means is that at any time someone sees some code that isn't as clear as it should be, they should take the opportunity to fix it right there and then - or at least within a few minutes Particularly note the following excerpt from the refactoring article: I'm wary of any development practices that cause friction for opportunistic refactoring ... My sense is that most teams don't do enough refactoring, so it's important to pay attention to anything that is discouraging people from doing it. To help flush this out be aware of any time you feel discouraged from doing a small refactoring, one that you're sure will only take a minute or two. Any such barrier is a smell that should prompt a conversation. So make a note of the discouragement and bring it up with the team. At the very least it should be discussed during your next retrospective. Where I work, there is one development practice that causes heavy friction - Code Review (CR). Whenever I change anything that's not in the scope of my "assignment" I'm being rebuked by my reviewers that I'm making the change harder to review. This is especially true when refactoring is involved, since it makes "line by line" diff comparison difficult. This approach is the standard here, which means opportunistic refactoring is seldom done, and only "planned" refactoring (which is usually too little, too late) takes place, if at all. I claim that the benefits are worth it, and that 3 reviewers will work a little harder (to actually understand the code before and after, rather than look at the narrow scope of which lines changed - the review itself would be better due to that alone) so that the next 100 developers reading and maintaining the code will benefit. When I present this argument my reviewers, they say they have no problem with my refactoring, as long as it's not in the same CR. However I claim this is a myth: (1) Most of the times you only realize what and how you want to refactor when you're in the midst of your assignment. As Martin Fowler puts it: As you add the functionality, you realize that some code you're adding contains some duplication with some existing code, so you need to refactor the existing code to clean things up... You may get something working, but realize that it would be better if the interaction with existing classes was changed. Take that opportunity to do that before you consider yourself done. (2) Nobody is going to look favorably at you releasing "refactoring" CRs you were not supposed to do. A CR has a certain overhead and your manager doesn't want you to "waste your time" on refactoring. When it's bundled with the change you're supposed to do, this issue is minimized. The issue is exacerbated by Resharper, as each new file I add to the change (and I can't know in advance exactly which files would end up changed) is usually littered with errors and suggestions - most of which are spot on and totally deserve fixing. The end result is that I see horrible code, and I just leave it there. Ironically, I feel that fixing such code not only will not improve my standings, but actually lower them and paint me as the "unfocused" guy who wastes time fixing things nobody cares about instead of doing his job. I feel bad about it because I truly despise bad code and can't stand watching it, let alone call it from my methods! Any thoughts on how I can remedy this situation ?

    Read the article

  • ASP.NET MVC 3: Implicit and Explicit code nuggets with Razor

    - by ScottGu
    This is another in a series of posts I’m doing that cover some of the new ASP.NET MVC 3 features: New @model keyword in Razor (Oct 19th) Layouts with Razor (Oct 22nd) Server-Side Comments with Razor (Nov 12th) Razor’s @: and <text> syntax (Dec 15th) Implicit and Explicit code nuggets with Razor (today) In today’s post I’m going to discuss how Razor enables you to both implicitly and explicitly define code nuggets within your view templates, and walkthrough some code examples of each of them.  Fluid Coding with Razor ASP.NET MVC 3 ships with a new view-engine option called “Razor” (in addition to the existing .aspx view engine).  You can learn more about Razor, why we are introducing it, and the syntax it supports from my Introducing Razor blog post. Razor minimizes the number of characters and keystrokes required when writing a view template, and enables a fast, fluid coding workflow. Unlike most template syntaxes, you do not need to interrupt your coding to explicitly denote the start and end of server blocks within your HTML. The Razor parser is smart enough to infer this from your code. This enables a compact and expressive syntax which is clean, fast and fun to type. For example, the Razor snippet below can be used to iterate a collection of products and output a <ul> list of product names that link to their corresponding product pages: When run, the above code generates output like below: Notice above how we were able to embed two code nuggets within the content of the foreach loop.  One of them outputs the name of the Product, and the other embeds the ProductID within a hyperlink.  Notice that we didn’t have to explicitly wrap these code-nuggets - Razor was instead smart enough to implicitly identify where the code began and ended in both of these situations.  How Razor Enables Implicit Code Nuggets Razor does not define its own language.  Instead, the code you write within Razor code nuggets is standard C# or VB.  This allows you to re-use your existing language skills, and avoid having to learn a customized language grammar. The Razor parser has smarts built into it so that whenever possible you do not need to explicitly mark the end of C#/VB code nuggets you write.  This makes coding more fluid and productive, and enables a nice, clean, concise template syntax.  Below are a few scenarios that Razor supports where you can avoid having to explicitly mark the beginning/end of a code nugget, and instead have Razor implicitly identify the code nugget scope for you: Property Access Razor allows you to output a variable value, or a sub-property on a variable that is referenced via “dot” notation: You can also use “dot” notation to access sub-properties multiple levels deep: Array/Collection Indexing: Razor allows you to index into collections or arrays: Calling Methods: Razor also allows you to invoke methods: Notice how for all of the scenarios above how we did not have to explicitly end the code nugget.  Razor was able to implicitly identify the end of the code block for us. Razor’s Parsing Algorithm for Code Nuggets The below algorithm captures the core parsing logic we use to support “@” expressions within Razor, and to enable the implicit code nugget scenarios above: Parse an identifier - As soon as we see a character that isn't valid in a C# or VB identifier, we stop and move to step 2 Check for brackets - If we see "(" or "[", go to step 2.1., otherwise, go to step 3  Parse until the matching ")" or "]" (we track nested "()" and "[]" pairs and ignore "()[]" we see in strings or comments) Go back to step 2 Check for a "." - If we see one, go to step 3.1, otherwise, DO NOT ACCEPT THE "." as code, and go to step 4 If the character AFTER the "." is a valid identifier, accept the "." and go back to step 1, otherwise, go to step 4 Done! Differentiating between code and content Step 3.1 is a particularly interesting part of the above algorithm, and enables Razor to differentiate between scenarios where an identifier is being used as part of the code statement, and when it should instead be treated as static content: Notice how in the snippet above we have ? and ! characters at the end of our code nuggets.  These are both legal C# identifiers – but Razor is able to implicitly identify that they should be treated as static string content as opposed to being part of the code expression because there is whitespace after them.  This is pretty cool and saves us keystrokes. Explicit Code Nuggets in Razor Razor is smart enough to implicitly identify a lot of code nugget scenarios.  But there are still times when you want/need to be more explicit in how you scope the code nugget expression.  The @(expression) syntax allows you to do this: You can write any C#/VB code statement you want within the @() syntax.  Razor will treat the wrapping () characters as the explicit scope of the code nugget statement.  Below are a few scenarios where we could use the explicit code nugget feature: Perform Arithmetic Calculation/Modification: You can perform arithmetic calculations within an explicit code nugget: Appending Text to a Code Expression Result: You can use the explicit expression syntax to append static text at the end of a code nugget without having to worry about it being incorrectly parsed as code: Above we have embedded a code nugget within an <img> element’s src attribute.  It allows us to link to images with URLs like “/Images/Beverages.jpg”.  Without the explicit parenthesis, Razor would have looked for a “.jpg” property on the CategoryName (and raised an error).  By being explicit we can clearly denote where the code ends and the text begins. Using Generics and Lambdas Explicit expressions also allow us to use generic types and generic methods within code expressions – and enable us to avoid the <> characters in generics from being ambiguous with tag elements. One More Thing….Intellisense within Attributes We have used code nuggets within HTML attributes in several of the examples above.  One nice feature supported by the Razor code editor within Visual Studio is the ability to still get VB/C# intellisense when doing this. Below is an example of C# code intellisense when using an implicit code nugget within an <a> href=”” attribute: Below is an example of C# code intellisense when using an explicit code nugget embedded in the middle of a <img> src=”” attribute: Notice how we are getting full code intellisense for both scenarios – despite the fact that the code expression is embedded within an HTML attribute (something the existing .aspx code editor doesn’t support).  This makes writing code even easier, and ensures that you can take advantage of intellisense everywhere. Summary Razor enables a clean and concise templating syntax that enables a very fluid coding workflow.  Razor’s ability to implicitly scope code nuggets reduces the amount of typing you need to perform, and leaves you with really clean code. When necessary, you can also explicitly scope code expressions using a @(expression) syntax to provide greater clarity around your intent, as well as to disambiguate code statements from static markup. Hope this helps, Scott P.S. In addition to blogging, I am also now using Twitter for quick updates and to share links. Follow me at: twitter.com/scottgu

    Read the article

  • Code Reuse and Abstraction in FP vs OOP

    - by Electric Coffee
    I've been told that code reuse and abstraction in OOP is far more difficult to do than it is in FP, and that all the claims that have been made about Object Orientedness (for lack of a better term) being great at reusing code have been flat out lies So I was wondering if anyone here could tell me why that is, and perhaps show me some code to back up these claims, I'm not saying I don't believe you Functional programmers, it's just that I've been "indoctrinated" to think Object Orientedly, and thus can't (yet) think Functionally enough to see it myself To quote Jimmy Hoffa (from an answer to one of my previous questions): The cake is a lie, code reuse in OO is far more difficult than in FP. For all that OO has claimed code reuse over the years, I have seen it follow through a minimum of times. (feel free to just say I must be doing it wrong, I'm comfortable with how well I write OO code having had to design and maintain OO systems for years, I know the quality of my own results) That quote is the basis of my question, I want to see if there's anything to the claim or not

    Read the article

  • Website where you can see how other programmers write their code

    - by CuiPengFei
    I remember seeing a website where people upload videos of themselves writing code. However, I can not find that site now. The purpose is to see how others code, to see how they refactor their code, to see how they use their paradigms, etc. Update: I remember that the video contains almost no audio, it's only one guy writing code, making mistakes, typos, fixing mistakes. If I read the final code, I can figure out how it works, but if I see how the code was wrote and what kind of mistakes were made along the way, then I can better understand it. I guess this is the main reason that they make this kind of video.

    Read the article

  • Self Documenting Code Vs. Commented Code

    - by Phill
    I had a search but didn't find what I was looking for, please feel free to link me if this question has already being asked. Earlier this month this post was made: http://net.tutsplus.com/tutorials/php/why-youre-a-bad-php-programmer/ Basically to sum it up, you're a bad programmer if you don't write comments. My personal opinion is that code should be descriptive and mostly not require comment's unless the code cannot be self describing. In the example given // Get the extension off the image filename $pieces = explode('.', $image_name); $extension = array_pop($pieces); The author said this code should be given a comment, my personal opinion is the code should be a function call that is descriptive: $extension = GetFileExtension($image_filename); However in the comments someone actually made just that suggestion: http://net.tutsplus.com/tutorials/php/why-youre-a-bad-php-programmer/comment-page-2/#comment-357130 The author responded by saying the commenter was "one of those people", i.e, a bad programmer. What are everyone elses views on Self Describing Code vs Commenting Code?

    Read the article

  • Is there any text editor for windows which can save files with code highlight for viewing

    - by user1713836
    I want some software for Windows where I can save code snippets and other daily usable commands in one file. Everyday I find some small code snippets which I want to save in a single file. Just like we have code snippet savers online, I want something offline on Windows, basically with all the features Microsoft Word has, but with code highlight. It should be lightweight like Notepad++. I mean if I select the code and then press some button, it should change the color according to the language. Currently I use Notepad++, but in it, I can't select small code snippets on one page. It either highlights the entire file or nothing.

    Read the article

  • How to code review without offending other developers [duplicate]

    - by Justin984
    This question already has an answer here: How to deal with someone who dislikes the idea of code reviews? 6 answers How can I tactfully suggest improvements to others' badly designed code during review? 14 answers How do I approach a coworker about his or her code quality? 12 answers I work on a team that does frequent code reviews. But it seems like more of a formality than anything. No one really points out problems in the code for fear of offending other developers. The few times I've tried to ask for changes were met with very defensive and reluctant attitudes. This is of course not good. Not only are we spending the time to code review, but we're getting literally zero value from it. Is this an issue that needs to be addressed by individual developers, or are there techniques for suggesting changes without stepping on other people's toes?

    Read the article

  • Tool to aid Code Review

    - by Prakash
    For our small team of 20 developers, we used do code review like: Make a label in svn and publish the label to the reviewers Reviewers checkout the code and add comments in line (with marker like: // REVIEWER_NAME::REVIEW COMMENT:) After all comments are in, reviewer checks in the code, preferably with new label. Developer checks the comments and makes changes (if appropriate) Developer keeps an excel sheet report for considered changes and reasons for ignored comments Problem: Developer needs to keep track of multiple labels which might have same comments Sometimes we even do One on One review and if we really have time, even do Table review (team of reviewers looks at the code on projector, on the fly, and pass comment) I was wondering: Are you guys using any specific tool which helps to do code reviews smoother? I have heard of Code Collaborator. But have anyone used that? Is it worth the money?

    Read the article

  • iPhone openGLES performance tuning

    - by genesys
    Hey there! I'm trying now for quite a while to optimize the framerate of my game without really making progress. I'm running on the newest iPhone SDK and have a iPhone 3G 3.1.2 device. I invoke arround 150 drawcalls, rendering about 1900 Triangles in total (all objects are textured using two texturelayers and multitexturing. most textures come from the same textureAtlasTexture stored in pvrtc 2bpp compressed texture). This renders on my phone at arround 30 fps, which appears to me to be way too low for only 1900 triangles. I tried many things to optimize the performance, including batching together the objects, transforming the vertices on the CPU and rendering them in a single drawcall. this yelds 8 drawcalls (as oposed to 150 drawcalls), but performance is about the same (fps drop to arround 26fps) I'm using 32byte vertices stored in an interleaved array (12bytes position, 12bytes normals, 8bytes uv). I'm rendering triangleLists and the vertices are ordered in TriStrip order. I did some profiling but I don't really know how to interprete it. instruments-sampling using Instruments and Sampling yelds this result: http://neo.cycovery.com/instruments_sampling.gif telling me that a lot of time is spent in "mach_msg_trap". I googled for it and it seems this function is called in order to wait for some other things. But wait for what?? instruments-openGL instruments with the openGL module yelds this result: http://neo.cycovery.com/intstruments_openglES_debug.gif but here i have really no idea what those numbers are telling me shark profiling: profiling with shark didn't tell me much either: http://neo.cycovery.com/shark_profile_release.gif the largest number is 10%, spent by DrawTriangles - and the whole rest is spent in very small percentage functions Can anyone tell me what else I could do in order to figure out the bottleneck and could help me to interprete those profiling information? Thanks a lot!

    Read the article

  • SQL Server high CPU and I/O activity database tuning

    - by zapping
    Our application tends to be running very slow recently. On debugging and tracing found out that the process is showing high cpu cycles and SQL Server shows high I/O activity. Can you please guide as to how it can be optimised? The application is now about an year old and the database file sizes are not very big or anything. The database is set to auto shrink. Its running on win2003, SQL Server 2005 and the application is a web application coded in c# i.e vs2005

    Read the article

  • Tuning MySQL to take advantage of a 4GB VPS

    - by alistair.mp
    Hello, We're running a large site at the moment which has a dedicated VPS for it's database server which is running MySQL and nothing else. At the moment all four CPU cores are running at close to 100% all of the time but the memory usage sticks at around 268MB out of an available 4096MB. I'm wondering what we can do to better utilise the memory and reduce the CPU load by tweaking MySQL's settings? Here is what we currently have in my.cnf: http://pastie.org/private/hxeji9o8n3u9up9mvtinbq Thanks

    Read the article

  • As our favorite imperative languages gain functional constructs, should loops be considered a code s

    - by Michael Buen
    In allusion to Dare Obasanjo's impressions on Map, Reduce, Filter (Functional Programming in C# 3.0: How Map/Reduce/Filter can Rock your World) "With these three building blocks, you could replace the majority of the procedural for loops in your application with a single line of code. C# 3.0 doesn't just stop there." Should we increasingly use them instead of loops? And should be having loops(instead of those three building blocks of data manipulation) be one of the metrics for coding horrors on code reviews? And why? [NOTE] I'm not advocating fully functional programming on those codes that could be simply translated to loops(e.g. tail recursions) Asking for politer term. Considering that the phrase "code smell" is not so diplomatic, I posted another question http://stackoverflow.com/questions/432492/whats-the-politer-word-for-code-smell about the right word for "code smell", er.. utterly bad code. Should that phrase have a place in our programming parlance?

    Read the article

  • Parameter Tuning for Perceptron Learning Algorithm

    - by Albert Diego
    Hi, I'm having sort of an issue trying to figure out how to tune the parameters for my perceptron algorithm so that it performs relatively well on unseen data. I've implemented a verified working perceptron algorithm and I'd like to figure out a method by which I can tune the numbers of iterations and the learning rate of the perceptron. These are the two parameters I'm interested in. I know that the learning rate of the perceptron doesn't affect whether or not the algorithm converges and completes. I'm trying to grasp how to change n. Too fast and it'll swing around a lot, and too low and it'll take longer. As for the number of iterations, I'm not entirely sure how to determine an ideal number. In any case, any help would be appreciated. Thanks.

    Read the article

  • MSSQL Server high CPU and I/O activity database tuning

    - by zapping
    Our application tends to be running very slow recently. On debugging and tracing found out that the process is showing high cpu cycles and SQL Server shows high I/O activity. Can you please guide as to how it can be optimised? The application is now about an year old and the database file sizes are not very big or anything. The database is set to auto shrink. Its running on win2003, SQL Server 2005 and the application is a web application coded in c# i.e vs2005

    Read the article

  • MS SQL tuning tools for finding overload

    - by SkyFox
    I use MS SQL server as a DBMS for my very big corporate DB (with different financial data). And some times my system go down. I don't understand why. What programs/tools I can use for finding process/program/thread, that overload my SQL-server? Thanks for all answers!

    Read the article

  • SQL SERVER – Performance Tuning Resolution

    - by pinaldave
    This blog post is written in response to T-SQL Tuesday hosted by MidnightDBAs. Taking resolutions is such an interesting subject. I think just like records, these are broken way more often. I find this is the funniest thing as we all take resolutions every year but not every year, we can manage to keep them. Well, does it mean we should not take resolutions? In fact I support resolutions. Every year, I take a resolution that I will strive reduce my body weight and I usually manage to keep eating healthy till the end of January. When February begins, I begin to loose focus from my goal and as March starts, the “As usual” eating habits begin. Looking at the positive side, what would happen if every year I do not eat healthy in January, I think that might cause terrible consequences to my health in the long run. So keeping resolutions is a good practise and following them to the extent one can is commendable. Let us come back to the world of SQL Server. What is my resolution for year 2011 for SQL Server? There are many, I am going to list three of very important resolutions that I have taken this new year over here. To understand SQL Server Performance Tuning at a deeper Level I think I am already half way through. I have been being very much busy during any given month doing hands-on performance tuning for at least 12 days on an average. That means, I am doing this activity for almost doing 2 weeks a month. I believe that I have a good understanding of the subject. Note that the word that I have used is “good,” and not “best.” There are often cases when I am stumped, and I have no clue of what to do next. Then, I usually go for my “trial and error” method - whichever method works, I make sure to keep a note on my blog. My goal is that I should never ever go for the trial and error method again to achieve the same solution. I should know the solution right away when I see the problem. I do understand that Performance Tuning can be a strange animal at times and one cannot guess the right step every time. However, aiming a high goal never hurts and I am going to learn more and more in this focused area. Going further from Basic BI understanding I do fairly decent with BI concepts. I know the nbasics of SSIS, SSRS, SSAS, PowerPivot and SharePoint (and few other things MDS, StreamInsight, etc). However, I still consider myself as a beginner. I do not have hands-on experience like many other BI Gurus around. I think I want to take my learning further in this direction. I do not want to be a BI expert as the first step but the goal is to move ahead from basic level towards an advanced level. I am going to start presenting in User Group Sessions and other places on this subject. When I have to prepare new subject for presentations, I think I force myself to learn more. I am committed to learn a bit more in this direction. Learning new features SQL Server 2011 Denali This is new thing from “Microsoft” for all the SQL Geeks. I am eagerly waiting for final product later this year and I am planning to learn it well. I think if I follow my above two goals, I think this goal will be automatically covered. I am eager and excited for this new offering from Microsoft. I guess, these are my resolutions; may be next year about the same time, I must revisit this post and see how much successful I am in following my goal. On a lighter note, I am particularly fan of following cartoon strip (Courtesy: Calvin and Hobbes). I think when we cannot resolve our resolutions, we tend to act like Calvin. Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: About Me, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL, Technology

    Read the article

  • Annotate source code with diagrams as comments

    - by Steven Lu
    I write a lot of (primarily c++ and javascript) code that touches upon computational geometry and graphics and those kinds of topics, so I have found that visual diagrams have been an indispensable part of the process of solving problems. I have determined just now that "oh, wouldn't it just be fantastic if I could somehow attach a hand-drawn diagram to a piece of code as a comment", and this would allow me to come back to something I worked on, days, weeks, months earlier and far more quickly re-grok my algorithms. As a visual learner, I feel like this has the potential to improve my productivity with almost every type of programming because simple diagrams can help with understanding and reasoning about any type of non-trivial data structure. Graphs for example. During graph theory class at university I had only ever been able to truly comprehend the graph relationships that I could actually draw diagrammatical representations of. So... No IDE to my knowledge lets you save a picture as a comment to code. My thinking was that I or someone else could come up with some reasonably easy-to-use tool that can convert an image into a base64 binary string which I can then insert into my code. If the conversion/insertion process can be streamlined enough it would allow a far better connection between the diagram and the actual code, so I no longer need to chronographically search through my notebooks. Even more awesome: plugins for the IDEs to automatically parse out and display the image. There is absolutely nothing difficult about this from a theoretical point of view. My guess is that it would take some extra time for me to actually figure out how to extend my favorite IDEs and maintain these plugins, so I'd be totally happy with a sort of code post-processor which would do the same parsing out and rendering of the images and show them side by side with the code, inside of a browser or something. Since I'm a javascript programmer by trade. What do people think? Would anyone pay for this? I would.

    Read the article

  • Advanced TSQL Tuning: Why Internals Knowledge Matters

    - by Paul White
    There is much more to query tuning than reducing logical reads and adding covering nonclustered indexes.  Query tuning is not complete as soon as the query returns results quickly in the development or test environments.  In production, your query will compete for memory, CPU, locks, I/O and other resources on the server.  Today’s entry looks at some tuning considerations that are often overlooked, and shows how deep internals knowledge can help you write better TSQL. As always, we’ll need some example data.  In fact, we are going to use three tables today, each of which is structured like this: Each table has 50,000 rows made up of an INTEGER id column and a padding column containing 3,999 characters in every row.  The only difference between the three tables is in the type of the padding column: the first table uses CHAR(3999), the second uses VARCHAR(MAX), and the third uses the deprecated TEXT type.  A script to create a database with the three tables and load the sample data follows: USE master; GO IF DB_ID('SortTest') IS NOT NULL DROP DATABASE SortTest; GO CREATE DATABASE SortTest COLLATE LATIN1_GENERAL_BIN; GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest', SIZE = 3GB, MAXSIZE = 3GB ); GO ALTER DATABASE SortTest MODIFY FILE ( NAME = 'SortTest_log', SIZE = 256MB, MAXSIZE = 1GB, FILEGROWTH = 128MB ); GO ALTER DATABASE SortTest SET ALLOW_SNAPSHOT_ISOLATION OFF ; ALTER DATABASE SortTest SET AUTO_CLOSE OFF ; ALTER DATABASE SortTest SET AUTO_CREATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_SHRINK OFF ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS ON ; ALTER DATABASE SortTest SET AUTO_UPDATE_STATISTICS_ASYNC ON ; ALTER DATABASE SortTest SET PARAMETERIZATION SIMPLE ; ALTER DATABASE SortTest SET READ_COMMITTED_SNAPSHOT OFF ; ALTER DATABASE SortTest SET MULTI_USER ; ALTER DATABASE SortTest SET RECOVERY SIMPLE ; USE SortTest; GO CREATE TABLE dbo.TestCHAR ( id INTEGER IDENTITY (1,1) NOT NULL, padding CHAR(3999) NOT NULL,   CONSTRAINT [PK dbo.TestCHAR (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestMAX ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAX (id)] PRIMARY KEY CLUSTERED (id), ) ; CREATE TABLE dbo.TestTEXT ( id INTEGER IDENTITY (1,1) NOT NULL, padding TEXT NOT NULL,   CONSTRAINT [PK dbo.TestTEXT (id)] PRIMARY KEY CLUSTERED (id), ) ; -- ============= -- Load TestCHAR (about 3s) -- ============= INSERT INTO dbo.TestCHAR WITH (TABLOCKX) ( padding ) SELECT padding = REPLICATE(CHAR(65 + (Data.n % 26)), 3999) FROM ( SELECT TOP (50000) n = ROW_NUMBER() OVER (ORDER BY (SELECT 0)) - 1 FROM master.sys.columns C1, master.sys.columns C2, master.sys.columns C3 ORDER BY n ASC ) AS Data ORDER BY Data.n ASC ; -- ============ -- Load TestMAX (about 3s) -- ============ INSERT INTO dbo.TestMAX WITH (TABLOCKX) ( padding ) SELECT CONVERT(VARCHAR(MAX), padding) FROM dbo.TestCHAR ORDER BY id ; -- ============= -- Load TestTEXT (about 5s) -- ============= INSERT INTO dbo.TestTEXT WITH (TABLOCKX) ( padding ) SELECT CONVERT(TEXT, padding) FROM dbo.TestCHAR ORDER BY id ; -- ========== -- Space used -- ========== -- EXECUTE sys.sp_spaceused @objname = 'dbo.TestCHAR'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAX'; EXECUTE sys.sp_spaceused @objname = 'dbo.TestTEXT'; ; CHECKPOINT ; That takes around 15 seconds to run, and shows the space allocated to each table in its output: To illustrate the points I want to make today, the example task we are going to set ourselves is to return a random set of 150 rows from each table.  The basic shape of the test query is the same for each of the three test tables: SELECT TOP (150) T.id, T.padding FROM dbo.Test AS T ORDER BY NEWID() OPTION (MAXDOP 1) ; Test 1 – CHAR(3999) Running the template query shown above using the TestCHAR table as the target, we find that the query takes around 5 seconds to return its results.  This seems slow, considering that the table only has 50,000 rows.  Working on the assumption that generating a GUID for each row is a CPU-intensive operation, we might try enabling parallelism to see if that speeds up the response time.  Running the query again (but without the MAXDOP 1 hint) on a machine with eight logical processors, the query now takes 10 seconds to execute – twice as long as when run serially. Rather than attempting further guesses at the cause of the slowness, let’s go back to serial execution and add some monitoring.  The script below monitors STATISTICS IO output and the amount of tempdb used by the test query.  We will also run a Profiler trace to capture any warnings generated during query execution. DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TC.id, TC.padding FROM dbo.TestCHAR AS TC ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; Let’s take a closer look at the statistics and query plan generated from this: Following the flow of the data from right to left, we see the expected 50,000 rows emerging from the Clustered Index Scan, with a total estimated size of around 191MB.  The Compute Scalar adds a column containing a random GUID (generated from the NEWID() function call) for each row.  With this extra column in place, the size of the data arriving at the Sort operator is estimated to be 192MB. Sort is a blocking operator – it has to examine all of the rows on its input before it can produce its first row of output (the last row received might sort first).  This characteristic means that Sort requires a memory grant – memory allocated for the query’s use by SQL Server just before execution starts.  In this case, the Sort is the only memory-consuming operator in the plan, so it has access to the full 243MB (248,696KB) of memory reserved by SQL Server for this query execution. Notice that the memory grant is significantly larger than the expected size of the data to be sorted.  SQL Server uses a number of techniques to speed up sorting, some of which sacrifice size for comparison speed.  Sorts typically require a very large number of comparisons, so this is usually a very effective optimization.  One of the drawbacks is that it is not possible to exactly predict the sort space needed, as it depends on the data itself.  SQL Server takes an educated guess based on data types, sizes, and the number of rows expected, but the algorithm is not perfect. In spite of the large memory grant, the Profiler trace shows a Sort Warning event (indicating that the sort ran out of memory), and the tempdb usage monitor shows that 195MB of tempdb space was used – all of that for system use.  The 195MB represents physical write activity on tempdb, because SQL Server strictly enforces memory grants – a query cannot ‘cheat’ and effectively gain extra memory by spilling to tempdb pages that reside in memory.  Anyway, the key point here is that it takes a while to write 195MB to disk, and this is the main reason that the query takes 5 seconds overall. If you are wondering why using parallelism made the problem worse, consider that eight threads of execution result in eight concurrent partial sorts, each receiving one eighth of the memory grant.  The eight sorts all spilled to tempdb, resulting in inefficiencies as the spilled sorts competed for disk resources.  More importantly, there are specific problems at the point where the eight partial results are combined, but I’ll cover that in a future post. CHAR(3999) Performance Summary: 5 seconds elapsed time 243MB memory grant 195MB tempdb usage 192MB estimated sort set 25,043 logical reads Sort Warning Test 2 – VARCHAR(MAX) We’ll now run exactly the same test (with the additional monitoring) on the table using a VARCHAR(MAX) padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TM.id, TM.padding FROM dbo.TestMAX AS TM ORDER BY NEWID() OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query takes around 8 seconds to complete (3 seconds longer than Test 1).  Notice that the estimated row and data sizes are very slightly larger, and the overall memory grant has also increased very slightly to 245MB.  The most marked difference is in the amount of tempdb space used – this query wrote almost 391MB of sort run data to the physical tempdb file.  Don’t draw any general conclusions about VARCHAR(MAX) versus CHAR from this – I chose the length of the data specifically to expose this edge case.  In most cases, VARCHAR(MAX) performs very similarly to CHAR – I just wanted to make test 2 a bit more exciting. MAX Performance Summary: 8 seconds elapsed time 245MB memory grant 391MB tempdb usage 193MB estimated sort set 25,043 logical reads Sort warning Test 3 – TEXT The same test again, but using the deprecated TEXT data type for the padding column: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) TT.id, TT.padding FROM dbo.TestTEXT AS TT ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; This time the query runs in 500ms.  If you look at the metrics we have been checking so far, it’s not hard to understand why: TEXT Performance Summary: 0.5 seconds elapsed time 9MB memory grant 5MB tempdb usage 5MB estimated sort set 207 logical reads 596 LOB logical reads Sort warning SQL Server’s memory grant algorithm still underestimates the memory needed to perform the sorting operation, but the size of the data to sort is so much smaller (5MB versus 193MB previously) that the spilled sort doesn’t matter very much.  Why is the data size so much smaller?  The query still produces the correct results – including the large amount of data held in the padding column – so what magic is being performed here? TEXT versus MAX Storage The answer lies in how columns of the TEXT data type are stored.  By default, TEXT data is stored off-row in separate LOB pages – which explains why this is the first query we have seen that records LOB logical reads in its STATISTICS IO output.  You may recall from my last post that LOB data leaves an in-row pointer to the separate storage structure holding the LOB data. SQL Server can see that the full LOB value is not required by the query plan until results are returned, so instead of passing the full LOB value down the plan from the Clustered Index Scan, it passes the small in-row structure instead.  SQL Server estimates that each row coming from the scan will be 79 bytes long – 11 bytes for row overhead, 4 bytes for the integer id column, and 64 bytes for the LOB pointer (in fact the pointer is rather smaller – usually 16 bytes – but the details of that don’t really matter right now). OK, so this query is much more efficient because it is sorting a very much smaller data set – SQL Server delays retrieving the LOB data itself until after the Sort starts producing its 150 rows.  The question that normally arises at this point is: Why doesn’t SQL Server use the same trick when the padding column is defined as VARCHAR(MAX)? The answer is connected with the fact that if the actual size of the VARCHAR(MAX) data is 8000 bytes or less, it is usually stored in-row in exactly the same way as for a VARCHAR(8000) column – MAX data only moves off-row into LOB storage when it exceeds 8000 bytes.  The default behaviour of the TEXT type is to be stored off-row by default, unless the ‘text in row’ table option is set suitably and there is room on the page.  There is an analogous (but opposite) setting to control the storage of MAX data – the ‘large value types out of row’ table option.  By enabling this option for a table, MAX data will be stored off-row (in a LOB structure) instead of in-row.  SQL Server Books Online has good coverage of both options in the topic In Row Data. The MAXOOR Table The essential difference, then, is that MAX defaults to in-row storage, and TEXT defaults to off-row (LOB) storage.  You might be thinking that we could get the same benefits seen for the TEXT data type by storing the VARCHAR(MAX) values off row – so let’s look at that option now.  This script creates a fourth table, with the VARCHAR(MAX) data stored off-row in LOB pages: CREATE TABLE dbo.TestMAXOOR ( id INTEGER IDENTITY (1,1) NOT NULL, padding VARCHAR(MAX) NOT NULL,   CONSTRAINT [PK dbo.TestMAXOOR (id)] PRIMARY KEY CLUSTERED (id), ) ; EXECUTE sys.sp_tableoption @TableNamePattern = N'dbo.TestMAXOOR', @OptionName = 'large value types out of row', @OptionValue = 'true' ; SELECT large_value_types_out_of_row FROM sys.tables WHERE [schema_id] = SCHEMA_ID(N'dbo') AND name = N'TestMAXOOR' ; INSERT INTO dbo.TestMAXOOR WITH (TABLOCKX) ( padding ) SELECT SPACE(0) FROM dbo.TestCHAR ORDER BY id ; UPDATE TM WITH (TABLOCK) SET padding.WRITE (TC.padding, NULL, NULL) FROM dbo.TestMAXOOR AS TM JOIN dbo.TestCHAR AS TC ON TC.id = TM.id ; EXECUTE sys.sp_spaceused @objname = 'dbo.TestMAXOOR' ; CHECKPOINT ; Test 4 – MAXOOR We can now re-run our test on the MAXOOR (MAX out of row) table: DECLARE @read BIGINT, @write BIGINT ; SELECT @read = SUM(num_of_bytes_read), @write = SUM(num_of_bytes_written) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; SET STATISTICS IO ON ; SELECT TOP (150) MO.id, MO.padding FROM dbo.TestMAXOOR AS MO ORDER BY NEWID() OPTION (MAXDOP 1, RECOMPILE) ; SET STATISTICS IO OFF ; SELECT tempdb_read_MB = (SUM(num_of_bytes_read) - @read) / 1024. / 1024., tempdb_write_MB = (SUM(num_of_bytes_written) - @write) / 1024. / 1024., internal_use_MB = ( SELECT internal_objects_alloc_page_count / 128.0 FROM sys.dm_db_task_space_usage WHERE session_id = @@SPID ) FROM tempdb.sys.database_files AS DBF JOIN sys.dm_io_virtual_file_stats(2, NULL) AS FS ON FS.file_id = DBF.file_id WHERE DBF.type_desc = 'ROWS' ; TEXT Performance Summary: 0.3 seconds elapsed time 245MB memory grant 0MB tempdb usage 193MB estimated sort set 207 logical reads 446 LOB logical reads No sort warning The query runs very quickly – slightly faster than Test 3, and without spilling the sort to tempdb (there is no sort warning in the trace, and the monitoring query shows zero tempdb usage by this query).  SQL Server is passing the in-row pointer structure down the plan and only looking up the LOB value on the output side of the sort. The Hidden Problem There is still a huge problem with this query though – it requires a 245MB memory grant.  No wonder the sort doesn’t spill to tempdb now – 245MB is about 20 times more memory than this query actually requires to sort 50,000 records containing LOB data pointers.  Notice that the estimated row and data sizes in the plan are the same as in test 2 (where the MAX data was stored in-row). The optimizer assumes that MAX data is stored in-row, regardless of the sp_tableoption setting ‘large value types out of row’.  Why?  Because this option is dynamic – changing it does not immediately force all MAX data in the table in-row or off-row, only when data is added or actually changed.  SQL Server does not keep statistics to show how much MAX or TEXT data is currently in-row, and how much is stored in LOB pages.  This is an annoying limitation, and one which I hope will be addressed in a future version of the product. So why should we worry about this?  Excessive memory grants reduce concurrency and may result in queries waiting on the RESOURCE_SEMAPHORE wait type while they wait for memory they do not need.  245MB is an awful lot of memory, especially on 32-bit versions where memory grants cannot use AWE-mapped memory.  Even on a 64-bit server with plenty of memory, do you really want a single query to consume 0.25GB of memory unnecessarily?  That’s 32,000 8KB pages that might be put to much better use. The Solution The answer is not to use the TEXT data type for the padding column.  That solution happens to have better performance characteristics for this specific query, but it still results in a spilled sort, and it is hard to recommend the use of a data type which is scheduled for removal.  I hope it is clear to you that the fundamental problem here is that SQL Server sorts the whole set arriving at a Sort operator.  Clearly, it is not efficient to sort the whole table in memory just to return 150 rows in a random order. The TEXT example was more efficient because it dramatically reduced the size of the set that needed to be sorted.  We can do the same thing by selecting 150 unique keys from the table at random (sorting by NEWID() for example) and only then retrieving the large padding column values for just the 150 rows we need.  The following script implements that idea for all four tables: SET STATISTICS IO ON ; WITH TestTable AS ( SELECT * FROM dbo.TestCHAR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id = ANY (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAX ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestTEXT ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; WITH TestTable AS ( SELECT * FROM dbo.TestMAXOOR ), TopKeys AS ( SELECT TOP (150) id FROM TestTable ORDER BY NEWID() ) SELECT T1.id, T1.padding FROM TestTable AS T1 WHERE T1.id IN (SELECT id FROM TopKeys) OPTION (MAXDOP 1) ; SET STATISTICS IO OFF ; All four queries now return results in much less than a second, with memory grants between 6 and 12MB, and without spilling to tempdb.  The small remaining inefficiency is in reading the id column values from the clustered primary key index.  As a clustered index, it contains all the in-row data at its leaf.  The CHAR and VARCHAR(MAX) tables store the padding column in-row, so id values are separated by a 3999-character column, plus row overhead.  The TEXT and MAXOOR tables store the padding values off-row, so id values in the clustered index leaf are separated by the much-smaller off-row pointer structure.  This difference is reflected in the number of logical page reads performed by the four queries: Table 'TestCHAR' logical reads 25511 lob logical reads 000 Table 'TestMAX'. logical reads 25511 lob logical reads 000 Table 'TestTEXT' logical reads 00412 lob logical reads 597 Table 'TestMAXOOR' logical reads 00413 lob logical reads 446 We can increase the density of the id values by creating a separate nonclustered index on the id column only.  This is the same key as the clustered index, of course, but the nonclustered index will not include the rest of the in-row column data. CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestCHAR (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAX (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestTEXT (id); CREATE UNIQUE NONCLUSTERED INDEX uq1 ON dbo.TestMAXOOR (id); The four queries can now use the very dense nonclustered index to quickly scan the id values, sort them by NEWID(), select the 150 ids we want, and then look up the padding data.  The logical reads with the new indexes in place are: Table 'TestCHAR' logical reads 835 lob logical reads 0 Table 'TestMAX' logical reads 835 lob logical reads 0 Table 'TestTEXT' logical reads 686 lob logical reads 597 Table 'TestMAXOOR' logical reads 686 lob logical reads 448 With the new index, all four queries use the same query plan (click to enlarge): Performance Summary: 0.3 seconds elapsed time 6MB memory grant 0MB tempdb usage 1MB sort set 835 logical reads (CHAR, MAX) 686 logical reads (TEXT, MAXOOR) 597 LOB logical reads (TEXT) 448 LOB logical reads (MAXOOR) No sort warning I’ll leave it as an exercise for the reader to work out why trying to eliminate the Key Lookup by adding the padding column to the new nonclustered indexes would be a daft idea Conclusion This post is not about tuning queries that access columns containing big strings.  It isn’t about the internal differences between TEXT and MAX data types either.  It isn’t even about the cool use of UPDATE .WRITE used in the MAXOOR table load.  No, this post is about something else: Many developers might not have tuned our starting example query at all – 5 seconds isn’t that bad, and the original query plan looks reasonable at first glance.  Perhaps the NEWID() function would have been blamed for ‘just being slow’ – who knows.  5 seconds isn’t awful – unless your users expect sub-second responses – but using 250MB of memory and writing 200MB to tempdb certainly is!  If ten sessions ran that query at the same time in production that’s 2.5GB of memory usage and 2GB hitting tempdb.  Of course, not all queries can be rewritten to avoid large memory grants and sort spills using the key-lookup technique in this post, but that’s not the point either. The point of this post is that a basic understanding of execution plans is not enough.  Tuning for logical reads and adding covering indexes is not enough.  If you want to produce high-quality, scalable TSQL that won’t get you paged as soon as it hits production, you need a deep understanding of execution plans, and as much accurate, deep knowledge about SQL Server as you can lay your hands on.  The advanced database developer has a wide range of tools to use in writing queries that perform well in a range of circumstances. By the way, the examples in this post were written for SQL Server 2008.  They will run on 2005 and demonstrate the same principles, but you won’t get the same figures I did because 2005 had a rather nasty bug in the Top N Sort operator.  Fair warning: if you do decide to run the scripts on a 2005 instance (particularly the parallel query) do it before you head out for lunch… This post is dedicated to the people of Christchurch, New Zealand. © 2011 Paul White email: @[email protected] twitter: @SQL_Kiwi

    Read the article

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >