Search Results

Search found 13376 results on 536 pages for 'performance engineers'.

Page 10/536 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17  | Next Page >

  • Recommended website performance monitoring services? [closed]

    - by Dennis G.
    I'm looking for a good performance monitoring service for websites. I know about some of the available general monitoring services that check for uptime and notify you about unavailable services. But I'm specifically looking for a service with an emphasis on performance. I.e., I would like to see reports with detailed performance statistics from multiple locations world-wide, with a break-down on how long it took to fetch the different website resources, including third-party scripts such as Google Analytics and so on (the report should contain similar details such as the FireBug Net tab). Are there any such services and if so, which one is the best?

    Read the article

  • Revisiting ANTS Performance Profiler 7.4

    - by James Michael Hare
    Last year, I did a small review on the ANTS Performance Profiler 6.3, now that it’s a year later and a major version number higher, I thought I’d revisit the review and revise my last post. This post will take the same examples as the original post and update them to show what’s new in version 7.4 of the profiler. Background A performance profiler’s main job is to keep track of how much time is typically spent in each unit of code. This helps when we have a program that is not running at the performance we expect, and we want to know where the program is experiencing issues. There are many profilers out there of varying capabilities. Red Gate’s typically seem to be the very easy to “jump in” and get started with very little training required. So let’s dig into the Performance Profiler. I’ve constructed a very crude program with some obvious inefficiencies. It’s a simple program that generates random order numbers (or really could be any unique identifier), adds it to a list, sorts the list, then finds the max and min number in the list. Ignore the fact it’s very contrived and obviously inefficient, we just want to use it as an example to show off the tool: 1: // our test program 2: public static class Program 3: { 4: // the number of iterations to perform 5: private static int _iterations = 1000000; 6: 7: // The main method that controls it all 8: public static void Main() 9: { 10: var list = new List<string>(); 11: 12: for (int i = 0; i < _iterations; i++) 13: { 14: var x = GetNextId(); 15: 16: AddToList(list, x); 17: 18: var highLow = GetHighLow(list); 19: 20: if ((i % 1000) == 0) 21: { 22: Console.WriteLine("{0} - High: {1}, Low: {2}", i, highLow.Item1, highLow.Item2); 23: Console.Out.Flush(); 24: } 25: } 26: } 27: 28: // gets the next order id to process (random for us) 29: public static string GetNextId() 30: { 31: var random = new Random(); 32: var num = random.Next(1000000, 9999999); 33: return num.ToString(); 34: } 35: 36: // add it to our list - very inefficiently! 37: public static void AddToList(List<string> list, string item) 38: { 39: list.Add(item); 40: list.Sort(); 41: } 42: 43: // get high and low of order id range - very inefficiently! 44: public static Tuple<int,int> GetHighLow(List<string> list) 45: { 46: return Tuple.Create(list.Max(s => Convert.ToInt32(s)), list.Min(s => Convert.ToInt32(s))); 47: } 48: } So let’s run it through the profiler and see what happens! Visual Studio Integration First, let’s look at how the ANTS profilers integrate with Visual Studio’s menu system. Once you install the ANTS profilers, you will get an ANTS menu item with several options: Notice that you can either Profile Performance or Launch ANTS Performance Profiler. These sound similar but achieve two slightly different actions: Profile Performance: this immediately launches the profiler with all defaults selected to profile the active project in Visual Studio. Launch ANTS Performance Profiler: this launches the profiler much the same way as starting it from the Start Menu. The profiler will pre-populate the application and path information, but allow you to change the settings before beginning the profile run. So really, the main difference is that Profile Performance immediately begins profiling with the default selections, where Launch ANTS Performance Profiler allows you to change the defaults and attach to an already-running application. Let’s Fire it Up! So when you fire up ANTS either via Start Menu or Launch ANTS Performance Profiler menu in Visual Studio, you are presented with a very simple dialog to get you started: Notice you can choose from many different options for application type. You can profile executables, services, web applications, or just attach to a running process. In fact, in version 7.4 we see two new options added: ASP.NET Web Application (IIS Express) SharePoint web application (IIS) So this gives us an additional way to profile ASP.NET applications and the ability to profile SharePoint applications as well. You can also choose your level of detail in the Profiling Mode drop down. If you choose Line-Level and method-level timings detail, you will get a lot more detail on the method durations, but this will also slow down profiling somewhat. If you really need the profiler to be as unintrusive as possible, you can change it to Sample method-level timings. This is performing very light profiling, where basically the profiler collects timings of a method by examining the call-stack at given intervals. Which method you choose depends a lot on how much detail you need to find the issue and how sensitive your program issues are to timing. So for our example, let’s just go with the line and method timing detail. So, we check that all the options are correct (if you launch from VS2010, the executable and path are filled in already), and fire it up by clicking the [Start Profiling] button. Profiling the Application Once you start profiling the application, you will see a real-time graph of CPU usage that will indicate how much your application is using the CPU(s) on your system. During this time, you can select segments of the graph and bookmark them, giving them mnemonic names. This can be useful if you want to compare performance in one part of the run to another part of the run. Notice that once you select a block, it will give you the call tree breakdown for that selection only, and the relative performance of those calls. Once you feel you have collected enough information, you can click [Stop Profiling] to stop the application run and information collection and begin a more thorough analysis. Analyzing Method Timings So now that we’ve halted the run, we can look around the GUI and see what we can see. By default, the times are shown in terms of percentage of time of the total run of the application, though you can change it in the View menu item to milliseconds, ticks, or seconds as well. This won’t affect the percentages of methods, it only affects what units the times are shown. Notice also that the major hotspot seems to be in a method without source, ANTS Profiler will filter these out by default, but you can right-click on the line and remove the filter to see more detail. This proves especially handy when a bottleneck is due to a method in the BCL. So now that we’ve removed the filter, we see a bit more detail: In addition, ANTS Performance Profiler gives you the ability to decompile the methods without source so that you can dive even deeper, though typically this isn’t necessary for our purposes. When looking at timings, there are generally two types of timings for each method call: Time: This is the time spent ONLY in this method, not including calls this method makes to other methods. Time With Children: This is the total of time spent in both this method AND including calls this method makes to other methods. In other words, the Time tells you how much work is being done exclusively in this method, and the Time With Children tells you how much work is being done inclusively in this method and everything it calls. You can also choose to display the methods in a tree or in a grid. The tree view is the default and it shows the method calls arranged in terms of the tree representing all method calls and the parent method that called them, etc. This is useful for when you find a hot-spot method, you can see who is calling it to determine if the problem is the method itself, or if it is being called too many times. The grid method represents each method only once with its totals and is useful for quickly seeing what method is the trouble spot. In addition, you can choose to display Methods with source which are generally the methods you wrote (as opposed to native or BCL code), or Any Method which shows not only your methods, but also native calls, JIT overhead, synchronization waits, etc. So these are just two ways of viewing the same data, and you’re free to choose the organization that best suits what information you are after. Analyzing Method Source If we look at the timings above, we see that our AddToList() method (and in particular, it’s call to the List<T>.Sort() method in the BCL) is the hot-spot in this analysis. If ANTS sees a method that is consuming the most time, it will flag it as a hot-spot to help call out potential areas of concern. This doesn’t mean the other statistics aren’t meaningful, but that the hot-spot is most likely going to be your biggest bang-for-the-buck to concentrate on. So let’s select the AddToList() method, and see what it shows in the source window below: Notice the source breakout in the bottom pane when you select a method (from either tree or grid view). This shows you the timings in this method per line of code. This gives you a major indicator of where the trouble-spot in this method is. So in this case, we see that performing a Sort() on the List<T> after every Add() is killing our performance! Of course, this was a very contrived, duh moment, but you’d be surprised how many performance issues become duh moments. Note that this one line is taking up 86% of the execution time of this application! If we eliminate this bottleneck, we should see drastic improvement in the performance. So to fix this, if we still wanted to maintain the List<T> we’d have many options, including: delay Sort() until after all Add() methods, using a SortedSet, SortedList, or SortedDictionary depending on which is most appropriate, or forgoing the sorting all together and using a Dictionary. Rinse, Repeat! So let’s just change all instances of List<string> to SortedSet<string> and run this again through the profiler: Now we see the AddToList() method is no longer our hot-spot, but now the Max() and Min() calls are! This is good because we’ve eliminated one hot-spot and now we can try to correct this one as well. As before, we can then optimize this part of the code (possibly by taking advantage of the fact the list is now sorted and returning the first and last elements). We can then rinse and repeat this process until we have eliminated as many bottlenecks as possible. Calls by Web Request Another feature that was added recently is the ability to view .NET methods grouped by the HTTP requests that caused them to run. This can be helpful in determining which pages, web services, etc. are causing hot spots in your web applications. Summary If you like the other ANTS tools, you’ll like the ANTS Performance Profiler as well. It is extremely easy to use with very little product knowledge required to get up and running. There are profilers built into the higher product lines of Visual Studio, of course, which are also powerful and easy to use. But for quickly jumping in and finding hot spots rapidly, Red Gate’s Performance Profiler 7.4 is an excellent choice. Technorati Tags: Influencers,ANTS,Performance Profiler,Profiler

    Read the article

  • How to measure disk performance?

    - by Jakub Šturc
    I am going to "fix" a friend's computer this weekend. By the symptoms he describes it looks like he has a disk performance problem with his 5400 rpm disk. I want to be sure that disk is the problem so I want to "scientificaly" measure the performance. Which tools do you recommend me for this job? Is there any standard set of numbers I can compare the result of measurement with?

    Read the article

  • Question about network topology and routing performance

    - by algorithms
    Hello I am currently working on a uni project about routing protocols and network performance, one of the criteria i was going to test under was to see what effect lan topology has, ie workstations arranged in mesh, star, ring etc, but i am having doubts as to whether that would have any affect on the routing performance thus would be useless to do, rather i'm thinking it would be better to test under the topology of the routers themselves, ie routers arranged in either star, mesh ring etc. I would appreciate some feedback on this as I am rather confused. Thank You

    Read the article

  • VMWare Server - Writing files to virtual hard drive performance

    - by Ardman
    We have just moved our infrastructure from physical servers to virtual machines. Everything is running great and we are happy with the result of the move. We have identified one problem, and that is reading/writing performance. We have an application that compiles files and writes to disk. This is considerably slower on the new virtual machines compared to the physical machines. Is there a performance bottleneck when writing to a virtual hard drive compared to a physical hard drive?

    Read the article

  • After writing SQL statements in MySQL, how to measure the speed / performance of them?

    - by Jian Lin
    I saw something from an "execution plan" article: 10 rows fetched in 0.0003s (0.7344s) How come there are 2 durations shown? What if I don't have large data set yet. For example, if I have only 20, 50, or even just 100 records, I can't really measure how faster 2 different SQL statements compare in term of speed in real life situation? In other words, there needs to be at least hundreds of thousands of records, or even a million records to accurately compares the performance of 2 different SQL statements?

    Read the article

  • VMWare - Writing files to virtual hard drive performance

    - by Ardman
    We have just moved our infrastructure from physical servers to virtual machines. Everything is running great and we are happy with the result of the move. We have identified one problem, and that is reading/writing performance. We have an application that compiles files and writes to disk. This is considerably slower on the new virtual machines compared to the physical machines. Is there a performance bottleneck when writing to a virtual hard drive compared to a physical hard drive?

    Read the article

  • Linux RAID-0 performance doesn't scale up over 1 GB/s

    - by wazoox
    I have trouble getting the max throughput out of my setup. The hardware is as follow : dual Quad-Core AMD Opteron(tm) Processor 2376 16 GB DDR2 ECC RAM dual Adaptec 52245 RAID controllers 48 1 TB SATA drives set up as 2 RAID-6 arrays (256KB stripe) + spares. Software : Plain vanilla 2.6.32.25 kernel, compiled for AMD-64, optimized for NUMA; Debian Lenny userland. benchmarks run : disktest, bonnie++, dd, etc. All give the same results. No discrepancy here. io scheduler used : noop. Yeah, no trick here. Up until now I basically assumed that striping (RAID 0) several physical devices should augment performance roughly linearly. However this is not the case here : each RAID array achieves about 780 MB/s write, sustained, and 1 GB/s read, sustained. writing to both RAID arrays simultaneously with two different processes gives 750 + 750 MB/s, and reading from both gives 1 + 1 GB/s. however when I stripe both arrays together, using either mdadm or lvm, the performance is about 850 MB/s writing and 1.4 GB/s reading. at least 30% less than expected! running two parallel writer or reader processes against the striped arrays doesn't enhance the figures, in fact it degrades performance even further. So what's happening here? Basically I ruled out bus or memory contention, because when I run dd on both drives simultaneously, aggregate write speed actually reach 1.5 GB/s and reading speed tops 2 GB/s. So it's not the PCIe bus. I suppose it's not the RAM. It's not the filesystem, because I get exactly the same numbers benchmarking against the raw device or using XFS. And I also get exactly the same performance using either LVM striping and md striping. What's wrong? What's preventing a process from going up to the max possible throughput? Is Linux striping defective? What other tests could I run?

    Read the article

  • Optimize windows 2008 performance

    - by Giorgi
    Hello, I have windows server 2008 sp2 installed as virtual machine on my personal laptop. I use it only for source control (visual svn) and continuous integration (teamcity). As the virtual machine resources are limited I'd like to optimize it's performance by disabling services and features that are not necessary for my purposes. Can anyone recommend where to start or provide with tips for getting better performance. Thanks.

    Read the article

  • Randomly poor 2D performance in Linux Mint 11 when using nvidia driver

    - by SDD
    I am using: - Linux Mint 11 - Geforce 560ti - nVidia driver (installed via helper programm, not from nvidia page) The third party nvidia drivers radomly cause very poor 2D performance. Radomly because the performance can be very great, but after the next reboot or login become very poor. After another reboot or login, this might change again to better or worse. I have no idea why and how and I need your help. Thank you.

    Read the article

  • How to measure disk-performance under Windows?

    - by Alphager
    I'm trying to find out why my application is very slow on a certain machine (runs fine everywhere else). I think i have traced the performance-problems to hard-disk reads and writes and i think it's simply the very slow disk. What tool could i use to measure hd read and write performance under Windows 2003 in a non-destructive way (the partitions on the drives have to remain intact)?

    Read the article

  • Encouraging software engineers to track time

    - by M. Dudley
    How can I encourage my coworkers to track the time they spend resolving issues and implementing features? We have software to do this, but they just don't enter the numbers. I want the team to get better at providing project estimates by comparing our past estimates to actual time spent. I suspect that my coworkers don't see the personal benefit, since they're not often involved in project scheduling.

    Read the article

  • best way to "introduce" OOP/OOD to team of experienced C++ engineers

    - by DXM
    I am looking for an efficient way, that also doesn't come off as an insult, to introduce OOP concepts to existing team members? My teammates are not new to OO languages. We've been doing C++/C# for a long time so technology itself is familiar. However, I look around and without major infusion of effort (mostly in the form of code reviews), it seems what we are producing is C code that happens to be inside classes. There's almost no use of single responsibility principle, abstractions or attempts to minimize coupling, just to name a few. I've seen classes that don't have a constructor but get memset to 0 every time they are instantiated. But every time I bring up OOP, everyone always nods and makes it seem like they know exactly what I'm talking about. Knowing the concepts is good, but we (some more than others) seem to have very hard time applying them when it comes to delivering actual work. Code reviews have been very helpful but the problem with code reviews is that they only occur after the fact so to some it seems we end up rewriting (it's mostly refactoring, but still takes lots of time) code that was just written. Also code reviews only give feedback to an individual engineer, not the entire team. I am toying with the idea of doing a presentation (or a series) and try to bring up OOP again along with some examples of existing code that could've been written better and could be refactored. I could use some really old projects that no one owns anymore so at least that part shouldn't be a sensitive issue. However, will this work? As I said most people have done C++ for a long time so my guess is that a) they'll sit there thinking why I'm telling them stuff they already know or b) they might actually take it as an insult because I'm telling them they don't know how to do the job they've been doing for years if not decades. Is there another approach which would reach broader audience than a code review would, but at the same time wouldn't feel like a punishment lecture? I'm not a fresh kid out of college who has utopian ideals of perfectly designed code and I don't expect that from anyone. The reason I'm writing this is because I just did a review of a person who actually had decent high-level design on paper. However if you picture classes: A - B - C - D, in the code B, C and D all implement almost the same public interface and B/C have one liner functions so that top-most class A is doing absolutely all the work (down to memory management, string parsing, setup negotiations...) primarily in 4 mongo methods and, for all intents and purposes, calls almost directly into D. Update: I'm a tech lead(6 months in this role) and do have full support of the group manager. We are working on a very mature product and maintenance costs are definitely letting themselves be known.

    Read the article

  • Training Courses/Material for Fresh Software Engineers

    - by H_Miri
    I am familiar with the C++ programming language, and I have been using it for some time now. Recently, I have been applying for C++ Software Development/Engineering jobs, but I feel like there is still so much I need to learn. When first-class companies, such as Google, hire a software programmer, they obviously put them through some initial training. How/Where can I find a similar source of training material/source that would prepare me for commercial/industrial programming in C++? I know there is millions of good tutorials on line, but I would rather work through something different. I actually don't know how an organization training courses in software development would be different to text books. Has anyone been in the same boat - feeling under-confident at the employment stage and then realizing they shouldn't have been?

    Read the article

  • More CPU cores may not always lead to better performance – MAXDOP and query memory distribution in spotlight

    - by sqlworkshops
    More hardware normally delivers better performance, but there are exceptions where it can hinder performance. Understanding these exceptions and working around it is a major part of SQL Server performance tuning.   When a memory allocating query executes in parallel, SQL Server distributes memory to each task that is executing part of the query in parallel. In our example the sort operator that executes in parallel divides the memory across all tasks assuming even distribution of rows. Common memory allocating queries are that perform Sort and do Hash Match operations like Hash Join or Hash Aggregation or Hash Union.   In reality, how often are column values evenly distributed, think about an example; are employees working for your company distributed evenly across all the Zip codes or mainly concentrated in the headquarters? What happens when you sort result set based on Zip codes? Do all products in the catalog sell equally or are few products hot selling items?   One of my customers tested the below example on a 24 core server with various MAXDOP settings and here are the results:MAXDOP 1: CPU time = 1185 ms, elapsed time = 1188 msMAXDOP 4: CPU time = 1981 ms, elapsed time = 1568 msMAXDOP 8: CPU time = 1918 ms, elapsed time = 1619 msMAXDOP 12: CPU time = 2367 ms, elapsed time = 2258 msMAXDOP 16: CPU time = 2540 ms, elapsed time = 2579 msMAXDOP 20: CPU time = 2470 ms, elapsed time = 2534 msMAXDOP 0: CPU time = 2809 ms, elapsed time = 2721 ms - all 24 cores.In the above test, when the data was evenly distributed, the elapsed time of parallel query was always lower than serial query.   Why does the query get slower and slower with more CPU cores / higher MAXDOP? Maybe you can answer this question after reading the article; let me know: [email protected].   Well you get the point, let’s see an example.   The best way to learn is to practice. To create the below tables and reproduce the behavior, join the mailing list by using this link: www.sqlworkshops.com/ml and I will send you the table creation script.   Let’s update the Employees table with 49 out of 50 employees located in Zip code 2001. update Employees set Zip = EmployeeID / 400 + 1 where EmployeeID % 50 = 1 update Employees set Zip = 2001 where EmployeeID % 50 != 1 go update statistics Employees with fullscan go   Let’s create the temporary table #FireDrill with all possible Zip codes. drop table #FireDrill go create table #FireDrill (Zip int primary key) insert into #FireDrill select distinct Zip from Employees update statistics #FireDrill with fullscan go  Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --First serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e       inner join #FireDrill fd on (e.Zip = fd.Zip)       order by e.Zip option (maxdop 1) goThe query took 1011 ms to complete.   The execution plan shows the 77816 KB of memory was granted while the estimated rows were 799624.  No Sort Warnings in SQL Server Profiler.  Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e       inner join #FireDrill fd on (e.Zip = fd.Zip)       order by e.Zip option (maxdop 0) go The query took 1912 ms to complete.  The execution plan shows the 79360 KB of memory was granted while the estimated rows were 799624.  The estimated number of rows between serial and parallel plan are the same. The parallel plan has slightly more memory granted due to additional overhead. Sort properties shows the rows are unevenly distributed over the 4 threads.   Sort Warnings in SQL Server Profiler.   Intermediate Summary: The reason for the higher duration with parallel plan was sort spill. This is due to uneven distribution of employees over Zip codes, especially concentration of 49 out of 50 employees in Zip code 2001. Now let’s update the Employees table and distribute employees evenly across all Zip codes.   update Employees set Zip = EmployeeID / 400 + 1 go update statistics Employees with fullscan go  Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --Serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e       inner join #FireDrill fd on (e.Zip = fd.Zip)       order by e.Zip option (maxdop 1) go   The query took 751 ms to complete.  The execution plan shows the 77816 KB of memory was granted while the estimated rows were 784707.  No Sort Warnings in SQL Server Profiler.   Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e       inner join #FireDrill fd on (e.Zip = fd.Zip)       order by e.Zip option (maxdop 0) go The query took 661 ms to complete.  The execution plan shows the 79360 KB of memory was granted while the estimated rows were 784707.  Sort properties shows the rows are evenly distributed over the 4 threads. No Sort Warnings in SQL Server Profiler.    Intermediate Summary: When employees were distributed unevenly, concentrated on 1 Zip code, parallel sort spilled while serial sort performed well without spilling to tempdb. When the employees were distributed evenly across all Zip codes, parallel sort and serial sort did not spill to tempdb. This shows uneven data distribution may affect the performance of some parallel queries negatively. For detailed discussion of memory allocation, refer to webcasts available at www.sqlworkshops.com/webcasts.     Some of you might conclude from the above execution times that parallel query is not faster even when there is no spill. Below you can see when we are joining limited amount of Zip codes, parallel query will be fasted since it can use Bitmap Filtering.   Let’s update the Employees table with 49 out of 50 employees located in Zip code 2001. update Employees set Zip = EmployeeID / 400 + 1 where EmployeeID % 50 = 1 update Employees set Zip = 2001 where EmployeeID % 50 != 1 go update statistics Employees with fullscan go  Let’s create the temporary table #FireDrill with limited Zip codes. drop table #FireDrill go create table #FireDrill (Zip int primary key) insert into #FireDrill select distinct Zip       from Employees where Zip between 1800 and 2001 update statistics #FireDrill with fullscan go  Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --Serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e       inner join #FireDrill fd on (e.Zip = fd.Zip)       order by e.Zip option (maxdop 1) go The query took 989 ms to complete.  The execution plan shows the 77816 KB of memory was granted while the estimated rows were 785594. No Sort Warnings in SQL Server Profiler.  Now let’s execute the query in parallel with MAXDOP 0. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e       inner join #FireDrill fd on (e.Zip = fd.Zip)       order by e.Zip option (maxdop 0) go The query took 1799 ms to complete.  The execution plan shows the 79360 KB of memory was granted while the estimated rows were 785594.  Sort Warnings in SQL Server Profiler.    The estimated number of rows between serial and parallel plan are the same. The parallel plan has slightly more memory granted due to additional overhead.  Intermediate Summary: The reason for the higher duration with parallel plan even with limited amount of Zip codes was sort spill. This is due to uneven distribution of employees over Zip codes, especially concentration of 49 out of 50 employees in Zip code 2001.   Now let’s update the Employees table and distribute employees evenly across all Zip codes. update Employees set Zip = EmployeeID / 400 + 1 go update statistics Employees with fullscan go Let’s execute the query serially with MAXDOP 1. --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --Serially with MAXDOP 1 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e       inner join #FireDrill fd on (e.Zip = fd.Zip)       order by e.Zip option (maxdop 1) go The query took 250  ms to complete.  The execution plan shows the 9016 KB of memory was granted while the estimated rows were 79973.8.  No Sort Warnings in SQL Server Profiler.  Now let’s execute the query in parallel with MAXDOP 0.  --Example provided by www.sqlworkshops.com --Execute query with uneven Zip code distribution --In parallel with MAXDOP 0 set statistics time on go declare @EmployeeID int, @EmployeeName varchar(48),@zip int select @EmployeeName = e.EmployeeName, @zip = e.Zip from Employees e       inner join #FireDrill fd on (e.Zip = fd.Zip)       order by e.Zip option (maxdop 0) go The query took 85 ms to complete.  The execution plan shows the 13152 KB of memory was granted while the estimated rows were 784707.  No Sort Warnings in SQL Server Profiler.    Here you see, parallel query is much faster than serial query since SQL Server is using Bitmap Filtering to eliminate rows before the hash join.   Parallel queries are very good for performance, but in some cases it can hinder performance. If one identifies the reason for these hindrances, then it is possible to get the best out of parallelism. I covered many aspects of monitoring and tuning parallel queries in webcasts (www.sqlworkshops.com/webcasts) and articles (www.sqlworkshops.com/articles). I suggest you to watch the webcasts and read the articles to better understand how to identify and tune parallel query performance issues.   Summary: One has to avoid sort spill over tempdb and the chances of spills are higher when a query executes in parallel with uneven data distribution. Parallel query brings its own advantage, reduced elapsed time and reduced work with Bitmap Filtering. So it is important to understand how to avoid spills over tempdb and when to execute a query in parallel.   I explain these concepts with detailed examples in my webcasts (www.sqlworkshops.com/webcasts), I recommend you to watch them. The best way to learn is to practice. To create the above tables and reproduce the behavior, join the mailing list at www.sqlworkshops.com/ml and I will send you the relevant SQL Scripts.   Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.   Disclaimer and copyright information:This article refers to organizations and products that may be the trademarks or registered trademarks of their various owners. Copyright of this article belongs to R Meyyappan / www.sqlworkshops.com. You may freely use the ideas and concepts discussed in this article with acknowledgement (www.sqlworkshops.com), but you may not claim any of it as your own work. This article is for informational purposes only; you use any of the suggestions given here entirely at your own risk.   Register for the upcoming 3 Day Level 400 Microsoft SQL Server 2008 and SQL Server 2005 Performance Monitoring & Tuning Hands-on Workshop in London, United Kingdom during March 15-17, 2011, click here to register / Microsoft UK TechNet.These are hands-on workshops with a maximum of 12 participants and not lectures. For consulting engagements click here.   R Meyyappan [email protected] LinkedIn: http://at.linkedin.com/in/rmeyyappan  

    Read the article

  • Benchmarking a file server

    - by Joel Coel
    I'm working on building a new file server... a simple Windows Server box with a few terabytes of disk space to share on the LAN. Pain for current hard drive prices aside :( -- I would like to get some benchmarks for this device under load compared to our old server. The old server was installed in 2005 and had 5 136GB 10K disks in RAID 5. The new server has 8 1TB disks in two RAID 10 volumes (plus a hot spare for each volume), but they're only 7.2K rpm, and of course with a much larger cache size. I'd like to get an idea of the performance expectations of the new server relative to the old. Where do I get started? I'd like to know both raw potential under different kinds of load for each server, as well an idea of what our real-world load looks like and how it will translate. Will disk load even matter, or will performance be more driven by the network connection? I could probably fumble through some disk i/o and wait counters in performance monitor, but I don't really know what to look for, which counters to watch, or for how long and when. FWIW, I'm expecting a nice improvement because of the benefits of having two different volumes and the better RAID 10 performance vs RAID 5, in spite of using slower disks... but I'd like to get an idea of how much.

    Read the article

  • How can dev teams prevent slow performance in consumer apps?

    - by Crashworks
    When I previously asked what's responsible for slow software, a few answers I've received suggested it was a social and management problem: This isn't a technical problem, it's a marketing and management problem.... Utimately, the product mangers are responsible to write the specs for what the user is supposed to get. Lots of things can go wrong: The product manager fails to put button response in the spec ... The QA folks do a mediocre job of testing against the spec ... if the product management and QA staff are all asleep at the wheel, we programmers can't make up for that. —Bob Murphy People work on good-size apps. As they work, performance problems creep in, just like bugs. The difference is - bugs are "bad" - they cry out "find me, and fix me". Performance problems just sit there and get worse. Programmers often think "Well, my code wouldn't have a performance problem. Rather, management needs to buy me a newer/bigger/faster machine." The fact is, if developers periodically just hunt for performance problems (which is actually very easy) they could simply clean them out. —Mike Dunlavey So, if this is a social problem, what social mechanisms can an organization put into place to avoid shipping slow software to its customers?

    Read the article

  • Openfiler iSCSI performance

    - by Justin
    Hoping someone can point me in the right direction with some iSCSI performance issues I'm having. I'm running Openfiler 2.99 on an older ProLiant DL360 G5. Dual Xeon processor, 6GB ECC RAM, Intel Gigabit Server NIC, SAS controller with and 3 10K SAS drives in a RAID 5. When I run a simple write test from the box directly the performance is very good: [root@localhost ~]# dd if=/dev/zero of=tmpfile bs=1M count=1000 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 4.64468 s, 226 MB/s So I created a LUN, attached it to another box I have running ESXi 5.1 (Core i7 2600k, 16GB RAM, Intel Gigabit Server NIC) and created a new datastore. Once I created the datastore I was able to create and start a VM running CentOS with 2GB of RAM and 16GB of disk space. The OS installed fine and I'm able to use it but when I ran the same test inside the VM I get dramatically different results: [root@localhost ~]# dd if=/dev/zero of=tmpfile bs=1M count=1000 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 26.8786 s, 39.0 MB/s [root@localhost ~]# Both servers have brand new Intel Server NIC's and I have Jumbo Frames enabled on the switch, the openfiler box as well as the VMKernel adapter on the ESXi box. I can confirm this is set up properly by using the vmkping command from the ESXi host: ~ # vmkping 10.0.0.1 -s 9000 PING 10.0.0.1 (10.0.0.1): 9000 data bytes 9008 bytes from 10.0.0.1: icmp_seq=0 ttl=64 time=0.533 ms 9008 bytes from 10.0.0.1: icmp_seq=1 ttl=64 time=0.736 ms 9008 bytes from 10.0.0.1: icmp_seq=2 ttl=64 time=0.570 ms The only thing I haven't tried as far as networking goes is bonding two interfaces together. I'm open to trying that down the road but for now I am trying to keep things simple. I know this is a pretty modest setup and I'm not expecting top notch performance but I would like to see 90-100MB/s. Any ideas?

    Read the article

  • Terminal server performance over high latency links

    - by holz
    Our datacenter and head office is currently in Brisbane, Australia, and we have a branch office in the UK. We have a private WAN with a 768k link to our UK office and the latency is at about 350ms. The terminal server performance is reeeeealy bad. Applications that don't have too much animation or any images seem to be okay. But as soon as they do, the session is almost unusable. Powerpoint and internet explorer are good examples of apps that make it run slow. And if there is an image in your email signature, outlook will hang for about 10 seconds each time a new line is inserted, while the image gets moved down a few pixels. We are currently running server 2003. I have tried Server 2008 R2 RDS, and also a third party solution called Blaze by a company called Ericom, but it is still not too much better. We currently have a 5 levels dynamic class of service with the priority in the following order. VoIP Video Terminal Services Printing Everything else When testing the terminal server performance, the link monitored using net-flows, and have plenty we of bandwidth available, so I believe that it is a latency issue rather than bandwidth. Is there anything that can be done to improve performance. Would citrix help at all?

    Read the article

  • mysql medium int vs. int performance?

    - by aviv
    Hi, I have a simple users table, i guess the maximum users i am going to have is 300,000. Currently i am using: CREATE TABLE users ( id INT UNSIGEND AUTOINCEREMENT PRIMARY KEY, .... Of course i have many other tables that the users(id) is a FOREIGN KEY in them. I read that since the id is not going to use the full maximum of INT it is better to use: MEDIUMINT and it will give better performance. Is it true? (I am using mysql on Windows Server 2008) Thanks.

    Read the article

  • Performance monitoring on Linux/Unix

    - by ervingsb
    I run a few Windows servers and (Debian and Ubuntu) Linux and AIX servers. I would like to continously monitor performance on these systems in order to easily identify bottlenecks as well as to have an overview of the general activity on the servers. On Windows, I use Windows Performance Monitor (perfmon) for this. I set up these counters: For bottlenecks: Processor utilization : System\Processor Queue Length Memory utilization : Memory\Pages Input/Sec Disk Utilization : PhysicalDisk\Current Disk Queue Length\driveletter Network problems: Network Interface\Output Queue Length\nic name For general activity: Processor utilization : Processor\% Processor Time_Total Memory utilization : Process\Working Set_Total (or per specific process) Memory utilization : Memory\Available MBytes Disk Utilization : PhysicalDisk\Bytes/sec_Total (or per process) Network Utilization : Network Interface\Bytes Total/Sec\nic name (More information on the choice of these counters on: http://itcookbook.net/blog/windows-perfmon-top-ten-counters ) This works really well. It allows me to look in one place and identify most common bottlenecks. So my question is, how can I do something equivalent (or just very similar) on Linux servers? I have looked a bit on nmon (http://www.ibm.com/developerworks/aix/library/au-analyze_aix/) which is a free performance monitoring tool developed for AIX but also availble for Linux. However, I am not sure if nmon allows me to set up the above counters. Maybe it is because Linux and AIX does not allow monitoring these exact same measures. Is so, which ones should I choose and why? If nmon is not the tool to use for this, then what do you recommend?

    Read the article

  • SQL Server: One 12-drive RAID-10 array or 2 arrays of 8-drives and 4-drives

    - by ben
    Setting up a box for SQL Server 2008, which would give the best performance (heavy OLTP)? The more drives in a RAID-10 array the better performance, but will losing 4 drives to dedicate them to the transaction logs give us more performance. 12-drives in RAID-10 plus one hot spare. OR 8-drives in RAID-10 for database and 4-drives RAID-10 for transaction logs plus 2 hot spares (one for each array). We have 14-drive slots to work with and it's an older PowerVault that doesn't support global hot spares.

    Read the article

  • Read only array, deep copy or retrieve copies one by one? (Performance and Memory)

    - by Arthur Wulf White
    In a garbage collection based system, what is the most effective way to handle a read only array if such a structure does not exist natively in the language. Is it better to return a copy of an array or allow other classes to retrieve copies of the objects stored in the array one by one? @JustinSkiles: It is not very broad. It is performance related and can actually be answered specifically for two general cases. You only need very few items: in this situation it's more effective to retrieve copies of the objects one by one. You wish to iterate over 30% or more objects. In this cases it is superior to retrieve all the array at once. This saves on functions calls. Function calls are very expansive when compared to reading directly from an array. A good specific answer could include performance, reading from an array and reading indirectly through a function. It is a simple performance related question.

    Read the article

  • Performance problems when running Java desktop applications on Citrix Metaframe

    - by demetriusnunes
    We have a desktop Java application running within a Citrix Metaframe server farm and the performance, specially while starting up the app, is very unreliable. Sometimes it takes 15 seconds and sometimes it takes over a minute. It's really unpredicatable. Is there any way to optimize running Java desktop applications within Citrix Metaframe Terminal server sessions to a more reliable performance level? Are there any optimization directed specifically toward Java, such as pre-load JVMs or something like that? Any help would be greatly appreciated.

    Read the article

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17  | Next Page >