Search Results

Search found 13300 results on 532 pages for 'exalytics performance tuning'.

Page 47/532 | < Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >

  • Can a call to WaitHandle.SignalAndWait be ignored for performance profiling purposes?

    - by Dan Tao
    I just downloaded the trial version of ANTS Performance Profiler from Red Gate and am investigating some of my team's code. Immediately I notice that there's a particular section of code that ANTS is reporting as eating up to 99% CPU time. I am completely unfamiliar with ANTS or performance profiling in general (that is, aside from self-profiling using what I'm sure are extremely crude and frowned-upon methods such as double timeToComplete = (endTime - startTime).TotalSeconds), so I'm still fiddling around with the application and figuring out how it's used. But I did call the developer responsible for the code in question and his immediate reaction was "Yeah, that doesn't surprise me that it says that; but that code calls SignalAndWait [which I could see for myself, thanks to ANTS], which doesn't use any CPU, it just sits there waiting for something to do." He advised me to simply ignore that code and look for anything ELSE I could find. My question: is it true that SignalAndWait requires NO CPU overhead (and if so, how is this possible?), and is it reasonable that a performance profiler would view it as taking up 99% CPU time? I find this particularly curious because, if it's at 99%, that would suggest that our application is often idle, wouldn't it? And yet its performance has become rather sluggish lately. Like I said, I really am just a beginner when it comes to this tool, and I don't know anything about the WaitHandle class. So ANY information to help me to understand what's going on here would be appreciated.

    Read the article

  • What is the performance impact of CSS's universal selector?

    - by Bungle
    I'm trying to find some simple client-side performance tweaks in a page that receives millions of monthly pageviews. One concern that I have is the use of the CSS universal selector (*). As an example, consider a very simple HTML document like the following: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> <title>Example</title> <style type="text/css"> * { margin: 0; padding: 0; } </head> <body> <h1>This is a heading</h1> <p>This is a paragraph of text.</p> </body> </html> The universal selector will apply the above declaration to the body, h1 and p elements, since those are the only ones in the document. In general, would I see better performance from a rule such as: body, h1, p { margin: 0; padding: 0; } Or would this have exactly the same net effect? Essentially, what I'm asking is if these rules are effectively equivalent in this case, or if the universal selector has to perform more unnecessary work that I may not be aware of. I realize that the performance impact in this example may be very small, but I'm hoping to learn something that may lead to more significant performance improvements in real-world situations. Thanks for any help!

    Read the article

  • Is there a IDE/compiler PC benchmark I can use to compare my PCs performance?

    - by RickL
    I'm looking for a benchmark (and results on other PCs) which would give me an idea of the development performance gain I could get by upgrading my PC, also the benchmark could be used to justify the upgrade to my boss. I use Visual Studio 2008 for my development, so I'd like to get an idea of by what factor the build times would be improved, and also it would be good if the benchmark could incorporate IDE performance (i.e. when editing, using intellisense, opening code files etc) into its result. I currently have an AMD 3800x2, with 2GB RAM on Vista 32. For example, I'd like to know what kind of performance gain I'd see in Visual Studio 2008 with a Q6600, 4GB RAM on Vista 64. And also with other processors, and other RAM sizes... also see whether hard disk performance is a big factor. EDIT: I mentioned Vista 64 because I'm aware that Vista 32 can only use 3GB RAM maximum. So I'd presume that wanting to use more RAM would require Vista 64, but perhaps it could still be slower overall there is a large overhead in using the 32 bit VS 2008 on 64 bit OS.

    Read the article

  • Performance of stored proc when updating columns selectively based on parameters?

    - by kprobst
    I'm trying to figure out if this is relatively well-performing T-SQL (this is SQL Server 2008). I need to create a stored procedure that updates a table. The proc accepts as many parameters as there are columns in the table, and with the exception of the PK column, they all default to NULL. The body of the procedure looks like this: CREATE PROCEDURE proc_repo_update @object_id bigint ,@object_name varchar(50) = NULL ,@object_type char(2) = NULL ,@object_weight int = NULL ,@owner_id int = NULL -- ...etc AS BEGIN update object_repo set object_name = ISNULL(@object_name, object_name) ,object_type = ISNULL(@object_type, object_type) ,object_weight = ISNULL(@object_weight, object_weight) ,owner_id = ISNULL(@owner_id, owner_id) -- ...etc where object_id = @object_id return @@ROWCOUNT END So basically: Update a column only if its corresponding parameter was provided, and leave the rest alone. This works well enough, but as the ISNULL call will return the value of the column if the received parameter was null, will SQL Server optimize this somehow? This might be a performance bottleneck on the application where the table might be updated heavily (insertion will be uncommon so the performance there is not a problem). So I'm trying to figure out what's the best way to do this. Is there a way to condition the column expressions with something like CASE WHEN or something? The table will be indexed up the wazoo as well for read performance. Is this the best approach? My alternative at this point is to create the UPDATE expression in code (e.g. inline SQL) and execute it against the server. This would solve my doubts about performance, but I'd rather leave this in a stored proc if possible.

    Read the article

  • Recent improvements in Console Performance

    - by loren.konkus
    Recently, the WebLogic Server development and support organizations have worked with a number of customers to quantify and improve the performance of the Administration Console in large, distributed configurations where there is significant latency in the communications between the administration server and managed servers. These improvements fall into two categories: Constraining the amount of time that the Console stalls waiting for communication Reducing and streamlining the amount of data required for an update A few releases ago, we added support for a configurable domain-wide mbean "Invocation Timeout" value on the Console's configuration: general, advanced section for a domain. The default value for this setting is 0, which means wait indefinitely and was chosen for compatibility with the behavior of previous releases. This configuration setting applies to all mbean communications between the admin server and managed servers, and is the first line of defense against being blocked by a stalled or completely overloaded managed server. Each site should choose an appropriate timeout value for their environment and network latency. In the next release of WebLogic Server, we've added an additional console preference, "Management Operation Timeout", to the Console's shared preference page. This setting further constrains how long certain console pages will wait for slowly responding servers before returning partial results. While not all Console pages support this yet, key pages such as the Servers Configuration and Control table pages and the Deployments Control pages have been updated to support this. For example, if a user requests a Servers Table page and a Management Operation Timeout occurs, the table is displayed with both local configuration and remote runtime information from the responding managed servers and only local configuration information for servers that did not yet respond. This means that a troublesome managed server does not impede your ability to manage your domain using the Console. To support these changes, these Console pages have been re-written to use the Work Management feature of WebLogic Server to interact with each server or deployment concurrently, which further improves the responsiveness of these pages. The basic algorithm for these pages is: For each configuration mbean (ie, Servers) populate rows with configuration attributes from the fast, local mbean server Find a WorkManager For each server, Create a Work instance to obtain runtime mbean attributes for the server Schedule Work instance in the WorkManager Call WorkManager.waitForAll to wait WorkItems to finish, constrained by Management Operation Timeout For each WorkItem, if the runtime information obtained was not complete, add a message indicating which server has incomplete data Display collected data in table In addition to these changes to constrain how long the console waits for communication, a number of other changes have been made to reduce the amount and scope of managed server interactions for key pages. For example, in previous releases the Deployments Control table looked at the status of a deployment on every managed server, even those servers that the deployment was not currently targeted on. (This was done to handle an edge case where a deployment's target configuration was changed while it remained running on previously targeted servers.) We decided supporting that edge case did not warrant the performance impact for all, and instead only look at the status of a deployment on the servers it is targeted to. Comprehensive status continues to be available if a user clicks on the 'status' field for a deployment. Finally, changes have been made to the System Status portlet to reduce its impact on Console page display times. Obtaining health information for this display requires several mbean interactions with managed servers. In previous releases, this mbean interaction occurred with every display, and any delay or impediment in these interactions was reflected in the display time for every page. To reduce this impact, we've made several changes in this portlet: Using Work Management to obtain health concurrently Applying the operation timeout configuration to constrain how long we will wait Caching health information to reduce the cost during rapid navigation from page to page and only obtaining new health information if the previous information is over 30 seconds old. Eliminating heath collection if this portlet is minimized. Together, these Console changes have resulted in significant performance improvements for the customers with large configurations and high latency that we have worked with during their development, and some lesser performance improvements for those with small configurations and very fast networks. These changes will be included in the 11g Rel 1 patch set 2 (10.3.3.0) release of WebLogic Server.

    Read the article

  • How do I find the cause for a huge difference in performance between two identical Ubuntu servers?

    - by the.duckman
    I am running two Dell R410 servers in the same rack of a data center. Both have the same hardware configuration, run Ubuntu 10.4, have the same packages installed and run the same Java web servers. No other load. One of them is 20-30% faster than the other, very consistently. I used dstat to figure out, if there are more context switches, IO, swapping or anything, but I see no reason for the difference. With the same workload, (no swapping, virtually no IO), the cpu usage and load is higher on one server. So the difference appears to be mainly CPU bound, but while a simple cpu benchmark using sysbench (with all other load turned off) did yield a difference, it was only 6%. So maybe it is not only CPU but also memory performance. I tried to figure out if the BIOS settings differ in some parameter, did a dump using dmidecode, but that yielded no difference. I compared /proc/cpuinfo, no difference. I compared the output of cpufreq-info, no difference. I am lost. What can I do, to figure out, what is going on?

    Read the article

  • mysql - moving to a lower performance server, how small can I go?

    - by pedalpete
    I've been running a site for a few years now which really isn't growing in traffic, and I want to save some money on hosting, but keep it going for the loyal users of the site and api. The database has one a nearly 4 million row table, and on a 4gb dual xeon 5320 server. When I check server stats on this server with ps -aux, i get returns of mysql running at about 11% capacity, so no serious load. The main query against mysql runs in about 0.45 seconds. I popped over to linode.com to see what kind of performance I could get out of one of their tiny boxes, and their 360mb ram XEN vps returns the same query in 20 seconds. Clearly not good enough. I've looked at the mysql variables, and they are both very similar (I've included the show variables output below, if anybody is interested). Is there a good way to decide on what size server is needed based on what I'm coming from? Is it RAM that is likely making the difference with the large table size? Is there a way for me to figure out how much ram would be ideal?? Here's the output of the show variables (though I'm not sure it is important). +---------------------------------+------------------------------------------------------------+ | Variable_name | Value | +---------------------------------+------------------------------------------------------------+ | auto_increment_increment | 1 | | auto_increment_offset | 1 | | automatic_sp_privileges | ON | | back_log | 50 | | basedir | /usr/ | | bdb_cache_size | 8384512 | | bdb_home | /var/lib/mysql/ | | bdb_log_buffer_size | 262144 | | bdb_logdir | | | bdb_max_lock | 10000 | | bdb_shared_data | OFF | | bdb_tmpdir | /tmp/ | | binlog_cache_size | 32768 | | bulk_insert_buffer_size | 8388608 | | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | | collation_connection | latin1_swedish_ci | | collation_database | latin1_swedish_ci | | collation_server | latin1_swedish_ci | | completion_type | 0 | | concurrent_insert | 1 | | connect_timeout | 10 | | datadir | /var/lib/mysql/ | | date_format | %Y-%m-%d | | datetime_format | %Y-%m-%d %H:%i:%s | | default_week_format | 0 | | delay_key_write | ON | | delayed_insert_limit | 100 | | delayed_insert_timeout | 300 | | delayed_queue_size | 1000 | | div_precision_increment | 4 | | keep_files_on_create | OFF | | engine_condition_pushdown | OFF | | expire_logs_days | 0 | | flush | OFF | | flush_time | 0 | | ft_boolean_syntax | + - For some reason, that table formats properly in the preview, but apparently not when viewing the question. Hopefully it isn't needed anyway.

    Read the article

  • Windows 7 host with Ubuntu Guest and a performance hit, memory locks?

    - by Cyrylski
    I have a brand new Lenovo T510 with Core i5 and 4GB of RAM with Windows 7 on it. I Installed Ubuntu 10.10 in a Virtualbox. For some reason system gets really slow on this setup which makes me really angry. There's a video card shared with full 3D support enabled and 1GB of RAM allocated for the Ubuntu machine. It may sound stupid, but WHY is the whole memory consumed in an instant when I run Virtualbox? I struggled for like 10 minutes restraining myself from a brutal reset, and now everything runs smooth but memory "in use" in Resource Monitor is 3GB flat with only Chrome running. I'm new to Windows 7, but I'm really disappointed with performance at this point... I used to work in a different environment with much slower hardware and there was no such problem (WinXP over Ubuntu, 1GB out of 2GB allocated for WinXP guest on intel GMA). This is, until I clogged RAM totally there. But I was capable of running Chrome, Firefox and Apache server on a 1GB RAM in Ubuntu there and Photoshop CS4 on Windows XP and it worked. In this case I can't go beyond setting up Ubuntu properly. I bet I'm doing something wrong.

    Read the article

  • Save object states in .data or attr - Performance vs CSS?

    - by Neysor
    In response to my answer yesterday about rotating an Image, Jamund told me to use .data() instead of .attr() First I thought that he is right, but then I thought about a bigger context... Is it always better to use .data() instead of .attr()? I looked in some other posts like what-is-better-data-or-attr or jquery-data-vs-attrdata The answers were not satisfactory for me... So I moved on and edited the example by adding CSS. I thought it might be useful to make a different Style on each image if it rotates. My style was the following: .rp[data-rotate="0"] { border:10px solid #FF0000; } .rp[data-rotate="90"] { border:10px solid #00FF00; } .rp[data-rotate="180"] { border:10px solid #0000FF; } .rp[data-rotate="270"] { border:10px solid #00FF00; } Because design and coding are often separated, it could be a nice feature to handle this in CSS instead of adding this functionality into JavaScript. Also in my case the data-rotate is like a special state which the image currently has. So in my opinion it make sense to represent it within the DOM. I also thought this could be a case where it is much better to save with .attr() then with .data(). Never mentioned before in one of the posts I read. But then i thought about performance. Which function is faster? I built my own test following: <!DOCTYPE HTML> <html> <head> <title>test</title> <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.7.1/jquery.min.js"></script> <script type="text/javascript"> function runfirst(dobj,dname){ console.log("runfirst "+dname); console.time(dname+"-attr"); for(i=0;i<10000;i++){ dobj.attr("data-test","a"+i); } console.timeEnd(dname+"-attr"); console.time(dname+"-data"); for(i=0;i<10000;i++){ dobj.data("data-test","a"+i); } console.timeEnd(dname+"-data"); } function runlast(dobj,dname){ console.log("runlast "+dname); console.time(dname+"-data"); for(i=0;i<10000;i++){ dobj.data("data-test","a"+i); } console.timeEnd(dname+"-data"); console.time(dname+"-attr"); for(i=0;i<10000;i++){ dobj.attr("data-test","a"+i); } console.timeEnd(dname+"-attr"); } $().ready(function() { runfirst($("#rp4"),"#rp4"); runfirst($("#rp3"),"#rp3"); runlast($("#rp2"),"#rp2"); runlast($("#rp1"),"#rp1"); }); </script> </head> <body> <div id="rp1">Testdiv 1</div> <div id="rp2" data-test="1">Testdiv 2</div> <div id="rp3">Testdiv 3</div> <div id="rp4" data-test="1">Testdiv 4</div> </body> </html> It should also show if there is a difference with a predefined data-test or not. One result was this: runfirst #rp4 #rp4-attr: 515ms #rp4-data: 268ms runfirst #rp3 #rp3-attr: 505ms #rp3-data: 264ms runlast #rp2 #rp2-data: 260ms #rp2-attr: 521ms runlast #rp1 #rp1-data: 284ms #rp1-attr: 525ms So the .attr() function did always need more time than the .data() function. This is an argument for .data() I thought. Because performance is always an argument! Then I wanted to post my results here with some questions, and in the act of writing I compared with the questions Stack Overflow showed me (similar titles) And true enough, there was one interesting post about performance I read it and run their example. And now I am confused! This test showed that .data() is slower then .attr() !?!! Why is that so? First I thought it is because of a different jQuery library so I edited it and saved the new one. But the result wasn't changing... So now my questions to you: Why are there some differences in the performance in these two examples? Would you prefer to use data- HTML5 attributes instead of data, if it represents a state? Although it wouldn't be needed at the time of coding? Why - Why not? Now depending on the performance: Would performance be an argument for you using .attr() instead of data, if it shows that .attr() is better? Although data is meant to be used for .data()? UPDATE 1: I did see that without overhead .data() is much faster. Misinterpreted the data :) But I'm more interested in my second question. :) Would you prefer to use data- HTML5 attributes instead of data, if it represents a state? Although it wouldn't be needed at the time of coding? Why - Why not? Are there some other reasons you can think of, to use .attr() and not .data()? e.g. interoperability? because .data() is jquery style and HTML Attributes can be read by all... UPDATE 2: As we see from T.J Crowder's speed test in his answer attr is much faster then data! which is again confusing me :) But please! Performance is an argument, but not the highest! So give answers to my other questions please too!

    Read the article

  • General monitoring for SQL Server Analysis Services using Performance Monitor

    - by Testas
    A recent customer engagement required a setup of a monitoring solution for SSAS, due to the time restrictions placed upon this, native Windows Performance Monitor (Perfmon) and SQL Server Profiler Monitoring Tools was used as using a third party tool would have meant the customer providing an additional monitoring server that was not available.I wanted to outline the performance monitoring counters that was used to monitor the system on which SSAS was running. Due to the slow query performance that was occurring during certain scenarios, perfmon was used to establish if any pressure was being placed on the Disk, CPU or Memory subsystem when concurrent connections access the same query, and Profiler to pinpoint how the query was being managed within SSAS, profiler I will leave for another blogThis guide is not designed to provide a definitive list of what should be used when monitoring SSAS, different situations may require the addition or removal of counters as presented by the situation. However I hope that it serves as a good basis for starting your monitoring of SSAS. I would also like to acknowledge Chris Webb’s awesome chapters from “Expert Cube Development” that also helped shape my monitoring strategy:http://cwebbbi.spaces.live.com/blog/cns!7B84B0F2C239489A!6657.entrySimulating ConnectionsTo simulate the additional connections to the SSAS server whilst monitoring, I used ascmd to simulate multiple connections to the typical and worse performing queries that were identified by the customer. A similar sript can be downloaded from codeplex at http://www.codeplex.com/SQLSrvAnalysisSrvcs.     File name: ASCMD_StressTestingScripts.zip. Performance MonitorWithin performance monitor,  a counter log was created that contained the list of counters below. The important point to note when running the counter log is that the RUN AS property within the counter log properties should be changed to an account that has rights to the SSAS instance when monitoring MSAS counters. Failure to do so means that the counter log runs under the system account, no errors or warning are given while running the counter log, and it is not until you need to view the MSAS counters that they will not be displayed if run under the default account that has no right to SSAS. If your connection simulation takes hours, this could prove quite frustrating if not done beforehand JThe counters used……  Object Counter Instance Justification System Processor Queue legnth N/A Indicates how many threads are waiting for execution against the processor. If this counter is consistently higher than around 5 when processor utilization approaches 100%, then this is a good indication that there is more work (active threads) available (ready for execution) than the machine's processors are able to handle. System Context Switches/sec N/A Measures how frequently the processor has to switch from user- to kernel-mode to handle a request from a thread running in user mode. The heavier the workload running on your machine, the higher this counter will generally be, but over long term the value of this counter should remain fairly constant. If this counter suddenly starts increasing however, it may be an indicating of a malfunctioning device, especially if the Processor\Interrupts/sec\(_Total) counter on your machine shows a similar unexplained increase Process % Processor Time sqlservr Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process % Processor Time msmdsrv Definately should be used if Processor\% Processor Time\(_Total) is maxing at 100% to assess the effect of the SQL Server process on the processor Process Working Set sqlservr If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Process Working Set msmdsrv If the Memory\Available bytes counter is decreaing this counter can be run to indicate if the process is consuming larger and larger amounts of RAM. Process(instance)\Working Set measures the size of the working set for each process, which indicates the number of allocated pages the process can address without generating a page fault. Processor % Processor Time _Total and individual cores measures the total utilization of your processor by all running processes. If multi-proc then be mindful only an average is provided Processor % Privileged Time _Total To see how the OS is handling basic IO requests. If kernel mode utilization is high, your machine is likely underpowered as it's too busy handling basic OS housekeeping functions to be able to effectively run other applications. Processor % User Time _Total To see how the applications is interacting from a processor perspective, a high percentage utilisation determine that the server is dealing with too many apps and may require increasing thje hardware or scaling out Processor Interrupts/sec _Total  The average rate, in incidents per second, at which the processor received and serviced hardware interrupts. Shoulr be consistant over time but a sudden unexplained increase could indicate a device malfunction which can be confirmed using the System\Context Switches/sec counter Memory Pages/sec N/A Indicates the rate at which pages are read from or written to disk to resolve hard page faults. This counter is a primary indicator of the kinds of faults that cause system-wide delays, this is the primary counter to watch for indication of possible insufficient RAM to meet your server's needs. A good idea here is to configure a perfmon alert that triggers when the number of pages per second exceeds 50 per paging disk on your system. May also want to see the configuration of the page file on the Server Memory Available Mbytes N/A is the amount of physical memory, in bytes, available to processes running on the computer. if this counter is greater than 10% of the actual RAM in your machine then you probably have more than enough RAM. monitor it regularly to see if any downward trend develops, and set an alert to trigger if it drops below 2% of the installed RAM. Physical Disk Disk Transfers/sec for each physical disk If it goes above 10 disk I/Os per second then you've got poor response time for your disk. Physical Disk Idle Time _total If Disk Transfers/sec is above  25 disk I/Os per second use this counter. which measures the percent time that your hard disk is idle during the measurement interval, and if you see this counter fall below 20% then you've likely got read/write requests queuing up for your disk which is unable to service these requests in a timely fashion. Physical Disk Disk queue legnth For the OLAP and SQL physical disk A value that is consistently less than 2 means that the disk system is handling the IO requests against the physical disk Network Interface Bytes Total/sec For the NIC Should be monitored over a period of time to see if there is anb increase/decrease in network utilisation Network Interface Current Bandwidth For the NIC is an estimate of the current bandwidth of the network interface in bits per second (BPS). MSAS 2005: Memory Memory Limit High KB N/A Shows (as a percentage) the high memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Limit Low KB N/A Shows (as a percentage) the low memory limit configured for SSAS in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP\Config\msmdsrv.ini MSAS 2005: Memory Memory Usage KB N/A Displays the memory usage of the server process. MSAS 2005: Memory File Store KB N/A Displays the amount of memory that is reserved for the Cache. Note if total memory limit in the msmdsrv.ini is set to 0, no memory is reserved for the cache MSAS 2005: Storage Engine Query Queries from Cache Direct / sec N/A Displays the rate of queries answered from the cache directly MSAS 2005: Storage Engine Query Queries from Cache Filtered / Sec N/A Displays the Rate of queries answered by filtering existing cache entry. MSAS 2005: Storage Engine Query Queries from File / Sec N/A Displays the Rate of queries answered from files. MSAS 2005: Storage Engine Query Average time /query N/A Displays the average time of a query MSAS 2005: Connection Current connections N/A Displays the number of connections against the SSAS instance MSAS 2005: Connection Requests / sec N/A Displays the rate of query requests per second MSAS 2005: Locks Current Lock Waits N/A Displays thhe number of connections waiting on a lock MSAS 2005: Threads Query Pool job queue Length N/A The number of queries in the job queue MSAS 2005:Proc Aggregations Temp file bytes written/sec N/A Shows the number of bytes of data processed in a temporary file MSAS 2005:Proc Aggregations Temp file rows written/sec N/A Shows the number of bytes of data processed in a temporary file 

    Read the article

  • SQL Server Prefetch and Query Performance

    Prefetching can make a surprising difference to SQL Server query execution times where there is a high incidence of waiting for disk i/o operations, but the benefits come at a cost. Mostly, the Query Optimizer gets it right, but occasionally there are queries that would benefit from tuning. Get smart with SQL Backup ProGet faster, smaller backups with integrated verification.Quickly and easily DBCC CHECKDB your backups. Learn more.

    Read the article

  • SQL SERVER – Introduction to Wait Stats and Wait Types – Wait Type – Day 1 of 28

    - by pinaldave
    I have been working a lot on Wait Stats and Wait Types recently. Last Year, I requested blog readers to send me their respective server’s wait stats. I appreciate their kind response as I have received  Wait stats from my readers. I took each of the results and carefully analyzed them. I provided necessary feedback to the person who sent me his wait stats and wait types. Based on the feedbacks I got, many of the readers have tuned their server. After a while I got further feedbacks on my recommendations and again, I collected wait stats. I recorded the wait stats and my recommendations and did further research. At some point at time, there were more than 10 different round trips of the recommendations and suggestions. Finally, after six month of working my hands on performance tuning, I have collected some real world wisdom because of this. Now I plan to share my findings with all of you over here. Before anything else, please note that all of these are based on my personal observations and opinions. They may or may not match the theory available at other places. Some of the suggestions may not match your situation. Remember, every server is different and consequently, there is more than one solution to a particular problem. However, this series is written with kept wait stats in mind. While I was working on various performance tuning consultations, I did many more things than just tuning wait stats. Today we will discuss how to capture the wait stats. I use the script diagnostic script created by my friend and SQL Server Expert Glenn Berry to collect wait stats. Here is the script to collect the wait stats: -- Isolate top waits for server instance since last restart or statistics clear WITH Waits AS (SELECT wait_type, wait_time_ms / 1000. AS wait_time_s, 100. * wait_time_ms / SUM(wait_time_ms) OVER() AS pct, ROW_NUMBER() OVER(ORDER BY wait_time_ms DESC) AS rn FROM sys.dm_os_wait_stats WHERE wait_type NOT IN ('CLR_SEMAPHORE','LAZYWRITER_SLEEP','RESOURCE_QUEUE','SLEEP_TASK' ,'SLEEP_SYSTEMTASK','SQLTRACE_BUFFER_FLUSH','WAITFOR', 'LOGMGR_QUEUE','CHECKPOINT_QUEUE' ,'REQUEST_FOR_DEADLOCK_SEARCH','XE_TIMER_EVENT','BROKER_TO_FLUSH','BROKER_TASK_STOP','CLR_MANUAL_EVENT' ,'CLR_AUTO_EVENT','DISPATCHER_QUEUE_SEMAPHORE', 'FT_IFTS_SCHEDULER_IDLE_WAIT' ,'XE_DISPATCHER_WAIT', 'XE_DISPATCHER_JOIN', 'SQLTRACE_INCREMENTAL_FLUSH_SLEEP')) SELECT W1.wait_type, CAST(W1.wait_time_s AS DECIMAL(12, 2)) AS wait_time_s, CAST(W1.pct AS DECIMAL(12, 2)) AS pct, CAST(SUM(W2.pct) AS DECIMAL(12, 2)) AS running_pct FROM Waits AS W1 INNER JOIN Waits AS W2 ON W2.rn <= W1.rn GROUP BY W1.rn, W1.wait_type, W1.wait_time_s, W1.pct HAVING SUM(W2.pct) - W1.pct < 99 OPTION (RECOMPILE); -- percentage threshold GO This script uses Dynamic Management View sys.dm_os_wait_stats to collect the wait stats. It omits the system-related wait stats which are not useful to diagnose performance-related bottleneck. Additionally, not OPTION (RECOMPILE) at the end of the DMV will ensure that every time the query runs, it retrieves new data and not the cached data. This dynamic management view collects all the information since the time when the SQL Server services have been restarted. You can also manually clear the wait stats using the following command: DBCC SQLPERF('sys.dm_os_wait_stats', CLEAR); Once the wait stats are collected, we can start analysis them and try to see what is causing any particular wait stats to achieve higher percentages than the others. Many waits stats are related to one another. When the CPU pressure is high, all the CPU-related wait stats show up on top. But when that is fixed, all the wait stats related to the CPU start showing reasonable percentages. It is difficult to have a sure solution, but there are good indications and good suggestions on how to solve this. I will keep this blog post updated as I will post more details about wait stats and how I reduce them. The reference to Book On Line is over here. Of course, I have selected February to run this Wait Stats series. I am already cheating by having the smallest month to run this series. :) Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: DMV, Pinal Dave, PostADay, SQL, SQL Authority, SQL Optimization, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, SQL Wait Stats, SQL Wait Types, T SQL, Technology

    Read the article

  • Speeding up procedural texture generation

    - by FalconNL
    Recently I've begun working on a game that takes place in a procedurally generated solar system. After a bit of a learning curve (having neither worked with Scala, OpenGL 2 ES or Libgdx before), I have a basic tech demo going where you spin around a single procedurally textured planet: The problem I'm running into is the performance of the texture generation. A quick overview of what I'm doing: a planet is a cube that has been deformed to a sphere. To each side, a n x n (e.g. 256 x 256) texture is applied, which are bundled in one 8n x n texture that is sent to the fragment shader. The last two spaces are not used, they're only there to make sure the width is a power of 2. The texture is currently generated on the CPU, using the updated 2012 version of the simplex noise algorithm linked to in the paper 'Simplex noise demystified'. The scene I'm using to test the algorithm contains two spheres: the planet and the background. Both use a greyscale texture consisting of six octaves of 3D simplex noise, so for example if we choose 128x128 as the texture size there are 128 x 128 x 6 x 2 x 6 = about 1.2 million calls to the noise function. The closest you will get to the planet is about what's shown in the screenshot and since the game's target resolution is 1280x720 that means I'd prefer to use 512x512 textures. Combine that with the fact the actual textures will of course be more complicated than basic noise (There will be a day and night texture, blended in the fragment shader based on sunlight, and a specular mask. I need noise for continents, terrain color variation, clouds, city lights, etc.) and we're looking at something like 512 x 512 x 6 x 3 x 15 = 70 million noise calls for the planet alone. In the final game, there will be activities when traveling between planets, so a wait of 5 or 10 seconds, possibly 20, would be acceptable since I can calculate the texture in the background while traveling, though obviously the faster the better. Getting back to our test scene, performance on my PC isn't too terrible, though still too slow considering the final result is going to be about 60 times worse: 128x128 : 0.1s 256x256 : 0.4s 512x512 : 1.7s This is after I moved all performance-critical code to Java, since trying to do so in Scala was a lot worse. Running this on my phone (a Samsung Galaxy S3), however, produces a more problematic result: 128x128 : 2s 256x256 : 7s 512x512 : 29s Already far too long, and that's not even factoring in the fact that it'll be minutes instead of seconds in the final version. Clearly something needs to be done. Personally, I see a few potential avenues, though I'm not particularly keen on any of them yet: Don't precalculate the textures, but let the fragment shader calculate everything. Probably not feasible, because at one point I had the background as a fullscreen quad with a pixel shader and I got about 1 fps on my phone. Use the GPU to render the texture once, store it and use the stored texture from then on. Upside: might be faster than doing it on the CPU since the GPU is supposed to be faster at floating point calculations. Downside: effects that cannot (easily) be expressed as functions of simplex noise (e.g. gas planet vortices, moon craters, etc.) are a lot more difficult to code in GLSL than in Scala/Java. Calculate a large amount of noise textures and ship them with the application. I'd like to avoid this if at all possible. Lower the resolution. Buys me a 4x performance gain, which isn't really enough plus I lose a lot of quality. Find a faster noise algorithm. If anyone has one I'm all ears, but simplex is already supposed to be faster than perlin. Adopt a pixel art style, allowing for lower resolution textures and fewer noise octaves. While I originally envisioned the game in this style, I've come to prefer the realistic approach. I'm doing something wrong and the performance should already be one or two orders of magnitude better. If this is the case, please let me know. If anyone has any suggestions, tips, workarounds, or other comments regarding this problem I'd love to hear them.

    Read the article

  • Intel Extreme Tuning utility options are greyed

    - by Abhishek Sha
    I'm having a ASUS K55VM with Intel Core i7 3610QM (IvyBridge) with a NVIDIA GT630M. I'm trying to operate the Intel XTU, but as you can see in the screenshot, all the options are greyed out. Can you please help with this situation. Another are is the CPU Throttling (Intel SpeedStep) which is always shown as 0%. But in the Intel Turbo Monitor, the Speed keeps dynamically changing. Then why is the CPU Throttling always at 0%?:

    Read the article

  • Tuning Linux IP routing parameters -- secret_interval and tcp_mem

    - by Jeff Atwood
    We had a little failover problem with one of our HAProxy VMs today. When we dug into it, we found this: Jan 26 07:41:45 haproxy2 kernel: [226818.070059] __ratelimit: 10 callbacks suppressed Jan 26 07:41:45 haproxy2 kernel: [226818.070064] Out of socket memory Jan 26 07:41:47 haproxy2 kernel: [226819.560048] Out of socket memory Jan 26 07:41:49 haproxy2 kernel: [226822.030044] Out of socket memory Which, per this link, apparently has to do with low default settings for net.ipv4.tcp_mem. So we increased them by 4x from their defaults (this is Ubuntu Server, not sure if the Linux flavor matters): current values are: 45984 61312 91968 new values are: 183936 245248 367872 After that, we started seeing a bizarre error message: Jan 26 08:18:49 haproxy1 kernel: [ 2291.579726] Route hash chain too long! Jan 26 08:18:49 haproxy1 kernel: [ 2291.579732] Adjust your secret_interval! Shh.. it's a secret!! This apparently has to do with /proc/sys/net/ipv4/route/secret_interval which defaults to 600 and controls periodic flushing of the route cache The secret_interval instructs the kernel how often to blow away ALL route hash entries regardless of how new/old they are. In our environment this is generally bad. The CPU will be busy rebuilding thousands of entries per second every time the cache is cleared. However we set this to run once a day to keep memory leaks at bay (though we've never had one). While we are happy to reduce this, it seems odd to recommend dropping the entire route cache at regular intervals, rather than simply pushing old values out of the route cache faster. After some investigation, we found /proc/sys/net/ipv4/route/gc_elasticity which seems to be a better option for keeping the route table size in check: gc_elasticity can best be described as the average bucket depth the kernel will accept before it starts expiring route hash entries. This will help maintain the upper limit of active routes. We adjusted elasticity from 8 to 4, in the hopes of the route cache pruning itself more aggressively. The secret_interval does not feel correct to us. But there are a bunch of settings and it's unclear which are really the right way to go here. /proc/sys/net/ipv4/route/gc_elasticity (8) /proc/sys/net/ipv4/route/gc_interval (60) /proc/sys/net/ipv4/route/gc_min_interval (0) /proc/sys/net/ipv4/route/gc_timeout (300) /proc/sys/net/ipv4/route/secret_interval (600) /proc/sys/net/ipv4/route/gc_thresh (?) rhash_entries (kernel parameter, default unknown?) We don't want to make the Linux routing worse, so we're kind of afraid to mess with some of these settings. Can anyone advise which routing parameters are best to tune, for a high traffic HAProxy instance?

    Read the article

  • Solaris TCP stack tuning

    - by disserman
    We have a large web project (about 2-3k requests per second), using haproxy (http://haproxy.1wt.eu/) as a frontend and load balancer between the java application servers. The frontend (haproxy) is running on Linux but we are going to migrate it to the Solaris 10 as all our other servers are running under Solaris. After switching a traffic I see the two things: a) the web site became loading slower (5-10 seconds with images in comparison to 2-3 seconds on Linux) b) sometimes haproxy fails to perform a "lifecheck" (get a special web page and analyze http response code) due to the socket timeout. After switching traffic back to Linux everything is okay. I've tried to tune all params I found in /dev/tcp but no progress. I believe the problem is in some open socket limitations. If someone can point me to the answer, I would be greatly appreciated. p.s. haproxy is running under Xen DomU on Linux (Kernel 2.6.18, Debian 5), under zone on Solaris (10 u8). the only thing we did on Linux is increasing of ip_conntrack_max (I believe Solaris option tcp_conn_req_max_q is the equivalent).

    Read the article

  • Why do I see a large performance hit with DRBD?

    - by BHS
    I see a much larger performance hit with DRBD than their user manual says I should get. I'm using DRBD 8.3.7 (Fedora 13 RPMs). I've setup a DRBD test and measured throughput of disk and network without DRBD: dd if=/dev/zero of=/data.tmp bs=512M count=1 oflag=direct 536870912 bytes (537 MB) copied, 4.62985 s, 116 MB/s / is a logical volume on the disk I'm testing with, mounted without DRBD iperf: [ 4] 0.0-10.0 sec 1.10 GBytes 941 Mbits/sec According to Throughput overhead expectations, the bottleneck would be whichever is slower, the network or the disk and DRBD should have an overhead of 3%. In my case network and I/O seem to be pretty evenly matched. It sounds like I should be able to get around 100 MB/s. So, with the raw drbd device, I get dd if=/dev/zero of=/dev/drbd2 bs=512M count=1 oflag=direct 536870912 bytes (537 MB) copied, 6.61362 s, 81.2 MB/s which is slower than I would expect. Then, once I format the device with ext4, I get dd if=/dev/zero of=/mnt/data.tmp bs=512M count=1 oflag=direct 536870912 bytes (537 MB) copied, 9.60918 s, 55.9 MB/s This doesn't seem right. There must be some other factor playing into this that I'm not aware of. global_common.conf global { usage-count yes; } common { protocol C; } syncer { al-extents 1801; rate 33M; } data_mirror.res resource data_mirror { device /dev/drbd1; disk /dev/sdb1; meta-disk internal; on cluster1 { address 192.168.33.10:7789; } on cluster2 { address 192.168.33.12:7789; } } For the hardware I have two identical machines: 6 GB RAM Quad core AMD Phenom 3.2Ghz Motherboard SATA controller 7200 RPM 64MB cache 1TB WD drive The network is 1Gb connected via a switch. I know that a direct connection is recommended, but could it make this much of a difference? Edited I just tried monitoring the bandwidth used to try to see what's happening. I used ibmonitor and measured average bandwidth while I ran the dd test 10 times. I got: avg ~450Mbits writing to ext4 avg ~800Mbits writing to raw device It looks like with ext4, drbd is using about half the bandwidth it uses with the raw device so there's a bottleneck that is not the network.

    Read the article

  • Tuning Windows 7 for use in a VM

    - by intuited
    I'm running Windows 7 in a VirtualBox Virtual Machine, and would like to make it run in a more streamlined fashion. I'll be using the install primarily for testing web apps, and have no need for it to run quickly. I would like it to run with minimal memory requirements, and with minimal changes to its virtual hard drive's contents. Changes to the hard drive contents, for example the paging file, result in larger snapshot sizes. Another recent post of mine seems to be related to this issue, but does not directly address issues with Windows. One concern that I have is that Windows seems to be using 17% of its paging file even with over 900MB of memory marked "Standby" or "Free". My uneducated guess is that this is being used to store indexes or some other data that helps to speed up the system but is not really necessary. I'm also wondering if it's normal for Windows to use over 500 MB of "In Use" memory with no apps running. Will this amount decrease if I reduce the amount of "installed" memory in the VM? What steps can I take to reduce the system's memory footprint without incurring an increase in paging file usage?

    Read the article

  • postgres memory allocation tuning 2

    - by pstanton
    i've got a Ubuntu Linux system with 12Gb memory most of which (at least 10Gb) can be allocated solely to postgres. the system also has a 6 disk 15k SCSI RAID 10 setup. The process i'm trying to optimise is twofold. firstly a single threaded, single connection will do many inserts into 2-4 tables linked by foreign key. secondly many different complex queries are run against the resulting data, using group by extensively. this part especially needs to be optimised. i have four of these processes running at once in order to make use of the quad core CPU, therefore there will generally be no more than 5 concurrent connections (1 spare for admin tasks). what configuration changes to the default Postgres config would you recommend? I'm looking for the optimum values for things like work_mem, shared_buffers etc. relevant doco thanks!

    Read the article

  • Fine-tuning a LNMP stack

    - by Norman
    I'm in the process of setting up a server with 4GB RAM and 2 CPUs. The stack will be CentOS + NGINX + MySQL + PHP (with APC) and spawn-fcgi. It will be used to serve 10 Wordpress blogs, 3 of which receive about 20,000 hits per day. Each Wordpress instance is equipped with the W3 TotalCache. I have a few variables to play with: NGINX (How many worker_processes, worker_connections, etc) PHP (What parameters in php.ini should I change? What about apc?) Spawn-fcgi (Right now I have 6 php-cgi spawned. How many of them should I have?) I realize it's hard to tell without testing, but if you could please provide me with some ballpark numbers, that would be helpful too.

    Read the article

  • Exchange Server 2007 message tracking log tuning ?

    - by Albert Widjaja
    Hi All, what is the best practice if I want to have a retention of let say 6 months ? I'm confused which parameter that is should/can be changes. Get-ExchangeServer | where {$_.isHubTransportServer -eq $true} | Get-TransportServer | select Name, *MessageTracking* | ft -AutoSize Name MessageTrackingLogEnabled MessageTrackingLogMaxAge MessageTrackingLogMaxDirectorySize MessageTrackingLogMaxFileSize MessageTrackingLogPat h ---- ------------------------- ------------------------ ---------------------------------- ----------------------------- --------------------- ExHTServer1 True 20.00:00:00 250MB 10MB D:\Program Files\M... ExHTServer2 True 20.00:00:00 250MB 10MB D:\Program Files\M... ExHTServer3 True 20.00:00:00 250MB 10MB D:\Program Files\M... Thanks, Albert

    Read the article

  • memory tuning with rails/unicorn running on ubuntu

    - by user970193
    I am running unicorn on Ubuntu 11, Rails 3.0, and Ruby 1.8.7. It is an 8 core ec2 box, and I am running 15 workers. CPU never seems to get pinned, and I seem to be handling requests pretty nicely. My question concerns memory usage, and what concerns I should have with what I am seeing. (if any) Here is the scenario: Under constant load (about 15 reqs/sec coming in from nginx), over the course of an hour, each server in the 3 server cluster loses about 100MB / hour. This is a linear slope for about 6 hours, then it appears to level out, but still maybe appear to lose about 10MB/hour. If I drop my page caches using the linux command echo 1 /proc/sys/vm/drop_caches, the available free memory shoots back up to what it was when I started the unicorns, and the memory loss pattern begins again over the hours. Before: total used free shared buffers cached Mem: 7130244 5005376 2124868 0 113628 422856 -/+ buffers/cache: 4468892 2661352 Swap: 33554428 0 33554428 After: total used free shared buffers cached Mem: 7130244 4467144 2663100 0 228 11172 -/+ buffers/cache: 4455744 2674500 Swap: 33554428 0 33554428 My Ruby code does use memoizations and I'm assuming Ruby/Rails/Unicorn is keeping its own caches... what I'm wondering is should I be worried about this behaviour? FWIW, my Unicorn config: worker_processes 15 listen "#{CAPISTRANO_ROOT}/shared/pids/unicorn_socket", :backlog = 1024 listen 8080, :tcp_nopush = true timeout 180 pid "#{CAPISTRANO_ROOT}/shared/pids/unicorn.pid" GC.respond_to?(:copy_on_write_friendly=) and GC.copy_on_write_friendly = true before_fork do |server, worker| STDERR.puts "XXXXXXXXXXXXXXXXXXX BEFORE FORK" print_gemfile_location defined?(ActiveRecord::Base) and ActiveRecord::Base.connection.disconnect! defined?(Resque) and Resque.redis.client.disconnect old_pid = "#{CAPISTRANO_ROOT}/shared/pids/unicorn.pid.oldbin" if File.exists?(old_pid) && server.pid != old_pid begin Process.kill("QUIT", File.read(old_pid).to_i) rescue Errno::ENOENT, Errno::ESRCH # already killed end end File.open("#{CAPISTRANO_ROOT}/shared/pids/unicorn.pid.ok", "w"){|f| f.print($$.to_s)} end after_fork do |server, worker| defined?(ActiveRecord::Base) and ActiveRecord::Base.establish_connection defined?(Resque) and Resque.redis.client.connect end Is there a need to experiment enforcing more stringent garbage collection using OobGC (http://unicorn.bogomips.org/Unicorn/OobGC.html)? Or is this just normal behaviour, and when/as the system needs more memory, it will empty the caches by itself, without me manually running that cache command? Basically, is this normal, expected behaviour? tia

    Read the article

  • Tuning OpenVZ Containers to work better with Java?

    - by Daniel
    I have a 8 GB RAM Server (Dedicated) and currently have KVM Virtual Machines running on there (successfully) however i'm considering moving to OpenVZ as KVM seems a bit overkill with a lot of overhead for what i use it for. In the past i have used OpenVZ Containers, hosted by myself and from other providers and Java doesn't seem to work well with them.. One example is that if i give a container 2 GB RAM ( No burst) (with or without vswap doesn't matter) a java instance can only be tuned to use at very most 1500 MB of that RAM (-Xmx, -Xms). Ideally, i wish to be able to create "Mini" containers with about 256MB, 512MB, 768 RAM and run some java instances in them. My question is: I'm trying to find an ideal way to tune a OpenVZ container configuration to work better with Java memory. Please, don't suggest anything related to Java settings, i'm looking for OpenVZ specific answers.. Though i welcome any suggestion if you feel it may help me. Much Appreciated, Daniel

    Read the article

  • Tuning MySQL to consume less memory

    - by Alex
    I have a VM which has 2GB Ram, (full specs) And I am setting up a site which has one table in particular with over a million records. There's little or no usage of this particular database (perhaps once or twice a day) but simply running mysql grinds the whole server to a halt. I've looked through the top results but nothing is really denting the CPU however the memory seems to be the issue. The site isnt even live of taking requests yet. the memory situation looks like this: # free -m total used free shared buffers cached Mem: 2006 1880 126 0 3 53 -/+ buffers/cache: 1823 183 Swap: 2047 345 1702 Are there any good pointers to tune mysql to stop hogging the system memory? Thanks very much EDIT: (requested by 8bit): http://tny.cz/b41a0b12

    Read the article

< Previous Page | 43 44 45 46 47 48 49 50 51 52 53 54  | Next Page >