Search Results

Search found 13249 results on 530 pages for 'performance tuning'.

Page 20/530 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >

Performance decrease in every game and application

- by Márk Vincze

When I start a game, initially it runs smoothly, but after a couple of minutes, the performance gradually decreases to the point of being unplayable (1-2 FPS). The sound also starts to lag at this point. This does not happen every time I start my PC, usually exiting the game, rebooting, then starting the game again solves the problem, and I can play with perfect FPS for as long as I want. I could not find any deterministic reason when this happens and when doesn't. It happens in every game I tried (SWTOR, Diablo 3, Skyrim), and not even games, but simple applications like a browser or the Control Panel can get unusably slow. This is a brand new PC I bought three months ago, and this problem occurs since the first day I've been using it. Could you provide any advice how to further diagnose the problem? I tried to reinstall Windows, and tried different video card drivers, but it did not help. It would be important to know whether this is a hardware or software problem, because I can use the warranty if it is a hardware issue. (I did not want to return the PC yet, because I can't reproduce the issue deterministically.) Spec of the pc: Motherboard: ASROCK H61M-HVS CPU: INTEL Core i3-2120 3.30GHz 1155 BOX Memory: KINGMAX 4096MB DDR3 1333MHz KIT Video card: GIGABYTE GV-R685OC-1GD HD6850 1GB GDDR5 PCIE HDD: SEAGATE 500GB Barracuda 7200rpm 16MB SATA3 ST500DM002 I am using Windows 7 64 bit. Thanks a lot in advance!

Read the article
file read performance degrades as number of files increases

- by bfallik-bamboom

We're observing poor file read IO results that we'd like to better understand. We can use fio to write 100 files with a sustained aggregate throughput of ~700MB/s. When we switch the test to read instead of write, the aggregate throughput is only ~55MB/s. The drop seems related to the number of files since the throughput for read and write are comparable for a single file then diverge proportionally as we increase the number of files. The test server has 24 CPU cores, 48GB of memory, and is running CentOS 6.0. The disk hardware is a RAID 6 array with 12 disks and a Dell H800 controller. This device is partitioned with ext4 using the default settings. Increasing the readahead (using blockdev) improves the read throughput significantly but it still doesn't match write speed. For instance, increasing the readahead from 128KB to 1M improved the read throughput to ~145MB/s. Is this a known performance issue in our OS/disk/filesystem configuration? If so, how can we tell? If not, what tools or tests can we use to further isolate the issue? Thanks.

Read the article
VMWare Esxi Looking for Bottlenecks

- by nextgenneo

I have a VMWare ESxi box, 22GB ram, Dual Quad Core Xeon, 2 Sas drives + Write caching raid controller etc. Anyways, have about 30 small XP VM's running on it and starting to get some very slow boot times and other performance issues. I THINK its I/O but looking at the graphs not too sure what to look for. Any ideas on what to look for would be appreciated. Here is the data I've got so far: (I feel like my IO is high but not sure what to bench it against)

Read the article
IIS/ASP.NET performance incident - Perfmon Current Annonymous Users going through roof but Requests/sec low

- by Laurence

Setup: ASP.NET 4.0 website on IIS 6.0 on Win 2003 64 bit, 8xCPUs, 16GB memory, separate SQL 2005 DB server. Had a serious slowdown today with any otherwise fairly well performing ASP.NET site. For a period of a couple of hours all page requests were taking a very long time to be served - e.g. 30-60s compared to usual 2s. The w3wp.exe's CPU and memory usage on the webserver was not much higher than normal. The application pool was not in the middle of recycling (and it hadn't recycled for several hours). Bottlenecks in the database were ruled out - no blocks occurring and query results were being returned quickly. I couldn't make any sense of it and set up the following Perfmon counters: Current Anonymous Users (for site in question) Get requests/sec (ditto) Requests/sec for the ASP.NET application running the site Get requests/sec was averaging 100-150. Requests/sec for ASP.NET was averaging 5-10. However Current Anonymous Users was around 200. And then as I was watching, the Current Anonymous Users began to climb steeply going up to about 500 within a few minutes. All this time Get requests/sec & Requests/sec for ASP.NET was if anything going down. I did a whole load of things (in a panic!) to try to get the site working, like shutting it down, recycling the app pool, and adding another worker process to the pool. I also extended the expiration time for content (in IIS under HTTP Headers) in an attempt to lower the number of requests for static files (there are a lot of images on the site). The site is now back to normal, and the counters are fairly steady and reading (added Current Connections counter): Current Anonymous Users : average 30 Get requests/sec : average 100 Requests/sec for ASP.NET : 5 Current Connections : average 300 I have also observed an inverse relationship between Get requests/sec & Current Anonymous Users. Usually both are fairly steady but there will be short periods when Get requests/sec will go down dramatically and Current Anonymous Users will go up in a perfect mirror image. Then they will flip back to their usual levels. So, my questions are: Thinking of the original performance issue - if w3wp.exe CPU, memory usage were normal and there was no DB bottleneck, what could explain page requests taking 20 times longer to be served than usual? What other counters should I be looking at if this happens again? What explains the inverse relationship between Get requests/sec & Current Anonymous Users? What could explain Current Anonymous Users going from 200 to 500 within a few minutes? Many thanks for any insight into this.

Read the article
Baseline / Benchmark Physical and virtual server performance

- by EyeonTech

I am setting up a new server and there are some options. I want to perform some benchmarks and I need your help in determining the best tools and if possible run pre-configured benchmarks designed for SQL servers on Windows Server 2008/2012. Step 1. Run a performance monitor on the current Live SQL server (Windows Server 2008 Virtual machine running on ESXi. New server Hardware rundown: Intel® Server System R1304BTLSHBN - 1U Rack, LGA1155 http://ark.intel.com/products/53559/Intel-Server-System-R1304BTLSHBN Intel Xeon E3-1270V2 2x Intel SSD 330 Series 240GB 2.5in SATA 6Gb/s 25nm 1x WD 2TB WD2002FAEX 2TB 64M SATA3 CAVIAR BLACK 4x 8GB 1333MHz DDR3 ECC CL9 DIMM There are several options for configurations and I want to benchmark some of them and share the results. Option 1. Configure 2x SSDs at RAID 0. Install Windows Server 2008 directly to the 2TB WD Caviar HDD. Store Database files on the RAID 0 Volume. Benchmark the OS direct on the hardware as an SQL Server. Store SQL Backup databases on the 2TB WD Caviar HDD. Option 2. Configure 2x SSDs at RAID 0. Install Windows Server 2012 directly to the 2TB WD Caviar HDD. Install Hyper-V. Install the SQL Server (Server 2008) as a virtual machine. Store the Virtual Hard Disks on the SSDs. Option 3. Configure 2x SSDs at RAID 0. Install VMWare ESXi on a partition of the 2TB WD Caviar HDD. Install the SQL Server (Server 2008) as a virtual machine. Store the Virtual Hard Disks on the SSDs. I have a few tools in mind from http://technet.microsoft.com/en-us/library/cc768530(v=bts.10).aspx. Any tools with pre-configured test would be fantastic. Specifically if there are pre-configured perfmon sets avaliable. Any opinions on the setup to gain the best results is welcome. Thanks in advance.

Read the article
Project Performance Evaluation and Finding Weak Areas

- by pramodc84

I'm working in J2EE web project, which has lots of Java, SQL scripts, JS, AJAX stuff. Its been 5 years for project still running fine. I have assigned with work of performance evaluation on the project as there might be some memory usage issues, DB fetching logic delays and other similar weak performance areas. From where should I begin? Any best practices to make project better?

Read the article
What is recommended minimum object size for gzip performance benefits?

- by utt73

I'm working on improving page speed display times, and one of the methods is to gzip content from the webserver. Google recommends: Note that gzipping is only beneficial for larger resources. Due to the overhead and latency of compression and decompression, you should only gzip files above a certain size threshold; we recommend a minimum range between 150 and 1000 bytes. Gzipping files below 150 bytes can actually make them larger. We serve our content through Akamai, using their network for a proxy and CDN. What they've told me: Following up on your question regarding what is the minimum size Akamai will compress the requested object when sending it to the end user: The minimum size is 860 bytes. My reply: What is the reason(s) for why Akamai's minimum size is 860 bytes? And why, for example, is this not the case for files Akamai serves for facebook? (see below) Google recommends to gzip more agressively. And that seems appropriate on our site where the most frequent hits, by far, are AJAX calls that are <860 bytes. Akamai's response: The reasons 860 bytes is the minimum size for compression is twofold: (1) The overhead of compressing an object under 860 bytes outweighs performance gain. (2) Objects under 860 bytes can be transmitted via a single packet anyway, so there isn't a compelling reason to compress them. So I'm here for some fact checking. Is the 860 byte limit due to packet size the end of this reasoning? Why would high traffic sites push this down to the 150 byte limit... just to save on bandwidth costs (since CDNs base their charges on bandwith offloaded from origin), or is there a performance gain in doing so?

Read the article
How do I measure performance of a virtual server?

- by Sergey

I've got a VPS running Ubuntu. Being a virtual server, I understand that it shares resources with unknown number of other servers, and I'm noticing that it's considerably slower than my desktop machine. Is there some tool to measure the performance of the virtual machine? I'd be curious to see some approximate measure similar to bogomips, possibly for CPU (operations/sec), memory and disk read/write speed. I'd like to be able to compare those numbers to my desktop machine. I'm not interested in the specs of the actual physical machine my VPS is running on - by doing cat /proc/cpuinfo I can see that it's a nice quad-core Xeon machine, but it doesn't matter to me. I'm basically interested in how fast a program would run in my VPS - how many CPU operations it can make in a second, how many bytes to write to RAM or to disk. I only have ssh access to the machine so the tool need to be command-line. I could write a script which, say, does some calculations in a loop for a second and counts how many loops it was able to do, or something similar to measure disk and RAM performance. But I'm sure something like this already exists.

Read the article
Performing client-side OAuth authorized Twitter API calls versus server side, how much of a difference is there in terms of performance?

- by Terence Ponce

I'm working on a Twitter application in Ruby on Rails. One of the biggest arguments that I have with other people on the project is the method of calling the Twitter API. Before, everything was done on the server: OAuth login, updating the user's Twitter data, and retrieving tweets. Retrieving tweets was the heaviest thing to do since we don't store the tweets in our database, so viewing the tweets means that we have to call the API every time. One of the people in the project suggested that we call the tweets through Javascript instead to lessen the load on the server. We used GET search, which, correct me if I'm wrong, will be removed when v1.0 becomes completely deprecated, but that really isn't a concern now. When the Twitter API has migrated completely to v1.1 (again, correct me if I'm wrong), every calls to the API must be authenticated, so we have to authenticate our Javascript requests to the API. As said here: We don't support or recommend performing OAuth directly through Javascript -- it's insecure and puts your application at risk. The only acceptable way to perform it is if you kept all keys and secrets server-side, computed the OAuth signatures and parameters server side, then issued the request client-side from the server-generated OAuth values. If we do exactly what Twitter suggests, the only difference between this and doing everything server-side is that our server won't have to contact the Twitter API anymore every time the user wants to view tweets. Here's how I would picture what's happening every time the user makes a request: If we do it through Javascript, it would be harder on my part because I would have to create the signatures manually for every request, but I will gladly do it if the boost in performance is worth all the trouble. Doing it through Ruby on Rails would be very easy since the Twitter gem does most of the grunt work already, so I'm really encouraging the other people in the project to agree with me. Is the difference in performance trivial or is it significant enough to switch to Javascript?

Read the article
How to squeeze the maximum performance out of Unity and GNOME 3?

- by melvincv

I see that I do not get good performance with the new Unity desktop, but I should say that Unity has improved a lot since the last edition Ubuntu 11.10. How to squeeze the maximum performance out of 1. Unity 2. GNOME 3 My system specs: -Processors- Intel(R) Pentium(R) Dual CPU E2180 @ 2.00GHz -Memory- Total Memory : 2049996 kB -PCI Devices- Host bridge : Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller (rev 10) PCI bridge : Intel Corporation 82G33/G31/P35/P31 Express PCI Express Root Port (rev 10) (prog-if 00 [Normal decode]) VGA compatible controller : Intel Corporation 82G33/G31 Express Integrated Graphics Controller (rev 10) (prog-if 00 [VGA controller]) USB controller : Intel Corporation N10/ICH 7 Family USB UHCI Controller #1 (rev 01) (prog-if 00 [UHCI]) USB controller : Intel Corporation N10/ICH 7 Family USB UHCI Controller #2 (rev 01) (prog-if 00 [UHCI]) USB controller : Intel Corporation N10/ICH 7 Family USB UHCI Controller #3 (rev 01) (prog-if 00 [UHCI]) USB controller : Intel Corporation N10/ICH 7 Family USB UHCI Controller #4 (rev 01) (prog-if 00 [UHCI]) USB controller : Intel Corporation N10/ICH 7 Family USB2 EHCI Controller (rev 01) (prog-if 20 [EHCI]) PCI bridge : Intel Corporation 82801 PCI Bridge (rev e1) (prog-if 01 [Subtractive decode]) ISA bridge : Intel Corporation 82801GB/GR (ICH7 Family) LPC Interface Bridge (rev 01) IDE interface : Intel Corporation 82801G (ICH7 Family) IDE Controller (rev 01) (prog-if 8a [Master SecP PriP]) IDE interface : Intel Corporation N10/ICH7 Family SATA Controller [IDE mode] (rev 01) (prog-if 8f [Master SecP SecO PriP PriO]) SMBus : Intel Corporation N10/ICH 7 Family SMBus Controller (rev 01) Ethernet controller : Intel Corporation PRO/100 VE Network Connection (rev 01)

Read the article
Premature-Optimization and Performance Anxiety

- by James Michael Hare

While writing my post analyzing the new .NET 4 ConcurrentDictionary class (here), I fell into one of the classic blunders that I myself always love to warn about. After analyzing the differences of time between a Dictionary with locking versus the new ConcurrentDictionary class, I noted that the ConcurrentDictionary was faster with read-heavy multi-threaded operations. Then, I made the classic blunder of thinking that because the original Dictionary with locking was faster for those write-heavy uses, it was the best choice for those types of tasks. In short, I fell into the premature-optimization anti-pattern. Basically, the premature-optimization anti-pattern is when a developer is coding very early for a perceived (whether rightly-or-wrongly) performance gain and sacrificing good design and maintainability in the process. At best, the performance gains are usually negligible and at worst, can either negatively impact performance, or can degrade maintainability so much that time to market suffers or the code becomes very fragile due to the complexity. Keep in mind the distinction above. I'm not talking about valid performance decisions. There are decisions one should make when designing and writing an application that are valid performance decisions. Examples of this are knowing the best data structures for a given situation (Dictionary versus List, for example) and choosing performance algorithms (linear search vs. binary search). But these in my mind are macro optimizations. The error is not in deciding to use a better data structure or algorithm, the anti-pattern as stated above is when you attempt to over-optimize early on in such a way that it sacrifices maintainability. In my case, I was actually considering trading the safety and maintainability gains of the ConcurrentDictionary (no locking required) for a slight performance gain by using the Dictionary with locking. This would have been a mistake as I would be trading maintainability (ConcurrentDictionary requires no locking which helps readability) and safety (ConcurrentDictionary is safe for iteration even while being modified and you don't risk the developer locking incorrectly) -- and I fell for it even when I knew to watch out for it. I think in my case, and it may be true for others as well, a large part of it was due to the time I was trained as a developer. I began college in in the 90s when C and C++ was king and hardware speed and memory were still relatively priceless commodities and not to be squandered. In those days, using a long instead of a short could waste precious resources, and as such, we were taught to try to minimize space and favor performance. This is why in many cases such early code-bases were very hard to maintain. I don't know how many times I heard back then to avoid too many function calls because of the overhead -- and in fact just last year I heard a new hire in the company where I work declare that she didn't want to refactor a long method because of function call overhead. Now back then, that may have been a valid concern, but with today's modern hardware even if you're calling a trivial method in an extremely tight loop (which chances are the JIT compiler would optimize anyway) the results of removing method calls to speed up performance are negligible for the great majority of applications. Now, obviously, there are those coding applications where speed is absolutely king (for example drivers, computer games, operating systems) where such sacrifices may be made. But I would strongly advice against such optimization because of it's cost. Many folks that are performing an optimization think it's always a win-win. That they're simply adding speed to the application, what could possibly be wrong with that? What they don't realize is the cost of their choice. For every piece of straight-forward code that you obfuscate with performance enhancements, you risk the introduction of bugs in the long term technical debt of the application. It will become so fragile over time that maintenance will become a nightmare. I've seen such applications in places I have worked. There are times I've seen applications where the designer was so obsessed with performance that they even designed their own memory management system for their application to try to squeeze out every ounce of performance. Unfortunately, the application stability often suffers as a result and it is very difficult for anyone other than the original designer to maintain. I've even seen this recently where I heard a C++ developer bemoaning that in VS2010 the iterators are about twice as slow as they used to be because Microsoft added range checking (probably as part of the 0x standard implementation). To me this was almost a joke. Twice as slow sounds bad, but it almost never as bad as you think -- especially if you're gaining safety. The only time twice is really that much slower is when once was too slow to begin with. Think about it. 2 minutes is slow as a response time because 1 minute is slow. But if an iterator takes 1 microsecond to move one position and a new, safer iterator takes 2 microseconds, this is trivial! The only way you'd ever really notice this would be in iterating a collection just for the sake of iterating (i.e. no other operations). To my mind, the added safety makes the extra time worth it. Always favor safety and maintainability when you can. I know it can be a hard habit to break, especially if you started out your career early or in a language such as C where they are very performance conscious. But in reality, these type of micro-optimizations only end up hurting you in the long run. Remember the two laws of optimization. I'm not sure where I first heard these, but they are so true: For beginners: Do not optimize. For experts: Do not optimize yet. This is so true. If you're a beginner, resist the urge to optimize at all costs. And if you are an expert, delay that decision. As long as you have chosen the right data structures and algorithms for your task, your performance will probably be more than sufficient. Chances are it will be network, database, or disk hits that will be your slow-down, not your code. As they say, 98% of your code's bottleneck is in 2% of your code so premature-optimization may add maintenance and safety debt that won't have any measurable impact. Instead, code for maintainability and safety, and then, and only then, when you find a true bottleneck, then you should go back and optimize further.

Read the article
Server configurations for hosting MySQL database

- by shyam

I have a web application which uses a MySQL database hosted on a virtual server. I've been using this server when I started the application and when the database was really small. Now it has grown and the server is not able to handle the db, causing frequent db errors. I'm planning to get a server and I need suggestions for that. Like I said, the db is now 9 GB, and is growing considerably fast. There are a number of tables with millions of rows, which are frequently updated and queried. The most frequent error the db shows is Lock wait timeout exceeded. Previously there used to be "The total number of locks exceeds the lock table size" errors too, but I could avoid it by increasing Innodb buffer pool size. Please suggest what configurations should I look for in the server I should buy. I read somewhere that the db should ideally have a buffer pool size greater than the size of its data, so in my case I guess I'd need memory gt 9 GB. What other things should I look for in the server? Just tell me if I should give you more info about the

Read the article
How do I objectively measure an application's load on a server

- by Joe

All, I'm not even sure where to begin looking for resources to answer my question, and I realize that speculation about this kind of thing is highly subjective. I need help determining what class of server I should purchase to host a MS Silverlight application with a MSSQL server back-end on a Windows Server 2008 platform. It's an interactive program, so I can't simply generate a list of URLs to test against, and run it with 1000 simultaneous users. What tools are out there to help me determine what kind of load the application will put on a server at varying levels of concurrent users? Would you all suggest separating the SQL server form the web server, to better differentiate the generated load on the different parts of the stack?

Read the article
Find out which task is generating a lot of context switches on linux

- by Gaks

According to vmstat, my Linux server (2xCore2 Duo 2.5 GHz) is constantly doing around 20k context switches per second. # vmstat 3 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 2 0 7292 249472 82340 2291972 0 0 0 0 0 0 7 13 79 0 0 0 7292 251808 82344 2291968 0 0 0 184 24 20090 1 1 99 0 0 0 7292 251876 82344 2291968 0 0 0 83 17 20157 1 0 99 0 0 0 7292 251876 82344 2291968 0 0 0 73 12 20116 1 0 99 0 ... but uptime shows small load: load average: 0.01, 0.02, 0.01 and top doesn't show any process with high %CPU usage. How do I find out what exactly is generating those context switches? Which process/thread? I tried to analyze pidstat output: # pidstat -w 10 1 12:39:13 PID cswch/s nvcswch/s Command 12:39:23 1 0.20 0.00 init 12:39:23 4 0.20 0.00 ksoftirqd/0 12:39:23 7 1.60 0.00 events/0 12:39:23 8 1.50 0.00 events/1 12:39:23 89 0.50 0.00 kblockd/0 12:39:23 90 0.30 0.00 kblockd/1 12:39:23 995 0.40 0.00 kirqd 12:39:23 997 0.60 0.00 kjournald 12:39:23 1146 0.20 0.00 svscan 12:39:23 2162 5.00 0.00 kjournald 12:39:23 2526 0.20 2.00 postgres 12:39:23 2530 1.00 0.30 postgres 12:39:23 2534 5.00 3.20 postgres 12:39:23 2536 1.40 1.70 postgres 12:39:23 12061 10.59 0.90 postgres 12:39:23 14442 1.50 2.20 postgres 12:39:23 15416 0.20 0.00 monitor 12:39:23 17289 0.10 0.00 syslogd 12:39:23 21776 0.40 0.30 postgres 12:39:23 23638 0.10 0.00 screen 12:39:23 25153 1.00 0.00 sshd 12:39:23 25185 86.61 0.00 daemon1 12:39:23 25190 12.19 35.86 postgres 12:39:23 25295 2.00 0.00 screen 12:39:23 25743 9.99 0.00 daemon2 12:39:23 25747 1.10 3.00 postgres 12:39:23 26968 5.09 0.80 postgres 12:39:23 26969 5.00 0.00 postgres 12:39:23 26970 1.10 0.20 postgres 12:39:23 26971 17.98 1.80 postgres 12:39:23 27607 0.90 0.40 postgres 12:39:23 29338 4.30 0.00 screen 12:39:23 31247 4.10 23.58 postgres 12:39:23 31249 82.92 34.77 postgres 12:39:23 31484 0.20 0.00 pdflush 12:39:23 32097 0.10 0.00 pidstat Looks like some postgresql tasks are doing 10 context swiches per second, but it doesn't all sum up to 20k anyway. Any idea how to dig a little deeper for an answer?

Read the article
Changing memory allocator to Jemalloc Centos 6

- by Brian Lovett

After reading this blog post about the impact of memory allocators like jemalloc on highly threaded applications, I wanted to test things on a larger scale on some of our cluster of servers. We run sphinx, and apache using threads, and on 24 core machines. Installing jemalloc was simple enough. We are running Centos 6, so yum install jemalloc jemalloc-devel did the trick. My question is, how do we change everything on the system over to using jemalloc instead of the default malloc built into Centos. Research pointed me at this as a potential option: LD_PRELOAD=$LD_PRELOAD:/usr/lib64/libjemalloc.so.1 Would this be sufficient to get everything using jemalloc?

Read the article
How to best tune a Linux development machine?

- by cletus

How to best tune a Linux PC for development purposes?

Read the article
Why MySQL sat for 2 minutes doing nothing?

- by Alex R

This was a one-time thing, not reproducible... But I saved the show innodb status output. Can anybody tell what's going on here? The simple insert took almost 3 minutes to complete. | InnoDB | | ===================================== 110201 15:58:10 INNODB MONITOR OUTPUT ===================================== Per second averages calculated from the last 34 seconds ---------- SEMAPHORES ---------- OS WAIT ARRAY INFO: reservation count 11963, signal count 11766 --Thread 1824 has waited at .\btr\btr0cur.c line 443 for 118.00 seconds the sema phore: S-lock on RW-latch at 09D6453C created in file .\buf\buf0buf.c line 550 a writer (thread id 1824) has reserved it in mode wait exclusive number of readers 1, waiters flag 1 Last time read locked in file .\buf\buf0flu.c line 599 Last time write locked in file .\btr\btr0cur.c line 443 Mutex spin waits 0, rounds 527817, OS waits 7133 RW-shared spins 2532, OS waits 1226; RW-excl spins 1652, OS waits 1118 ------------ TRANSACTIONS ------------ Trx id counter 0 95830 Purge done for trx's n:o < 0 95814 undo n:o < 0 0 History list length 11 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 0 0, not started, OS thread id 3704 MySQL thread id 551, query id 2702112 localhost 127.0.0.1 root show innodb status ---TRANSACTION 0 95829, not started, OS thread id 3132 MySQL thread id 534, query id 2702020 localhost 127.0.0.1 root ---TRANSACTION 0 95828, not started, OS thread id 3152 MySQL thread id 527, query id 2701973 localhost 127.0.0.1 root ---TRANSACTION 0 95827, ACTIVE 118 sec, OS thread id 1824 inserting, thread decl ared inside InnoDB 500 mysql tables in use 1, locked 1 1 lock struct(s), heap size 320, 0 row lock(s) MySQL thread id 526, query id 2701972 localhost 127.0.0.1 root update INSERT INTO log_searchcriteria (userid,search_criteria,date,search_type) VALUES ( NAME_CONST('userid',NULL), NAME_CONST('search_criteria',_latin1' SELECT SQL_C ALC_FOUND_ROWS idx_search.CTCX_LATITUDE, idx_search.CTCX_LONGITUDE, idx_search.b uilding_id, idx_search.LN_LIST_NUMBER, idx_search.LP_LIST_PRICE, idx_search.HSN_ ADRESS_HOUSE_NUMBER, idx_search.STR_ADDRESS_STREET, idx_search.CP_ADDRESS_COMPAS S_POINT, idx_search.UN_UNIT, idx_search.CIT_CITY, idx_search.ZP_ZIP_CODE, idx_se arch.AR_AREA_NAME, idx_search.BR_BEDROOMS, idx_search.BTH_BATHS, idx_search.ST_S TATUS, idx_search.CTCX_STYLE_TYPE, idx_s -------- FILE I/O -------- I/O thread 0 state: wait Windows aio (insert buffer thread) I/O thread 1 state: wait Windows aio (log thread) I/O thread 2 state: wait Windows aio (read thread) I/O thread 3 state: wait Windows aio (write thread) Pending normal aio reads: 0, aio writes: 1, ibuf aio reads: 0, log i/o's: 0, sync i/o's: 0 Pending flushes (fsync) log: 0; buffer pool: 0 151006 OS file reads, 120758 OS file writes, 6844 OS fsyncs 0.00 reads/s, 0 avg bytes/read, 0.00 writes/s, 0.00 fsyncs/s ------------------------------------- INSERT BUFFER AND ADAPTIVE HASH INDEX ------------------------------------- Ibuf: size 1, free list len 5, seg size 7, 24664 inserts, 24664 merged recs, 4612 merges Hash table size 553253, node heap has 629 buffer(s) 0.00 hash searches/s, 0.00 non-hash searches/s --- LOG --- Log sequence number 5 2318193115 Log flushed up to 5 2318193115 Last checkpoint at 5 2318129891 0 pending log writes, 0 pending chkp writes 3036 log i/o's done, 0.00 log i/o's/second ---------------------- BUFFER POOL AND MEMORY ---------------------- Total memory allocated 213459462; in additional pool allocated 1720192 Dictionary memory allocated 240416 Buffer pool size 8192 Free buffers 0 Database pages 7563 Modified db pages 18 Pending reads 0 Pending writes: LRU 0, flush list 18, single page 0 Pages read 150973, created 28788, written 115137 0.00 reads/s, 0.00 creates/s, 0.00 writes/s No buffer pool page gets since the last printout -------------- ROW OPERATIONS -------------- 1 queries inside InnoDB, 0 queries in queue 1 read views open inside InnoDB Main thread id 2992, state: flushing buffer pool pages Number of rows inserted 794294, updated 89203, deleted 13698, read 1453084305 0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s ---------------------------- END OF INNODB MONITOR OUTPUT ============================ Thanks

Read the article
How to check for bottlenecks in Windows Server 2008 R2

- by Phil Koury

I recently switched out a 10 year old server for a brand new server in a small office and upgraded from Windows Server 2000 to Windows Server 2008 R2. After the switch was complete and some configurations were changed around we are running into what appear to be some bottlenecks in the network speed. Accessing programs on the server is slower (resulting in long loading times, slower report generation, etc.) than it was on the old server hardware. I am wondering what options or tools I have, if there are any at all, to find out exactly where these hang ups might be coming from.

Read the article
Benchmarking Java programs

- by stefan-ock

For university, I perform bytecode modifications and analyze their influence on performance of Java programs. Therefore, I need Java programs---in best case used in production---and appropriate benchmarks. For instance, I already got HyperSQL and measure its performance by the benchmark program PolePosition. The Java programs running on a JVM without JIT compiler. Thanks for your help! P.S.: I cannot use programs to benchmark the performance of the JVM or of the Java language itself (such as Wide Finder).

Read the article
How to know if my nginx is in good health?

- by Howard

I am running a nginx on EC2 (m1.small) for SSL termination. I am using 2 workers on Ubuntu, with latest nginx (stable), the network throughput is around 2Mbps and system load average is around 2 to 3. I am wondering if this system is in good health for now, e.g. what is the queue length (I know nginx can handle a lot of concurrent request, but I mean before the request is being served, how many of them need to wait before being served) what is the average queue time for a given request to be served. I want to know because if my nginx is cpu bounded (e.g. due to SSL), I will need to upgrade to a faster instance. My current nginx status Active connections: 4076 server accepts handled requests 90664283 90664283 104117012 Reading: 525 Writing: 81 Waiting: 3470

Read the article
What perfmon counters should I track for improvements to the query plan cache?

- by Bill Paetzke

I am parameterizing my web app's ad hoc sql. As a result, I expect the query plan cache to reduce in size and have a higher hit ratio. What should I track in perfmon regarding this? How could I report on this change? (I'd also accept an answer that queries the DMV tables or whatever else to report on the change).

Read the article
Can compressing Program Files save space *and* give a significant boost to SSD performance?

- by Christopher Galpin

Considering solid-state disk space is still an expensive resource, compressing large folders has appeal. Thanks to VirtualStore, could Program Files be a case where it might even improve performance? Discovery In particular I have been reading: SSD and NTFS Compression Speed Increase? Does NTFS compression slow SSD/flash performance? Will somebody benchmark whole disk compression (HD,SSD) please? (may have to scroll up) The first link is particularly dreamy, but maybe head a little too far in the clouds. The third link has this sexy semi-log graph (logarithmic scale!). Quote (with notes): Using highly compressable data (IOmeter), you get at most a 30x performance increase [for reads], and at least a 49x performance DECREASE [for writes]. Assuming I interpreted and clarified that sentence correctly, this single user's benchmark has me incredibly interested. Although write performance tanks wretchedly, read performance still soars. It gave me an idea. Idea: VirtualStore It so happens that thanks to sanity saving security features introduced in Windows Vista, write access to certain folders such as Program Files is virtualized for non-administrator processes. Which means, in normal (non-elevated) usage, a program or game's attempt to write data to its install location in Program Files (which is perhaps a poor location) is redirected to %UserProfile%\AppData\Local\VirtualStore, somewhere entirely different. Thus, to my understanding, writes to Program Files should primarily only occur when installing an application. This makes compressing it not only a huge source of space gain, but also a potential candidate for performance gain. Testing The beginning of this post has me a bit timid, it suggests benchmarking NTFS compression on a whole drive is difficult because turning it off "doesn't decompress the objects". However it seems to me the compact command is perfectly capable of doing so for both drives and individual folders. Could it be only marking them for decompression the next time the OS reads from them? I need to find the answer before I begin my own testing.

Read the article
SSIS Lookup component tuning tips

- by jamiet

Yesterday evening I attended a London meeting of the UK SQL Server User Group at Microsoft’s offices in London Victoria. As usual it was both a fun and informative evening and in particular there seemed to be a few questions arising about tuning the SSIS Lookup component; I rattled off some comments and figured it would be prudent to drop some of them into a dedicated blog post, hence the one you are reading right now. Scene setting A popular pattern in SSIS is to use a Lookup component to determine whether a record in the pipeline already exists in the intended destination table or not and I cover this pattern in my 2006 blog post Checking if a row exists and if it does, has it changed? (note to self: must rewrite that blog post for SSIS2008). Fundamentally the SSIS lookup component (when using FullCache option) sucks some data out of a database and holds it in memory so that it can be compared to data in the pipeline. One of the big benefits of using SSIS dataflows is that they process data one buffer at a time; that means that not all of the data from your source exists in the dataflow at the same time and is why a SSIS dataflow can process data volumes that far exceed the available memory. However, that only applies to data in the pipeline; for reasons that are hopefully obvious ALL of the data in the lookup set must exist in the memory cache for the duration of the dataflow’s execution which means that any memory used by the lookup cache will not be available to be used as a pipeline buffer. Moreover, there’s an obvious correlation between the amount of data in the lookup cache and the time it takes to charge that cache; the more data you have then the longer it will take to charge and the longer you have to wait until the dataflow actually starts to do anything. For these reasons your goal is simple: ensure that the lookup cache contains as little data as possible. General tips Here is a simple tick list you can follow in order to tune your lookups: Use a SQL statement to charge your cache, don’t just pick a table from the dropdown list made available to you. (Read why in SELECT *... or select from a dropdown in an OLE DB Source component?) Only pick the columns that you need, ignore everything else Make the database columns that your cache is populated from as narrow as possible. If a column is defined as VARCHAR(20) then SSIS will allocate 20 bytes for every value in that column – that is a big waste if the actual values are significantly less than 20 characters in length. Do you need DT_WSTR typed columns or will DT_STR suffice? DT_WSTR uses twice the amount of space to hold values that can be stored using a DT_STR so if you can use DT_STR, consider doing so. Same principle goes for the numerical datatypes DT_I2/DT_I4/DT_I8. Only populate the cache with data that you KNOW you will need. In other words, think about your WHERE clause! Thinking outside the box It is tempting to build a large monolithic dataflow that does many things, one of which is a Lookup. Often though you can make better use of your available resources by, well, mixing things up a little and here are a few ideas to get your creative juices flowing: There is no rule that says everything has to happen in a single dataflow. If you have some particularly resource intensive lookups then consider putting that lookup into a dataflow all of its own and using raw files to pass the pipeline data in and out of that dataflow. Know your data. If you think, for example, that the majority of your incoming rows will match with only a small subset of your lookup data then consider chaining multiple lookup components together; the first would use a FullCache containing that data subset and the remaining data that doesn’t find a match could be passed to a second lookup that perhaps uses a NoCache lookup thus negating the need to pull all of that least-used lookup data into memory. Do you need to process all of your incoming data all at once? If you can process different partitions of your data separately then you can partition your lookup cache as well. For example, if you are using a lookup to convert a location into a [LocationId] then why not process your data one region at a time? This will mean your lookup cache only has to contain data for the location that you are currently processing and with the ability of the Lookup in SSIS2008 and beyond to charge the cache using a dynamically built SQL statement you’ll be able to achieve it using the same dataflow and simply loop over it using a ForEach loop. Taking the previous data partitioning idea further … a dataflow can contain more than one data path so why not split your data using a conditional split component and, again, charge your lookup caches with only the data that they need for that partition. Lookups have two uses: to (1) find a matching row from the lookup set and (2) put attributes from that matching row into the pipeline. Ask yourself, do you need to do these two things at the same time? After all once you have the key column(s) from your lookup set then you can use that key to get the rest of attributes further downstream, perhaps even in another dataflow. Are you using the same lookup data set multiple times? If so, consider the file caching option in SSIS 2008 and beyond. Above all, experiment and be creative with different combinations. You may be surprised at what works. Final thoughts If you want to know more about how the Lookup component differs in SSIS2008 from SSIS2005 then I have a dedicated blog post about that at Lookup component gets a makeover. I am on a mini-crusade at the moment to get a BULK MERGE feature into the database engine, the thinking being that if the database engine can quickly merge massive amounts of data in a similar manner to how it can insert massive amounts using BULK INSERT then that’s a lot of work that wouldn’t have to be done in the SSIS pipeline. If you think that is a good idea then go and vote for BULK MERGE on Connect. If you have any other tips to share then please stick them in the comments. Hope this helps! @Jamiet Share this post: email it! | bookmark it! | digg it! | reddit! | kick it! | live it!

Read the article
Performance question: Inverting an array of pointers in-place vs array of values

- by Anders

The background for asking this question is that I am solving a linearized equation system (Ax=b), where A is a matrix (typically of dimension less than 100x100) and x and b are vectors. I am using a direct method, meaning that I first invert A, then find the solution by x=A^(-1)b. This step is repated in an iterative process until convergence. The way I'm doing it now, using a matrix library (MTL4): For every iteration I copy all coeffiecients of A (values) in to the matrix object, then invert. This the easiest and safest option. Using an array of pointers instead: For my particular case, the coefficients of A happen to be updated between each iteration. These coefficients are stored in different variables (some are arrays, some are not). Would there be a potential for performance gain if I set up A as an array containing pointers to these coefficient variables, then inverting A in-place? The nice thing about the last option is that once I have set up the pointers in A before the first iteration, I would not need to copy any values between successive iterations. The values which are pointed to in A would automatically be updated between iterations. So the performance question boils down to this, as I see it: - The matrix inversion process takes roughly the same amount of time, assuming de-referencing of pointers is non-expensive. - The array of pointers does not need the extra memory for matrix A containing values. - The array of pointers option does not have to copy all NxN values of A between each iteration. - The values that are pointed to the array of pointers option are generally NOT ordered in memory. Hopefully, all values lie relatively close in memory, but *A[0][1] is generally not next to *A[0][0] etc. Any comments to this? Will the last remark affect performance negatively, thus weighing up for the positive performance effects?

Read the article
Performance impact: What is the optimal payload for SqlBulkCopy.WriteToServer()?

- by Linchi Shea

For many years, I have been using a C# program to generate the TPC-C compliant data for testing. The program relies on the SqlBulkCopy class to load the data generated by the program into the SQL Server tables. In general, the performance of this C# data loader is satisfactory. Lately however, I found myself in a situation where I needed to generate a much larger amount of data than I typically do and the data needed to be loaded within a confined time frame. So I was driven to look into the code...(read more)

Read the article

Search Results

Search found 13249 results on 530 pages for 'performance tuning'.

Page 20/530 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >

- by Márk Vincze

- by bfallik-bamboom

- by nextgenneo

- by Laurence

- by EyeonTech

- by pramodc84

- by utt73

- by Sergey

- by Terence Ponce

- by melvincv

- by James Michael Hare

- by shyam

- by Joe

- by Gaks

- by Brian Lovett

- by cletus

- by Alex R

- by Phil Koury

- by stefan-ock

- by Howard

- by Bill Paetzke

- by Christopher Galpin

- by jamiet

- by Anders

- by Linchi Shea

< Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27 | Next Page >