Search Results

Search found 22447 results on 898 pages for 'cpu load'.

Page 20/898 | < Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27  | Next Page >

  • Predictive vs Least Connection Load Balancing Techniques

    - by Mani
    I have a windows based desktop application that communicates via TCP to the application servers. (windows 2003). No sticky sessions between client calls. We have exactly 2 servers to load balance and we are thinking to use a F5 hardware NLB. The application is a heavy load types, doing not much bussiness logic in the services but retrieving quite a big amount of data at most of the times. May be on an average 5000 to 10000 records at all times. Used mainly for storing and retirieving data and no special processing of data or calculations running on the server side. I am favouring 'predictive' considering my services take a while at times to return data and hence tracking the feedback would yield some better routing as in predictive. I am not sure if the given data is sufficient enough to suggest some ideas but considering these, what would be some suggestions\things to consider\best between Predictive and Least Connections ? Thanks.

    Read the article

  • SQLServer 2008 FailOver and Load Balancing

    - by Jedi Master Spooky
    I have a project with a 2TB database ( 450.000.000 rows). I need to provide to the proyect a solution that gives FailOver and load Balacing, what do you recommend? We are going to use a NetApp Filer for the Data Files and for the File System of the Proyect. I read that SQl Clustering does not provide load balacing. If I cannnot have this feature and I have to go only to the FailOver what Server ( I presume that the key feature here is memory) would you recomend. We are adding 1.000.000 rows a day. Once the rows is inserted we are doing a lot of updates to that row for about 1 week then the row get static. Because of this I am thinking in some kind of history table or database or something like that. I am open to the Os servers implementation, I was thinking of a windows 2008 server with cluster but this depend of the database solution

    Read the article

  • ECMP Load Balancing in JUNOS

    - by SpacemanSpiff
    I'm trying to figure out how to use ECMP load balancing in JUNOS. I know this isn't the best way to load balance, but its quick and dirty and gets done what I need to. In ScreenOS this was pretty easy. Device: SRX220 JunOS: 10.3R2.11 Here's what I've got so far: routing-options { static { route 0.0.0.0/0 { next-hop [ 1.1.1.1 1.1.1.2 ]; metric 10; } } maximum-paths 2; Will that do it? Tom

    Read the article

  • Server load is spiking from 2 to 250

    - by Hakzona
    Hello, I'am using Wordpress 2.9. Webserver 8GB from Hostgator. I'am fighting with this problem for long time but still can not find the solution. Php switched to run as an apache module, Php 5 in DSO, Apache suEXEC, eaccelerator installed, but this configuration started making huge server load on server. Server load spiking from 1 to 250 (4 cpus) and server stops, after period of time its back again and stops in about 10 minutes. It started happening when hostgator support team installed eaccelerator on server. What can make this problem and how can I fix it?

    Read the article

  • How to create basic load balancer?

    - by Ilya Rusanen
    guys. I'd like to deploy my app on two different servers, located in US and Germany. As I suppose, I need to set up some kind of load balancer, that would deternime from which country my user is, and resolve it to US/Germany server. The general aim is to provide user abitiliy to work with the closest server (CDN is not a solution, 'cause we dont share static content). Where should I place load balancer that would resolve user to USA/GER severs? In usa/germany? What shold it look like? A usual server with some specific app or what? Thank you.

    Read the article

  • Apache SSL losing session over load balancer

    - by SaltyNuts
    I have two physical Apache servers behind a load balancer. The load balancer was supposed to be set up so that a user would always be sent to the same physical server after the first request, to preserve sessions. This worked fine for our web apps until we added SSL to the setup. Now the user can successfully login, see the home page, but clicking on any other internal links logs the user right out. I traced the issue to the fact that while initial authentication is performed by server 1, clicking on internal links leads to having the request sent to server 2. Server 2 does not share sessions with server 1, and the user is kicked out. How can I fix it? Do I need to share sessions between the two servers? If so, could you point me to a good guide for doing this? Thanks.

    Read the article

  • How to set RpcClientAccessServer for a Exchange 2010 mailbox database to a load balancer

    - by Archit Baweja
    I have 2 Exchange 2010 servers each with a Mailbox Database. I have also setup a Hardware Load Balancer (KEMP LoadMaster 2200 to be precise) to load balance the CAS role access. My HLB has an IP of 192.168.1.100. I've setup the DNS A record for mail.mydomain.com to point to 192.168.1.100. However when I try to set the RpcClientAccessServer on a mailbox database using Set-MailboxDatabase "My Mailbox Database" -RpcClientAccessServer mail.mydomain.com I get an error saying Exchange server "mail.mydomain.com" was not found. Please make sure you have typed it correctly. + CategoryInfo : NotSpecified: (:) [], ManagementObjectNotFoundException + FullyQualifiedErrorId : 4082394C Any ideas?

    Read the article

  • Maximizing TCP connections on HAProxy load balancer

    - by imaginative
    I am currently using HAProxy in order to load balance tcp connections from clients to my Erlang app server. The connection is persistent, which means I'm limited to roughly 64K clients on an optimized server (I'm currently running HAProxy on an m1.large EC2 instance). My app server is designed to horizontally scale based on the number of TCP connections. What's worrying me though is I'll need an equal number of HAProxy servers as app servers since it's a 1:1 connection. Is there currently a way to "proxy" the tcp connection to the app server so that once HAProxy sends the client off to my Erlang server, it can free up the connection, ready to serve another client? Are there any papers, existing solutions out there I can read so that I only have to worry about the 64K limit on my app servers, and not on the load balancing servers themselves?

    Read the article

  • What's going on with my server? High load, lots of idle CPU time, low disk utilization

    - by Jonathan
    I run a web site and send a legitimate opt-in, daily email newsletter to subscribers. Both the web hosting and email sending are done by the same machine. I have about 100,000 subscribers who have opted in to my daily email newsletter. My PHP script did a pretty good job sending mail to all of them until fairly recently, but as the list has grown I can't keep up. When I run top, I have very high load--usually at least 6 or 7, sometimes as high as 15--even though I only have two CPUs. However, when I run sar, my CPU is idle an average of about 30% of the time. So, it seems I'm not CPU bound. When I run iostat, it seems as though I'm not disk bound because my %util for each device is very low (no more than 5%). Given that I don't seem to be CPU bound or disk bound, why is top reporting such high load? Additionally, since I don't seem to be CPU bound or disk bound, why is my email sending script not able to keep up? Here's what I see when running top: top - 11:33:28 up 74 days, 18:49, 2 users, load average: 7.65, 8.79, 8.28 Tasks: 168 total, 5 running, 162 sleeping, 0 stopped, 1 zombie Cpu(s): 38.9%us, 58.6%sy, 0.8%ni, 0.0%id, 0.7%wa, 0.2%hi, 0.8%si, 0.0%st Mem: 3083012k total, 2144436k used, 938576k free, 281136k buffers Swap: 2048248k total, 39164k used, 2009084k free, 1470412k cached Here's what I see when running iostat -mx: avg-cpu: %user %nice %system %iowait %steal %idle 34.80 1.20 55.24 0.37 0.00 8.38 Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await svctm %util sda 0.19 71.70 1.59 29.45 0.02 0.07 5.90 0.55 17.82 1.16 3.59 sda1 0.00 0.00 0.00 0.00 0.00 0.00 7.10 0.00 13.80 13.72 0.00 sda2 0.05 50.45 1.13 24.57 0.01 0.29 24.25 0.35 13.43 1.15 2.97 sda3 0.05 10.17 0.20 2.33 0.01 0.05 43.75 0.05 20.96 2.45 0.62 sda4 0.00 0.00 0.00 0.00 0.00 0.00 2.00 0.00 70.50 70.50 0.00 sda5 0.07 0.22 0.03 0.07 0.00 0.00 32.84 0.08 856.19 8.03 0.08 sda6 0.02 5.45 0.03 0.72 0.00 0.02 67.55 0.02 26.72 5.26 0.39 sda7 0.00 1.56 0.00 0.42 0.00 0.01 38.04 0.00 8.88 5.84 0.24 sda8 0.01 3.84 0.20 1.35 0.00 0.02 28.55 0.05 31.90 4.08 0.63 Here's what I see when running sar: 09:40:02 AM CPU %user %nice %system %iowait %steal %idle 09:50:01 AM all 30.59 1.01 49.80 0.23 0.00 18.37 10:00:08 AM all 31.73 0.92 51.66 0.13 0.00 15.55 10:10:06 AM all 30.43 0.99 48.94 0.26 0.00 19.38 10:20:01 AM all 29.58 1.00 47.76 0.25 0.00 21.42 10:30:01 AM all 29.37 1.02 47.30 0.18 0.00 22.13 10:40:06 AM all 32.50 1.01 52.94 0.16 0.00 13.39 10:50:01 AM all 30.49 1.00 49.59 0.15 0.00 18.77 11:00:01 AM all 29.43 0.99 47.71 0.17 0.00 21.71 11:10:07 AM all 30.26 0.93 49.48 0.83 0.00 18.50 11:20:02 AM all 29.83 0.81 48.51 1.32 0.00 19.52 11:30:06 AM all 31.18 0.88 51.33 1.15 0.00 15.47 Average: all 26.21 1.15 42.62 0.48 0.00 29.54 Here are the top handful of processes listed at the particular time I happened to run top -c: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 8180 mysql 16 0 57448 19m 2948 S 26.6 0.7 4702:26 /usr/sbin/mysqld --basedir=/ --datadir=/var/lib/mysql --user=mysql --pid-file=/var/lib/mysql/bristno.pid --skip-external-locking 26956 brristno 17 0 0 0 0 Z 8.0 0.0 0:00.24 [php] <defunct> 26958 brristno 17 0 94408 43m 37m R 5.0 1.4 0:00.15 /usr/bin/php /home/brristno/public_html/dbv.php 22852 nobody 16 0 9628 2900 1524 S 0.7 0.1 0:00.17 /usr/local/apache/bin/httpd -k start -DSSL 8591 brristno 34 19 96896 13m 6652 S 0.3 0.4 0:29.82 /usr/local/bin/php /home/brristno/bin/mailer.php 1qwqyb6 i0gbor 24469 nobody 16 0 9628 2880 1508 S 0.3 0.1 0:00.08 /usr/local/apache/bin/httpd -k start -DSSL 25495 nobody 15 0 9628 2876 1500 S 0.3 0.1 0:00.06 /usr/local/apache/bin/httpd -k start -DSSL 26149 nobody 15 0 9628 2864 1504 S 0.3 0.1 0:00.04 /usr/local/apache/bin/httpd -k start -DSSL

    Read the article

  • Increasing load capacity for growing website

    - by markxi
    My website currently runs on a dedicated web server (with LiteSpeed) and dedicated MySQL database server. It's a download based site with a lot of user-generated content, which can be streamed and downloaded, there are also thousands of thumbnails and static content. I'm at the stage where the web server can no longer handle the amount of traffic, so I'm looking a how best to increase capacity considering the large amount of downloadable content. My host suggests mirroring everything on a second web server and distributing the load between them using either DNS Made Easy, or to have my own load balancer (using ldirector) in front of the two web servers. Could anyone advise whether the above method would be the best option? Does any one have any experience with DNS Made Easy and/or ldirector? I'd appreciate any help.

    Read the article

  • Load balancing + NAT issue on BNT GBE 2-7 gear

    - by Clément Game
    Hi guys, I've got troubles configuring an Hardware load-Balancer with NAT functions. I have the following architecture: Internet === VIP (public) LB (private ip) ==== private addressed servers When a connection is initialised from the outside (internet) , the LB correctly forwards the SYN packet to one of the private servers. But when these servers want to reply with a SYN/ACK there is a problem. the initial SYN packet had as ip header : VIP = Private_server_Address But the private servers cannot reach VIP from their side (this is normal since it's nated), and then provide a correct reply. Have you guys any solution to correctly forward the packets to their correct destination ? Note: The load balancer, which is the default gw for the servers, also has a NAT rule for "masquerading" (actually more SNAT than real masquerading) Regards, Clément.

    Read the article

  • First request too slow even if I have a load balancer in the back

    - by adrian7
    I have an Apache 2 on Centos + bind with a wordpress website on it (e.g example.com). I have also set up, on another server in a different contry a load balancer (varnish:80 + nginx 127.0.0.1:8080) for it - which task is to server all static content under /wp-content/. Using Simple DNS editor I added an A entry to cdn.example.com pointing to the server's IP. So no extra work from a 2nd dns server. Then using htaccess I redirect all requests to jpg|gif|css|js files to cdn.example.com. That works and all files are saved on the "cdn" server and served right away. My problem is that for the first time I enter on example.com (e.g after restarting the computer or closing the browser) the load time is 1 up to 3 seconds, while any subsequent page loads take only 300 to 600 miliseconds. I know it might be a DNS issue, but I have done a cache check on several websites and cdn.example.com indicates the right IP. Do you have any ideas where I should dig to solve this first-time slowness?

    Read the article

  • Amazon EC2 Web server in Load Balancer gives 503

    - by dale
    we've been running our web servers at Amazon with load balancer and auto-scaling for over a year with no problem. All of a sudden today the request began to get aborted with the error: 503 ... Backend server is at capacity The web servers are at 1% CPU and no other alarms trigger. We use Amazons load balancer and nginx. Lots of requests like this are showing up in the access_log. 10.246.114.93 - - [05/Jun/2014:20:16:09 +0000] "-" 400 0 "-" "-" 10.246.114.93 - - [05/Jun/2014:20:16:09 +0000] "-" 400 0 "-" "-" 10.246.114.93 - - [05/Jun/2014:20:16:09 +0000] "-" 400 0 "-" "-" 10.246.114.93 - - [05/Jun/2014:20:16:09 +0000] "-" 400 0 "-" "-" 10.246.114.93 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-" 10.246.114.93 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-" 10.246.114.93 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-" 10.229.15.214 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-" 10.229.15.214 - - [05/Jun/2014:20:16:10 +0000] "-" 400 0 "-" "-" Any thoughts?

    Read the article

  • PostgreSQL 9.0 HA load balancing between servers

    - by Vijay Ramachandran
    Hey folks, I'm bashing my head to configure load balancing stuff between two database servers. I have no clue whether, I can find any mechanism to implement this. I already tried to implement Heart beat clustering but it requires virtual Ip wherein I can't create virtual IP or assign my own IP address in amazon EC2. Is there a way to configure PostgreSQL database servers in similar to Amazon load balancing kind of thing ? If so, please suggest the solution. Thanks in advance.

    Read the article

  • PHP and load balancing

    - by StCee
    I have one major domain but the server spec behind it is not good enough. Hence I want to relay the traffic, in particular php-mysql queries to multiple smaller servers. How is that normally be done? (BTW I wonder how much traffic or number of php/mysql request a normal setup on ec2 micro instance can handle? ) I did have a look of EC2 load balancer. But is it only possible to load balance on machines of your own account?

    Read the article

  • Load balanced proxies to avoid an API request limit

    - by ClickClickClick
    There is a certain API out there which limits the number of requests per day per IP. My plan is to create a bunch of EC2 instances with elastic IPs to sidestep the limitation. I'm familiar with EC2 and am just interested in the configuration of the proxies and a software load balancer. I think I want to run a simple TCP Proxy on each instance and a software load balancer on the machine I will be requesting from. Something that allows the following to return a response from a different IP (round robin, availability, doesn't really matter..) eg. curl http://www.bbc.co.uk -x http://myproxyloadbalancer:port Could anyone recommend a combination of software or even a link to an article that details a pleasing way to pull it off? (My client won't be curl but is proxy aware.. I'll be making the requests from a Ruby script..)

    Read the article

  • What is the difference between "load" and "fetch"?

    - by DragonLord
    I often encounter the words load and fetch in contexts where data are being read from some source, and they seem to have slightly different meanings. What's the difference? I've done some research and couldn't find any specific technical difference in general usage. While the term fetch can refer to one stage in CPU instruction execution, I've seen it used in contexts not related to CPUs, and I'm looking for an answer that is not specific to CPUs.

    Read the article

  • Why doesn't jquery .load() load a text file from an external website?

    - by Edward Tanguay
    In the example below, when I click the button, it says "Load was performed" but no text is shown. I have a clientaccesspolicy.xml in the root directory and am able to asynchronously load the same file from silverlight. So I would think I should be able to access from AJAX as well. What do I have to change so that the text of the file http://www.tanguay.info/knowsite/data.txt is properly displayed in the #content element? <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <script type="text/javascript" src="http://www.google.com/jsapi"></script> <script type="text/javascript"> google.load("jquery", "1.3.2"); google.setOnLoadCallback(function() { $('#loadButton').click(loadDataFromExernalWebsite); }); function loadDataFromExernalWebsite() { $('#content').load('http://www.tanguay.info/knowsite/data.txt', function() { alert('Load was performed.'); }); } </script> </head> <body> <p>Click the button to load content:</p> <p id="content"></p> <input id="loadButton" type="button" value="load content"/> </body> </html>

    Read the article

  • Approximate Number of CPU Cycles for Various Operations

    - by colordot
    I am trying to find a reference for approximately how many CPU cycles various operations require. I don't need exact numbers (as this is going to vary between CPUs) but I'd like something relatively credible that gives ballpark figures that I could cite in discussion with friends. As an example, we all know that floating point division takes more CPU cycles than say doing a bitshift. I'd guess that the difference is that the division is around 100 cycles, where as a shift is 1 but I'm looking for something to cite to back that up. Can anyone recommend such a resource?

    Read the article

  • Quick CPU ring mode protection question

    - by b-gen-jack-o-neill
    Hi, me again :) I am very curious in messing up with HW. But my top level "messing" so far was linked or inline assembler in C program. If my understanding of CPU and ring mode is right, I cannot directly from user mode app access some low level CPU features, like disabling interrupts, or changing protected mode segments, so I must use system calls to do everything I want. But, if I am right, drivers can run in ring mode 0. I actually don´t know much about drivers, but this is what I ask for. I just want to know, is learning how to write your own drivers and than call them the way I should go, to do what I wrote? I know I could write whole new OS (at least to some point), but what I exactly want to do is acessing some low level features of HW from standart windows application. So, is driver the way to go?

    Read the article

  • high load average, high wait, dmesg raid error messages (debian nfs server)

    - by John Stumbles
    Debian 6 on HP proliant (2 CPU) with raid (2*1.5T RAID1 + 2*2T RAID1 joined RAID0 to make 3.5T) running mainly nfs & imapd (plus samba for windows share & local www for previewing web pages); with local ubuntu desktop client mounting $HOME, laptops accessing imap & odd files (e.g. videos) via nfs/smb; boxes connected 100baseT or wifi via home router/switch uname -a Linux prole 2.6.32-5-686 #1 SMP Wed Jan 11 12:29:30 UTC 2012 i686 GNU/Linux Setup has been working for months but prone to intermittently going very slow (user experience on desktop mounting $HOME from server, or laptop playing videos) and now consistently so bad I've had to delve into it to try to find what's wrong(!) Server seems OK at low load e.g. (laptop) client (with $HOME on local disk) connecting to server's imapd and nfs mounting RAID to access 1 file: top shows load ~ 0.1 or less, 0 wait but when (desktop) client mounts $HOME and starts user KDE session (all accessing server) then top shows e.g. top - 13:41:17 up 3:43, 3 users, load average: 9.29, 9.55, 8.27 Tasks: 158 total, 1 running, 157 sleeping, 0 stopped, 0 zombie Cpu(s): 0.4%us, 0.4%sy, 0.0%ni, 49.0%id, 49.7%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 903856k total, 851784k used, 52072k free, 171152k buffers Swap: 0k total, 0k used, 0k free, 476896k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3935 root 20 0 2456 1088 784 R 2 0.1 0:00.02 top 1 root 20 0 2028 680 584 S 0 0.1 0:01.14 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/0 4 root 20 0 0 0 0 S 0 0.0 0:00.12 ksoftirqd/0 5 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0 6 root RT 0 0 0 0 S 0 0.0 0:00.00 migration/1 7 root 20 0 0 0 0 S 0 0.0 0:00.16 ksoftirqd/1 8 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/1 9 root 20 0 0 0 0 S 0 0.0 0:00.42 events/0 10 root 20 0 0 0 0 S 0 0.0 0:02.26 events/1 11 root 20 0 0 0 0 S 0 0.0 0:00.00 cpuset 12 root 20 0 0 0 0 S 0 0.0 0:00.00 khelper 13 root 20 0 0 0 0 S 0 0.0 0:00.00 netns 14 root 20 0 0 0 0 S 0 0.0 0:00.00 async/mgr 15 root 20 0 0 0 0 S 0 0.0 0:00.00 pm 16 root 20 0 0 0 0 S 0 0.0 0:00.02 sync_supers 17 root 20 0 0 0 0 S 0 0.0 0:00.02 bdi-default 18 root 20 0 0 0 0 S 0 0.0 0:00.00 kintegrityd/0 19 root 20 0 0 0 0 S 0 0.0 0:00.00 kintegrityd/1 20 root 20 0 0 0 0 S 0 0.0 0:00.02 kblockd/0 21 root 20 0 0 0 0 S 0 0.0 0:00.08 kblockd/1 22 root 20 0 0 0 0 S 0 0.0 0:00.00 kacpid 23 root 20 0 0 0 0 S 0 0.0 0:00.00 kacpi_notify 24 root 20 0 0 0 0 S 0 0.0 0:00.00 kacpi_hotplug 25 root 20 0 0 0 0 S 0 0.0 0:00.00 kseriod 28 root 20 0 0 0 0 S 0 0.0 0:04.19 kondemand/0 29 root 20 0 0 0 0 S 0 0.0 0:02.93 kondemand/1 30 root 20 0 0 0 0 S 0 0.0 0:00.00 khungtaskd 31 root 20 0 0 0 0 S 0 0.0 0:00.18 kswapd0 32 root 25 5 0 0 0 S 0 0.0 0:00.00 ksmd 33 root 20 0 0 0 0 S 0 0.0 0:00.00 aio/0 34 root 20 0 0 0 0 S 0 0.0 0:00.00 aio/1 35 root 20 0 0 0 0 S 0 0.0 0:00.00 crypto/0 36 root 20 0 0 0 0 S 0 0.0 0:00.00 crypto/1 203 root 20 0 0 0 0 S 0 0.0 0:00.00 ksuspend_usbd 204 root 20 0 0 0 0 S 0 0.0 0:00.00 khubd 205 root 20 0 0 0 0 S 0 0.0 0:00.00 ata/0 206 root 20 0 0 0 0 S 0 0.0 0:00.00 ata/1 207 root 20 0 0 0 0 S 0 0.0 0:00.14 ata_aux 208 root 20 0 0 0 0 S 0 0.0 0:00.01 scsi_eh_0 dmesg suggests there's a disk problem: .............. (previous episode) [13276.966004] raid1:md0: read error corrected (8 sectors at 489900360 on sdc7) [13276.966043] raid1: sdb7: redirecting sector 489898312 to another mirror [13279.569186] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13279.569211] ata4.00: irq_stat 0x40000008 [13279.569230] ata4.00: failed command: READ FPDMA QUEUED [13279.569257] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13279.569262] res 41/40:00:05:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13279.569306] ata4.00: status: { DRDY ERR } [13279.569321] ata4.00: error: { UNC } [13279.575362] ata4.00: configured for UDMA/133 [13279.575388] ata4: EH complete [13283.169224] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13283.169246] ata4.00: irq_stat 0x40000008 [13283.169263] ata4.00: failed command: READ FPDMA QUEUED [13283.169289] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13283.169294] res 41/40:00:07:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13283.169331] ata4.00: status: { DRDY ERR } [13283.169345] ata4.00: error: { UNC } [13283.176071] ata4.00: configured for UDMA/133 [13283.176104] ata4: EH complete [13286.224814] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13286.224837] ata4.00: irq_stat 0x40000008 [13286.224853] ata4.00: failed command: READ FPDMA QUEUED [13286.224879] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13286.224884] res 41/40:00:06:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13286.224922] ata4.00: status: { DRDY ERR } [13286.224935] ata4.00: error: { UNC } [13286.231277] ata4.00: configured for UDMA/133 [13286.231303] ata4: EH complete [13288.802623] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13288.802646] ata4.00: irq_stat 0x40000008 [13288.802662] ata4.00: failed command: READ FPDMA QUEUED [13288.802688] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13288.802693] res 41/40:00:05:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13288.802731] ata4.00: status: { DRDY ERR } [13288.802745] ata4.00: error: { UNC } [13288.808901] ata4.00: configured for UDMA/133 [13288.808927] ata4: EH complete [13291.380430] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13291.380453] ata4.00: irq_stat 0x40000008 [13291.380470] ata4.00: failed command: READ FPDMA QUEUED [13291.380496] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13291.380501] res 41/40:00:05:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13291.380577] ata4.00: status: { DRDY ERR } [13291.380594] ata4.00: error: { UNC } [13291.386517] ata4.00: configured for UDMA/133 [13291.386543] ata4: EH complete [13294.347147] ata4.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 [13294.347169] ata4.00: irq_stat 0x40000008 [13294.347186] ata4.00: failed command: READ FPDMA QUEUED [13294.347211] ata4.00: cmd 60/08:00:00:6a:05/00:00:23:00:00/40 tag 0 ncq 4096 in [13294.347217] res 41/40:00:06:6a:05/00:00:23:00:00/40 Emask 0x409 (media error) <F> [13294.347254] ata4.00: status: { DRDY ERR } [13294.347268] ata4.00: error: { UNC } [13294.353556] ata4.00: configured for UDMA/133 [13294.353583] sd 3:0:0:0: [sdc] Unhandled sense code [13294.353590] sd 3:0:0:0: [sdc] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [13294.353599] sd 3:0:0:0: [sdc] Sense Key : Medium Error [current] [descriptor] [13294.353610] Descriptor sense data with sense descriptors (in hex): [13294.353616] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00 [13294.353635] 23 05 6a 06 [13294.353644] sd 3:0:0:0: [sdc] Add. Sense: Unrecovered read error - auto reallocate failed [13294.353657] sd 3:0:0:0: [sdc] CDB: Read(10): 28 00 23 05 6a 00 00 00 08 00 [13294.353675] end_request: I/O error, dev sdc, sector 587557382 [13294.353726] ata4: EH complete [13294.366953] raid1:md0: read error corrected (8 sectors at 489900544 on sdc7) [13294.366992] raid1: sdc7: redirecting sector 489898496 to another mirror and they're happening quite frequently, which I guess is liable to account for the performance problem(?) # dmesg | grep mirror [12433.561822] raid1: sdc7: redirecting sector 489900464 to another mirror [12449.428933] raid1: sdb7: redirecting sector 489900504 to another mirror [12464.807016] raid1: sdb7: redirecting sector 489900512 to another mirror [12480.196222] raid1: sdb7: redirecting sector 489900520 to another mirror [12495.585413] raid1: sdb7: redirecting sector 489900528 to another mirror [12510.974424] raid1: sdb7: redirecting sector 489900536 to another mirror [12526.374933] raid1: sdb7: redirecting sector 489900544 to another mirror [12542.619938] raid1: sdc7: redirecting sector 489900608 to another mirror [12559.431328] raid1: sdc7: redirecting sector 489900616 to another mirror [12576.553866] raid1: sdc7: redirecting sector 489900624 to another mirror [12592.065265] raid1: sdc7: redirecting sector 489900632 to another mirror [12607.621121] raid1: sdc7: redirecting sector 489900640 to another mirror [12623.165856] raid1: sdc7: redirecting sector 489900648 to another mirror [12638.699474] raid1: sdc7: redirecting sector 489900656 to another mirror [12655.610881] raid1: sdc7: redirecting sector 489900664 to another mirror [12672.255617] raid1: sdc7: redirecting sector 489900672 to another mirror [12672.288746] raid1: sdc7: redirecting sector 489900680 to another mirror [12672.332376] raid1: sdc7: redirecting sector 489900688 to another mirror [12672.362935] raid1: sdc7: redirecting sector 489900696 to another mirror [12674.201177] raid1: sdc7: redirecting sector 489900704 to another mirror [12698.045050] raid1: sdc7: redirecting sector 489900712 to another mirror [12698.089309] raid1: sdc7: redirecting sector 489900720 to another mirror [12698.111999] raid1: sdc7: redirecting sector 489900728 to another mirror [12698.134006] raid1: sdc7: redirecting sector 489900736 to another mirror [12719.034376] raid1: sdc7: redirecting sector 489900744 to another mirror [12734.545775] raid1: sdc7: redirecting sector 489900752 to another mirror [12734.590014] raid1: sdc7: redirecting sector 489900760 to another mirror [12734.624050] raid1: sdc7: redirecting sector 489900768 to another mirror [12734.647308] raid1: sdc7: redirecting sector 489900776 to another mirror [12734.664657] raid1: sdc7: redirecting sector 489900784 to another mirror [12734.710642] raid1: sdc7: redirecting sector 489900792 to another mirror [12734.721919] raid1: sdc7: redirecting sector 489900800 to another mirror [12734.744732] raid1: sdc7: redirecting sector 489900808 to another mirror [12734.779330] raid1: sdc7: redirecting sector 489900816 to another mirror [12782.604564] raid1: sdb7: redirecting sector 1242934216 to another mirror [12798.264153] raid1: sdc7: redirecting sector 1242935080 to another mirror [13245.832193] raid1: sdb7: redirecting sector 489898296 to another mirror [13261.376929] raid1: sdb7: redirecting sector 489898304 to another mirror [13276.966043] raid1: sdb7: redirecting sector 489898312 to another mirror [13294.366992] raid1: sdc7: redirecting sector 489898496 to another mirror although the arrays are still running on all disks - they haven't given up on any yet: # cat /proc/mdstat Personalities : [raid1] [raid0] md10 : active raid0 md0[0] md1[1] 3368770048 blocks super 1.2 512k chunks md1 : active raid1 sde2[2] sdd2[1] 1464087824 blocks super 1.2 [2/2] [UU] md0 : active raid1 sdb7[0] sdc7[2] 1904684920 blocks super 1.2 [2/2] [UU] unused devices: <none> So I think I have some idea what the problem is but I am not a linux sysadmin expert by the remotest stretch of the imagination and would really appreciate some clue checking here with my diagnosis and what do I need to do: obviously I need to source another drive for sdc. (I'm guessing I could buy a larger drive if the price is right: I'm thinking that one day I'll need to grow the size of the array and that would be one less drive to replace with a larger one) then use mdadm to fail out the existing sdc, remove it and fit the new drive fdisk the new drive with the same size partition for the array as the old one had use mdadm to add the new drive into the array that sound OK?

    Read the article

  • Load and Web Performance Testing using Visual Studio Ultimate 2010-Part 3

    - by Tarun Arora
    Welcome back once again, in Part 1 of Load and Web Performance Testing using Visual Studio 2010 I talked about why Performance Testing the application is important, the test tools available in Visual Studio Ultimate 2010 and various test rig topologies, in Part 2 of Load and Web Performance Testing using Visual Studio 2010 I discussed the details of web performance & load tests as well as why it’s important to follow a goal based pattern while performance testing your application. In part 3 I’ll be discussing Test Result Analysis, Test Result Drill through, Test Report Generation, Test Run Comparison, Asp.net Profiler and some closing thoughts. Test Results – I see some creepy worms! In Part 2 we put together a web performance test and a load test, lets run the test to see load test to see how the Web site responds to the load simulation. While the load test is running you will be able to see close to real time analysis in the Load Test Analyser window. You can use the Load Test Analyser to conduct load test analysis in three ways: Monitor a running load test - A condensed set of the performance counter data is maintained in memory. To prevent the results memory requirements from growing unbounded, up to 200 samples for each performance counter are maintained. This includes 100 evenly spaced samples that span the current elapsed time of the run and the most recent 100 samples.         After the load test run is completed - The test controller spools all collected performance counter data to a database while the test is running. Additional data, such as timing details and error details, is loaded into the database when the test completes. The performance data for a completed test is loaded from the database and analysed by the Load Test Analyser. Below you can see a screen shot of the summary view, this provides key results in a format that is compact and easy to read. You can also print the load test summary, this is generated after the test has completed or been stopped.         Analyse the load test results of a previously run load test – We’ll see this in the section where i discuss comparison between two test runs. The performance counters can be plotted on the graphs. You also have the option to highlight a selected part of the test and view details, drill down to the user activity chart where you can hover over to see more details of the test run.   Generate Report => Test Run Comparisons The level of reports you can generate using the Load Test Analyser is astonishing. You have the option to create excel reports and conduct side by side analysis of two test results or to track trend analysis. The tools also allows you to export the graph data either to MS Excel or to a CSV file. You can view the ASP.NET profiler report to conduct further analysis as well. View Data and Diagnostic Attachments opens the Choose Diagnostic Data Adapter Attachment dialog box to select an adapter to analyse the result type. For example, you can select an IntelliTrace adapter, click OK and open the IntelliTrace summary for the test agent that was used in the load test.   Compare results This creates a set of reports that compares the data from two load test results using tables and bar charts. I have taken these screen shots from the MSDN documentation, I would highly recommend exploring the wealth of knowledge available on MSDN. Leaving Thoughts While load testing the application with an excessive load for a longer duration of time, i managed to bring the IIS to its knees by piling up a huge queue of requests waiting to be processed. This clearly means that the IIS had run out of threads as all the threads were busy processing existing request, one easy way of fixing this is by increasing the default number of allocated threads, but this might escalate the problem. The better suggestion is to try and drill down to the actual root cause of the problem. When ever the garbage collection runs it stops processing any pages so all requests that come in during that period are queued up, but realistically the garbage collection completes in fraction of a a second. To understand this better lets look at the .net heap, it is divided into large heap and small heap, anything greater than 85kB in size will be allocated to the Large object heap, the Large object heap is non compacting and remember large objects are expensive to move around, so if you are allocating something in the large object heap, make sure that you really need it! The small object heap on the other hand is divided into generations, so all objects that are supposed to be short-lived are suppose to live in Gen-0 and the long living objects eventually move to Gen-2 as garbage collection goes through.  As you can see in the picture below all < 85 KB size objects are first assigned to Gen-0, when Gen-0 fills up and a new object comes in and finds Gen-0 full, the garbage collection process is started, the process checks for all the dead objects and assigns them as the valid candidate for deletion to free up memory and promotes all the remaining objects in Gen-0 to Gen-1. So in the future when ever you clean up Gen-1 you have to clean up Gen-0 as well. When you fill up Gen – 0 again, all of Gen – 1 dead objects are drenched and rest are moved to Gen-2 and Gen-0 objects are moved to Gen-1 to free up Gen-0, but by this time your Garbage collection process has started to take much more time than it usually takes. Now as I mentioned earlier when garbage collection is being run all page requests that come in during that period are queued up. Does this explain why possibly page requests are getting queued up, apart from this it could also be the case that you are waiting for a long running database process to complete.      Lets explore the heap a bit more… What is really a case of crisis is when the objects are living long enough to make it to Gen-2 and then dying, this is definitely a high cost operation. But sometimes you need objects in memory, for example when you cache data you hold on to the objects because you need to use them right across the user session, which is acceptable. But if you wanted to see what extreme caching can do to your server then write a simple application that chucks in a lot of data in cache, run a load test over it for about 10-15 minutes, forcing a lot of data in memory causing the heap to run out of memory. If you get to such a state where you start running out of memory the IIS as a mode of recovery restarts the worker process. It is great way to free up all your memory in the heap but this would clear the cache. The problem with this is if the customer had 10 items in their shopping basket and that data was stored in the application cache, the user basket will now be empty forcing them either to get frustrated and go to a competitor website or if the customer is really patient, give it another try! How can you address this, well two ways of addressing this; 1. Workaround – A x86 bit processor only allows a maximum of 4GB of RAM, this means the machine effectively has around 3.4 GB of RAM available, the OS needs about 1.5 GB of RAM to run efficiently, the IIS and .net framework also need their share of memory, leaving you a heap of around 800 MB to play with. Because Team builds by default build your application in ‘Compile as any mode’ it means the application is build such that it will run in x86 bit mode if run on a x86 bit processor and run in a x64 bit mode if run on a x64 but processor. The problem with this is not all applications are really x64 bit compatible specially if you are using com objects or external libraries. So, as a quick win if you compiled your application in x86 bit mode by changing the compile as any selection to compile as x86 in the team build, you will be able to run your application on a x64 bit machine in x86 bit mode (WOW – By running Windows on Windows) and what that means is, you could use 8GB+ worth of RAM, if you take away everything else your application will roughly get a heap size of at least 4 GB to play with, which is immense. If you need a heap size of more than 4 GB you have either build a software for NASA or there is something fundamentally wrong in your application. 2. Solution – Now that you have put a workaround in place the IIS will not restart the worker process that regularly, which means you can take a breather and start working to get to the root cause of this memory leak. But this begs a question “How do I Identify possible memory leaks in my application?” Well i won’t say that there is one single tool that can tell you where the memory leak is, but trust me, ‘Performance Profiling’ is a great start point, it definitely gets you started in the right direction, let’s have a look at how. Performance Wizard - Start the Performance Wizard and select Instrumentation, this lets you measure function call counts and timings. Before running the performance session right click the performance session settings and chose properties from the context menu to bring up the Performance session properties page and as shown in the screen shot below, check the check boxes in the group ‘.NET memory profiling collection’ namely ‘Collect .NET object allocation information’ and ‘Also collect the .NET Object lifetime information’.    Now if you fire off the profiling session on your pages you will notice that the results allows you to view ‘Object Lifetime’ which shows you the number of objects that made it to Gen-0, Gen-1, Gen-2, Large heap, etc. Another great feature about the profile is that if your application has > 5% cases where objects die right after making to the Gen-2 storage a threshold alert is generated to alert you. Since you have the option to also view the most expensive methods and by capturing the IntelliTrace data you can drill in to narrow down to the line of code that is the root cause of the problem. Well now that we have seen how crucial memory management is and how easy Visual Studio Ultimate 2010 makes it for us to identify and reproduce the problem with the best of breed tools in the product. Caching One of the main ways to improve performance is Caching. Which basically means you tell the web server that instead of going to the database for each request you keep the data in the webserver and when the user asks for it you serve it from the webserver itself. BUT that can have consequences! Let’s look at some code, trust me caching code is not very intuitive, I define a cache key for almost all searches made through the common search page and cache the results. The approach works fine, first time i get the data from the database and second time data is served from the cache, significant performance improvement, EXCEPT when two users try to do the same operation and run into each other. But it is easy to handle this by adding the lock as you can see in the snippet below. So, as long as a user comes in and finds that the cache is empty, the user locks and starts to get the cache no more concurrency issues. But lets say you are processing 10 requests per second, by the time i have locked the operation to get the results from the database, 9 other users came in and found that the cache key is null so after i have come out and populated the cache they will still go in to get the results again. The application will still be faster because the next set of 10 users and so on would continue to get data from the cache. BUT if we added another null check after locking to build the cache and before actual call to the db then the 9 users who follow me would not make the extra trip to the database at all and that would really increase the performance, but didn’t i say that the code won’t be very intuitive, may be you should leave a comment you don’t want another developer to come in and think what a fresher why is he checking for the cache key null twice !!! The downside of caching is, you are storing the data outside of the database and the data could be wrong because the updates applied to the database would make the data cached at the web server out of sync. So, how do you invalidate the cache? Well if you only had one way of updating the data lets say only one entry point to the data update you can write some logic to say that every time new data is entered set the cache object to null. But this approach will not work as soon as you have several ways of feeding data to the system or your system is scaled out across a farm of web servers. The perfect solution to this is Micro Caching which means you cache the query for a set time duration and invalidate the cache after that set duration. The advantage is every time the user queries for that data with in the time span for which you have cached the results there are no calls made to the database and the data is served right from the server which makes the response immensely quick. Now figuring out the appropriate time span for which you micro cache the query results really depends on the application. Lets say your website gets 10 requests per second, if you retain the cache results for even 1 minute you will have immense performance gains. You would reduce 90% hits to the database for searching. Ever wondered why when you go to e-bookers.com or xpedia.com or yatra.com to book a flight and you click on the book button because the fare seems too exciting and you get an error message telling you that the fare is not valid any more. Yes, exactly => That is a cache failure! These travel sites or price compare engines are not going to hit the database every time you hit the compare button instead the results will be served from the cache, because the query results are micro cached, its a perfect trade-off, by micro caching the results the site gains 100% performance benefits but every once in a while annoys a customer because the fare has expired. But the trade off works in the favour of these sites as they are still able to process up to 30+ page requests per second which means cater to the site traffic by may be losing 1 customer every once in a while to a competitor who is also using a similar caching technique what are the odds that the user will not come back to their site sooner or later? Recap   Resources Below are some Key resource you might like to review. I would highly recommend the documentation, walkthroughs and videos available on MSDN. You can always make use of Fiddler to debug Web Performance Tests. Some community test extensions and plug ins available on Codeplex might also be of interest to you. The Road Ahead Thank you for taking the time out and reading this blog post, you may also want to read Part I and Part II if you haven’t so far. If you enjoyed the post, remember to subscribe to http://feeds.feedburner.com/TarunArora. Questions/Feedback/Suggestions, etc please leave a comment. Next ‘Load Testing in the cloud’, I’ll be working on exploring the possibilities of running Test controller/Agents in the Cloud. See you on the other side! Thank You!   Share this post : CodeProject

    Read the article

< Previous Page | 16 17 18 19 20 21 22 23 24 25 26 27  | Next Page >