epm performance tuning - Page 93

Is basing storage requirements based on IOPS sufficient?

- by Boden

The current system in question is running SBS 2003, and is going to be migrated on new hardware to SBS 2008. Currently I'm seeing on average 200-300 disk transfers per second total across all the arrays in the system. The array seeing the bulk of activity is a 6 disk 7200RPM RAID 6 and it struggles to keep up during high traffic times (idle time often only 10-20%; response times peaking 20-50+ ms). Based on some rough calculations this makes sense (avg ~245 IOPS on this array at 70/30 read to write ratio). I'm considering using a much simpler disk configuration using a single RAID 10 array of 10K disks. Using the same parameters for my calculations above, I'm getting 583 average random IOPS / sec. Granted SBS 2008 is not the same beast as 2003, but I'd like to make the assumption that it'll be similar in terms of disk performance, if not better (Exchange 2007 is easier on the disk and there's no ISA server). Am I correct in believing that the proposed system will be sufficient in terms of performance, or am I missing something? I've read so much about recommended disk configurations for various products like Exchange, and they often mention things like dedicating spindles to logs, etc. I understand the reasoning behind this, but if I've got more than enough random I/O overhead, does it really matter? I've always at the very least had separate spindles for the OS, but I could really reduce cost and complexity if I just had a single, good performing array. So as not to make you guys do my job for me, the generic version of this question is: if I have a projected IOPS figure for a new system, is it sufficient to use this value alone to spec the storage, ignoring "best practice" configurations? (given similar technology, not going from DAS to SAN or anything)

Read the article

Different network response for indentical co-located machines

- by Santosh

We have a situation as follows: We have a two different virtual machines (VMs) on some remote server farm. The machines are identical in terms of hardware/software(OS) configurations. We have a J2EE application running on JBoss on each of those two machines. These two applications are of different version sav V1 on VM1 and V2 on VM2. We observed some degraded response time for application V2 when accessed via public URL. When we accessed the application through a secured VPN, there is hardly any difference. The bandwidth test (upload/download speed, ping etc) shows that VM1 is responding better when accessed via secured VPN. We concluded that the application does not seem to have performance issue. Because, it that's the case the performance degradation should also be there when access via VPN. So we concluded its the network problem. But since those two identical VMs are on same network we are looking for the reasons for different responses. My question is, given the above situation, what could be reasons for such a behavior ?

Read the article

What is the best VM for developing WPF apps from within OS X?

- by MarqueIV

All of my machines are Macs (Mac Pro, MacBook Pro, MacBook Air and Mac Mini (and Apple TV 2.0 too! :) ) but for my day-job, I develop .NET/WPF applications. Normally I just boot into Boot Camp and develop that way, which of course works great, but there are times when I need to simultaneously get to things on my Mac-side of the equation, so I've bought both VMware 3.1 and Parallels 6. Both work, however, even on my Mac Pro where I paid to upgrade to the better video cards (the NVidia 8600s I think vs. the stock ATI cards) the WPF performance bites!! Now this confuses me since both boast that they support not only hardware-accelerated OpenGL 2.1, but also hardware-accelerated DirectX 9 (VMware even allegedly supports DirectX 10!) via their respective virtual drivers and both can run 3D games just fine, even in a window. But even the simple act of resizing a WPF window that has a tiled background results in some HIDEOUS repainting and resizing behaviors. It's damn near closer to what you'd expect over RDP let alone a software-only renderer (forget accelerated hardware completely!) So... can anyone please tell me WTF WPF is doing differently? More importantly, how can I speed up the WPF performance? Should I switch to VirtualBox that also has support for DirectX? Or am I just gonna have to 'byte' the bullet (sorry... had to. So I like puns! Thank Jon Stewart!) and continue using Boot Camp?

Read the article

Postfix spool on ext3 optimiziations in >=linux-2.6.34 days

- by Luke404

Given the very specific nature of the subject (we're not talking about mailboxes, just the spool; we're not talking about other filesystems, just ext3; and so on...) and the maturity of the softwares involved (linux kernel, ext3fs, postfix) I'd think there should be a more or less agreed on set of best practices to filesystem related tuning. I'm trying to get a roundup of them: data=journal became the default in recent kernels (somewhere around 2.6.30 IIRC) so we should be ok with that Wietse Venema says atime must be on, but Postfix documentation recommendsnoatime while talking about the Incoming Queue. Does that mean that postfix needs atime on just for some queue directories and will benefit from noatime on the others? can we use noatime if we just don't use ETRN? filesystem can be mounted nodev,noexec,nosuid - no* won't prevent you from setting attributes (postfix uses exec attr) they just won't have any effect (we don't run anything from the spool) the fsync() issue cited by Wietse and/or the chattr -S are probably linked to sync/async options of ext3fs but I do not understand them enough. Mouting the filesystem with async option is equivalent to chattr -R -S the whole fs? Seems like it will increase performance, but will that pose a risk of "loss of mail after a system crash" or is it really "safe on /var/spool/postfix" ? would you tune anything else on postfix-2.6.x to work better on ext3 or do you leave defaults everywhere? is there a "best" linux I/O scheduler for this kind of workload (namely CFQ or deadline?) or that's something that will vary too much based on hardware configuration? would you tune anything else in the filesystem or in the kernel? anything else? References: Postfix Performance here on SF Postfix documentation about the Incoming Queue Wietse Venema in Best file system on [email protected] here Postfix and ext3 on [email protected] here and there

Read the article

Real benefits of tcp TIME-WAIT and implications in production environment

- by user64204

SOME THEORY I've been doing some reading on tcp TIME-WAIT (here and there) and what I read is that it's a value set to 2 x MSL (maximum segment life) which keeps a connection in the "connection table" for a while to guarantee that, "before your allowed to create a connection with the same tuple, all the packets belonging to previous incarnations of that tuple will be dead". Since segments received (apart from SYN under specific circumstances) while a connection is either in TIME-WAIT or no longer existing would be discarded, why not close the connection right away? Q1: Is it because there is less processing involved in dealing with segments from old connections and less processing to create a new connection on the same tuple when in TIME-WAIT (i.e. are there performance benefits)? If the above explanation doesn't stand, the only reason I see the TIME-WAIT being useful would be if a client sends a SYN for a connection before it sends remaining segments for an old connection on the same tuple in which case the receiver would re-open the connection but then get bad segments and and would have to terminate it. Q2: Is this analysis correct? Q3: Are there other benefits to using TIME-WAIT? SOME PRACTICE I've been looking at the munin graphs on a production server that I administrate. Here is one: As you can see there are more connections in TIME-WAIT than ESTABLISHED, around twice as many most of the time, on some occasions four times as many. Q4: Does this have an impact on performance? Q5: If so, is it wise/recommended to reduce the TIME-WAIT value (and what to)? Q6: Is this ratio of TIME-WAIT / ESTABLISHED connections normal? Could this be related to malicious connection attempts?

Read the article

Some free cloud solution to enhance your business

- by Saif Bechan

I am co-owner of a small internet business. I am in charge of IT, and I try to get things done as low cost as possible. When investing in servers, resources and overall business costs your project can soon turn into a financial disaster. Cloud solutions can help you in solving some financial problems, they can help you in scalability problems, and overall performance problems of your server or web application. Recently I moved the whole internal/external communication(email,calendar,documents) of my business to the cloud. I did this by using the free version of Google Apps. This works great and is a big advantage on multiple levels. I do not have to fight spam anymore on my system, and there are less resources used on my system. Also switching servers will go a lot easier. Questions Can you name some cloud solution that you have used, or some you just recommend. They can fairy form financial benefits, organizational benefits, performance benefits. It doesn't matter as soon as it helps you spread the load of your business.

Read the article

ZFS with L2ARC (SSD) slower for random seeks than without L2ARC

- by Florian Kruse

I am currently testing ZFS (Opensolaris 2009.06) in an older fileserver to evaluate its use for our needs. Our current setup is as follows: Dual core (2,4 GHz) with 4 GB RAM 3x SATA controller with 11 HDDs (250 GB) and one SSD (OCZ Vertex 2 100 GB) We want to evaluate the use of a L2ARC, so the current ZPOOL is: $ zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM afstank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c11t0d0 ONLINE 0 0 0 c11t1d0 ONLINE 0 0 0 c11t2d0 ONLINE 0 0 0 c11t3d0 ONLINE 0 0 0 raidz1 ONLINE 0 0 0 c13t0d0 ONLINE 0 0 0 c13t1d0 ONLINE 0 0 0 c13t2d0 ONLINE 0 0 0 c13t3d0 ONLINE 0 0 0 cache c14t3d0 ONLINE 0 0 0 where c14t3d0 is the SSD (of course). We run IO tests with bonnie++ 1.03d, size is set to 200 GB (-s 200g) so that the test sample will never be completely in ARC/L2ARC. The results without SSD are (average values over several runs which show no differences) write_chr write_blk rewrite read_chr read_blk random seeks 101.998 kB/s 214.258 kB/s 96.673 kB/s 77.702 kB/s 254.695 kB/s 900 /s With SSD it becomes interesting. My assumption was that the results should be in worst case at least the same. While write/read/rewrite rates are not different, the random seek rate differs significantly between individual bonnie++ runs (between 188 /s and 1333 /s so far), average is 548 +- 200 /s, so below the value w/o SSD. So, my questions are mainly: Why do the random seek rates differ so much? If the seeks are really random, they should not differ much (my assumption). So, even if the SSD is impairing the performance it should be the same in each bonnie++ run. Why is the random seek performance worse in most of the bonnie++ runs? I would assume that some part of the bonnie++ data is in the L2ARC and random seeks on this data performs better while random seeks on other data just performs similarly like before.

Read the article

HP Proliant DL380 G4 - Can this server still perform in 2011?

- by BSchriver

Can the HP Proliant DL380 G4 series server still perform at high a quality in the 2011 IT world? This may sound like a weird question but we are a very small company whose primary business is NOT IT related. So my IT dollars have to stretch a long way. I am in need of a good web and database server. The load and demand for a while will be fairly low so I am not looking nor do I have the money to buy a brand new HP Dl380 G7 series box for $6K. While searching around today I found a company in ATL that buys servers off business leases and then stripes them down to parts. They clean, check and test each part and then custom "rebuild" the server based on whatever specs you request. The interesting thing is they also provide a 3-year warranty on all their servers they sell. I am contemplating buying two of the following: HP Proliant DL380 G4 Dual (2) Intel Xeon 3.6 GHz 800Mhz 1MB Cache processors 8GB PC3200R ECC Memory 6 x 73GB U320 15K rpm SCSI drives Smart Array 6i Card Dual Power Supplies Plus the usual cdrom, dual nic, etc... All this for $750 each or $1500 for two pretty nicely equipped servers. The price then jumps up on the next model up which is the G5 series. It goes from $750 to like $2000 for a comparable server. I just do not have $4000 to buy two servers right now. So back to my original question, if I load Windows 2008 R2 Server and IIS 7 on one of the machines and Windows 2008 R2 server and MS SQL 2008 R2 Server on another machine, what kind of performance might I expect to see from these machines? The facts is this series is now 3 versions behind the G7's and this series of server was built when Windows 200 Server was the dominant OS and Windows 2003 Server was just coming out. If you are running Windows 2008 R2 Server on a G4 with similar or less specs I would love to hear what your performance is like.

Read the article

Caching all files in varnish

- by csgwro

I want my varnish servers to cache all files. At backend there is lighttpd hosting only static files, and there is an md5 in the url in case of file change, ex. /gfx/Bird.b6e0bc2d6cbb7dfe1a52bc45dd2b05c4.swf). However my hit ratio is very poorly (about 0.18) My config: sub vcl_recv { set req.backend=default; ### passing health to backend if (req.url ~ "^/health.html$") { return (pass); } remove req.http.If-None-Match; remove req.http.cookie; remove req.http.authenticate; if (req.request == "GET") { return (lookup); } } sub vcl_fetch { ### do not cache wrong codes if (beresp.status == 404 || beresp.status >= 500) { set beresp.ttl = 0s; } remove beresp.http.Etag; remove beresp.http.Last-Modified; } sub vcl_deliver { set resp.http.expires = "Thu, 31 Dec 2037 23:55:55 GMT"; } I have made an performance tuning: DAEMON_OPTS="${DAEMON_OPTS} -p thread_pool_min=200 -p thread_pool_max=4000 -p thread_pool_add_delay=2 -p session_linger=100" The main url which is missed is... /health.html. Is that forward to backend correctly configured? Disabling health checking hit ratio increases to 0.45. Now mostly "/crossdomain.xml" is missed (from many domains, as it is wildcard). How can I avoid that? Should I carry on other headers like User-Agent or Accept-Encoding? I thing that default hashing mechanism is using url + host/IP. Compression is used at the backend. What else can improve performance?

Read the article

Server slowdown

- by Clinton Bosch

I have a GWT application running on Tomcat on a cloud linux(Ubuntu) server, recently I released a new version of the application and suddenly my server response times have gone from 500ms average to 15s average. I have run every monitoring tool I know. iostat says my disks are 0.03% utilised mysqltuner.pl says I am OK other see below top says my processor is 99% idle and load average: 0.20, 0.31, 0.33 memory usage is 50% (-/+ buffers/cache: 3997 3974) mysqltuner output [OK] Logged in using credentials from debian maintenance account. -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.1.63-0ubuntu0.10.04.1-log [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster [--] Data in MyISAM tables: 370M (Tables: 52) [--] Data in InnoDB tables: 697M (Tables: 1749) [!!] Total fragmented tables: 1754 -------- Security Recommendations ------------------------------------------- [OK] All database users have passwords assigned -------- Performance Metrics ------------------------------------------------- [--] Up for: 19h 25m 41s (1M q [28.122 qps], 1K conn, TX: 2B, RX: 1B) [--] Reads / Writes: 98% / 2% [--] Total buffers: 1.0G global + 2.7M per thread (500 max threads) [OK] Maximum possible memory usage: 2.4G (30% of installed RAM) [OK] Slow queries: 0% (1/1M) [OK] Highest usage of available connections: 34% (173/500) [OK] Key buffer size / total MyISAM indexes: 16.0M/279.0K [OK] Key buffer hit rate: 99.9% (50K cached / 40 reads) [OK] Query cache efficiency: 61.4% (844K cached / 1M selects) [!!] Query cache prunes per day: 553779 [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 34K sorts) [OK] Temporary tables created on disk: 4% (4K on disk / 102K total) [OK] Thread cache hit rate: 84% (185 created / 1K connections) [!!] Table cache hit rate: 0% (256 open / 27K opened) [OK] Open file limit used: 0% (20/2K) [OK] Table locks acquired immediately: 100% (692K immediate / 692K locks) [OK] InnoDB data size / buffer pool: 697.2M/1.0G -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance MySQL started within last 24 hours - recommendations may be inaccurate Enable the slow query log to troubleshoot bad queries Increase table_cache gradually to avoid file descriptor limits Variables to adjust: query_cache_size (> 16M) table_cache (> 256)

Read the article

Is there an objective way to measure slowness of PC/WINDOWS?

- by ekms

We've a lot of users that usually complain about that his PC is "slow". (we use win XP). We usually check startup programs, virus, fragmentation, disk health and common problems that causes slowness (Symantec AV drops disk to 1mb/s , or a seagate HD firmware error in certain models), but in those cases the slowness is pretty evident. In other hand, the most common is the user complaining about his pc but for us looks OK, even in 6 years old desktops. People sometimes even complains about his new quad core desktops speed!!! So, we are asking if there's a way to OBJECTIVELY check that a computer didn't dropped its performance, compared with similar ones o previous measures, specially for work use (I don't think that 3dmark benchmark o similar may help). The only thing that I found that was useful is HDTune, but it only check hard disk performance. Basically, what we want is something that enable us to say to our users "see? your PC is as slow as was three years ago! stop complaining! Is all in your head!"

Read the article

Why database partitioning didn't work? Extract from thedailywtf.com

- by questzen

Original link. http://thedailywtf.com/Articles/The-Certified-DBA.aspx. Article summary: The DBA suggests an approach involving rigorous partitioning, 10 partitions per disk (3 actual disks and 3 raid). The stats show that the performance is non-optimal. Then the DBA suggests an alternative of 1 partition per disk (with more added disks). This also fails. The sys-admin then sets up a single disk, single partition and saves the day. The size of disks was not mentioned but given today,s typical disk sizes (of the order of 100 GB), the partitions ; would be huge, it surprises me that a single disk with all partitions outperformed. Initially I suspect that the data was segregated and hence faster reads. But how come the performance didn't degrade as time went by with all the inserts and updates happening? Saw this on reddit, but the explanation was by far spindle/platter centered. There was no mention in the article about this. Is there any other reason? I can only guess that the tables were using a incorrect hash distribution causing non-uniform allocation across disks (wrong partitioning); this would increase fetch times. Any thoughts?

Read the article

Disk fragmentation when dealing with many small files

- by Zorlack

On a daily basis we generate about 3.4 Million small jpeg files. We also delete about 3.4 Million 90 day old images. To date, we've dealt with this content by storing the images in a hierarchical manner. The heriarchy is something like this: /Year/Month/Day/Source/ This heirarchy allows us to effectively delete days worth of content across all sources. The files are stored on a Windows 2003 server connected to a 14 disk SATA RAID6. We've started having significant performance issues when writing-to and reading-from the disks. This may be due to the performance of the hardware, but I suspect that disk fragmentation may be a culprit at well. Some people have recommended storing the data in a database, but I've been hesitant to do this. An other thought was to use some sort of container file, like a VHD or something. Does anyone have any advice for mitigating this kind of fragmentation? Additional Info: The average file size is 8-14KB Format information from fsutil: NTFS Volume Serial Number : 0x2ae2ea00e2e9d05d Version : 3.1 Number Sectors : 0x00000001e847ffff Total Clusters : 0x000000003d08ffff Free Clusters : 0x000000001c1a4df0 Total Reserved : 0x0000000000000000 Bytes Per Sector : 512 Bytes Per Cluster : 4096 Bytes Per FileRecord Segment : 1024 Clusters Per FileRecord Segment : 0 Mft Valid Data Length : 0x000000208f020000 Mft Start Lcn : 0x00000000000c0000 Mft2 Start Lcn : 0x000000001e847fff Mft Zone Start : 0x0000000002163b20 Mft Zone End : 0x0000000007ad2000

Read the article

Windows XP to remote server 2008 R2 shares - awful response times

- by nick3216

I have a network infrastructure of Windows XP clients (a mix of XP and 64-bit XP), that are accessing a network share on a Windows 2008 R2 server. Whenever users type the address of a folder into the address bar of Windows Explorer it's as snappy at determining the contents of the current folder and presenting them to you in the address bar as if you're working on a local drive. But if you open one of the subfolders users get the animated red torch and 'Searching for items...' dialog, typically for 45 seconds. Similarly when using the open folder dialog to try and select a subfolder on this share it takes, on average, 45 seconds for the dialog to expand each node and show the subfolders of each node. Also, while the Explorer instance accsesing the network share is running slowly users notice that the performance of all other Explorer windows suffers. So while Explorer is searching for files on the network share they can't switch to another task and navigate around their local drive using Explorer because it's now as slow as a dead dog at accessing anything. Are there any settings we can change which will improve the performance accessing network shares?

Read the article

Optimise Apache for EC2 micro instance

- by Shiyu Sekam

I'm running apache2 on a EC2 micro instance with ~600 mb RAM. The instance was running for almost a year without problems, but in the last weeks it just keeps crashing, because the server reached MaxClients. The server basically runs few websites, one wordpress blog(not often used), company website(most used) and 2 small sites, which are just internal. The database for the blog runs on RDS, so there's no Mysql running on this web server. When I came to the company, the server already was setup and is running apache + mod_php + prefork. We want to migrate that in the future to a nginx + php-fpm, but it still needs further testing. So for now I have to stick with the old setup. I also use CloudFlare DDOS protection in front of the server, because it was attacked a couple of the times in the last weeks. My company don't want to pay money for a better web server at this point, so I have to stick with the micro instance also. Additionally the code for the website we run is really bad and slow and sometimes a single page load can take up to 15 seconds. The whole website is dynamic and written in PHP, so caching isn't really an option here. It's a customized search for users. I've already turned off KeepAlive, which improved the performance a little bit. My prefork config looks like the following: StartServers 2 MinSpareServers 2 MaxSpareServers 5 ServerLimit 10 MaxClients 10 MaxRequestsPerChild 100 The server just becomes unresponsive after a while running and I've run the following command to see how many connections there are: netstat | grep http | wc -l 75 Trying to restart apache helps for a short moment, but after that a while the apache process(es) become unresponsive again. I've the following modules enabled(output of apache2ctl -M) Loaded Modules: core_module (static) log_config_module (static) logio_module (static) version_module (static) mpm_prefork_module (static) http_module (static) so_module (static) alias_module (shared) authz_host_module (shared) deflate_module (shared) dir_module (shared) expires_module (shared) mime_module (shared) negotiation_module (shared) php5_module (shared) rewrite_module (shared) setenvif_module (shared) ssl_module (shared) status_module (shared) Syntax OK apache2.conf # Security ServerTokens OS ServerSignature On TraceEnable On ServerName "web.example.com" ServerRoot "/etc/apache2" PidFile ${APACHE_PID_FILE} Timeout 30 KeepAlive off User www-data Group www-data AccessFileName .htaccess <Files ~ "^\.ht"> Order allow,deny Deny from all Satisfy all </Files> <Directory /> Options FollowSymLinks AllowOverride None </Directory> DefaultType none HostnameLookups Off ErrorLog /var/log/apache2/error.log LogLevel warn EnableSendfile On #Listen 80 Include /etc/apache2/mods-enabled/*.load Include /etc/apache2/mods-enabled/*.conf Include /etc/apache2/ports.conf LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined LogFormat "%h %l %u %t \"%r\" %>s %b" common LogFormat "%{Referer}i -> %U" referer LogFormat "%{User-agent}i" agent Include /etc/apache2/conf.d/*.conf Include /etc/apache2/sites-enabled/*.conf Vhost of main site <VirtualHost *:80> ServerName www.example.com ## Vhost docroot DocumentRoot /srv/www/jenkins/Web ## Directories, there should at least be a declaration for /srv/www/jenkins/Web <Directory /srv/www/jenkins/Web> AllowOverride All Order allow,deny Allow from all </Directory> ## Load additional static includes ## Logging ErrorLog /var/log/apache2/www.example.com.error.log LogLevel warn ServerSignature Off CustomLog /var/log/apache2/www.example.com.access.log combined ## Rewrite rules RewriteEngine On RewriteCond %{HTTP_HOST} !^www.example.com$ RewriteRule ^.*$ http://www.example.com%{REQUEST_URI} [R=301,L] ## Server aliases ServerAlias www.example.invalid ServerAlias example.com ## Custom fragment <Location /srv/www/jenkins/Web/library> Order Deny,Allow Deny from all </Location> <Files ~ "^\.(.+)"> Order deny,allow deny from all </Files> </VirtualHost>

Read the article

How to force two process to run on the same CPU?

- by kovan

Context: I'm programming a software system that consists of multiple processes. It is programmed in C++ under Linux. and they communicate among them using Linux shared memory. Usually, in software development, is in the final stage when the performance optimization is made. Here I came to a big problem. The software has high performance requirements, but in machines with 4 or 8 CPU cores (usually with more than one CPU), it was only able to use 3 cores, thus wasting 25% of the CPU power in the first ones, and more than 60% in the second ones. After many research, and having discarded mutex and lock contention, I found out that the time was being wasted on shmdt/shmat calls (detach and attach to shared memory segments). After some more research, I found out that these CPUs, which usually are AMD Opteron and Intel Xeon, use a memory system called NUMA, which basically means that each processor has its fast, "local memory", and accessing memory from other CPUs is expensive. After doing some tests, the problem seems to be that the software is designed so that, basically, any process can pass shared memory segments to any other process, and to any thread in them. This seems to kill performance, as process are constantly accessing memory from other processes. Question: Now, the question is, is there any way to force pairs of processes to execute in the same CPU?. I don't mean to force them to execute always in the same processor, as I don't care in which one they are executed, altough that would do the job. Ideally, there would be a way to tell the kernel: If you schedule this process in one processor, you must also schedule this "brother" process (which is the process with which it communicates through shared memory) in that same processor, so that performance is not penalized.

Read the article

Refactoring or Rewriting Monolithic PHP Spaghetti Codebase

- by nategood

I've inherited a really poorly designed PHP spaghetti code project. It's been gaining a good bit of traffic recently and is starting to have performance issues on top of the poor monolithic code base. Its maxing out performance on a chunky 16GB dedicated machine when it really shouldn't be. I'm planning on doing some performance tweaks right off the bat to help the performance issue, but this still won't really help the horrible code base. The team is small but expecting to grow very soon. I've read Joel's article on the troubles of doing a complete rewrite and see the concerns. But how bad does the code base have to be before you consider a rewrite? There is PHP handling logic interjected into what one would usually consider a "view". Even worse, in some places SQL statements are in these same files! The only real separation of presentation and logic are a few PHP scripts that serve as function libraries. These scripts do most of the ORM stuff... if you can even call it that. Trying to slowly refractor this seems like a nightmare. Open to your thoughts and opinions... however not interested in hearing, "Run away, Run away!".

Read the article

How can a single disk in a hardware SATA RAID-10 array bring the entire array to a screeching halt?

- by Stu Thompson

Prelude: I'm a code-monkey that's increasingly taken on SysAdmin duties for my small company. My code is our product, and increasingly we provide the same app as SaaS. About 18 months ago I moved our servers from a premium hosting centric vendor to a barebones rack pusher in a tier IV data center. (Literally across the street.) This ment doing much more ourselves--things like networking, storage and monitoring. As part the big move, to replace our leased direct attached storage from the hosting company, I built a 9TB two-node NAS based on SuperMicro chassises, 3ware RAID cards, Ubuntu 10.04, two dozen SATA disks, DRBD and . It's all lovingly documented in three blog posts: Building up & testing a new 9TB SATA RAID10 NFSv4 NAS: Part I, Part II and Part III. We also setup a Cacit monitoring system. Recently we've been adding more and more data points, like SMART values. I could not have done all this without the awesome boffins at ServerFault. It's been a fun and educational experience. My boss is happy (we saved bucket loads of $$$), our customers are happy (storage costs are down), I'm happy (fun, fun, fun). Until yesterday. Outage & Recovery: Some time after lunch we started getting reports of sluggish performance from our application, an on-demand streaming media CMS. About the same time our Cacti monitoring system sent a blizzard of emails. One of the more telling alerts was a graph of iostat await. Performance became so degraded that Pingdom began sending "server down" notifications. The overall load was moderate, there was not traffic spike. After logging onto the application servers, NFS clients of the NAS, I confirmed that just about everything was experiencing highly intermittent and insanely long IO wait times. And once I hopped onto the primary NAS node itself, the same delays were evident when trying to navigate the problem array's file system. Time to fail over, that went well. Within 20 minuts everything was confirmed to be back up and running perfectly. Post-Mortem: After any and all system failures I perform a post-mortem to determine the cause of the failure. First thing I did was ssh back into the box and start reviewing logs. It was offline, completely. Time for a trip to the data center. Hardware reset, backup an and running. In /var/syslog I found this scary looking entry: Nov 15 06:49:44 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_00], 6 Currently unreadable (pending) sectors Nov 15 06:49:44 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_07], SMART Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 171 to 170 Nov 15 06:49:45 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_10], 16 Currently unreadable (pending) sectors Nov 15 06:49:45 umbilo smartd[2827]: Device: /dev/twa0 [3ware_disk_10], 4 Offline uncorrectable sectors Nov 15 06:49:45 umbilo smartd[2827]: Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error Nov 15 06:49:45 umbilo smartd[2827]: # 1 Short offline Completed: read failure 90% 6576 3421766910 Nov 15 06:49:45 umbilo smartd[2827]: # 2 Short offline Completed: read failure 90% 6087 3421766910 Nov 15 06:49:45 umbilo smartd[2827]: # 3 Short offline Completed: read failure 10% 5901 656821791 Nov 15 06:49:45 umbilo smartd[2827]: # 4 Short offline Completed: read failure 90% 5818 651637856 Nov 15 06:49:45 umbilo smartd[2827]: So I went to check the Cacti graphs for the disks in the array. Here we see that, yes, disk 7 is slipping away just like syslog says it is. But we also see that disk 8's SMART Read Erros are fluctuating. There are no messages about disk 8 in syslog. More interesting is that the fluctuating values for disk 8 directly correlate to the high IO wait times! My interpretation is that: Disk 8 is experiencing an odd hardware fault that results in intermittent long operation times. Somehow this fault condition on the disk is locking up the entire array Maybe there is a more accurate or correct description, but the net result has been that the one disk is impacting the performance of the whole array. The Question(s) How can a single disk in a hardware SATA RAID-10 array bring the entire array to a screeching halt? Am I being naïve to think that the RAID card should have dealt with this? How can I prevent a single misbehaving disk from impacting the entire array? Am I missing something?

Read the article

X11 performance problem after upgrading from Centos3 to Centos5 with an ATI Rage XL

- by Marcelo Santos

After upgrading a computer from Centos3 to Centos5 an application that does a lot of scrolling took a very high performance hit. top tells me that X is using a lot of CPU and that was not happening before. The machine has an ATI Rage XL with 8MB and X is using the ati driver as there is no proprietary ATI driver for this board on linux. The xorg.conf: Section "Device" Identifier "Videocard0" Driver "ati" EndSection Section "Screen" Identifier "Screen0" Device "Videocard0" DefaultDepth 24 SubSection "Display" Viewport 0 0 Depth 24 Modes "1024x768" "800x600" "640x480" EndSubSection EndSection Section "DRI" Group 0 Mode 0666 EndSection A similar machine that still has Centos3 installed is able to start DRI on the X server while this one is not, this is the Xorg.0.log for the Centos5 machine: drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is -1, (No such device or address) drmOpenDevice: open result is -1, (No such device or address) drmOpenDevice: Open failed drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is -1, (No such device or address) drmOpenDevice: open result is -1, (No such device or address) drmOpenDevice: Open failed [drm] failed to load kernel module "mach64" (II) ATI(0): [drm] drmOpen failed (EE) ATI(0): [dri] DRIScreenInit Failed (II) ATI(0): Largest offscreen areas (with overlaps): (II) ATI(0): 1024 x 1279 rectangle at 0,768 (II) ATI(0): 768 x 1280 rectangle at 0,768 (II) ATI(0): Using XFree86 Acceleration Architecture (XAA) Screen to screen bit blits Solid filled rectangles 8x8 mono pattern filled rectangles Indirect CPU to Screen color expansion Solid Lines Offscreen Pixmaps Setting up tile and stipple cache: 32 128x128 slots 10 256x256 slots (==) ATI(0): Backing store disabled (==) ATI(0): Silken mouse enabled (II) ATI(0): Direct rendering disabled (==) RandR enabled I also tried using EXA instead of XAA and setting: Option "AccelMethod" "XAA" Option "XAANoOffscreenPixmaps" "true" uname -a Linux sir5.erg.inpe.br 2.6.18-128.7.1.el5 #1 SMP Mon Aug 24 08:20:55 EDT 2009 i686 i686 i386 GNU/Linux rpm -qa | grep xorg-x11-server xorg-x11-server-utils-7.1-4.fc6 xorg-x11-server-sdk-1.1.1-48.52.el5 xorg-x11-server-Xvfb-1.1.1-48.52.el5 xorg-x11-server-Xnest-1.1.1-48.52.el5 xorg-x11-server-Xorg-1.1.1-48.52.el5 The drmOpenDevice error continues when using the suggested Option "AIGLX" "true".

Read the article

SAN with iSCSI-Target Performance Horrendous

- by Justin

We have a poor man's SAN setup in a 1U Ubuntu server running iSCSI-Target with two 300GB drives in RAID-0. We then are using it for block level storage for virtual machines. The hypervisor is connected to the SAN via gigabit on a dedicated VLAN and interfaces. We only have a single virtual machine setup and doing some benchmarks. If we run hdparm -t /dev/sda1 from the virtual machine, we get 'ok' performance of 75MB/s from the virtual machine to the SAN. Then we basically compile a package with ./configure and make. Things start ok, but then all the sudden the load average on the SAN grows to 7+ and things slow down to a crawl. When we SSH into the SAN and run top, sure the load is 7+, but the CPU usage is basically nothing, also the server has 1.5GB of memory available. When we kill the compile on the virtual machine, slowly the LOAD on the SAN goes back to sub 1 figures. What in the world is causing this? How can we diagnosis this further? Here are two screenshot from the SAN during high load. 1> Output of iotop on the SAN: 2> Output of top on the SAN:

Read the article

Bad Performance when SQL Server hits 99% Memory Usage

- by user15863

I've got a server that reports 8 GB of ram used up at 99%. When restart Sql Server, it drops down to about 5% usage, but gradually builds back up to 99% over about 2 hours. When I look at the sqlserver process, its reported as only using 100k ram, and generally never goes up or below that number by very much. In fact, if I add up all the processes in my TaskManager, it's barely scratching the surface of my total available (yet TaskManager still shows 99% memory usage with "All processes shown"). It appears that Sql Server has a huge memory leak going on but it's not reporting it. The server has ran fine for nearly two years, with this only starting to manifest itself in the last 3-4 weeks. Anyone seen this or have any insight into the problem? EDIT When the server hits 99%, performance goes down hill. All queries to the server, apps, etc. come to a crawl. Restarting the service makes things zippy again, until 2 hours has passed and the server hits 99% once again.

Read the article

Performance of Virtual machines on very low end machines

- by TheLQ

I am managing a few cheap servers as my user base isn't large enough to get much more powerful servers. I also don't have the money lying around to invest in a server to prepare for the larger user base. So I'm stuck with the old hardware I have. I am toying with the idea of virtualizing all the current OS's with most likely VMware vSphere Hypervisor (AKA ESXi) Xen (ESXi has too strict of an HCL, and my hardware is too old). Big reasons for doing so: Ability to upgrade and scale hardware rapidly - This is most likely what I'll be doing as I distribute services, get a bigger server, centralize (electricity bills are horrible), distribute, get a bigger server, etc... Manually doing this by reinstalling the entire OS would be a big pain Safety from me - I've made many rookie mistakes, like doing lots of risky work on a vital production server. With a VM I can just backup the state, work on my machine, test, and revert if necessary. No worries, and no OS reinstallation Safety from other factors - As I scale servers might go down, and a backup VM can instantly be started. Various other reasons. However the limiting factor here is hardware. And I mean very depressing hardware. The current server's run off of a Pentium 3 and 4, and have 512 MB and 768 MB RAM respectively (RAM can be upgraded soon however). Is the Virtualization layer small enough to run itself and a Linux OS effectively? Will performance be acceptable (50% CPU overhead for every operation isn't acceptable)? Does it leave enough RAM for the Linux OS? Is this even feasible?

Read the article

Hints on diagnosing performance issue in OpenBSD firewall

- by Tom

My OpenBSD 4.6 pf firewall has started having really bad performance in the past few weeks. I've isolated the firewall (as opposed to the WAN connection, switch, cable, etc.) as the problem, but need a hint on how to further diagnose or fix the problem. The facts: Normal setup is: DSL Modem - FW Ext. NIC - FW Int. NIC - Switch - Laptop Normal setup described above gives only 25 Kbps! Plugging the laptop straight from the DSL modem gives a 1 MBps connection (full speed, as advertised). Therefore, the DSL connection seems to be OK. Plugging the laptop directly into the firewall's internal NIC (bypassing the switch) also gives only 25 Kbps. Therefore, the switch does not seem to be a problem. I've replaced the ethernet cables, but it didn't help. Here's the weird thing. Reloading the ruleset (/sbin/pfctl -Fa -f /etc/pf.conf) causes the laptop's connection to go up to 1 Mbps (i.e. full speed) for a few minutes before it gradually degrades back down to 25Kbps again. Any ideas on what's wrong or how I could further diagnose the problem?

Read the article

Nginx + uWSGI + Django performance stuck on 100rq/s

- by dancio

I have configured Nginx with uWSGI and Django on CentOS 6 x64 (3.06GHz i3 540, 4GB), which should easily handle 2500 rq/s but when I run ab test ( ab -n 1000 -c 100 ) performance stops at 92 - 100 rq/s. Nginx: user nginx; worker_processes 2; events { worker_connections 2048; use epoll; } uWSGI: Emperor /usr/sbin/uwsgi --master --no-orphans --pythonpath /var/python --emperor /var/python/*/uwsgi.ini [uwsgi] socket = 127.0.0.2:3031 master = true processes = 5 env = DJANGO_SETTINGS_MODULE=x.settings env = HTTPS=on module = django.core.handlers.wsgi:WSGIHandler() disable-logging = true catch-exceptions = false post-buffering = 8192 harakiri = 30 harakiri-verbose = true vacuum = true listen = 500 optimize = 2 sysclt changes: # Increase TCP max buffer size setable using setsockopt() net.ipv4.tcp_rmem = 4096 87380 8388608 net.ipv4.tcp_wmem = 4096 87380 8388608 net.core.rmem_max = 8388608 net.core.wmem_max = 8388608 net.core.netdev_max_backlog = 5000 net.ipv4.tcp_max_syn_backlog = 5000 net.ipv4.tcp_window_scaling = 1 net.core.somaxconn = 2048 # Avoid a smurf attack net.ipv4.icmp_echo_ignore_broadcasts = 1 # Optimization for port usefor LBs # Increase system file descriptor limit fs.file-max = 65535 I did sysctl -p to enable changes. Idle server info: top - 13:34:58 up 102 days, 18:35, 1 user, load average: 0.00, 0.00, 0.00 Tasks: 118 total, 1 running, 117 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3983068k total, 2125088k used, 1857980k free, 262528k buffers Swap: 2104504k total, 0k used, 2104504k free, 606996k cached free -m total used free shared buffers cached Mem: 3889 2075 1814 0 256 592 -/+ buffers/cache: 1226 2663 Swap: 2055 0 2055 **During the test:** top - 13:45:21 up 102 days, 18:46, 1 user, load average: 3.73, 1.51, 0.58 Tasks: 122 total, 8 running, 114 sleeping, 0 stopped, 0 zombie Cpu(s): 93.5%us, 5.2%sy, 0.0%ni, 0.2%id, 0.0%wa, 0.1%hi, 1.1%si, 0.0%st Mem: 3983068k total, 2127564k used, 1855504k free, 262580k buffers Swap: 2104504k total, 0k used, 2104504k free, 608760k cached free -m total used free shared buffers cached Mem: 3889 2125 1763 0 256 595 -/+ buffers/cache: 1274 2615 Swap: 2055 0 2055 iotop 30141 be/4 nginx 0.00 B/s 7.78 K/s 0.00 % 0.00 % nginx: wo~er process Where is the bottleneck ? Or what am I doing wrong ?

Read the article

X11 performance problem after upgrading from Centos3 to Centos5 with an ATI Rage XL

- by Marcelo Santos

After upgrading a computer from Centos3 to Centos5 an application that does a lot of scrolling took a very high performance hit. top tells me that X is using a lot of CPU and that was not happening before. The machine has an ATI Rage XL with 8MB and X is using the ati driver as there is no proprietary ATI driver for this board on linux. The xorg.conf: Section "Device" Identifier "Videocard0" Driver "ati" EndSection Section "Screen" Identifier "Screen0" Device "Videocard0" DefaultDepth 24 SubSection "Display" Viewport 0 0 Depth 24 Modes "1024x768" "800x600" "640x480" EndSubSection EndSection Section "DRI" Group 0 Mode 0666 EndSection A similar machine that still has Centos3 installed is able to start DRI on the X server while this one is not, this is the Xorg.0.log for the Centos5 machine: drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is -1, (No such device or address) drmOpenDevice: open result is -1, (No such device or address) drmOpenDevice: Open failed drmOpenDevice: node name is /dev/dri/card0 drmOpenDevice: open result is -1, (No such device or address) drmOpenDevice: open result is -1, (No such device or address) drmOpenDevice: Open failed [drm] failed to load kernel module "mach64" (II) ATI(0): [drm] drmOpen failed (EE) ATI(0): [dri] DRIScreenInit Failed (II) ATI(0): Largest offscreen areas (with overlaps): (II) ATI(0): 1024 x 1279 rectangle at 0,768 (II) ATI(0): 768 x 1280 rectangle at 0,768 (II) ATI(0): Using XFree86 Acceleration Architecture (XAA) Screen to screen bit blits Solid filled rectangles 8x8 mono pattern filled rectangles Indirect CPU to Screen color expansion Solid Lines Offscreen Pixmaps Setting up tile and stipple cache: 32 128x128 slots 10 256x256 slots (==) ATI(0): Backing store disabled (==) ATI(0): Silken mouse enabled (II) ATI(0): Direct rendering disabled (==) RandR enabled I also tried using EXA instead of XAA and setting: Option "AccelMethod" "XAA" Option "XAANoOffscreenPixmaps" "true" uname -a Linux sir5.erg.inpe.br 2.6.18-128.7.1.el5 #1 SMP Mon Aug 24 08:20:55 EDT 2009 i686 i686 i386 GNU/Linux rpm -qa | grep xorg-x11-server xorg-x11-server-utils-7.1-4.fc6 xorg-x11-server-sdk-1.1.1-48.52.el5 xorg-x11-server-Xvfb-1.1.1-48.52.el5 xorg-x11-server-Xnest-1.1.1-48.52.el5 xorg-x11-server-Xorg-1.1.1-48.52.el5 The drmOpenDevice error continues when using the suggested Option "AIGLX" "true".

Search Results

Search found 13403 results on 537 pages for 'epm performance tuning'.

Page 93/537 | < Previous Page | 89 90 91 92 93 94 95 96 97 98 99 100 | Next Page >

- by Boden

- by Santosh

- by MarqueIV

- by Luke404

- by user64204

- by Saif Bechan

- by Florian Kruse

- by BSchriver

- by csgwro

- by Clinton Bosch

- by ekms

- by questzen

- by Zorlack

- by nick3216

- by Shiyu Sekam

- by kovan

- by nategood

- by Stu Thompson

- by Marcelo Santos

- by Justin

- by user15863

- by TheLQ

- by Tom

- by dancio

- by Marcelo Santos

< Previous Page | 89 90 91 92 93 94 95 96 97 98 99 100 | Next Page >