Search Results

Search found 19555 results on 783 pages for 'job performance'.

Page 114/783 | < Previous Page | 110 111 112 113 114 115 116 117 118 119 120 121  | Next Page >

  • Slow NFS and GFS2 performance

    - by Tiago
    Recently I've designed and configured a 4 node cluster for a webapp that does lots of file handling. The cluster have been broken down into 2 main roles, webserver and storage. Each role is replicated to a second server using drbd in active/passive mode. The webserver does a NFS mount of the data directory of the storage server and the latter also has a webserver running to serve files to browser clients. In the storage servers I've created a GFS2 FS to hold the data which is wired to drbd. I've chose GFS2 mainly because the announced performance and also because the volume size which has to be pretty high. Since we entered production I've been facing two problems that I think are deeply connected. First of all, the NFS mount on the webservers keeps hanging for a minute or so and then resumes normal operations. By analyzing the logs I've found out that NFS stops answering for a while and outputs the following log lines: Oct 15 18:15:42 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:44 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:46 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:47 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:47 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:47 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:48 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:48 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:51 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:52 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:52 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:55 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:55 <server hostname> kernel: nfs: server active.storage.vlan not responding, still trying Oct 15 18:15:58 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK Oct 15 18:15:59 <server hostname> kernel: nfs: server active.storage.vlan OK In this case, the hang lasted for 16 seconds but sometimes it takes 1 or 2 minutes to resume normal operations. My first guess was this was happening due to heavy load of the NFS mount and that by increasing RPCNFSDCOUNT to a higher value, this would become stable. I've increased it several times and apparently, after a while, the logs started appearing less times. The value is now on 32. After further investigating the issue, I've came across a different hang, despite the NFS messages still appear in the logs. Sometimes, the GFS2 FS simply hangs which causes both the NFS and the storage webserver to serve files. Both stay hang for a while and then they resume normal operations. This hangs leaves no trace on client side (also leaves no NFS ... not responding messages) and, on the storage side, the log system appears to be empty, even though the rsyslogd is running. The nodes connect themselves through a 10Gbps non-dedicated connection but I don't think this is an issue because the GFS2 hang is confirmed but connecting directly to the active storage server. I've been trying to solve this for a while now and I've tried different NFS configuration options, before I've found out the GFS2 FS is also hanging. The NFS mount is exported as such: /srv/data/ <ip_address>(rw,async,no_root_squash,no_all_squash,fsid=25) And the NFS client mounts with: mount -o "async,hard,intr,wsize=8192,rsize=8192" active.storage.vlan:/srv/data /srv/data After some tests, these were the configurations that yielded more performance to the cluster. I am desperate to find a solution for this as the cluster is already in production mode and I need to fix this so that this hangs won't happen in the future and I don't really know for sure what and how I should be benchmarking. What I can tell is that this is happening due to heavy loads as I have tested the cluster earlier and this problems weren't happening at all. Please tell me if you need me to provide configuration details of the cluster, and which do you want me to post. As last resort I can migrate the files to a different FS but I need some solid pointers on whether this will solve this problems as the volume size is extremely large at this point. The servers are being hosted by a third-party enterprise and I don't have physical access to them. Best regards. EDIT 1: The servers are physical servers and their specs are: Webservers: Intel Bi Xeon E5606 2x4 2.13GHz 24GB DDR3 Intel SSD 320 2 x 120GB Raid 1 Storage: Intel i5 3550 3.3GHz 16GB DDR3 12 x 2TB SATA Initially there was a VRack setup between the servers but we've upgraded one of the storage servers to have more RAM and it wasn't inside the VRack. They connect through a shared 10Gbps connection between them. Please note that it is the same connection that is used for public access. They use a single IP (using IP Failover) to connect between them and to allow for a graceful failover. NFS is therefore over a public connection and not under any private network (it was before the upgrade, were the problem still existed). The firewall was configured and tested thoroughly but I disabled it for a while to see if the problem still occurred, and it did. From my knowledge the hosting provider isn't blocking or limiting the connection between either the servers and the public domain (at least under a given bandwidth consumption threshold that hasn't been reached yet). Hope this helps figuring out the problem. EDIT 2: Relevant software versions: CentOS 2.6.32-279.9.1.el6.x86_64 nfs-utils-1.2.3-26.el6.x86_64 nfs-utils-lib-1.1.5-4.el6.x86_64 gfs2-utils-3.0.12.1-32.el6_3.1.x86_64 kmod-drbd84-8.4.2-1.el6_3.elrepo.x86_64 drbd84-utils-8.4.2-1.el6.elrepo.x86_64 DRBD configuration on storage servers: #/etc/drbd.d/storage.res resource storage { protocol C; on <server1 fqdn> { device /dev/drbd0; disk /dev/vg_storage/LV_replicated; address <server1 ip>:7788; meta-disk internal; } on <server2 fqdn> { device /dev/drbd0; disk /dev/vg_storage/LV_replicated; address <server2 ip>:7788; meta-disk internal; } } NFS Configuration in storage servers: #/etc/sysconfig/nfs RPCNFSDCOUNT=32 STATD_PORT=10002 STATD_OUTGOING_PORT=10003 MOUNTD_PORT=10004 RQUOTAD_PORT=10005 LOCKD_UDPPORT=30001 LOCKD_TCPPORT=30001 (can there be any conflict in using the same port for both LOCKD_UDPPORT and LOCKD_TCPPORT?) GFS2 configuration: # gfs2_tool gettune <mountpoint> incore_log_blocks = 1024 log_flush_secs = 60 quota_warn_period = 10 quota_quantum = 60 max_readahead = 262144 complain_secs = 10 statfs_slow = 0 quota_simul_sync = 64 statfs_quantum = 30 quota_scale = 1.0000 (1, 1) new_files_jdata = 0 Storage network environment: eth0 Link encap:Ethernet HWaddr <mac address> inet addr:<ip address> Bcast:<bcast address> Mask:<ip mask> inet6 addr: <ip address> Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:957025127 errors:0 dropped:0 overruns:0 frame:0 TX packets:1473338731 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2630984979622 (2.3 TiB) TX bytes:1648430431523 (1.4 TiB) eth0:0 Link encap:Ethernet HWaddr <mac address> inet addr:<ip failover address> Bcast:<bcast address> Mask:<ip mask> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 The IP addresses are statically assigned with the given network configurations: DEVICE="eth0" BOOTPROTO="static" HWADDR=<mac address> ONBOOT="yes" TYPE="Ethernet" IPADDR=<ip address> NETMASK=<net mask> and DEVICE="eth0:0" BOOTPROTO="static" HWADDR=<mac address> IPADDR=<ip failover> NETMASK=<net mask> ONBOOT="yes" BROADCAST=<bcast address> Hosts file to allow for a graceful NFS failover in conjunction with NFS option fsid=25 set on both storage servers: #/etc/hosts <storage ip failover address> active.storage.vlan <webserver ip failover address> active.service.vlan As you can see, packet errors are down to 0. I've also ran ping for a long time without any packet loss. MTU size is the normal 1500. As there is no VLan by now, this is the MTU used to communicate between servers. The webservers' network environment is similar. One thing I forgot to mention is that the storage servers handle ~200GB of new files each day through the NFS connection, which is a key point for me to think this is some kind of heavy load problem with either NFS or GFS2. If you need further configuration details please tell me. EDIT 3: Earlier today we had a major filesystem crash on the storage server. I couldn't get the details of the crash right away because the server stop responding. After the reboot, I noticed the filesystem was extremely slow, and I was not being able to serve a single file through either NFS or httpd, perhaps due to cache warming or so. Nevertheless, I've been monitoring the server closely and the following error came up in dmesg. The source of the problem is clearly GFS, which is waiting for a lock and ends up starving after a while. INFO: task nfsd:3029 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. nfsd D 0000000000000000 0 3029 2 0x00000080 ffff8803814f79e0 0000000000000046 0000000000000000 ffffffff8109213f ffff880434c5e148 ffff880624508d88 ffff8803814f7960 ffffffffa037253f ffff8803815c1098 ffff8803814f7fd8 000000000000fb88 ffff8803815c1098 Call Trace: [<ffffffff8109213f>] ? wake_up_bit+0x2f/0x40 [<ffffffffa037253f>] ? gfs2_holder_wake+0x1f/0x30 [gfs2] [<ffffffff814ff42e>] __mutex_lock_slowpath+0x13e/0x180 [<ffffffff814ff2cb>] mutex_lock+0x2b/0x50 [<ffffffffa0379f21>] gfs2_log_reserve+0x51/0x190 [gfs2] [<ffffffffa0390da2>] gfs2_trans_begin+0x112/0x1d0 [gfs2] [<ffffffffa0369b05>] ? gfs2_dir_check+0x35/0xe0 [gfs2] [<ffffffffa0377943>] gfs2_createi+0x1a3/0xaa0 [gfs2] [<ffffffff8121aab1>] ? avc_has_perm+0x71/0x90 [<ffffffffa0383d1e>] gfs2_create+0x7e/0x1a0 [gfs2] [<ffffffffa037783f>] ? gfs2_createi+0x9f/0xaa0 [gfs2] [<ffffffff81188cf4>] vfs_create+0xb4/0xe0 [<ffffffffa04217d6>] nfsd_create_v3+0x366/0x4c0 [nfsd] [<ffffffffa0429703>] nfsd3_proc_create+0x123/0x1b0 [nfsd] [<ffffffffa041a43e>] nfsd_dispatch+0xfe/0x240 [nfsd] [<ffffffffa025a5d4>] svc_process_common+0x344/0x640 [sunrpc] [<ffffffff810602a0>] ? default_wake_function+0x0/0x20 [<ffffffffa025ac10>] svc_process+0x110/0x160 [sunrpc] [<ffffffffa041ab62>] nfsd+0xc2/0x160 [nfsd] [<ffffffffa041aaa0>] ? nfsd+0x0/0x160 [nfsd] [<ffffffff81091de6>] kthread+0x96/0xa0 [<ffffffff8100c14a>] child_rip+0xa/0x20 [<ffffffff81091d50>] ? kthread+0x0/0xa0 [<ffffffff8100c140>] ? child_rip+0x0/0x20

    Read the article

  • TaskFactory.StartNew versus ThreadPool.QueueUserWorkItem

    - by Dan Tao
    Apparently the TaskFactory.StartNew method in .NET 4.0 is intended as a replacement for ThreadPool.QueueUserWorkItem (according to this post, anyway). My question is simple: does anyone know why? Does TaskFactory.StartNew have better performance? Does it use less memory? Or is it mainly for the additional functionality provided by the Task class? In the latter case, does StartNew possibly have worse performance than QueueUserWorkItem? It seems to me that StartNew would actually potentially use more memory than QueueUserWorkItem, since it returns a Task object with every call and I would expect that to result in more memory allocation. In any case, I'm interested to know which is more appropriate for a high-performance scenario.

    Read the article

  • How do I learn Java EE? Where to start?

    - by bloodagar
    I'm going to be honest with you guys. I want to learn Java EE (and its related technologies) because it offers a lot of job opportunities here (the Philippines) and abroad. The problem is I don't know where to start and what to read or where to learn from. I'm almost finished reading Head First Java and I'm liking the language so far. Any tips that would increase my chances of finding a job would be greatly appreciated. Thanks in advance!

    Read the article

  • Is using ReaderWriterLockSlim a bad idea for long lived objects?

    - by uriDium
    I am trying to track down the reason that an application has periods of bad performance. I think that I have linked the bad performance to the points where Garbage Collection is run for Gen 2. I get a profiling tool (CLR Profiler) and was quite surprised by the results. In my test I was spawning and processing millions of objects. However the biggest hog of the Gen 2 space comes from something Called Threading.ReaderWriterCount which comes from System.Threading.ReaderWriterLockSlim::InitializeThreadCounts. I know nothing about the inner workings of ReaderWriterLockSlim but from what I am getting from the reports it is okay to have 1 or 2 Locks for longer lived objects but try and use other locks if you are going to have many smaller objects. Does anyone have any comments or experience with ReaderWriterLockSlim and/or what to look for if it seems that GC is killing application performance?

    Read the article

  • Title for postion needed, can someone help [closed]

    - by Jeffery
    I work in a large resort style rental community. Recently I was given a promotion, though my senior management has found it difficult to come up with a job title. I hold the position of concierge manager, spa manager and special events coordinator. So I would like to come up with a job title that would encompass the three areas. The suggested title was Hospitality Director. However it sounds like I would be working at a Hotel. Can someone please give me some ideas. Thank you in advance

    Read the article

  • Monitor database table for external changes from within Rails application

    - by jhwist
    I'm integrating some non-rails-model tables in my Rails application. Everything works out very nicely, the way I set up the model is: class Change < ActiveRecord::Base establish_connection(ActiveRecord::Base.configurations["otherdb_#{RAILS_ENV}"]) set_table_name "change" end This way I can use the Change model for all existing records with find etc. Now I'd like to run some sort of notification, when a record is added to the table. Since the model never gets created via Change.new and Change.save using ActiveRecord::Observer is not an option. Is there any way I can get some of my Rails code to be executed, whenever a new record is added? I looked at delayed_job but can't quite get my head around, how to set that up. I imagine it evolves around a cron-job, that selects all rows that where created since the job last ran and then calls the respective Rails code for each row.

    Read the article

  • how to store dynamically generated pages in html?

    - by Dharmik Bhandari
    I'm working on ASP.net MVC3 Web application that is facing scalability issue. For improving performance I want to store dynamically generated pages in html and serve them from generated html directly rather then querying database for each page request. I'm sure this will dramatically increase performance. Can any one share any hint / example / tutorial on how to do it? And what are challenges? I would also like to know how others are handling performance issue for large e-commerce sites with at-least thousand categories and 200k products with at least 200-500 concurrent visitors? What are the best approaches? Thanks in advance.

    Read the article

  • Using delayed_job to process file uploads across multiple servers

    - by Steve Klabnik
    Does anyone have any good resources on how to do this? Basically, I'm working on a project (in Rails) where people can upload files. They might be big. I'd like to process them using delayed_job before sending them to S3. I'd also like to do this processing on a separate job queue server, rather than on the webserver itself. I'd rather not have to upload the files to the webserver, then transfer them to the job queue server, and then upload them to S3 if I don't have to. Thanks.

    Read the article

  • worth of getting certified

    - by user58935
    In 6 months I will be a graduate and pursuing a masters in computers for the next 2 years in India. My options after that are to either do a post graduation (again) from a reputed college abroad, or to take up a job. Recently I came to know about about global certification programs like ccna, ccnp, ccie, oca, ocp, j2se, mcse, mcp etc. If I do these certifications, will it help me get a better job, or get into a top college ? How much does it matter? Considering that I like most areas of computers, which certifications are most beneficial? (I even had a crazy idea to do them all in 2.5 years left. Or should I try and master a few instead). Please advise.

    Read the article

  • Would using a MemoryMappedFile for IPC across AppDomains be faster than WCF/named pipes?

    - by Morten Mertner
    Context: I am loading and executing untrusted code in a separate AppDomain and am currently communicating between the two using WCF (using named pipes as the underlying transport). I am exchanging relatively simple object graphs using a reasonably coarse-grained API, but would like to use a more fine-grained API if it does not cost me performance-wise. I've noticed that 4.0 adds a MemoryMappedFile class (which doesn't need a physical file, so could be entirely memory based). What kind of performance gains could I expect to see (if any) by using this new class? I know that it would take some "infrastructure code" to get the request/response behavior of WCF, but for now I'm only interested in the performance difference.

    Read the article

  • SQL Server 2005 - Understanding ouput of DBCC SHOWCONTIG

    - by user169743
    I'm seeing some slow performance on a SQL Server 2005 database. I've been doing some research regarding SQL Server performance but I'm having difficulty fully understanding the output of SHOWCONTIG and would be very grateful if someone could have a look and offer some suggestions to improve performance. TABLE level scan performed. Pages Scanned................................: 19348 Extents Scanned..............................: 2427 Extent Switches..............................: 3829 Avg. Pages per Extent........................: 8.0 Scan Density [Best Count:Actual Count].......: 63.16% [2419:3830] Logical Scan Fragmentation ..................: 8.40% Extent Scan Fragmentation ...................: 35.15% Avg. Bytes Free per Page.....................: 938.1 Avg. Page Density (full).....................: 88.41%

    Read the article

  • Do bit operations cause programs to run slower?

    - by flashnik
    I'm dealing with a problem which needs to work with a lot of data. Currently its values are represented as an unsigned int. I know that real values do not exceed a limit of 1000. Questions I can use unsigned short to store it. An upside to this is that it'll use less storage space to store the value. Will performance suffer? If I decided to store data as short but all the calling functions use int, it's recognized that I need to convert between these datatypes when storing or extracting values. Will performance suffer? Will the loss in performance be dramatic? If I decided to not use short but just 10 bits packed into an array of unsigned int. What will happen in this case comparing with previous ones?

    Read the article

  • MS SQL 2005 - Understanding ouput of DBCC SHOWCONTIG

    - by user169743
    I'm seeing some slow performance on a MS SQL 2005 database. I've been doing some research regarding MS SQL performance but I'm having difficulty fully understanding the output of SHOWCONTIG and would be very grateful if someone could have a look and offer some suggestions to improve performance. TABLE level scan performed. Pages Scanned................................: 19348 Extents Scanned..............................: 2427 Extent Switches..............................: 3829 Avg. Pages per Extent........................: 8.0 Scan Density [Best Count:Actual Count].......: 63.16% [2419:3830] Logical Scan Fragmentation ..................: 8.40% Extent Scan Fragmentation ...................: 35.15% Avg. Bytes Free per Page.....................: 938.1 Avg. Page Density (full).....................: 88.41%

    Read the article

  • Common causes of slow performing jQuery and how to optimize the code?

    - by Polaris878
    Hello, This might be a bit of a vague or general question, but I figure it might be able to serve as a good resource for other jQuery-ers. I'm interested in common causes of slow running jQuery and how to optimize these cases. We have a good amount of jQuery/JavaScript performing actions on our page... and performance can really suffer with a large number off elements. What are some obvious performance pitfalls you know of with jQuery? What are some general optimizations a jQuery-er can do to squeeze every last bit of performance out of his/her scripts? One example: a developer may use a selector to access an element that is slower than some other way. Thanks

    Read the article

  • Delayed Jobs is not finding Records and failing..

    - by Trip
    In my app, delayed jobs isn't running automatically on my server anymore. It used to.. When I manually ssh in, and perform rake jobs:work I return this : * Starting job worker host:ip-(censored) pid:21458 * [Worker(host:ip-(censored) pid:21458)] acquired lock on PhotoJob * [JOB] host:ip-(censored) pid:21458 failed with ActiveRecord::RecordNotFound: Couldn't find Photo with ID=9237 - 4 failed attempts This returns roughly 20 times over for what I think is several jobs. Then I get a few of these: [Worker(host:ip-(censored) pid:21458)] failed to acquire exclusive lock for PhotoJob And then finally one of these : 12 jobs processed at 73.6807 j/s, 12 failed ... Any ideas what I should be mulling over? Thanks so much!

    Read the article

  • C++0x optimizing compiler quality

    - by aaa
    hello. I do some heavy numbercrunching and for me floating-point performance is very important. I like performance of Intel compiler very much and quite content with quality of assembly it produces. I am thinking at some point to try C++0x mainly for sugar parts, like auto, initializer list, etc, but also lambdas. at this point I use those features in regular C++ by the means of boost. How good of assembly code do compilers C++0x generate? specifically Intel and gcc compilers. Do they produce SSE code? is performance comparable to C++? are there any benchmarks? My Google search did not reveal much. Thank you.

    Read the article

  • Process AJAX response with long runing tasks

    - by mpz
    I have long time task in controller action. I use delayed job for it. (Also in heroku it is good practice for perfomance - dyno must work for small time in each request) But my client need result of it work and users can wait on that task. It is more clear: no any addition models or records in it, simple view and js... I think about such way: On client run AJAX with very long timeout (5 min for example) Client make request to server as usual On controller in action1 def start_work (with delay work setup) i need NO any response to client After work performs (delay job finished) i need run new action2 with response to client Client recieve response after about 1-5 min It is possible?

    Read the article

  • Writing at the end of file

    - by user342534
    Hi, I'm working on a system that requires high file I/O performance (with C#). Basically, I'm filling up large files (~100MB) from the start of the file until the end of the file. Every ~5 seconds I'm adding ~5MB to the file (sequentially from the start of the file), on every bulk I'm flushing the stream. Every few minutes I need to update a structure which I write at the end of the file (some kind of metadata). When flushing each one of the bulks I have no performance issue. However, when updating the metadata at the end of the file I get really low performance. My guess is that when creating the file (which also should be done extra fast), the file doesn't really allocates the entire 100MB on the disk and when I flush the metadata it must allocates all space until the end of file. Guys/Girls, any Idea how I can overcome this problem? Thanks a lot!

    Read the article

  • USB 3 vs. eSATA

    - by Robert Nickens
    Will the full speed advantages of the future USB 3.0 be negated by the fact the most HD being mass produced are SATA 3? If so, what would you suggest a person do? For performance reasons go with eSATA or 1394 for external HDs. Why spend the money on USB 3.0 next year,even if the prices come down quickly. Given that SATA 6 is not here and may be a while.

    Read the article

  • "Task Manager" addon for Firefox?

    - by eidylon
    Hello all... I'm wondering if there is an addon for Firefox that would basically replicate the performance monitoring of Task Manager in Windows - seeing memory and cpu used - but for all the tabs in your current Firefox session. I want to be able to see which tabs are taking up the most memory or hitting hardest on the CPU. Thanks in advance!

    Read the article

  • ASPNET WMI class not available

    - by Nexus
    I need to extract the ASPNET\Requests Queued performance counter from some IIS servers via WMI. The WMI class for this sort of thing appears to be contained in Win32_PerfFormattedData_ASPNET_ASPNET. I've queried all available classes in root\cimv2 on my Win 2003/IIS6 servers, and it's not listed. It is, however, available on an unrelated Win2008/IIS7 box (which is interesting but doesn't really help me much) What gives? Why is this WMI class not available on my Windows 2003 servers?

    Read the article

  • IRP_MJ_WRITE latency up to 15 seconds

    - by racitup
    We have written an application that performs small (22kB) writes to multiple files at once (one thread performing asynchronous queued writes to multiple locations on behalf of other threads) on the same local volume (RAID1). 99.9% of the writes are low-latency but occasionally (maybe every minute or two) we get one or two huge latency writes (I have seen 10 seconds and above) without any real explanation. Platform: Win2003 Server with NTFS. Monitoring: Sysinternals Process Monitor (see link below) and our own application logging. We have tried multiple things to try and solve this that have been gleaned from a few websites, e.g.: Making the first part of file names unique to aid 8.3 name generation Writing files to multiple directories Changing Intel Disk Write Caching Windows File/Printer Sharing Minimize memory used Balance Maximize data throughput for file sharing Maximize data throughput for network applications System-Advanced-Performance-Advanced NtfsDisableLastAccessUpdate - use fsutil behavior set disablelastaccess 1 disable 8.3 name generation - use "fsutil behavior set disable8dot3 1" + restart Enable a large size file system cache Disable paging of the kernel code IO Page Lock Limit Turn Off (or On) the Indexing Service But nothing seems to make much difference. There's a whole host of things we haven't tried yet but we wondered if anyone had come across the same problem, a reason and a solution (programmatic or not)? We can reproduce the problem using IOMeter and a simple setup: Start IOMeter and remove all but the first worker thread in 'Topology' using the disconnect button. Select the Worker thread and put a cross in the box next to the disk you want to use in the Disk Targets tab and put '2000000' in Maximum Disk Size (NOTE: must have at least 1GB free space; sector size is 512 bytes) Next create a new access specification and add it to the worker thread: Transfer Request Size = 22kB 100% Sequential Percent of Access Spec = 100% Percent Read/Write = 100% Write Change Results Display Update Frequency to 5 seconds, Test Setup Run Time to 20 seconds and both 'Number of Workers to Spawn Automatically' settings to zero. Select the Worker Thread in the Topology panel and hit the Duplicate Worker button 59 times to create 60 threads with identical settings. Hit the 'Go' button (green flag) and monitor the Results tab. The 'Maximum I/O Response Time (ms)' always hits at least 3500 on our machine. Our machine isn't exactly slow (Xeon 8 core rack server with 4GB and onboard RAID). I'd be interested to see what other people get. We have a feeling it might be something to do with the NTFS filesystem (ours is currently 75% full of fragmented files) and we are going to try a few things around this principle. But it is also related to disk performance since we don't see it on a RAMDisk and it's not as severe on a RAID10 array. Any help is much appreciated. Richard Right-click and select 'Open Link in New Tab': ProcMon Result

    Read the article

< Previous Page | 110 111 112 113 114 115 116 117 118 119 120 121  | Next Page >