Search Results

Search found 146 results on 6 pages for 'slowdown'.

Page 5/6 | < Previous Page | 1 2 3 4 5 6  | Next Page >

  • VMWare Workstation 8 Disk I/O & Hard Faults

    - by Scott
    I have VMWare Workstation 8 installed on a host machine with the following specs: Intel i5 2500k CPU 16GB DDR3 1600 ram 1TB Western Digital Caviar Black HD I have two Windows 7 virtual machines configured (currently running one at a time but will be operating both at once when my 32GB RAM kit arrives in a couple days). Each one is configured with 8GB of RAM and no tweaks/performance customizations or anything done. All of the VMWare settings are the defaults. When I boot into these machines and run various programs (Visual Studio, Outlook, etc), I can hear the disk thrashing quite a bit and checking Resource Monitor, I can see that I'm getting anywhere between 300-800 hard faults per second. From the host machine, it shows they're coming from the VMWare image. If I go to the virtual machine, whatever app I'm currently loading is the image that's causing the hard faults. As I understand it, hard faults are (simply) when an address in memory has been swapped out to the page file and has to be read from the page file instead of from memory. I don't understand why this is happening though. With 8GB of ram on the guest machine and 6.5GB available, what could be causing this? I know Windows 7 supposedly improved on page file management over XP but it seems excessive for this kind of slowdown, disk thrashing and high hard fault count when I have that much free RAM. Is there anything I can to to improve the performance in my guest machines? On the host machine, I can open/run any applications at all and hard faults stays around 0 with low disk I/O.

    Read the article

  • Raspberry pi slows down my entire network

    - by gnusouth
    Whenever my Raspberry Pi is connected to the network (via ethernet) the entire network is slowed to a crawl. On my main computer, ping times for google.com go from ~10ms to ~200ms and it takes forever to load web pages. Connections are also slow on the Pi, with an apt-get update showing pathetic speeds in the order of 1KB/s. Turning off the Pi completely removes the drag from the network. I've tried static and dynamic IP addresses for the Pi, but both have the same problems. I'm currently using Raspbian (downloaded today), but also had this problem with Arch Linux. I've checked the connection's duplex with dmesg | grep -i duplex, which shows that the Pi's connection is running at 100Mbps, full-duplex, as expected. My modem/router is a Billion 7404VNPX (an Australian thing); relatively high-end, albeit a bit buggy at times (it will occassionally delete all its firewall settings). It assigns IPs in the range 192.168.1.1 to 192.168.1.20 and has 192.168.1.254 as its own IP. When I assign static IPs I tend to use the 192.168.1.200 area. Does anyone have any idea as to what could be causing this weird slowdown? Or any tests I could try? Thanks

    Read the article

  • How to tell Linux to explicitly swap out main memory of a suspended process?

    - by Vi
    I run a memory-hungry process (mkcromfs) which consumes more memory than I have physical memory on my latop, so it is paging and swappin and thrashing all the time and loadavg is about 2 (compcache is already in use with usual swap partition as well), but slowly moving forward (Although I afraid it will finally try to allocate 2GB and crash draining 2 days of thrashing). When I want to use the laptop for something else, I stop the process, start X server, firefox and other programs. The problem is that when I start Firefox the loadavg jumps to 10 and the system becomes almost unresponsive at all (long time to turn on/off caps lock, slow mouse cursor position updates, slow switching from X server to Linux console, slow login). The stopped mkcromfs still holds a lot of memory (464.8 MiB and slowly falling) and moves it to swap only when more memory is needed for some other program, which results in a great slowdown. How to tell the Linux to swap out this process entirely (e.g. I'm not intending to resume it in short term), possibly waking from swap other data? Also it will be useful to be able to specify the exact swap device to swap the given process out.

    Read the article

  • High disk I/O activity in CentOS server

    - by triiim
    I have about 16 websites in a CentOS dedicated, and I am having some problems on high traffic hours, it seems to be a high disk I/O activity causing a general slowdown. I've installed atop and this is what I see on the bottom (the server has been restarted thats why the values are so low): *** system and process activity since boot *** PID RDDSK WRDSK WCANCL DSK CMD 1/18 2176 1.7G 7.3G 854.4M 39 mysqld 671 1248K 3.0G 0K 13 flush-8:0 566 0K 1.1G 0K 5 jbd2/sda2-8 2401 124.2M 529.1M 22408K 3 crond 2032 2.2G 502.0M 0K 12 nginx 2360 425.8M 115.3M 4188K 2 httpd flush-8:0 and jbd2/sda2-8 are the processes I see with iotop using 99% on the IO column, and they are the processes that write the most on the hdd (after mysql). From what I saw in google this could be caused by some ext4 related bug, the current kernel is: Linux srvr.com 2.6.32-71.29.1.el6.x86_64 #1 SMP Mon Jun 27 19:49:27 BST 2011 x86_64 x86_64 x86_64 GNU/Linux I asked the hosting support to update the kernel and they tried but they now say that the server wont boot with the new installed kernel and they had to go back to the previous, they are not helping very much. Does someone has any idea how could I solve the high disk usage caused by flush-8:0 and jbd2/sda2-8 processes?

    Read the article

  • why would resetting the Netgear N300 router fix my Win 7 laptop's slow wifi?

    - by rjnagle
    In the past day the wifi download speeds of my Win 7 HP 64 bit laptop have slowed considerably. I am trying to troubleshoot the problem and to figure out whether it's hardware related (i.e., is the Intel(R) Centrino(R) Wireless-N 2230 the problem?) or router related. I have a Netgear N300 router connected to my modem. I'm using Speedtest to measure my speed. First, during my problem state, my ipad can download and upload at normal speeds. It's only my Win 7 laptop which is having problems. Because my ipad downloads at normal speeds, that would tell me that the problem is specific to the laptop (either HW or SW). But when I restarted my Netgear router, the laptop wifi problems disappeared. That just doesn't make sense. If we know that one device can connect properly to the router, why would a laptop have problems? What are some possible reasons why this might happen? Also, during my problem state, I noticed that on my laptop upload speeds were faster than my download speeds. Anybody have a guess about what might cause upload speeds on one device to be faster than another? Is there any actions i could take (or options to enable) so this problem won't occur. (I initially thought my problem might be software related or memory related -- Norton AV or browser plugins. But even after I disabled everything and made sure memory footprint was minimal, the slowdown was still occurring -- and it solved itself altogether when the router was reset).

    Read the article

  • Windows mounted network drives slow after upgrading switch

    - by Kver
    On our small business network our old 10/100 consumer grade switch gave up the ghost, and we replaced it with a proper business-grade gigabyte switch. After wiring it in our Linux and Mac users immediately got back to working off of network drives; But 2 of our 3 Windows 7 PCs have suddenly experienced a tremendous slowdown with mapped network drives; Windows will become stuck "discovering" a folder causing applications to freeze when trying to open files. It will instantly display and browse files, but the moment you try to open one the bug hits. To remedy this we have our users copying files to the desktop, but it can take a few minutes while windows is stuck "calculating" the time it will take to copy. These aren't big files, mostly excel sheets less than 500KB - these operations are instant on Linux and Mac. (The third Windows machine is having no issues) I've tried remapping the drives, mapping to different drive letters, rebooting, etc. I'm at a loss, because switches are mostly transparent, and it's only after the switch was replaced that the Windows PCs started acting up. What black-magic voodoo am I missing to make Windows work? Thank you.

    Read the article

  • How to tell Linux to explicitly swap out main memory of suspended process?

    - by Vi
    I run a memory-hungry process (mkcromfs) which consumes more memory than I have physical memory on my latop, so it is paging and swappin and thrashing all the time and loadavg is about 2 (compcache is already in use with usual swap partition as well), but slowly moving forward (Although I afraid it will finally try to allocate 2GB and crash draining 2 days of thrashing). When I want to use the laptop for something else, I stop the process, start X server, firefox and other programs. The problem is that when I start Firefox the loadavg jumps to 10 and the system becomes almost unresponsive at all (long time to turn on/off caps lock, slow mouse cursor position updates, slow switching from X server to Linux console, slow login). The stopped mkcromfs still holds a lot of memory (464.8 MiB and slowly falling) and moves it to swap only when more memory is needed for some other program, which results in a great slowdown. How to tell the Linux to swap out this process entirely (e.g. I'm not intending to resume it in short term), possibly waking from swap other data? Also it will be useful to be able to specify the exact swap device to swap the given process out.

    Read the article

  • Suggestions for transitioning to new GW/private network

    - by Quinten
    I am replacing a private T1 link with a new firewall device with an ipsec tunnel for a branch office. I am trying to figure out the right way to transition folks at the new site over to the new connection, so that they default to using the much faster tunnel. Existing network: 192.168.254.0/24, gw 192.168.254.253 (Cisco router plugged in to private t1) Test network I have been using with ipsec tunnel: 192.168.1.0/24, gw 192.168.1.1 (pfsense fw plugged in to public internet), also plugged in to same switch as the old network. There are probably ~20-30 network devices in the existing subnet, about 5 with static IPs. The remote endpoint is already the firewall--I can't set up redundant links to the existing subnet. In other words, as soon as I change the tunnel configuration to point to 192.168.254.0/24, all devices in the existing subnet will stop working because they point to the wrong gateway. I'd like some ability to do this slowly--such that I can move over a few clients and verify the stability of the new link before moving critical services or less tolerant users over. What's the right way to do this? Change the netmask on all of the devices to /16, and update gateway to point to the new device? Could this cause any problems? Also, how should I handle DNS? The pfsense box is not aware of my Active Directory environment. But if I change DNS to use the local servers, it will result in a huge slowdown as DNS queries will still be routed over the private t1. I need some help coming up with a plan that's not too disruptive but will really let me thoroughly test the stability of the IPSEC tunnel before I make the final switch. The AD version is 2008R2, as are the servers. Workstations are mostly Windows XP SP3. I have not configured the 192.168.1.0/24 as a site in AD sites and services.

    Read the article

  • Windows Server 2008 R2 grinds to a screeching halt during file copy operations

    - by skolima
    When my Windows Server 2008 R2 machine is performing any large disk operations (copying 10GB files from one drive to another, copying similar file over network, merging HyperV snapshots, compressing large files), performance of the whole machine slows down terribly, everything becomes unresponsive. This is noticeable in any situation when the disk access is large enough not to fit in the cache. Are there any settings available for tuning this behaviour? I can accept slower file transfer if this would give me more responsiveness. System details: Dell Optiflex 960, Core 2 Quad Q9650, 8GB RAM, 2 SATA drives - 320GB (ST3320418AS) and 1TB (ST31000528AS), NCQ active on both, Intel 82564LM-3 Gigabit Ethernet, ATI HD 3450 graphics, Intel ICH10 bridge. We have multiple machines like this, every one is exhibiting the same behaviour. I though this was overkill for a workstation, apparently I was mistaken. Update: I guess I shouldn't have mentioned the HyperV at all. The above configuration is a standard workstation setup at the company I work for, this is not a server of any kind. I have at most 3 virtual machines working, and usually I'm the only person accessing them. Never the less, the slowdown occurs even when no VMs are running. On a Linux machine I'd simply ionice the copy process and I could forget about it, is there any way to manage IO priorities on Windows?

    Read the article

  • Should I completely turn off swap for linux webserver?

    - by Poma
    Recently my friend told me that it is a good idea to turn off swap on linux webservers with enough memory. My server has 12 GB and currently uses 4GB (not counting cache and buffers) under peak load. His argument was that in a normal situation server will never use all of its RAM so the only way it can encounter OutOfMemory situation is due to some bug/ddos/etc. So in case swap is turned off system will run out of memory that will eventually crash the program hogging memory (most likely the web server process) and probably some other processes. In case swap is turned on it will eat both RAM and swap and eventually will result in the same crash, but before that it will offload crucial processes like sshd to swap and start to do a lot of swap operations resulting in major slowdown. This way when under ddos system may go into a completely unusable condition due to huge lags and I probably will not be unable to log in and kill webserver process or deny all incoming traffic (all but ssh). Is this right? Am I missing something (like the fact that swap partition is very useful in some way even if I have enough RAM)? Should I turn it off?

    Read the article

  • How do I disable DirectDraw and Direct3D acceleration on Windows 8? [closed]

    - by Favourite Chigozie Onwuemene
    Some old games that i would really like to play run slowly on some graphic drivers when direct3d acceleration is enabled. I have tried many suggestions but none seems to work. The only thing i have not tried is disabling direct3d acceleration. Is it possible to disable DirectDraw and Direct3D acceleration on my Windows 8 pc? There are certain bad versions of GeForce drivers that cause this problem. This is a problem in the drivers themselves and is unfortunately completely outside our control. The recommended way to fix this problem is to update your graphics card drivers (go to NVIDIA's web site for this). Alternatively, there is a workaround that alleviates or solves the slowdown problem altogether. Try this: Right-click on your desktop and select "Properties". Go to the "Settings" tab and click on "Advanced...". Click on the "Troubleshooting" tab and move the slider to the left until it says that all DirectDraw and Direct3D accelerations have been disabled (around the middle of the range). Finally, click on "OK". Note that this workaround might cause other games on your computer to slow down, so you may have to switch back and forth between settings, but it's certainly worth a try if you can't obtain an updated graphics driver. -source: interactionstudios

    Read the article

  • Grid pathfinding with a lot of entities

    - by Vee
    I'd like to explain this problem with a screenshot from a released game, DROD: Gunthro's Epic Blunder, by Caravel Games. The game is turn-based and tile-based. I'm trying to create something very similar (a clone of the game), and I've got most of the fundamentals done, but I'm having trouble implementing pathfinding. Look at the screenshot. The guys in yellow are friendly, and want to kill the roaches. Every turn, every guy in yellow pathfinds to the closest roach, and every roach pathfinds to the closest guy in yellow. By closest I mean the target with the shortest path, not a simple distance calculation. All of this without any kind of slowdown when loading the level or when passing turns. And all of the entities change position every turn. Also (not shown in screenshot), there can be doors that open and close and change the level's layout. Impressive. I've tried implementing pathfinding in my clone. First attempt was making every roach find a path to a yellow guy every turn, using a breadth-first search algorithm. Obviously incredibly slow with more than a single roach, and would get exponentially slower with more than a single yellow guy. Second attempt was mas making every yellow guy generate a pathmap (still breadth-first search) every time he moved. Worked perfectly with multiple roaches and a single yellow guy, but adding more yellow guys made the game slow and unplayable. Last attempt was implementing JPS (jump point search). Every entity would individually calculate a path to its target. Fast, but with a limited number of entities. Having less than half the entities in the screenshot would make the game slow. And also, I had to get the "closest" enemy by calculating distance, not shortest path. I've asked on the DROD forums how they did it, and a user replied that it was breadth-first search. The game is open source, and I took a look at the source code, but it's C++ (I'm using C#) and I found it confusing. I don't know how to do it. Every approach I tried isn't good enough. And I believe that DROD generates global pathmaps, somehow, but I can't understand how every entity find the best individual path to other entities that move every turn. What's the trick? This is a reply I just got on the DROD forums: Without having looked at the code I'd wager it's two (or so) pathmaps for the whole room: One to the nearest enemy, and one to the nearest friendly for every tile. There's no need to make a separate pathmap for every entity when the overall goal is "move towards nearest enemy/friendly"... just mark every tile with the number of moves it takes to the nearest target and have the entity chose the move that takes it to the tile with the lowest number. To be honest, I don't understand it that well.

    Read the article

  • How much multiple style sheets slow down to website?

    - by metal-gear-solid
    Here is 3 css file (one is only for IE) <link rel="stylesheet" href="css/blueprint/screen.css" type="text/css" media="screen, projection"> <link rel="stylesheet" href="css/blueprint/print.css" type="text/css" media="print"> <!--[if lt IE 8]><link rel="stylesheet" href="css/blueprint/ie.css" type="text/css" media="screen, projection"><![endif]--> If i keep divide scree.css into these css in my website Now it will be 6 css ( one is only for IE) <link rel="stylesheet" href="css/blueprint/reset.css" type="text/css" media="screen, projection"> <link rel="stylesheet" href="css/blueprint/grid.css" type="text/css" media="screen, projection"> <link rel="stylesheet" href="css/blueprint/typography.css" type="text/css" media="screen, projection"> <link rel="stylesheet" href="css/blueprint/forms.css" type="text/css" media="screen, projection"> <link rel="stylesheet" href="css/blueprint/print.css" type="text/css" media="print"> <!--[if lt IE 8]><link rel="stylesheet" href="css/blueprint/ie.css" type="text/css" media="screen, projection"><![endif]--> If i go for method two for a website even after production . Does it really slowdown the website page loading speed? if yes then how much? How much these 3 extra style sheet will affect site performance?

    Read the article

  • SSL Slow in IE 8.0.7600.16385IC

    - by discovery.jerrya
    I'm having a performance problem on my company's web site using a specific version of IE 8 to load a page using https. Here's what I know. Server: Virtual machine running on VMWare ESX Windows Server 2003 Enterprise Edition SP 2 Tomcat 6.0.16 Client: Windows XP and Window 7 Internet Explorer 8.0.7600.16385IC Page loads/refreshes in under 1 second using HTTP. Page loads/refreshes in 15-16 seconds in HTTPS using this version of IE. Problem reproduced on multiple client machines with same IE version. Problem reproduced on multiple client machines with different Windows versions (XP and 7). No performance problem using Chrome, Firefox, Opera, or Safari from same machine. No performance problem using other versions of IE 8 on other machines. Slow load causes virtually no CPU, memory, or I/O spike on server or client machine. No performance problem on other sites using HTTPS on same client machine. The pages in question use JavaScript and innerHTML to replace the contents of div elements to create a collapsible menu, and an iframe to display some content. A couple of the div elements contain images. If I remove the iframe and the JavaScript, the performance issues go away. However, rewriting the entire site to make these changes would be very time consuming. We're in the process of replacing the whole site, but it may be 2-3 months before we do so and we really cannot live with this slowdown that long. I've already looked at several IE tuning options, such as disabling add ons, running IE-rereg, and resetting IE, with no luck. Does anyone have any suggestions?

    Read the article

  • Why is setting HTML5's CanvasPixelArray values ridiculously slow and how can I do it faster?

    - by Nixuz
    I am trying to do some dynamic visual effects using the HTML 5 canvas' pixel manipulation, but I am running into a problem where setting pixels in the CanvasPixelArray is ridiculously slow. For example if I have code like: imageData = ctx.getImageData(0, 0, 500, 500); for (var i = 0; i < imageData.length; i += 4){ imageData.data[i] = buffer[i]; imageData.data[i + 1] = buffer[i + 1]; imageData.data[i + 2] = buffer[i + 2]; } ctx.putImageData(imageData, 0, 0); Profiling with Chrome reveals, it runs 44% slower than the following code where CanvasPixelArray is not used. tempArray = new Array(500 * 500 * 4); imageData = ctx.getImageData(0, 0, 500, 500); for (var i = 0; i < imageData.length; i += 4){ tempArray[i] = buffer[i]; tempArray[i + 1] = buffer[i + 1]; tempArray[i + 2] = buffer[i + 2]; } ctx.putImageData(imageData, 0, 0); My guess is that the reason for this slowdown is due to the conversion between the Javascript doubles and the internal unsigned 8bit integers, used by the CanvasPixelArray. Is this guess correct? Is there anyway to reduce the time spent setting values in the CanvasPixelArray?

    Read the article

  • Avoiding GC thrashing with WSE 3.0 MTOM service

    - by Leon Breedt
    For historical reasons, I have some WSE 3.0 web services that I cannot upgrade to WCF on the server side yet (it is also a substantial amount of work to do so). These web services are being used for file transfers from client to server, using MTOM encoding. This can also not be changed in the short term, for reasons of compatibility. Secondly, they are being called from both Java and .NET, and therefore need to be cross-platform, hence MTOM. How it works is that an "upload" WebMethod is called by the client, sending up a chunk of data at a time, since files being transferred could potentially be gigabytes in size. However, due to not being able to control parts of the stack before the WebMethod is invoked, I cannot control the memory usage patterns of the web service. The problem I am running into is for file sizes from 50MB or so onwards, performance is absolutely killed because of GC, since it appears that WSE 3.0 buffers each chunk received from the client in a new byte[] array, and by the time we've done 50MB we're spending 20-30% of time doing GC. I've played with various chunk sizes, from 16k to 2MB, with no real great difference in results. Smaller chunks are killed by the latency involved with round-tripping, and larger chunks just postpone the slowdown until GC kicks in. Any bright ideas on cutting down on the garbage created by WSE? Can I plug into the pipeline somehow and jury-rig something that has access to the client's request stream and streams it to the WebMethod? I'm aware that it is possible to "stream" responses to the client using WSE (albeit very ugly), but this problem is with requests from the client.

    Read the article

  • Why is setting HTML5's CanvasPixelArray values is ridiculously slow and how can I do it faster?

    - by Nixuz
    I am trying to do some dynamic visual effects using the HTML 5 canvas' pixel manipulation, but I am running into a problem where setting pixels in the CanvasPixelArray is ridiculously slow. For example if I have code like: imageData = ctx.getImageData(0, 0, 500, 500); for (var i = 0; i < imageData.length; i += 4){ imageData.data[index] = buffer[i]; imageData.data[index + 1] = buffer[i]; imageData.data[index + 2] = buffer[i]; } ctx.putImageData(imageData, 0, 0); Profiling with Chrome reveals, it runs 44% slower than the following code where CanvasPixelArray is not used. tempArray = new Array(500 * 500 * 4); imageData = ctx.getImageData(0, 0, 500, 500); for (var i = 0; i < imageData.length; i += 4){ tempArray[index] = buffer[i]; tempArray[index + 1] = buffer[i]; tempArray[index + 2] = buffer[i]; } ctx.putImageData(imageData, 0, 0); My guess is that the reason for this slowdown is due to the conversion between the Javascript doubles and the internal unsigned 8bit integers, used by the CanvasPixelArray. Is this guess correct? Is there anyway to reduce the time spent setting values in the CanvasPixelArray?

    Read the article

  • How does loop address alignment affect the speed on Intel x86_64?

    - by Alexander Gololobov
    I'm seeing 15% performance degradation of the same C++ code compiled to exactly same machine instructions but located on differently aligned addresses. When my tiny main loop starts at 0x415220 it's faster then when it is at 0x415250. I'm running this on Intel Core2 Duo. I use gcc 4.4.5 on x86_64 Ubuntu. Can anybody explain the cause of slowdown and how I can force gcc to optimally align the loop? Here is the disassembly for both cases with profiler annotation: 415220 576 12.56% |XXXXXXXXXXXXXX 48 c1 eb 08 shr $0x8,%rbx 415224 110 2.40% |XX 0f b6 c3 movzbl %bl,%eax 415227 0.00% | 41 0f b6 04 00 movzbl (%r8,%rax,1),%eax 41522c 40 0.87% | 48 8b 04 c1 mov (%rcx,%rax,8),%rax 415230 806 17.58% |XXXXXXXXXXXXXXXXXXX 4c 63 f8 movslq %eax,%r15 415233 186 4.06% |XXXX 48 c1 e8 20 shr $0x20,%rax 415237 102 2.22% |XX 4c 01 f9 add %r15,%rcx 41523a 414 9.03% |XXXXXXXXXX a8 0f test $0xf,%al 41523c 680 14.83% |XXXXXXXXXXXXXXXX 74 45 je 415283 ::Run(char const*, char const*)+0x4b3 41523e 0.00% | 41 89 c7 mov %eax,%r15d 415241 0.00% | 41 83 e7 01 and $0x1,%r15d 415245 0.00% | 41 83 ff 01 cmp $0x1,%r15d 415249 0.00% | 41 89 c7 mov %eax,%r15d 415250 679 13.05% |XXXXXXXXXXXXXXXX 48 c1 eb 08 shr $0x8,%rbx 415254 124 2.38% |XX 0f b6 c3 movzbl %bl,%eax 415257 0.00% | 41 0f b6 04 00 movzbl (%r8,%rax,1),%eax 41525c 43 0.83% |X 48 8b 04 c1 mov (%rcx,%rax,8),%rax 415260 828 15.91% |XXXXXXXXXXXXXXXXXXX 4c 63 f8 movslq %eax,%r15 415263 388 7.46% |XXXXXXXXX 48 c1 e8 20 shr $0x20,%rax 415267 141 2.71% |XXX 4c 01 f9 add %r15,%rcx 41526a 634 12.18% |XXXXXXXXXXXXXXX a8 0f test $0xf,%al 41526c 749 14.39% |XXXXXXXXXXXXXXXXXX 74 45 je 4152b3 ::Run(char const*, char const*)+0x4c3 41526e 0.00% | 41 89 c7 mov %eax,%r15d 415271 0.00% | 41 83 e7 01 and $0x1,%r15d 415275 0.00% | 41 83 ff 01 cmp $0x1,%r15d 415279 0.00% | 41 89 c7 mov %eax,%r15d

    Read the article

  • _dl_runtime_resolve -- When do the shared objects get loaded in to memory?

    - by windfinder
    We have a message processing system with high performance demands. Recently we have noticed that the first message takes many times longer then subsequent messages. A bunch of transformation and message augmentation happens as this goes through our system, much of it done by way of external lib. I just profiled this issue (using callgrind), comparing a "run" of just one message with a "run" of many messages (providing a baseline of comparison). The main difference I see is the function "do_lookup_x" taking up a huge amount of time. Looking at the various calls to this function, they all seem to be called by the common function: _dl_runtime_resolve. Not sure what this function does, but to me this looks like the first time the various shared libraries are being used, and are then being loaded in to memory by the ld. Is this a correct assumption? That the binary will not load the shared libraries in to memory until they are being prepped for use, therefore we will see a massive slowdown on the first message, but on none of the subsequent? How do we go about avoiding this? Note: We operate on the microsecond scale.

    Read the article

  • Python: slow read & write for millions of small files

    - by Jami
    I am building directory tree which has tons of subdirectories and files. The total directory count is somewhere along 256^32 subdirectories with 256 files in each end which are only a few bytes long. I did this so I would have fast access to these files (since i'm not searching and i'm just directly accessing then via a known file path) I have a python script that builds this filesystem and reads & writes those files. The problem is that when I reach more than 1Gb of total filesize, the read and write methods become extremely slow. Here's the function I have that reads the contents of a file (the file contains an integer string), adds a certain number to it, then writes it back to the original file. def addInFile(path, scoreToAdd): num = scoreToAdd try: shutil.copyfile(path, '/tmp/tmp.txt') fp = open('/tmp/tmp.txt', 'r') num += int(fp.readlines()[0]) fp.close() except: pass fp = open('/tmp/tmp.txt', 'w') fp.write(str(num)) fp.close() shutil.copyfile('/tmp/tmp.txt', path) I previously tried performing linux console commands but it was slower. I copy the file to a temporary file first then access/modify it then copy it back because i found this was faster than directly accessing the file. I think the cause of the slowdown is because there're tons of files. performing this function 1000 times sometimes reach 1 minute now, but before (when there were only a few files, 1000 calls was performed for only less than 1 second) How do you suggest I fix this?

    Read the article

  • Parallel version of loop not faster than serial version

    - by Il-Bhima
    I'm writing a program in C++ to perform a simulation of particular system. For each timestep, the biggest part of the execution is taking up by a single loop. Fortunately this is embarassingly parallel, so I decided to use Boost Threads to parallelize it (I'm running on a 2 core machine). I would expect at speedup close to 2 times the serial version, since there is no locking. However I am finding that there is no speedup at all. I implemented the parallel version of the loop as follows: Wake up the two threads (they are blocked on a barrier). Each thread then performs the following: Atomically fetch and increment a global counter. Retrieve the particle with that index. Perform the computation on that particle, storing the result in a separate array Wait on a job finished barrier The main thread waits on the job finished barrier. I used this approach since it should provide good load balancing (since each computation may take differing amounts of time). I am really curious as to what could possibly cause this slowdown. I always read that atomic variables are fast, but now I'm starting to wonder whether they have their performance costs. If anybody has some ideas what to look for or any hints I would really appreciate it. I've been bashing my head on it for a week, and profiling has not revealed much.

    Read the article

  • AudioRecord problems with non-HTC devices

    - by Marc
    I'm having troubles using AudioRecord. An example using some of the code derived from the splmeter project: private static final int FREQUENCY = 8000; private static final int CHANNEL = AudioFormat.CHANNEL_CONFIGURATION_MONO; private static final int ENCODING = AudioFormat.ENCODING_PCM_16BIT; private int BUFFSIZE = 50; private AudioRecord recordInstance = null; ... android.os.Process.setThreadPriority(android.os.Process.THREAD_PRIORITY_URGENT_AUDIO); recordInstance = new AudioRecord(MediaRecorder.AudioSource.MIC, FREQUENCY, CHANNEL, ENCODING, 8000); recordInstance.startRecording(); short[] tempBuffer = new short[BUFFSIZE]; int retval = 0; while (this.isRunning) { for (int i = 0; i < BUFFSIZE - 1; i++) { tempBuffer[i] = 0; } retval = recordInstance.read(tempBuffer, 0, BUFFSIZE); ... // process the data } This works on the HTC Dream and the HTC Magic perfectly without any log warnings/errors, but causes problems on the emulators and Nexus One device. On the Nexus one, it simply never returns useful data. I cannot provide any other useful information as I'm having a remote friend do the testing. On the emulators (Android 1.5, 2.1 and 2.2), I get weird errors from the AudioFlinger and Buffer overflows with the AudioRecordThread. I also get a major slowdown in UI responsiveness (even though the recording takes place in a separate thread than the UI). Is there something apparent that I'm doing incorrectly? Do I have to do anything special for the Nexus One hardware?

    Read the article

  • Better use a tuple or numpy array for storing coordinates

    - by Ivan
    Hi, I'm porting an C++ scientific application to python, and as I'm new to python, some problems come to my mind: 1) I'm defining a class that will contain the coordinates (x,y). These values will be accessed several times, but they only will be read after the class instantiation. Is it better to use an tuple or an numpy array, both in memory and access time wise? 2) In some cases, these coordinates will be used to build a complex number, evaluated on a complex function, and the real part of this function will be used. Assuming that there is no way to separate real and complex parts of this function, and the real part will have to be used on the end, maybe is better to use directly complex numbers to store (x,y)? How bad is the overhead with the transformation from complex to real in python? The code in c++ does a lot of these transformations, and this is a big slowdown in that code. 3) Also some coordinates transformations will have to be performed, and for the coordinates the x and y values will be accessed in separate, the transformation be done, and the result returned. The coordinate transformations are defined in the complex plane, so is still faster to use the components x and y directly than relying on the complex variables? Thank you

    Read the article

  • Weird stuttering issues not related to GC.

    - by Smills
    I am getting some odd stuttering issues with my game even though my FPS never seems to drop below 30. About every 5 seconds my game stutters. I was originally getting stuttering every 1-2 seconds due to my garbage collection issues, but I have sorted those and will often go 15-20 seconds without a garbage collection. Despite this, my game still stutters periodically even when there is no GC listed in logcat anywhere near the stutter. Even when I take out most of my code and simply make my "physics" code the below code I get this weird slowdown issue. I feel that I am missing something or overlooking something. Shouldn't that "elapsed" code that I put in stop any variance in the speed of the main character related to changes in FPS? Any input/theories would be awesome. Physics: private void updatePhysics() { //get current time long now = System.currentTimeMillis(); //added this to see if I could speed it up, it made no difference Thread myThread = Thread.currentThread(); myThread.setPriority(Thread.MAX_PRIORITY); //work out elapsed time since last frame in seconds double elapsed = (now - mLastTime2) / 1000.0; mLastTime2 = now; //measures FPS and displays in logcat once every 30 frames fps+=1/elapsed; fpscount+=1; if (fpscount==30) { fps=fps/fpscount; Log.i("myActivity","FPS: "+fps+" Touch: "+touch); fpscount=0; } //this should make the main character (theoretically) move upwards at a steady pace mY-=100*elapsed; //increase amount I translate the draw to = main characters Y //location if the main character goes upwards if (mY<=viewY) { viewY=mY; } }

    Read the article

  • Computation overhead in C# - Using getters/setters vs. modifying arrays directly and casting speeds

    - by Jeffrey Kern
    I was going to write a long-winded post, but I'll boil it down here: I'm trying to emulate the graphical old-school style of the NES via XNA. However, my FPS is SLOW, trying to modify 65K pixels per frame. If I just loop through all 65K pixels and set them to some arbitrary color, I get 64FPS. The code I made to look-up what colors should be placed where, I get 1FPS. I think it is because of my object-orented code. Right now, I have things divided into about six classes, with getters/setters. I'm guessing that I'm at least calling 360K getters per frame, which I think is a lot of overhead. Each class contains either/and-or 1D or 2D arrays containing custom enumerations, int, Color, or Vector2D, bytes. What if I combined all of the classes into just one, and accessed the contents of each array directly? The code would look a mess, and ditch the concepts of object-oriented coding, but the speed might be much faster. I'm also not concerned about access violations, as any attempts to get/set the data in the arrays will done in blocks. E.g., all writing to arrays will take place before any data is accessed from them. As for casting, I stated that I'm using custom enumerations, int, Color, and Vector2D, bytes. Which data types are fastest to use and access in the .net Framework, XNA, XBox, C#? I think that constant casting might be a cause of slowdown here. Also, instead of using math to figure out which indexes data should be placed in, I've used precomputed lookup tables so I don't have to use constant multiplication, addition, subtraction, division per frame. :)

    Read the article

< Previous Page | 1 2 3 4 5 6  | Next Page >