Search Results

Search found 19055 results on 763 pages for 'high performance'.

Page 113/763 | < Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120  | Next Page >

  • Linux software RAID6: rebuild slow

    - by Ole Tange
    I am trying to find the bottleneck in the rebuilding of a software raid6. ## Pause rebuilding when measuring raw I/O performance # echo 1 > /proc/sys/dev/raid/speed_limit_min # echo 1 > /proc/sys/dev/raid/speed_limit_max ## Drop caches so that does not interfere with measuring # sync ; echo 3 | tee /proc/sys/vm/drop_caches >/dev/null # time parallel -j0 "dd if=/dev/{} bs=256k count=4000 | cat >/dev/null" ::: sdbd sdbc sdbf sdbm sdbl sdbk sdbe sdbj sdbh sdbg 4000+0 records in 4000+0 records out 1048576000 bytes (1.0 GB) copied, 7.30336 s, 144 MB/s [... similar for each disk ...] # time parallel -j0 "dd if=/dev/{} skip=15000000 bs=256k count=4000 | cat >/dev/null" ::: sdbd sdbc sdbf sdbm sdbl sdbk sdbe sdbj sdbh sdbg 4000+0 records in 4000+0 records out 1048576000 bytes (1.0 GB) copied, 12.7991 s, 81.9 MB/s [... similar for each disk ...] So we can read sequentially at 140 MB/s in the outer tracks and 82 MB/s in the inner tracks on all the drives simultaneously. Sequential write performance is similar. This would lead me to expect a rebuild speed of 82 MB/s or more. # echo 800000 > /proc/sys/dev/raid/speed_limit_min # echo 800000 > /proc/sys/dev/raid/speed_limit_max # cat /proc/mdstat md2 : active raid6 sdbd[10](S) sdbc[9] sdbf[0] sdbm[8] sdbl[7] sdbk[6] sdbe[11] sdbj[4] sdbi[3](F) sdbh[2] sdbg[1] 27349121408 blocks super 1.2 level 6, 128k chunk, algorithm 2 [9/8] [UUU_UUUUU] [=========>...........] recovery = 47.3% (1849905884/3907017344) finish=855.9min speed=40054K/sec But we only get 40 MB/s. And often this drops to 30 MB/s. # iostat -dkx 1 sdbc 0.00 8023.00 0.00 329.00 0.00 33408.00 203.09 0.70 2.12 1.06 34.80 sdbd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdbe 13.00 0.00 8334.00 0.00 33388.00 0.00 8.01 0.65 0.08 0.06 47.20 sdbf 0.00 0.00 8348.00 0.00 33388.00 0.00 8.00 0.58 0.07 0.06 48.00 sdbg 16.00 0.00 8331.00 0.00 33388.00 0.00 8.02 0.71 0.09 0.06 48.80 sdbh 961.00 0.00 8314.00 0.00 37100.00 0.00 8.92 0.93 0.11 0.07 54.80 sdbj 70.00 0.00 8276.00 0.00 33384.00 0.00 8.07 0.78 0.10 0.06 48.40 sdbk 124.00 0.00 8221.00 0.00 33380.00 0.00 8.12 0.88 0.11 0.06 47.20 sdbl 83.00 0.00 8262.00 0.00 33380.00 0.00 8.08 0.96 0.12 0.06 47.60 sdbm 0.00 0.00 8344.00 0.00 33376.00 0.00 8.00 0.56 0.07 0.06 47.60 iostat says the disks are not 100% busy (but only 40-50%). This fits with the hypothesis that the max is around 80 MB/s. Since this is software raid the limiting factor could be CPU. top says: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 38520 root 20 0 0 0 0 R 64 0.0 2947:50 md2_raid6 6117 root 20 0 0 0 0 D 53 0.0 473:25.96 md2_resync So md2_raid6 and md2_resync are clearly busy taking up 64% and 53% of a CPU respectively, but not near 100%. The chunk size (128k) of the RAID was chosen after measuring which chunksize gave the least CPU penalty. If this speed is normal: What is the limiting factor? Can I measure that? If this speed is not normal: How can I find the limiting factor? Can I change that?

    Read the article

  • Parallel processing slower than sequential?

    - by zebediah49
    EDIT: For anyone who stumbles upon this in the future: Imagemagick uses a MP library. It's faster to use available cores if they're around, but if you have parallel jobs, it's unhelpful. Do one of the following: do your jobs serially (with Imagemagick in parallel mode) set MAGICK_THREAD_LIMIT=1 for your invocation of the imagemagick binary in question. By making Imagemagick use only one thread, it slows down by 20-30% in my test cases, but meant I could run one job per core without issues, for a significant net increase in performance. Original question: While converting some images using ImageMagick, I noticed a somewhat strange effect. Using xargs was significantly slower than a standard for loop. Since xargs limited to a single process should act like a for loop, I tested that, and found it to be about the same. Thus, we have this demonstration. Quad core (AMD Athalon X4, 2.6GHz) Working entirely on a tempfs (16g ram total; no swap) No other major loads Results: /media/ramdisk/img$ time for f in *.bmp; do echo $f ${f%bmp}png; done | xargs -n 2 -P 1 convert -auto-level real 0m3.784s user 0m2.240s sys 0m0.230s /media/ramdisk/img$ time for f in *.bmp; do echo $f ${f%bmp}png; done | xargs -n 2 -P 2 convert -auto-level real 0m9.097s user 0m28.020s sys 0m0.910s /media/ramdisk/img$ time for f in *.bmp; do echo $f ${f%bmp}png; done | xargs -n 2 -P 10 convert -auto-level real 0m9.844s user 0m33.200s sys 0m1.270s Can anyone think of a reason why running two instances of this program takes more than twice as long in real time, and more than ten times as long in processor time to complete the same task? After that initial hit, more processes do not seem to have as significant of an effect. I thought it might have to do with disk seeking, so I did that test entirely in ram. Could it have something to do with how Convert works, and having more than one copy at once means it cannot use processor cache as efficiently or something? EDIT: When done with 1000x 769KB files, performance is as expected. Interesting. /media/ramdisk/img$ time for f in *.bmp; do echo $f ${f%bmp}png; done | xargs -n 2 -P 1 convert -auto-level real 3m37.679s user 5m6.980s sys 0m6.340s /media/ramdisk/img$ time for f in *.bmp; do echo $f ${f%bmp}png; done | xargs -n 2 -P 1 convert -auto-level real 3m37.152s user 5m6.140s sys 0m6.530s /media/ramdisk/img$ time for f in *.bmp; do echo $f ${f%bmp}png; done | xargs -n 2 -P 2 convert -auto-level real 2m7.578s user 5m35.410s sys 0m6.050s /media/ramdisk/img$ time for f in *.bmp; do echo $f ${f%bmp}png; done | xargs -n 2 -P 4 convert -auto-level real 1m36.959s user 5m48.900s sys 0m6.350s /media/ramdisk/img$ time for f in *.bmp; do echo $f ${f%bmp}png; done | xargs -n 2 -P 10 convert -auto-level real 1m36.392s user 5m54.840s sys 0m5.650s

    Read the article

  • Performance of Silverlight Datagrid in Silverlight 3 vs Silverlight 4 on a mac

    - by Simon
    I'm using Silverlight Beta 4 for a LOB application. After finding out today that I'll have to wait perhaps 4 months to be able to develop with SL4 on Visual Studio 2010 I'm thinking I need to downgrade my application to SL3 but thats another question. The problem is I'm noticing absolutely abismal performance for simple datagrids that work just fine on a PC when I'm running on a Mac. These grids contain only 5-10 columns and maybe 50 rows. Paging up and down takes about 1-2 seconds sometimes. I would appreciate anybody's experience in which of the following is the best solution: reverting to Silverlight 3 and hoping DataGrid is faster switching to 3rd party datagrid such as Telerik forgetting silverlight altogether I was hoping that possibly SL4 runtime might be updated but that won't happen probably for 3-4 months. Just a reminder - this is specifically a mac issue. Performance on my PC while slightly slow to populate the grid initially is fine.

    Read the article

  • RAD/Eclipse Eclipse Test and Performance Tools Platform, export data to text file

    - by Berlin Brown
    I am using the RAD (also on Eclipse) Test and Performance Monitoring. I monitor CPU performance time with it, on particular methods, etc. It is a good tool for my monitoring my applications but I can't copy/paste or export the output to a text file format. So I can send to the others. There has to be a way to export this? Also, I can save the output to file but it is '*.trcxml' binary file? has anyone seen a parser for this file format?

    Read the article

  • iPhone Foundation - performance implications of mutable and xxxWithCapacity:0

    - by Adam Eberbach
    All of the collection classes have two versions - mutable and immutable, such as NSArray and NSMutableArray. Is the distinction merely to promote careful programming by providing a const collection or is there some performance hit when using a mutable object as opposed to immutable? Similarly each of the collection classes has a method xxxxWithCapacity, like [NSMutableArray arrayWithCapacity:0]. I often use zero as the argument because it seems a better choice than guessing wrongly how many objects might be added. Is there some performance advantage to creating a collection with capacity for enough objects in advance? If not why isn't the function something like + (id)emptyArray?

    Read the article

  • Logging strategy vs. performance

    - by vtortola
    Hi, I'm developing a web application that has to support lots of simultaneous requests, and I'd like to keep it fast enough. I have now to implement a logging strategy, I'm gonna use log4net, but ... what and how should I log? I mean: How logging impacts in performance? is it possible/recomendable logging using async calls? Is better use a text file or a database? Is it possible to do it conditional? for example, default log to the database, and if it fails, the switch to a text file. What about multithreading? should I care about synchronization when I use log4net? or it's thread safe out of the box? In the requirements appear that the application should cache a couple of things per request, and I'm afraid of the performance impact of that. Cheers.

    Read the article

  • .NET or Windows Synchronization Primitives Performance Specifications

    - by ovanes
    Hello *, I am currently writing a scientific article, where I need to be very exact with citation. Can someone point me to either MSDN, MSDN article, some published article source or a book, where I can find performance comparison of Windows or .NET Synchronization primitives. I know that these are in the descending performance order: Interlocked API, Critical Section, .NET lock-statement, Monitor, Mutex, EventWaitHandle, Semaphore. Many Thanks, Ovanes P.S. I found a great book: Concurrent Programming on Windows by Joe Duffy. This book is written by one of the head concurrency developers for .NET Framework and is simply brilliant with lots of explanations, how things work or were implemented.

    Read the article

  • NSURLConnection performance

    - by oksk
    Hi all, I'm using NSURLConnection for downloading some images in my app currently. Before implementing via this, I implemented it by NSData(dataWithContentOfURL) in NSThread. But I wanted to cancel during downloading images, So I changed it to NSURLConnection. But It happens other problem. Performance was very low after changing. For example, There is at least 5seconds for downloading images at NSThread(NSData async) But, There is more than 2 or 3 times than it at NSURLConnection(async) !! Can I enhance performance ?? How?? (* sorry about my question with NSData(dataWithContentOfFile). correct question is dataWithContentOfURL)

    Read the article

  • A way to measure performance

    - by Andrei Ciobanu
    Given Exercise 14 from 99 Haskell Problems: (*) Duplicate the elements of a list. Eg.: *Main> dupli''' [1..10] [1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10] I've implemented 4 solutions: {-- my first attempt --} dupli :: [a] -> [a] dupli [] = [] dupli (x:xs) = replicate 2 x ++ dupli xs {-- using concatMap and replicate --} dupli' :: [a] -> [a] dupli' xs = concatMap (replicate 2) xs {-- usign foldl --} dupli'' :: [a] -> [a] dupli'' xs = foldl (\acc x -> acc ++ [x,x]) [] xs {-- using foldl 2 --} dupli''' :: [a] -> [a] dupli''' xs = reverse $ foldl (\acc x -> x:x:acc) [] xs Still, I don't know how to really measure performance . So what's the recommended function (from the above list) in terms of performance . Any suggestions ?

    Read the article

  • Performance benefits of upgrading Richfaces to newer version

    - by peteDog
    I have a client that's running an application based on JBoss 4.0.5, Seam 1.2 and RichFaces 3.0.1. Their system is having performance problems due to the fact that a lot of data is coming back from the server to be displayed on screen and it seems like the rendering of that data is taking forever. The data brought back is displayed in a tabbed interface, but the tabs aren't currently being loaded individually, but all at once. I'm trying to build up a case to present to the client on the benefits of upgrading to never version of RichFaces, which, as I understand it, has added a great number of features related to tabbed panels and being able to use ajax to page the data and load the chunks you actually need to display at the moment, and not the rest that's in other tabs. The move to a newer version of RichFaces will also result in never versions of Jboss and Seam, as the current production build of RichFaces 3.2.1 requires JSF 1.2. IF anyone has some suggestions or experience on performance of current versions RichFaces, paging, etc, I would really appreciate some feedback.

    Read the article

  • How does glClear() improve performance?

    - by Jon-Eric
    Apple's Technical Q&A on addressing flickering (QA1650) includes the following paragraph. (Emphasis mine.) You must provide a color to every pixel on the screen. At the beginning of your drawing code, it is a good idea to use glClear() to initialize the color buffer. A full-screen clear of each of your color, depth, and stencil buffers (if you're using them) at the start of a frame can also generally improve your application's performance. On other platforms, I've always found it to be an optimization to not clear the color buffer if you're going to draw to every pixel. (Why waste time filling the color buffer if you're just going to overwrite that clear color?) How can a call to glClear() improve performance?

    Read the article

  • Cost of exception handlers in Python

    - by Thilo
    In another question, the accepted answer suggested replacing a (very cheap) if statement in Python code with a try/except block to improve performance. Coding style issues aside, and assuming that the exception is never triggered, how much difference does it make (performance-wise) to have an exception handler, versus not having one, versus having a compare-to-zero if-statement?

    Read the article

  • Virtual Earth Shape Rendering Performance

    - by Mike
    I am overlaying a transparent image on my VEMap control by rendering it as a single VEShape. The shape changes sizes dynamically depeding on the zoom level of my map and can be as large as 4000*4000px. In older browsers such as IE6 and early versions of Firefox 2.x, map control performance degrades rapidly when my shape gets larger than 1500*1500px. The mouse pointer moves slowly and the map responds very slowly to events. I don't see this issue at all in newer browsers (IE7+). Are there any workarounds to boost performance of rendering a large shape for IE6 users?

    Read the article

  • Entity Framework Performance Problem

    - by Steve Horn
    I'm hoping that someone can help me understand how to overcome a performance problem I'm running into with the latest version of the Entity Framework. In my test, I created my model from a database consisting of around 80 tables. The problem that I'm running into is that the cost of the very first query I run on a thread is very expensive. If I run without pre-compiling views the first query takes anywhere from 5800 to 6600 milliseconds. If I pre-compile the views (see this article) I can get the initial query cost down to about 2800 to 3200 milliseconds. 3 seconds for each request is still unacceptable for my needs. Subsequent queries are very fast. Can you please help me understand how to eliminate the poor performance of the initial query? I'm using the version of entity framework that ships with Visual Studio 2010 RC.

    Read the article

  • Switch from Microsofts STL to STLport

    - by Laserallan
    Hi! I'm using quite much STL in performance critical C++ code under windows. One possible "cheap" way to get some extra performance would be to change to a faster STL library. According to this post STLport is faster and uses less memory, however it's a few years old. Has anyone made this change recently and what were your results?

    Read the article

  • THE FASTEST Smarty Cache Handler

    - by rob.effect
    Does anyone know if there is an overview of the performance of different cache handlers for smarty? I compared smarty file cache with a memcache handler, but it seemed memcache has a negative impact on performance. I figured there would be a faster way to cache than through the filesystem... am I wrong?

    Read the article

  • Bitmapfontatlas cocos2d performance issue

    - by meboz
    This question was also asked in the cocos2d forums but they are a little slower than here. Hi, Im trying to resolve a performance issue in my game. I have all of my game images on the one spritesheet. I now have a score label for which i have generated a font file with the Hiero tool. Ideally I'd like to update my score label with the current score on every update. However there seems to be a significant performance hit when doing so, which i believe is the result of the game images texture and the font texture swapping each update. Can anyone suggest a way to avoid this? Possibly by combining the font images with the game images in the one image file to avoid the texture swapping? Cheers

    Read the article

  • Javascript performance issue

    - by Daniel
    Hello, I've built a top menu based on superfish, but the amount of displayed items in the menu is huge. And there is also alot of jquery on the top menu. Now to the problem, everytime I load any page that has the menu, the browser(ie7) feels like it looks it locks it self for about 1-2 seconds while the page is being loaded. I'm sure that the top menu is the issue, and I would like to improve the performance of the page.(besides removing the menu and removing the menu items) I've used firebug to see which calls take most of the times, and I the calls are standard jquery or superfish. The top menu is a ascx control. Are they any good ways to let the page load first and the menu later or any other goods ideas to improve the performance?

    Read the article

  • What are suitable performance indicators for programmers?

    - by Graphain
    Hi, I am wondering what performance indicators people encounter, and think are realistic, for programmers in the workplace? I've seen numerous articles (I can't recall a really good one that I read right now) that detail how programmers will optimise for the metric they are being measured by (whether that be lines of code etc.). However, is there any metric that can be used as a good performance indicator of a programmer in the workplace, and conversely be used as a milestone by a programmer when negotiating with management? Replies Thanks for the link to that one and good feedback so far!

    Read the article

  • log4j performance

    - by Bob
    Hi, I'm developing a web app, and I'd like to log some information to help me improve and observe the app. (I'm using Tomcat6) First I thought I would use StringBuilders, append the logs to them and a task would persist them into the database like every 2 minutes. Because I was worried about the out-of-the-box logging system's performance. Then I made some test. Especially with log4j. Here is my code: Main.java public static void main(String[] args) { Thread[] threads = new Thread[LoggerThread.threadsNumber]; for(int i = 0; i < LoggerThread.threadsNumber; ++i){ threads[i] = new Thread(new LoggerThread("name - " + i)); } LoggerThread.startTimestamp = System.currentTimeMillis(); for(int i = 0; i < LoggerThread.threadsNumber; ++i){ threads[i].start(); } LoggerThread.java public class LoggerThread implements Runnable{ public static int threadsNumber = 10; public static long startTimestamp; private static int counter = 0; private String name; public LoggerThread(String name) { this.name = name; } private Logger log = Logger.getLogger(this.getClass()); @Override public void run() { for(int i=0; i<10000; ++i){ log.info(name + ": " + i); if(i == 9999){ int c = increaseCounter(); if(c == threadsNumber){ System.out.println("Elapsed time: " + (System.currentTimeMillis() - startTimestamp)); } } } } private synchronized int increaseCounter(){ return ++counter; } } } log4j.properties log4j.logger.main.LoggerThread=debug, f log4j.appender.f=org.apache.log4j.RollingFileAppender log4j.appender.f.layout=org.apache.log4j.PatternLayout log4j.appender.f.layout.ConversionPattern=%d{ABSOLUTE} %5p %c{1}:%L - %m%n log4j.appender.f.File=c:/logs/logging.log log4j.appender.f.MaxFileSize=15000KB log4j.appender.f.MaxBackupIndex=50 I think this is a very common configuration for log4j. First I used log4j 1.2.14 then I realized there was a newer version, so I switched to 1.2.16 Here are the figures (all in millisec) LoggerThread.threadsNumber = 10 1.2.14: 4235, 4267, 4328, 4282 1.2.16: 2780, 2781, 2797, 2781 LoggerThread.threadsNumber = 100 1.2.14: 41312, 41014, 42251 1.2.16: 25606, 25729, 25922 I think this is very fast. Don't forget that: in every cycle the run method not just log into the file, it has to concatenate strings (name + ": " + i), and check an if test (i == 9999). When threadsNumber is 10, there are 100.000 loggings and if tests and concatenations. When it is 100, there are 1.000.000 loggings and if tests and concatenations. (I've read somewhere JVM uses StringBuilder's append for concatenation, not simple concatenation). Did I missed something? Am I doing something wrong? Did I forget any factor that could decrease the performance? If these figures are correct I think, I don't have to worry about log4j's performance even if I heavily log, do I?

    Read the article

  • Performance Counters in Server Development

    - by Mubashar Ahmad
    Dear Gurus All you be agree with the value and worth of Performance Counters while developing and maintaining a server kind application I would like to know what is the best way to implement those, Specifically using C#? Usually performance counters have the following attributes They are shared global Writing requires locks to ensure Synchronization Reading Some times requires locks too. Is it better to update them Asynchronously and what is the best way to make them so. (I am planning to use the ThreadPool.QueuWorketItem function, pls tell me you opinion on this too.) If my question seems a bit vague can you just take the example of a HelloWorld Wcf service and i wanted to know following how many times its being hit overall and within a certain period Average/min/max Response Times overall and within a certain period. Moreover if any one knows about the Specialized way provided by DotNet or WCF then please let me know as well.

    Read the article

  • Performance of fopen vs stat

    - by Alex Marshall
    Hello, I'm writing several C programs for an embedded system where every bit of performance we can squeeze out will matter. Part of that is accessing log files. When determining if a file exists, is there any performance difference between using open / fopen, and stat ? I've been using stat on the assumption that it only has to do a quick check against the file system, whereas fopen would have to actually gain access to a file and manipulate internal data structures before returning. Is there any merit to this ?

    Read the article

  • Grails benchmarks compared to other web MVC platform (Rails, Django, ASP MVC)?

    - by fabien7474
    I have been searching the web for recent benchmarks measuring Grails overall performance compared to its competitors (Rails, Django, ASP.NET MVC...), but I didn't find anything more recent than a 3 years-old article with obsolete grails version (0.5). See here and here. So, starting from grails 1.2, are there any more recent grails benchmarks you are aware of ? Or do you have your own performance tests for grails (compared to others if possible) ?

    Read the article

< Previous Page | 109 110 111 112 113 114 115 116 117 118 119 120  | Next Page >