Search Results

Search found 4580 results on 184 pages for 'faster'.

Page 10/184 | < Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17  | Next Page >

  • c++ - what is faster ?

    - by VaioIsBorn
    If we have the following 2 snippets of code in c++ that do the same task: int a, b=somenumber; while(b > 0) { a = b % 3; b /= 3; } or int b=somenumber; while(b > 0) { int a=b%3; b /= 3; } I don't know much about computer architecture/c++ design, but i think that the first code is faster because it declares the integer a at the beginning and just uses it in the while-loop, and in the second code the integer a is being declared everytime the while-loop starts over. Can some one help me with this, am i correct or what and why ?

    Read the article

  • Why is writeSTRef faster than if expression?

    - by wenlong
    writeSTRef twice for each iteration fib3 :: Int -> Integer fib3 n = runST $ do a <- newSTRef 1 b <- newSTRef 1 replicateM_ (n-1) $ do !a' <- readSTRef a !b' <- readSTRef b writeSTRef a b' writeSTRef b $! a'+b' readSTRef b writeSTRef once for each iteration fib4 :: Int -> Integer fib4 n = runST $ do a <- newSTRef 1 b <- newSTRef 1 replicateM_ (n-1) $ do !a' <- readSTRef a !b' <- readSTRef b if a' > b' then writeSTRef b $! a'+b' else writeSTRef a $! a'+b' a'' <- readSTRef a b'' <- readSTRef b if a'' > b'' then return a'' else return b'' Benchmark, given n = 20000: benchmarking 20000/fib3 mean: 5.073608 ms, lb 5.071842 ms, ub 5.075466 ms, ci 0.950 std dev: 9.284321 us, lb 8.119454 us, ub 10.78107 us, ci 0.950 benchmarking 20000/fib4 mean: 5.384010 ms, lb 5.381876 ms, ub 5.386099 ms, ci 0.950 std dev: 10.85245 us, lb 9.510215 us, ub 12.65554 us, ci 0.950 fib3 is a bit faster than fib4.

    Read the article

  • Faster way of initializing arrays in Delphi

    - by Max
    I'm trying to squeeze every bit of performance in my Delphi application and now I came to a procedure which works with dynamic arrays. The slowest line in it is SetLength(Result, Len); which is used to initialize the dynamic array. When I look at the code for the SetLength procedure I see that it is far from optimal. The call sequence is as follows: _DynArraySetLength - DynArraySetLength DynArraySetLength gets the array length (which is zero for initialization) and then uses ReallocMem which is also unnecessary for initilization. I was doing SetLength to initialize dynamic array all the time. Maybe I'm missing something? Is there a faster way to do this?

    Read the article

  • faster implementation of sum ( for Codility test )

    - by Oscar Reyes
    How can the following simple implementation of sum be faster? private long sum( int [] a, int begin, int end ) { if( a == null ) { return 0; } long r = 0; for( int i = begin ; i < end ; i++ ) { r+= a[i]; } return r; } EDIT Background is in order. Reading latest entry on coding horror, I came to this site: http://codility.com which has this interesting programming test. Anyway, I got 60 out of 100 in my submission, and basically ( I think ) is because this implementation of sum, because those parts where I failed are the performance parts. I'm getting TIME_OUT_ERROR's So, I was wondering if an optimization in the algorithm is possible. So, no built in functions or assembly would be allowed. This my be done in C, C++, C#, Java or pretty much in any other. EDIT As usual, mmyers was right. I did profile the code and I saw most of the time was spent on that function, but I didn't understand why. So what I did was to throw away my implementation and start with a new one. This time I've got an optimal solution [ according to San Jacinto O(n) -see comments to MSN below - ] This time I've got 81% on Codility which I think is good enough. The problem is that I didn't take the 30 mins. but around 2 hrs. but I guess that leaves me still as a good programmer, for I could work on the problem until I found an optimal solution: Here's my result. I never understood what is those "combinations of..." nor how to test "extreme_first"

    Read the article

  • Which LINQ expression is faster

    - by Vlad Bezden
    Hi All In following code public class Person { public string Name { get; set; } public uint Age { get; set; } public Person(string name, uint age) { Name = name; Age = age; } } void Main() { var data = new List<Person>{ new Person("Bill Gates", 55), new Person("Steve Ballmer", 54), new Person("Steve Jobs", 55), new Person("Scott Gu", 35)}; // 1st approach data.Where (x => x.Age > 40).ToList().ForEach(x => x.Age++); // 2nd approach data.ForEach(x => { if (x.Age > 40) x.Age++; }); data.ForEach(x => Console.WriteLine(x)); } in my understanding 2nd approach should be faster since it iterates through each item once and first approach is running 2 times: Where clause ForEach on subset of items from where clause. However internally it might be that compiler translates 1st approach to the 2nd approach anyway and they will have the same performance. Any suggestions or ideas? I could do profiling like suggested, but I want to understand what is going on compiler level if those to lines of code are the same to the compiler, or compiler will treat it literally. Thanks in advance for your help.

    Read the article

  • Shouldn't prepared statements be much faster?

    - by silversky
    $s = explode (" ", microtime()); $s = $s[0]+$s[1]; $con = mysqli_connect ('localhost', 'test', 'pass', 'db') or die('Err'); for ($i=0; $i<1000; $i++) { $stmt = $con -> prepare( " SELECT MAX(id) AS max_id , MIN(id) AS min_id FROM tb "); $stmt -> execute(); $stmt->bind_result($M,$m); $stmt->free_result(); $rand = mt_rand( $m , $M ).'<br/>'; $res = $con -> prepare( " SELECT * FROM tb WHERE id >= ? LIMIT 0,1 "); $res -> bind_param("s", $rand); $res -> execute(); $res->free_result(); } $e = explode (" ", microtime()); $e = $e[0]+$e[1]; echo number_format($e-$s, 4, '.', ''); // and: $link = mysql_connect ("localhost", "test", "pass") or die (); mysql_select_db ("db") or die ("Unable to select database".mysql_error()); for ($i=0; $i<1000; $i++) { $range_result = mysql_query( " SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM tb "); $range_row = mysql_fetch_object( $range_result ); $random = mt_rand( $range_row->min_id , $range_row->max_id ); $result = mysql_query( " SELECT * FROM tb WHERE id >= $random LIMIT 0,1 "); } defenitly prepared statements are much more safer but also every where it says that they are much faster BUT in my test on the above code I have: - 2.45 sec for prepared statements - 5.05 sec for the secon example What do you think I'm doing wrong? Should I use the second solution or I should try to optimize the prep stmt?

    Read the article

  • Python MD5 Hash Faster Calculation

    - by balgan
    Hi everyone. I will try my best to explain my problem and my line of thought on how I think I can solve it. I use this code for root, dirs, files in os.walk(downloaddir): for infile in files: f = open(os.path.join(root,infile),'rb') filehash = hashlib.md5() while True: data = f.read(10240) if len(data) == 0: break filehash.update(data) print "FILENAME: " , infile print "FILE HASH: " , filehash.hexdigest() and using start = time.time() elapsed = time.time() - start I measure how long it takes to calculate an hash. Pointing my code to a file with 653megs this is the result: root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.624 root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.373 root@Mars:/home/tiago# python algorithm-timer.py FILENAME: freebsd.iso FILE HASH: ace0afedfa7c6e0ad12c77b6652b02ab 12.540 Ok now 12 seconds +- on a 653mb file, my problem is I intend to use this code on a program that will run through multiple files, some of them might be 4/5/6Gb and it will take wayy longer to calculate. What am wondering is if there is a faster way for me to calculate the hash of the file? Maybe by doing some multithreading? I used a another script to check the use of the CPU second by second and I see that my code is only using 1 out of my 2 CPUs and only at 25% max, any way I can change this? Thank you all in advance for the given help.

    Read the article

  • faster way to change xml to array(grails to flex)

    - by Anthony Umpad
    I have a large xml passed from grails to flex. When flex receives the xml, it converts the xml into an associative array object. Given the large xml file, it takes too long to complete the loop, is there any way in flex to make conversion faster? Below is my sample code. <xml> <car> <model>Vios</model> <type>Sedan</type> <color>Blue</color> </car> <car> <model>Camry</model> <type>Luxury</type> <color>Black</color> </car> </xml> *converted to the flex associative array below.* [Vios].type = Sedan .color = Blue [Camry].type = Luxury .color = Black *Below is a code I used in flex to convert the xml to the associative array object* var tempXML=xml.children() var tempArray:Array= new Array() for(var i:int=0;i<tempXML.length();i++) { tempArray[tempXML[i].@model]= new Object(); tempArray[tempXML[i].@model].color = tempXML[i][email protected](); tempArray[tempXML[i].@model].type = tempXML[i][email protected](); }

    Read the article

  • Copy Small Bitmaps on to Large Bitmap with Transparency Blend: What is faster than graphics.DrawImag

    - by Glenn
    I have identified this call as a bottleneck in a high pressure function. graphics.DrawImage(smallBitmap, x , y); Is there a faster way to blend small semi transparent bitmaps into a larger semi transparent one? Example Usage: XY[] locations = GetLocs(); Bitmap[] bitmaps = GetBmps(); //small images sizes vary approx 30px x 30px using (Bitmap large = new Bitmap(500, 500, PixelFormat.Format32bppPArgb)) using (Graphics largeGraphics = Graphics.FromImage(large)) { for(var i=0; i < largeNumber; i++) { //this is the bottleneck largeGraphics.DrawImage(bitmaps[i], locations[i].x , locations[i].y); } } var done = new MemoryStream(); large.Save(done, ImageFormat.Png); done.Position = 0; return (done); The DrawImage calls take a small 32bppPArgb bitmaps and copies them into a larger bitmap at locations that vary and the small bitmaps might only partially overlap the larger bitmaps visible area. Both images have semi transparent contents that get blended by DrawImage in a way that is important to the output. I've done some testing with BitBlt but not seen significant speed improvement and the alpha blending didn't come out the same in my tests. I'm open to just about any method including a better call to bitblt or unsafe c# code.

    Read the article

  • Why Stream/lazy val implementation using is faster than ListBuffer one

    - by anrizal
    I coded the following implementation of lazy sieve algorithms using Stream and lazy val below : def primes(): Stream[Int] = { lazy val ps = 2 #:: sieve(3) def sieve(p: Int): Stream[Int] = { p #:: sieve( Stream.from(p + 2, 2). find(i=> ps.takeWhile(j => j * j <= i). forall(i % _ > 0)).get) } ps } and the following implementation using (mutable) ListBuffer: import scala.collection.mutable.ListBuffer def primes(): Stream[Int] = { def sieve(p: Int, ps: ListBuffer[Int]): Stream[Int] = { p #:: { val nextprime = Stream.from(p + 2, 2). find(i=> ps.takeWhile(j => j * j <= i). forall(i % _ > 0)).get sieve(nextprime, ps += nextprime) } } sieve(3, ListBuffer(3))} When I did primes().takeWhile(_ < 1000000).size , the first implementation is 3 times faster than the second one. What's the explanation for this ? I edited the second version: it should have been sieve(3, ListBuffer(3)) instead of sieve(3, ListBuffer()) .

    Read the article

  • How Optimize sql query make it faster

    - by user502083
    Hello every one : I have a very simple small database, 2 of tables are: Node (Node_ID, Node_name, Node_Date) : Node_ID is primary key Citation (Origin_Id, Target_Id) : PRIMARY KEY (Origin_Id, Target_Id) each is FK in Node Now I write a query that first find all citations that their Origin_Id has a specific date and then I want to know what are the target dates of these records. I'm using sqlite in python the Node table has 3000 record and Citation has 9000 records, and my query is like this in a function: def cited_years_list(self, date): c=self.cur try: c.execute("""select n.Node_Date,count(*) from Node n INNER JOIN (select c.Origin_Id AS Origin_Id, c.Target_Id AS Target_Id, n.Node_Date AS Date from CITATION c INNER JOIN NODE n ON c.Origin_Id=n.Node_Id where CAST(n.Node_Date as INT)={0}) VW ON VW.Target_Id=n.Node_Id GROUP BY n.Node_Date;""".format(date)) cited_years=c.fetchall() self.conn.commit() print('Cited Years are : \n ',str(cited_years)) except Exception as e: print('Cited Years retrival failed ',e) return cited_years Then I call this function for some specific years, But it's crazy slowwwwwwwww :( (around 1 min for a specific year) Although my query works fine, it is slow. would you please give me a suggestion to make it faster? I'd appreciate any idea about optimizing this query :) I also should mention that I have indices on Origin_Id and Target_Id, so the inner join should be pretty fast, but it's not!!!

    Read the article

  • PHP what is faster to use

    - by user1631500
    What is faster / better to use? To put html into variables and print them later, or to just html print / echo print the content based on condition? EXAMPLE 1:(html into variables) if(!isset($_SESSION['daddy'])) { $var = "<span class='something'>Go here:<a href='#'>Click</a></span> <span class='something'>Go here:<a href='#'>Click</a></span>" } else { $one=$_SESSION["one"]; $two=$_SESSION["two"]; $three=$_SESSION["three"]; $var = You are cool enough to view the content; } echo $var; EXAMPLE 2:(print based on condition) if(!isset($_SESSION['daddy'])) { $var = 1; } else { $one=$_SESSION["one"]; $two=$_SESSION["two"]; $three=$_SESSION["three"]; $var = 0; } if ($var==1) { ?> <span class='something'>Go here:<a href='#'>Click</a></span> <span class='something'>Go here:<a href='#'>Click</a></span <?php } else { ?> You are cool enough to view the content. <?php } ?>

    Read the article

  • C#: Any faster way of copying arrays?

    - by Yang
    I have three arrays that need to be combined in one three-dimension array. The following code shows slow performance in Performance Explorer. Is there a faster solution? for (int i = 0; i < sortedIndex.Length; i++) { if (i < num_in_left) { // add instance to the left child leftnode[i, 0] = sortedIndex[i]; leftnode[i, 1] = sortedInstances[i]; leftnode[i, 2] = sortedLabels[i]; } else { // add instance to the right child rightnode[i-num_in_left, 0] = sortedIndex[i]; rightnode[i-num_in_left, 1] = sortedInstances[i]; rightnode[i-num_in_left, 2] = sortedLabels[i]; } } Update: I'm actually trying to do the following: //given three 1d arrays double[] sortedIndex, sortedInstances, sortedLabels; // copy them over to a 3d array (forget about the rightnode for now) double[] leftnode = new double[sortedIndex.Length, 3]; // some magic happens here so that leftnode = {sortedIndex, sortedInstances, sortedLabels};

    Read the article

  • Faster Javascript text replace

    - by Stacey
    Given the following javascript (jquery) $("#username").keyup(function () { selected.username = $("#username").val(); var url = selected.protocol + (selected.prepend == true ? selected.username : selected.url) + "/" + (selected.prepend == true ? selected.url : selected.username); $("#identifier").val(url); }); This code basically reads a textbox (username), and when it is typed into, it reconstructs the url that is being displayed in another textbox (identifier). This works fine - there are no problems with its functionality. However it feels 'slow' and 'sluggish'. Is there a cleaner/faster way to accomplish this task? Here is the HTML as requested. <fieldset class="identifier delta"> <form action="/authenticate/openid" method="post" target="_top" > <input type="text" class="openid" id="identifier" name="identifier" readonly="readonly" /> <input type='text' id='username' name='username' class="left" style='display: none;'/> <input type="submit" value="Login" style="height: 32px; padding-top: 1px; margin-right: 0px;" class="login right" /> </form> </fieldset> The identifier textbox just has a value set based on the hyperlink anchor of a button.

    Read the article

  • How to make this JavaScript much faster?

    - by Ralph
    Still trying to answer this question, and I think I finally found a solution, but it runs too slow. var $div = $('<div>') .css({ 'border': '1px solid red', 'position': 'absolute', 'z-index': '65535' }) .appendTo('body'); $('body *').live('mousemove', function(e) { var topElement = null; $('body *').each(function() { if(this == $div[0]) return true; var $elem = $(this); var pos = $elem.offset(); var width = $elem.width(); var height = $elem.height(); if(e.pageX > pos.left && e.pageY > pos.top && e.pageX < (pos.left + width) && e.pageY < (pos.top + height)) { var zIndex = document.defaultView.getComputedStyle(this, null).getPropertyValue('z-index'); if(zIndex == 'auto') zIndex = $elem.parents().length; if(topElement == null || zIndex > topElement.zIndex) { topElement = { 'node': $elem, 'zIndex': zIndex }; } } }); if(topElement != null ) { var $elem = topElement.node; $div.offset($elem.offset()).width($elem.width()).height($elem.height()); } }); It basically loops through all the elements on the page and finds the top-most element beneath the cursor. Is there maybe some way I could use a quad-tree or something and segment the page so the loop runs faster?

    Read the article

  • faster way to compare rows in a data frame

    - by aguiar
    Consider the data frame below. I want to compare each row with rows below and then take the rows that are equal in more than 3 values. I wrote the code below, but it is very slow if you have a large data frame. How could I do that faster? data <- as.data.frame(matrix(c(10,11,10,13,9,10,11,10,14,9,10,10,8,12,9,10,11,10,13,9,13,13,10,13,9), nrow=5, byrow=T)) rownames(data)<-c("sample_1","sample_2","sample_3","sample_4","sample_5") >data V1 V2 V3 V4 V5 sample_1 10 11 10 13 9 sample_2 10 11 10 14 9 sample_3 10 10 8 12 9 sample_4 10 11 10 13 9 sample_5 13 13 10 13 9 tab <- data.frame(sample = NA, duplicate = NA, matches = NA) dfrow <- 1 for(i in 1:nrow(data)) { sample <- data[i, ] for(j in (i+1):nrow(data)) if(i+1 <= nrow(data)) { matches <- 0 for(V in 1:ncol(data)) { if(data[j,V] == sample[,V]) { matches <- matches + 1 } } if(matches > 3) { duplicate <- data[j, ] pair <- cbind(rownames(sample), rownames(duplicate), matches) tab[dfrow, ] <- pair dfrow <- dfrow + 1 } } } >tab sample duplicate matches 1 sample_1 sample_2 4 2 sample_1 sample_4 5 3 sample_2 sample_4 4

    Read the article

  • TcpListener is queuing connections faster than I can clear them

    - by Matthew Brindley
    As I understand it, TcpListener will queue connections once you call Start(). Each time you call AcceptTcpClient (or BeginAcceptTcpClient), it will dequeue one item from the queue. If we load test our TcpListener app by sending 1,000 connections to it at once, the queue builds far faster than we can clear it, leading (eventually) to timeouts from the client because it didn't get a response because its connection was still in the queue. However, the server doesn't appear to be under much pressure, our app isn't consuming much CPU time and the other monitored resources on the machine aren't breaking a sweat. It feels like we're not running efficiently enough right now. We're calling BeginAcceptTcpListener and then immediately handing over to a ThreadPool thread to actually do the work, then calling BeginAcceptTcpClient again. The work involved doesn't seem to put any pressure on the machine, it's basically just a 3 second sleep followed by a dictionary lookup and then a 100 byte write to the TcpClient's stream. Here's the TcpListener code we're using: // Thread signal. private static ManualResetEvent tcpClientConnected = new ManualResetEvent(false); public void DoBeginAcceptTcpClient(TcpListener listener) { // Set the event to nonsignaled state. tcpClientConnected.Reset(); listener.BeginAcceptTcpClient( new AsyncCallback(DoAcceptTcpClientCallback), listener); // Wait for signal tcpClientConnected.WaitOne(); } public void DoAcceptTcpClientCallback(IAsyncResult ar) { // Get the listener that handles the client request, and the TcpClient TcpListener listener = (TcpListener)ar.AsyncState; TcpClient client = listener.EndAcceptTcpClient(ar); if (inProduction) ThreadPool.QueueUserWorkItem(state => HandleTcpRequest(client, serverCertificate)); // With SSL else ThreadPool.QueueUserWorkItem(state => HandleTcpRequest(client)); // Without SSL // Signal the calling thread to continue. tcpClientConnected.Set(); } public void Start() { currentHandledRequests = 0; tcpListener = new TcpListener(IPAddress.Any, 10000); try { tcpListener.Start(); while (true) DoBeginAcceptTcpClient(tcpListener); } catch (SocketException) { // The TcpListener is shutting down, exit gracefully CheckBuffer(); return; } } I'm assuming the answer will be related to using Sockets instead of TcpListener, or at least using TcpListener.AcceptSocket, but I wondered how we'd go about doing that? One idea we had was to call AcceptTcpClient and immediately Enqueue the TcpClient into one of multiple Queue<TcpClient> objects. That way, we could poll those queues on separate threads (one queue per thread), without running into monitors that might block the thread while waiting for other Dequeue operations. Each queue thread could then use ThreadPool.QueueUserWorkItem to have the work done in a ThreadPool thread and then move onto dequeuing the next TcpClient in its queue. Would you recommend this approach, or is our problem that we're using TcpListener and no amount of rapid dequeueing is going to fix that?

    Read the article

  • How does the default Camera iPhone app manages to save a photo so fast?

    - by worriorbg
    Hello everyone. So far I've managed to create an app for iPhone that takes multiple images with about a 3 second interval between each. I`m processing each image in a separate thread asynchronously and everything is great till it gets to the moment for saving the image on the iPhone disk. Then it takes about 12 seconds to save the image to the disk using JPEG representation. How does Apple do it, how do they manage to save a single image so fast to the disk is there a trick they are using? I saw that the animations distract the user for a while, but still the time needed is below 12 seconds! Thanks in advance.

    Read the article

  • ASP.NET Speed up DataView sorting/paging

    - by rlb.usa
    I have a page in ASP.NET where I'm using a Repeater to display a record listing. But it's slow as molasses, I've been tasked with speeding it up (sorting,paging). I've got it set up as follows: When user enters page, grab all of the data from the database (500 records, up to 4 relation'ed records) Store it all in Application["MyDataView"] On sort or paging, simply use the data view's internal sort/page method (no db calls) and rebind. I understand that databases can take time to query, but simply to have the DataView call it's sort method (no db calls) takes 10ish seconds, that's an alarmingly slow. Two questions: Why is it taking so long? How can I speed it up? A gridview is not possible.

    Read the article

  • Performance Optimization &ndash; It Is Faster When You Can Measure It

    - by Alois Kraus
    Performance optimization in bigger systems is hard because the measured numbers can vary greatly depending on the measurement method of your choice. To measure execution timing of specific methods in your application you usually use Time Measurement Method Potential Pitfalls Stopwatch Most accurate method on recent processors. Internally it uses the RDTSC instruction. Since the counter is processor specific you can get greatly different values when your thread is scheduled to another core or the core goes into a power saving mode. But things do change luckily: Intel's Designer's vol3b, section 16.11.1 "16.11.1 Invariant TSC The time stamp counter in newer processors may support an enhancement, referred to as invariant TSC. Processor's support for invariant TSC is indicated by CPUID.80000007H:EDX[8]. The invariant TSC will run at a constant rate in all ACPI P-, C-. and T-states. This is the architectural behavior moving forward. On processors with invariant TSC support, the OS may use the TSC for wall clock timer services (instead of ACPI or HPET timers). TSC reads are much more efficient and do not incur the overhead associated with a ring transition or access to a platform resource." DateTime.Now Good but it has only a resolution of 16ms which can be not enough if you want more accuracy.   Reporting Method Potential Pitfalls Console.WriteLine Ok if not called too often. Debug.Print Are you really measuring performance with Debug Builds? Shame on you. Trace.WriteLine Better but you need to plug in some good output listener like a trace file. But be aware that the first time you call this method it will read your app.config and deserialize your system.diagnostics section which does also take time.   In general it is a good idea to use some tracing library which does measure the timing for you and you only need to decorate some methods with tracing so you can later verify if something has changed for the better or worse. In my previous article I did compare measuring performance with quantum mechanics. This analogy does work surprising well. When you measure a quantum system there is a lower limit how accurately you can measure something. The Heisenberg uncertainty relation does tell us that you cannot measure of a quantum system the impulse and location of a particle at the same time with infinite accuracy. For programmers the two variables are execution time and memory allocations. If you try to measure the timings of all methods in your application you will need to store them somewhere. The fastest storage space besides the CPU cache is the memory. But if your timing values do consume all available memory there is no memory left for the actual application to run. On the other hand if you try to record all memory allocations of your application you will also need to store the data somewhere. This will cost you memory and execution time. These constraints are always there and regardless how good the marketing of tool vendors for performance and memory profilers are: Any measurement will disturb the system in a non predictable way. Commercial tool vendors will tell you they do calculate this overhead and subtract it from the measured values to give you the most accurate values but in reality it is not entirely true. After falling into the trap to trust the profiler timings several times I have got into the habit to Measure with a profiler to get an idea where potential bottlenecks are. Measure again with tracing only the specific methods to check if this method is really worth optimizing. Optimize it Measure again. Be surprised that your optimization has made things worse. Think harder Implement something that really works. Measure again Finished! - Or look for the next bottleneck. Recently I have looked into issues with serialization performance. For serialization DataContractSerializer was used and I was not sure if XML is really the most optimal wire format. After looking around I have found protobuf-net which uses Googles Protocol Buffer format which is a compact binary serialization format. What is good for Google should be good for us. A small sample app to check out performance was a matter of minutes: using ProtoBuf; using System; using System.Diagnostics; using System.IO; using System.Reflection; using System.Runtime.Serialization; [DataContract, Serializable] class Data { [DataMember(Order=1)] public int IntValue { get; set; } [DataMember(Order = 2)] public string StringValue { get; set; } [DataMember(Order = 3)] public bool IsActivated { get; set; } [DataMember(Order = 4)] public BindingFlags Flags { get; set; } } class Program { static MemoryStream _Stream = new MemoryStream(); static MemoryStream Stream { get { _Stream.Position = 0; _Stream.SetLength(0); return _Stream; } } static void Main(string[] args) { DataContractSerializer ser = new DataContractSerializer(typeof(Data)); Data data = new Data { IntValue = 100, IsActivated = true, StringValue = "Hi this is a small string value to check if serialization does work as expected" }; var sw = Stopwatch.StartNew(); int Runs = 1000 * 1000; for (int i = 0; i < Runs; i++) { //ser.WriteObject(Stream, data); Serializer.Serialize<Data>(Stream, data); } sw.Stop(); Console.WriteLine("Did take {0:N0}ms for {1:N0} objects", sw.Elapsed.TotalMilliseconds, Runs); Console.ReadLine(); } } The results are indeed promising: Serializer Time in ms N objects protobuf-net   807 1000000 DataContract 4402 1000000 Nearly a factor 5 faster and a much more compact wire format. Lets use it! After switching over to protbuf-net the transfered wire data has dropped by a factor two (good) and the performance has worsened by nearly a factor two. How is that possible? We have measured it? Protobuf-net is much faster! As it turns out protobuf-net is faster but it has a cost: For the first time a type is de/serialized it does use some very smart code-gen which does not come for free. Lets try to measure this one by setting of our performance test app the Runs value not to one million but to 1. Serializer Time in ms N objects protobuf-net 85 1 DataContract 24 1 The code-gen overhead is significant and can take up to 200ms for more complex types. The break even point where the code-gen cost is amortized by its faster serialization performance is (assuming small objects) somewhere between 20.000-40.000 serialized objects. As it turned out my specific scenario involved about 100 types and 1000 serializations in total. That explains why the good old DataContractSerializer is not so easy to take out of business. The final approach I ended up was to reduce the number of types and to serialize primitive types via BinaryWriter directly which turned out to be a pretty good alternative. It sounded good until I measured again and found that my optimizations so far do not help much. After looking more deeper at the profiling data I did found that one of the 1000 calls did take 50% of the time. So how do I find out which call it was? Normal profilers do fail short at this discipline. A (totally undeserved) relatively unknown profiler is SpeedTrace which does unlike normal profilers create traces of your applications by instrumenting your IL code at runtime. This way you can look at the full call stack of the one slow serializer call to find out if this stack was something special. Unfortunately the call stack showed nothing special. But luckily I have my own tracing as well and I could see that the slow serializer call did happen during the serialization of a bool value. When you encounter after much analysis something unreasonable you cannot explain it then the chances are good that your thread was suspended by the garbage collector. If there is a problem with excessive GCs remains to be investigated but so far the serialization performance seems to be mostly ok.  When you do profile a complex system with many interconnected processes you can never be sure that the timings you just did measure are accurate at all. Some process might be hitting the disc slowing things down for all other processes for some seconds as well. There is a big difference between warm and cold startup. If you restart all processes you can basically forget the first run because of the OS disc cache, JIT and GCs make the measured timings very flexible. When you are in need of a random number generator you should measure cold startup times of a sufficiently complex system. After the first run you can try again getting different and much lower numbers. Now try again at least two times to get some feeling how stable the numbers are. Oh and try to do the same thing the next day. It might be that the bottleneck you found yesterday is gone today. Thanks to GC and other random stuff it can become pretty hard to find stuff worth optimizing if no big bottlenecks except bloatloads of code are left anymore. When I have found a spot worth optimizing I do make the code changes and do measure again to check if something has changed. If it has got slower and I am certain that my change should have made it faster I can blame the GC again. The thing is that if you optimize stuff and you allocate less objects the GC times will shift to some other location. If you are unlucky it will make your faster working code slower because you see now GCs at times where none were before. This is where the stuff does get really tricky. A safe escape hatch is to create a repro of the slow code in an isolated application so you can change things fast in a reliable manner. Then the normal profilers do also start working again. As Vance Morrison does point out it is much more complex to profile a system against the wall clock compared to optimize for CPU time. The reason is that for wall clock time analysis you need to understand how your system does work and which threads (if you have not one but perhaps 20) are causing a visible delay to the end user and which threads can wait a long time without affecting the user experience at all. Next time: Commercial profiler shootout.

    Read the article

  • Is parsing JSON faster than parsing XML

    - by geme_hendrix
    I'm creating a sophisticated JavaScript library for working with my company's server side framework. The server side framework encodes its data to a simple XML format. There's no fancy namespacing or anything like that. Ideally I'd like to parse all of the data in the browser as JSON. However, if I do this I need to rewrite some of the server side code to also spit out JSON. This is a pain because we have public APIs that I can't easily change. What I'm really concerned about here is performance in the browser of parsing JSON versus XML. Is there really a big difference to be concerned about? Or should I exclusively go for JSON? Does anyone have any experience or benchmarks in the performance difference between the two? I realize that most modern web developers would probably opt for JSON and I can see why. However, I really am just interested in performance. If there's a proven massive difference then I'm prepared to spend the extra effort in generating JSON server side for the client.

    Read the article

  • Faster way to transfer table data from linked server

    - by spender
    After much fiddling, I've managed to install the right ODBC driver and have successfully created a linked server on SQL Server 2008, by which I can access my PostgreSQL db from SQL server. I'm copying all of the data from some of the tables in the PgSQL DB into SQL Server using merge statements that take the following form: with mbRemote as ( select * from openquery(someLinkedDb,'select * from someTable') ) merge into someTable mbLocal using mbRemote on mbLocal.id=mbRemote.id when matched /*edit*/ /*clause below really speeds things up when many rows are unchanged*/ /*can you think of anything else?*/ and not (mbLocal.field1=mbRemote.field1 and mbLocal.field2=mbRemote.field2 and mbLocal.field3=mbRemote.field3 and mbLocal.field4=mbRemote.field4) /*end edit*/ then update set mbLocal.field1=mbRemote.field1, mbLocal.field2=mbRemote.field2, mbLocal.field3=mbRemote.field3, mbLocal.field4=mbRemote.field4 when not matched then insert ( id, field1, field2, field3, field4 ) values ( mbRemote.id, mbRemote.field1, mbRemote.field2, mbRemote.field3, mbRemote.field4 ) WHEN NOT MATCHED BY SOURCE then delete; After this statement completes, the local (SQL Server) copy is fully in sync with the remote (PgSQL server). A few questions about this approach: is it sane? it strikes me that an update will be run over all fields in local rows that haven't necessarily changed. The only prerequisite is that the local and remote id field match. Is there a more fine grained approach/a way of constraining the merge statment to only update rows that have actually changed?

    Read the article

  • Why is file_get_contents() faster than using fsock_open()?

    - by eds
    In PHP, sometimes I want to send an HTTP request to a remote site just to look at the response headers, so I declare it all manually and use the fsock_open() function. However, this goes much slower than calling file_get_contents() with a remote URL (which loads the whole page content). Why is this? Is there a good alternative way to get just the response headers (to check if a page returns a 404 error, for example) that works as fast as file_get_contents()?

    Read the article

  • Java Integer: what is faster comparison or subtraction?

    - by Vladimir
    I've found that java.lang.Integer implementation of compareTo method looks as follows: public int compareTo(Integer anotherInteger) { int thisVal = this.value; int anotherVal = anotherInteger.value; return (thisVal<anotherVal ? -1 : (thisVal==anotherVal ? 0 : 1)); } The question is why use comparison instead of subtraction: return thisVal - anotherVal;

    Read the article

< Previous Page | 6 7 8 9 10 11 12 13 14 15 16 17  | Next Page >