Search Results

Search found 3706 results on 149 pages for 'nano optimization'.

Page 51/149 | < Previous Page | 47 48 49 50 51 52 53 54 55 56 57 58  | Next Page >

  • Where is the virtual function call overhead?

    - by Semen Semenych
    Hello everybody, I'm trying to benchmark the difference between a function pointer call and a virtual function call. To do this, I have written two pieces of code, that do the same mathematical computation over an array. One variant uses an array of pointers to functions and calls those in a loop. The other variant uses an array of pointers to a base class and calls its virtual function, which is overloaded in the derived classes to do absolutely the same thing as the functions in the first variant. Then I print the time elapsed and use a simple shell script to run the benchmark many times and compute the average run time. Here is the code: #include <iostream> #include <cstdlib> #include <ctime> #include <cmath> using namespace std; long long timespecDiff(struct timespec *timeA_p, struct timespec *timeB_p) { return ((timeA_p->tv_sec * 1000000000) + timeA_p->tv_nsec) - ((timeB_p->tv_sec * 1000000000) + timeB_p->tv_nsec); } void function_not( double *d ) { *d = sin(*d); } void function_and( double *d ) { *d = cos(*d); } void function_or( double *d ) { *d = tan(*d); } void function_xor( double *d ) { *d = sqrt(*d); } void ( * const function_table[4] )( double* ) = { &function_not, &function_and, &function_or, &function_xor }; int main(void) { srand(time(0)); void ( * index_array[100000] )( double * ); double array[100000]; for ( long int i = 0; i < 100000; ++i ) { index_array[i] = function_table[ rand() % 4 ]; array[i] = ( double )( rand() / 1000 ); } struct timespec start, end; clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start); for ( long int i = 0; i < 100000; ++i ) { index_array[i]( &array[i] ); } clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end); unsigned long long time_elapsed = timespecDiff(&end, &start); cout << time_elapsed / 1000000000.0 << endl; } and here is the virtual function variant: #include <iostream> #include <cstdlib> #include <ctime> #include <cmath> using namespace std; long long timespecDiff(struct timespec *timeA_p, struct timespec *timeB_p) { return ((timeA_p->tv_sec * 1000000000) + timeA_p->tv_nsec) - ((timeB_p->tv_sec * 1000000000) + timeB_p->tv_nsec); } class A { public: virtual void calculate( double *i ) = 0; }; class A1 : public A { public: void calculate( double *i ) { *i = sin(*i); } }; class A2 : public A { public: void calculate( double *i ) { *i = cos(*i); } }; class A3 : public A { public: void calculate( double *i ) { *i = tan(*i); } }; class A4 : public A { public: void calculate( double *i ) { *i = sqrt(*i); } }; int main(void) { srand(time(0)); A *base[100000]; double array[100000]; for ( long int i = 0; i < 100000; ++i ) { array[i] = ( double )( rand() / 1000 ); switch ( rand() % 4 ) { case 0: base[i] = new A1(); break; case 1: base[i] = new A2(); break; case 2: base[i] = new A3(); break; case 3: base[i] = new A4(); break; } } struct timespec start, end; clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &start); for ( int i = 0; i < 100000; ++i ) { base[i]->calculate( &array[i] ); } clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &end); unsigned long long time_elapsed = timespecDiff(&end, &start); cout << time_elapsed / 1000000000.0 << endl; } My system is LInux, Fedora 13, gcc 4.4.2. The code is compiled it with g++ -O3. The first one is test1, the second is test2. Now I see this in console: [Ignat@localhost circuit_testing]$ ./test2 && ./test2 0.0153142 0.0153166 Well, more or less, I think. And then, this: [Ignat@localhost circuit_testing]$ ./test2 && ./test2 0.01531 0.0152476 Where are the 25% which should be visible? How can the first executable be even slower than the second one? I'm asking this because I'm doing a project which involves calling a lot of small functions in a row like this in order to compute the values of an array, and the code I've inherited does a very complex manipulation to avoid the virtual function call overhead. Now where is this famous call overhead?

    Read the article

  • Optimizing Oracle query

    - by Omnipresent
    SELECT MAX(verification_id) FROM VERIFICATION_TABLE WHERE head = 687422 AND mbr = 23102 AND RTRIM(LTRIM(lname)) = '.iq bzw' AND TO_CHAR(dob,'MM/DD/YYYY')= '08/10/2004' AND system_code = 'M'; This query is taking 153 seconds to run. there are millions of rows in VERIFICATION_TABLE. I think query is taking long because of the functions in where clause. However, I need to do ltrim rtrim on the columns and also date has to be matched in MM/DD/YYYY format. How can I optimize this query?

    Read the article

  • How to outperform this regex replacement?

    - by spender
    After considerable measurement, I have identified a hotspot in one of our windows services that I'd like to optimize. We are processing strings that may have multiple consecutive spaces in it, and we'd like to reduce to only single spaces. We use a static compiled regex for this task: private static readonly Regex regex_select_all_multiple_whitespace_chars = new Regex(@"\s+",RegexOptions.Compiled); and then use it as follows: var cleanString= regex_select_all_multiple_whitespace_chars.Replace(dirtyString.Trim(), " "); This line is being invoked several million times, and is proving to be fairly intensive. I've tried to write something better, but I'm stumped. Given the fairly modest processing requirements of the regex, surely there's something faster. Could unsafe processing with pointers speed things further?

    Read the article

  • Strange C++ performance difference?

    - by STingRaySC
    I just stumbled upon a change that seems to have counterintuitive performance ramifications. Can anyone provide a possible explanation for this behavior? Original code: for (int i = 0; i < ct; ++i) { // do some stuff... int iFreq = getFreq(i); double dFreq = iFreq; if (iFreq != 0) { // do some stuff with iFreq... // do some calculations with dFreq... } } While cleaning up this code during a "performance pass," I decided to move the definition of dFreq inside the if block, as it was only used inside the if. There are several calculations involving dFreq so I didn't eliminate it entirely as it does save the cost of multiple run-time conversions from int to double. I expected no performance difference, or if any at all, a negligible improvement. However, the perfomance decreased by nearly 10%. I have measured this many times, and this is indeed the only change I've made. The code snippet shown above executes inside a couple other loops. I get very consistent timings across runs and can definitely confirm that the change I'm describing decreases performance by ~10%. I would expect performance to increase because the int to double conversion would only occur when iFreq != 0. Chnaged code: for (int i = 0; i < ct; ++i) { // do some stuff... int iFreq = getFreq(i); if (iFreq != 0) { // do some stuff with iFreq... double dFreq = iFreq; // do some stuff with dFreq... } } Can anyone explain this? I am using VC++ 9.0 with /O2. I just want to understand what I'm not accounting for here.

    Read the article

  • Benchmarking a particular method in Objective-C

    - by Jasconius
    I have a critical method in an Objective-C application that I need to optimize as much as possible. I first need to take some easy benchmarks on this one single method so I can compare my progress as I optimize. What is the easiest way to track the execution time of a given method in, say, milliseconds, and print that to console.

    Read the article

  • Creating thousands of records in Rails

    - by willCosgrove
    Let me set the stage: My application deals with gift cards. When we create cards they have to have a unique string that the user can use to redeem it with. So when someone orders our gift cards, like a retailer, we need to make a lot of new card objects and store them in the DB. With that in mind, I'm trying to see how quickly I can have my application generate 100,000 Cards. Database expert, I am not, so I need someone to explain this little phenomena: When I create 1000 Cards, it takes 5 seconds. When I create 100,000 cards it should take 500 seconds right? Now I know what you're wanting to see, the card creation method I'm using, because the first assumption would be that it's getting slower because it's checking the uniqueness of a bunch of cards, more as it goes along. But I can show you my rake task desc "Creates cards for a retailer" task :order_cards, [:number_of_cards, :value, :retailer_name] => :environment do |t, args| t = Time.now puts "Searching for retailer" @retailer = Retailer.find_by_name(args[:retailer_name]) puts "Retailer found" puts "Generating codes" value = args[:value].to_i number_of_cards = args[:number_of_cards].to_i codes = [] top_off_codes(codes, number_of_cards) while codes != codes.uniq codes.uniq! top_off_codes(codes, number_of_cards) end stored_codes = Card.all.collect do |c| c.code end while codes != (codes - stored_codes) codes -= stored_codes top_off_codes(codes, number_of_cards) end puts "Codes are unique and generated" puts "Creating bundle" @bundle = @retailer.bundles.create!(:value => value) puts "Bundle created" puts "Creating cards" @bundle.transaction do codes.each do |code| @bundle.cards.create!(:code => code) end end puts "Cards generated in #{Time.now - t}s" end def top_off_codes(codes, intended_number) (intended_number - codes.size).times do codes << ReadableRandom.get(CODE_LENGTH) end end I'm using a gem called readable_random for the unique code. So if you read through all of that code, you'll see that it does all of it's uniqueness testing before it ever starts creating cards. It also writes status updates to the screen while it's running, and it always sits for a while at creating. Meanwhile it flies through the uniqueness tests. So my question to the stackoverflow community is: Why is my database slowing down as I add more cards? Why is this not a linear function in regards to time per card? I'm sure the answer is simple and I'm just a moron who knows nothing about data storage. And if anyone has any suggestions, how would you optimize this method, and how fast do you think you could get it to create 100,000 cards? (When I plotted out my times on a graph and did a quick curve fit to get my line formula, I calculated how long it would take to create 100,000 cards with my current code and it says 5.5 hours. That maybe completely wrong, I'm not sure. But if it stays on the line I curve fitted, it would be right around there.)

    Read the article

  • Cost of exception handlers in Python

    - by Thilo
    In another question, the accepted answer suggested replacing a (very cheap) if statement in Python code with a try/except block to improve performance. Coding style issues aside, and assuming that the exception is never triggered, how much difference does it make (performance-wise) to have an exception handler, versus not having one, versus having a compare-to-zero if-statement?

    Read the article

  • Software to Tune/Calibrate Properties for Heuristic Algorithms

    - by Karussell
    Today I read that there is a software called WinCalibra (scroll a bit down) which can take a text file with properties as input. This program can then optimize the input properties based on the output values of your algorithm. See this paper or the user documentation for more information (see link above; sadly doc is a zipped exe). Do you know other software which can do the same which runs under Linux? (preferable Open Source) EDIT: Since I need this for a java application I will now invest my research in java libraries like jgap. Other ideas and links would be appreciated!

    Read the article

  • Software to Tune/Calibrate Properties for Heuristic Algorithms

    - by Karussell
    Today I read that there is a software called WinCalibra (scroll a bit down) which can take a text file with properties as input. This program can then optimize the input properties based on the output values of your algorithm. See this paper or the user documentation for more information (see link above; sadly doc is a zipped exe). Do you know other software which can do the same which runs under Linux? (preferable Open Source)

    Read the article

  • What's the difference between !col and col=false in MySQL?

    - by Mask
    The two statements have totally different performance: mysql> explain select * from jobs where createIndexed=false; +----+-------------+-------+------+----------------------+----------------------+---------+-------+------+-------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+----------------------+----------------------+---------+-------+------+-------+ | 1 | SIMPLE | jobs | ref | i_jobs_createIndexed | i_jobs_createIndexed | 1 | const | 1 | | +----+-------------+-------+------+----------------------+----------------------+---------+-------+------+-------+ 1 row in set (0.01 sec) mysql> explain select * from jobs where !createIndexed; +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | jobs | ALL | NULL | NULL | NULL | NULL | 17996 | Using where | +----+-------------+-------+------+---------------+------+---------+------+-------+-------------+ Column definition and related index for aiding analysis: createIndexed tinyint(1) NOT NULL DEFAULT 0, create index i_jobs_createIndexed on jobs(createIndexed);

    Read the article

  • Why is shrink_to_fit non-binding?

    - by Roger Pate
    The C++0x FCD states in 23.3.6.2 vector capacity: void shrink_to_fit(); Remarks: shrink_to_fit is a non-binding request to reduce capacity() to size(). [Note: The request is non-binding to allow latitude for implementation-specific optimizations. —end note] Why is it non-binding, and what optimizations are intended to be allowed?

    Read the article

  • Does visibility affect DOM manipulation performance?

    - by Chetan Sastry
    IE7/Windows XP I have a third party component in my page that does a lot of DOM manipulation to adjust itself each time the browser window is resized. Unfortunately I have little control of what it does internally and I have optimized everything else (such as callbacks and event handlers) as much as I can. I can't take the component off the flow by setting display:none because it fails measuring itself if I do so. In general, does setting visibility of the container to invisible during the resize help improve DOM rendering performance?

    Read the article

  • very quickly getting total size of folder

    - by freakazo
    I want to quickly find the total size of any folder using python. def GetFolderSize(path): TotalSize = 0 for item in os.walk(path): for file in item[2]: try: TotalSize = TotalSize + getsize(join(item[0], file)) except: print("error with file: " + join(item[0], file)) return TotalSize That's the simple script I wrote to get the total size of the folder, it took around 60 seconds (+-5 seconds). By using multiprocessing I got it down to 23 seconds on a quad core machine. Using the Windows file explorer it takes only ~3 seconds (Right click- properties to see for yourself). So is there a faster way of finding the total size of a folder close to the speed that windows can do it? Windows 7, python 2.6 (Did searches but most of the time people used a very similar method to my own) Thanks in advance.

    Read the article

  • What are good memory management techniques in Flash/as3

    - by Parris
    Hello! So I am pretty familiar with memory management in Java, C and C++; however, in flash what constructs are there for memory management? I assume flash has a sort of virtual machine like java, and I have been assuming that things get garbage collected when they are set to null. I am not sure if this is actually the case though. Also is there a way to force garbage collection in Flash? Any other tips? Thanks

    Read the article

  • performance issue in a select query from a single table

    - by daedlus
    Hi , I have a table as below dbo.UserLogs ------------------------------------- Id | UserId |Date | Name| P1 | Dirty ------------------------------------- There can be several records per userId[even in millions] I have clustered index on Date column and query this table very frequently in time ranges. The column 'Dirty' is non-nullable and can take either 0 or 1 only so I have no indexes on 'Dirty' I have several millions of records in this table and in one particular case in my application i need to query this table to get all UserId that have at least one record that is marked dirty. I tried this query - select distinct(UserId) from UserLogs where Dirty=1 I have 10 million records in total and this takes like 10min to run and i want this to run much faster than this. [i am able to query this table on date column in less than a minute.] Any comments/suggestion are welcome. my env 64bit,sybase15.0.3,Linux

    Read the article

  • TSQL -- Make it better

    - by user319353
    Hi: -- Very Narrow (all IDs are passed in) IF(@EmpID IS NOT NULL AND @DeptID IS NOT NULL AND @CityID IS NOT NULL) BEGIN SELECT e.EmpName ,d.DeptName ,c.CityName FROM Employee e WITH (NOLOCK) JOIN Department d WITH (NOLOCK) ON e.deptid = d.deptid JOIN City c WITH (NOLOCK) ON e.CityID = c.CityID WHERE e.EmpID = @EmpID END -- Just 2 IDs passed in ELSE IF(@DeptID IS NOT NULL AND @CityID IS NOT NULL) BEGIN SELECT e.EmpName ,d.DeptName ,NULL AS [CityName] FROM Employee e WITH (NOLOCK) JOIN Department d WITH (NOLOCK) ON e.deptid = d.deptid JOIN City c WITH (NOLOCK) ON e.CityID = c.CityID WHERE d.deptID = @DeptID END -- Very Broad (just 1 ID passed in) ELSE IF(@CityID IS NOT NULL) BEGIN SELECT e.EmpName ,NULL AS [DeptName] ,NULL AS [CityName] FROM Employee e WITH (NOLOCK) JOIN Department d WITH (NOLOCK) ON e.deptid = d.deptid JOIN City c WITH (NOLOCK) ON e.CityID = c.CityID WHERE c.CityID = @CityID END -- None (Nothing passed in) ELSE BEGIN SELECT NULL AS [EmpName] ,NULL AS [DeptName] ,NULL AS [CityName] END Question: Is there any better way (OR specifically can I do anything without IF...ELSE condition?

    Read the article

  • How to open DataSet in Visual Studio 2008 faster?

    - by Ekkapop
    When I open DataSet in Visual Studio 2008 to design or modify it, it always take a very long time (more than five minutes) before I can continue to do my job. While I'm waiting I can't do anything on Visual Studio, moreover CPU and memory usage is growth dramatically. I want to know, Is it has anyway to reduce this waiting time? Hardware - Desktop CPU: Intel Q6600 Memory: 4 GB HDD: 320 GB 7200 rpm OS: Windows XP 32 bit with Service Pack 3

    Read the article

  • How do you optimize stunicholls' "Professional dropdown #2" with jquery?

    - by geff_chang
    Link to menu: Professional dropdown #2 I was wondering if these posts Suckerfish meets jQuery or Son of Suckerfish dropdowns in jQuery could optimize the menu above. I need the menu to be optimized for IE6, because when I use the menu as it is, the menu hangs after I click on a menu item that loads a page with heavy processing. It takes too long for the menu to be enabled again. Any ideas?

    Read the article

  • JavaScript replace with callback - performance question

    - by Tomalak
    In JavaScript, you can define a callback handler in regex string replace operations: str.replace(/str[123]|etc/, replaceCallback); Imagine you have a lookup object of strings and replacements. var lookup = {"str1": "repl1", "str2": "repl2", "str3": "repl3", "etc": "etc" }; and this callback function: var replaceCallback = function(match) { if (lookup[match]) return lookup[match]; else return match; } How would you assess the performance of the above callback? Are there solid ways to improve it? Would if (match in lookup) //.... or even return lookup[match] | match; lead to opportunities for the JS compiler to optimize, or is it all the same thing?

    Read the article

  • image archive VS image strip

    - by DevA
    Hi, i've noticed that plenty of games / applications (very common on mobile builds) pack numerous images into an image strip. I figured that the advantages in this are making the program more tidy (file system - wise) and reducing (un)installation time. During the runtime of the application, the entire image strip is allocated and copied from FS to RAM. On the contrary, images can be stored in an image archive and unpacked during runtime to a number of image structures in RAM. The way I see it, the image strip approach is less efficient because of worse caching performance and because that even if the optimal rectangle packing algorithm is used, there will be empty spaces between the stored images in the strip, causing a waste of RAM. What are the advantages in using an image strip over using an image archive file?

    Read the article

  • Why is Magento so slow?

    - by mr-euro
    Is Magento usually so terrible slow? This is my first experience with it and the admin panel simply takes ages to load and save changes. It is a default installation with the test data. The server it is hosted on serves other non-Magento sites super fast. What is it about the PHP code that Magento uses that makes it so slow, and what can be done to fix it?

    Read the article

  • How can I encode four unsigned bytes (0-255) to a float and back again using HLSL?

    - by Statement
    Hello! I am facing a task where one of my hlsl shaders require multiple texture lookups per pixel. My 2d textures are fixed to 256*256, so two bytes should be sufficient to address any given texel given this constraint. My idea is then to put two xy-coordinates in each float, giving me eight xy-coordinates in pixel space when packed in a Vector4 format image. These eight coordinates are then used to sample another texture(s). The reason for doing this is to save graphics memory and an attempt to optimize processing time, since then I don't require multiple texture lookups. By the way: Does anyone know if encoding/decoding 16 bytes from/to 4 floats using 1 sampling is slower than 4 samplings with unencoded data?

    Read the article

  • Speed of CSS

    - by Ólafur Waage
    This is just a question to help me understand CSS rendering better. Lets say we have a million lines of this. <div class="first"> <div class="second"> <span class="third">Hello World</span> </div> </div> Which would be the fastest way to change the font of Hello World to red? .third { color: red; } div.third { color: red; } div.second div.third { color: red; } div.first div.second div.third { color: red; } Also, what if there was at tag in the middle that had a unique id of "foo". Which one of the CSS methods above would be the fastest. I know why these methods are used etc, im just trying to grasp better the rendering technique of the browsers and i have no idea how to make a test that times it. UPDATE: Nice answer Gumbo. From the looks of it then it would be quicker in a regular site to do the full definition of a tag. Since it finds the parents and narrows the search for every parent found. That could be bad in the sense you'd have a pretty large CSS file though.

    Read the article

< Previous Page | 47 48 49 50 51 52 53 54 55 56 57 58  | Next Page >