Search Results

Search found 3481 results on 140 pages for 'convex optimization'.

Page 45/140 | < Previous Page | 41 42 43 44 45 46 47 48 49 50 51 52 | Next Page >

How to check for C++ copy ellision

- by Steve

I ran across this article on copy ellision in C++ and I've seen comments about it in the boost library. This is appealing, as I prefer my functions to look like verylargereturntype DoSomething(...) rather than void DoSomething(..., verylargereturntype& retval) So, I have two questions about this Google has virtually no documentation on this at all, how real is this? How can I check that this optimization is actually occuring? I assume it involves looking at the assembly, but lets just say that isn't my strong suit. If anyone can give a very basic example as to what successful ellision looks like, that would be very useful I won't be using copy ellision just to prettify things, but if I can be guaranteed that it works, it sounds pretty useful.

Read the article
In Python, is there a way to call a method on every item of an iterable? [closed]

- by Thane Brimhall

Possible Duplicate: Is there a map without result in python? I often come to a situation in my programs when I want to quickly/efficiently call an in-place method on each of the items contained by an iterable. (Quickly meaning the overhead of a for loop is unacceptable). A good example would be a list of sprites when I want to call draw() on each of the Sprite objects. I know I can do something like this: [sprite.draw() for sprite in sprite_list] But I feel like the list comprehension is misused since I'm not using the returned list. The same goes for the map function. Stone me for premature optimization, but I also don't want the overhead of the return value. What I want to know is if there's a method in Python that lets me do what I just explained, perhaps like the hypothetical function I suggest below: do_all(sprite_list, draw)

Read the article
SQL: Optimize insensive SELECTs on DateTime fields

- by Fedyashev Nikita

I have an application for scheduling certain events. And all these events must be reviewed after each scheduled time. So basically we have 3 tables: items(id, name) scheduled_items(id, item_id, execute_at - datetime) - item_id column has an index option. reviewed_items(id, item_id, created_at - datetime) - item_id column has an index option. So core function of the application is "give me any items(which are not yet reviewed) for the actual moment". How can I optimize this solution for speed(because it is very core business feature and not micro optimization)? I suppose that adding index to the datetime fields doesn't make any sense because the cardinality or uniqueness on that fields are very high and index won't give any(?) speed-up. Is it correct? What would you recommend? Should I try no-SQL? -- mysql -V 5.075 I use caching where it makes sence.

Read the article
Will these optimizations to my Ruby implementation of diff improve performance in a Rails app?

- by grg-n-sox

<tl;dr> In source version control diff patch generation, would it be worth it to use the optimizations listed at the very bottom of this writing (see <optimizations>) in my Ruby implementation of diff for making diff patches? </tl;dr> <introduction> I am programming something I have never done before and there might already be tools out there to do the exact thing I am programming but at this point I am having too much fun to care so I am still going to do it from scratch, even if there is a tool for this. So anyways, I am working on a Ruby on Rails app and need a certain feature. Basically I want each entry in a table of mine, let's say for example a table of video games, to have a stored chunk of text that represents a review or something of the sort for that table entry. However, I want this text to be both editable by any registered user and also keep track of different submissions in a version control system. The simplest solution I could think of is just implement a solution that keeps track of the text body and the diff patch history of different versions of the text body as objects in Ruby and then serialize it, preferably in human readable form (so I'll most likely use YAML for this) for editing if needed due to corruption by a software bug or a mistake is made by an admin doing some version editing. So at first I just tried to dive in head first into this feature to find that the problem of generating a diff patch is more difficult that I thought to do efficiently. So I did some research and came across some ideas. Some I have implemented already and some I have not. However, it all pretty much revolves around the longest common subsequence problem, as you would already know if you have already done anything with diff or diff-like features, and optimization the function that solves it. Currently I have it so it truncates the compared versions of the text body from the beginning and end until non-matching lines are found. Then it solves the problem using a comparison matrix, but instead of incrementing the value stored in a cell when it finds a matching line like in most longest common subsequence algorithms I have seen examples of, I increment when I have a non-matching line so as to calculate edit distance instead of longest common subsequence. Although as far as I can tell between the two approaches, they are essentially two sides of the same coin so either could be used to derive an answer. It then back-traces through the comparison matrix and notes when there was an incrementation and in which adjacent cell (West, Northwest, or North) to determine that line's diff entry and assumes all other lines to be unchanged. Normally I would leave it at that, but since this is going into a Rails environment and not just some stand-alone Ruby script, I started getting worried about needing to optimize at least enough so if a spammer that somehow knew how I implemented the version control system and knew my worst case scenario entry still wouldn't be able to hit the server that bad. After some searching and reading of research papers and articles through the internet, I've come across several that seem decent but all seem to have pros and cons and I am having a hard time deciding how well in this situation that the pros and cons balance out. So are the ones listed here worth it? I have listed them with known pros and cons. </introduction> <optimizations> Chop the compared sequences into multiple chucks of subsequences by splitting where lines are unchanged, and then truncating each section of unchanged lines at the beginning and end of each section. Then solve the edit distance of each subsequence. Pro: Changes the time increase as the changed area gets bigger from a quadratic increase to something more similar to a linear increase. Con: Figuring out where to split already seems like you have to solve edit distance except now you don't care how it is changed. Would be fine if this was solvable by a process closer to solving hamming distance but a single insertion would throw this off. Use a cryptographic hash function to both convert all sequence elements into integers and ensure uniqueness. Then solve the edit distance comparing the hash integers instead of the sequence elements themselves. Pro: The operation of comparing two integers is faster than the operation of comparing two strings, so a slight performance gain is received after every comparison, which can be a lot overall. Con: Using a cryptographic hash function takes time to convert all the sequence elements and may end up costing more time to do the conversion that you gain back from the integer comparisons. You could use the built in hash function for a string but that will not guarantee uniqueness. Use lazy evaluation to only calculate the three center-most diagonals of the comparison matrix and then only calculate additional diagonals as needed. And then also use this approach to possibly remove the need on some comparisons to compare all three adjacent cells as desribed here. Pro: Can turn an algorithm that always takes O(n * m) time and make it so only worst case scenario is that time, best case becomes practically linear, and average case is somewhere between the two. Con: It is an algorithm I've only seen implemented in functional programming languages and I am having a difficult time comprehending how to convert this into Ruby based on how it is described at the site linked to above. Make a C module and do the hard work at the native level in C and just make a Ruby wrapper for it so Ruby can make all the calls to it that it needs. Pro: I have to imagine that evaluating something like this in could be a LOT faster. Con: I have no idea how Rails handles apps with ruby code that has C extensions and it hurts the portability of the app. This is an optimization for after the solving of edit distance, but idea is to store additional combined diffs with the ones produced by each version to make a delta-tree data structure with the most recently made diff as the root node of the tree so getting to any version takes worst case time of O(log n) instead of O(n). Pro: Would make going back to an old version a lot faster. Con: It would mean every new commit, the delta-tree would get a new root node that will cost time to reorganize the delta-tree for an operation that will be carried out a lot more often than going back a version, not to mention the unlikelihood it will be an old version. </optimizations> So are these things worth the effort?

Read the article
faster implementation of sum ( for Codility test )

- by Oscar Reyes

How can the following simple implementation of sum be faster? private long sum( int [] a, int begin, int end ) { if( a == null ) { return 0; } long r = 0; for( int i = begin ; i < end ; i++ ) { r+= a[i]; } return r; } EDIT Background is in order. Reading latest entry on coding horror, I came to this site: http://codility.com which has this interesting programming test. Anyway, I got 60 out of 100 in my submission, and basically ( I think ) is because this implementation of sum, because those parts where I failed are the performance parts. I'm getting TIME_OUT_ERROR's So, I was wondering if an optimization in the algorithm is possible. So, no built in functions or assembly would be allowed. This my be done in C, C++, C#, Java or pretty much in any other. EDIT As usual, mmyers was right. I did profile the code and I saw most of the time was spent on that function, but I didn't understand why. So what I did was to throw away my implementation and start with a new one. This time I've got an optimal solution [ according to San Jacinto O(n) -see comments to MSN below - ] This time I've got 81% on Codility which I think is good enough. The problem is that I didn't take the 30 mins. but around 2 hrs. but I guess that leaves me still as a good programmer, for I could work on the problem until I found an optimal solution: Here's my result. I never understood what is those "combinations of..." nor how to test "extreme_first"

Read the article
How can i optimize this c# code?

- by Pandiya Chendur

I have converted my Datatable to json string use the following method... public string GetJSONString(DataTable Dt) { string[] StrDc = new string[Dt.Columns.Count]; string HeadStr = string.Empty; for (int i = 0; i < Dt.Columns.Count; i++) { StrDc[i] = Dt.Columns[i].Caption; HeadStr += "\"" + StrDc[i] + "\" : \"" + StrDc[i] + i.ToString() + "¾" + "\","; } HeadStr = HeadStr.Substring(0, HeadStr.Length - 1); StringBuilder Sb = new StringBuilder(); Sb.Append("{\"" + Dt.TableName + "\" : ["); for (int i = 0; i < Dt.Rows.Count; i++) { string TempStr = HeadStr; Sb.Append("{"); for (int j = 0; j < Dt.Columns.Count; j++) { if (Dt.Rows[i][j].ToString().Contains("'") == true) { Dt.Rows[i][j] = Dt.Rows[i][j].ToString().Replace("'", ""); } TempStr = TempStr.Replace(Dt.Columns[j] + j.ToString() + "¾", Dt.Rows[i][j].ToString()); } Sb.Append(TempStr + "},"); } Sb = new StringBuilder(Sb.ToString().Substring(0, Sb.ToString().Length - 1)); Sb.Append("]}"); return Sb.ToString(); } Is this fair enough or still there is margin for optimization to make it execute faster.... Any suggestion...

Read the article
Draw Rectangle with XNA

- by mazzzzz

Hey guys, I was working on game, and wanted to highlight a spot on the screen when something happens, I created a class to do this for me, and found a bit of code to draw the rectangle static private Texture2D CreateRectangle(int width, int height, Color colori) { Texture2D rectangleTexture = new Texture2D(game.GraphicsDevice, width, height, 1, TextureUsage.None, SurfaceFormat.Color);// create the rectangle texture, ,but it will have no color! lets fix that Color[] color = new Color[width * height];//set the color to the amount of pixels in the textures for (int i = 0; i < color.Length; i++)//loop through all the colors setting them to whatever values we want { color[i] = colori; } rectangleTexture.SetData(color);//set the color data on the texture return rectangleTexture;//return the texture } Problem is that the code above is called every update, (60 times a second), and it was not written with optimization in mind, I have no clue how else to write a code to do this though. It needs to be extremely fast (the code above freezes the game, which has only skeleton code right now).. Any suggestions. Note: Any new code would be great (WireFrame/Fill are both fine). I would like to be able to specify color. Something to point in the right direction would be great, Thanks, Max

Read the article
How to implement Excel Solver functionality in C#?

- by Vic

Hi, I have an application in C#, I need to do some optimization calculations, like Excel Solver Add-in does, one option is certainly to write my own solver implementation, but I'm kind of short of time, so I'm looking into libraries that already exist that can help me with this. I've been trying the Microsoft Solver Foundation, which seems pretty neat and cool, the problem is that it doesn't seem to work with the kind of calculations that I need to do. At the end of this question I'm adding the information about the calculations I need to perform and optimize. So basically my question is if any of you know of any other library that I can use for this purpose, or any tutorial that can help to do my own solver, or any idea that gives me a lead to solve this issue. Thanks. Additional Info: This is the data I need to calculate: I have 7 variables, lets call them var1, var2,...,var7 The constraints for these variables are: All of them need to be 0 <= varn <= 0.5 (where n is the number of the variable) The sum of all the variables should be equal to 1 The objective is to maximize the target formula, which in Excel looks like this: (MMULT(TRANSPOSE(L26:L32),M14:M20)) / (SQRT(MMULT(MMULT(TRANSPOSE(L26:L32),M4:S10),L26:L32))) The range that you see in this formula, L26:L32, is actually the range with the variables from above, var1, var2,..., varn. M14:M20 and M4:S10 are ranges with data that I get from different sources, there are more likely decimal values. As I said before, I was using Microsoft Solver Foundation, I modeled pretty much everything with it, I created functions that handle the operations of the target formula, but when I tried to solve the model it always fail, I think it is because of the complexity of the operations. In any case, I just wanted to show these data so you can have an idea about the kind of calculations that I need to implement.

Read the article
PHP Performance Metrics

- by bigstylee

I am currently developing a PHP MVC Framework for a personal project. While I am developing the framework I am interested to see any notable performance by implementing different techniques for optimization. I have implemented a crude BenchMark class that logs mircotime. The problem is I have no frame of reference for execution times. I am very near the beginnig of this project with a database connection and a few queries but no output (bar some debugging text and BenchMark log). I have a current execution time of 0.01917 seconds. I was expecting this to be lower but as I said before I have no frame of reference. I appreciate there are many variables to take into account when juding performance but I am hoping to find some sort of metric to a) techniques to measure performance for example requests per second and b) compare results for example; how a "moderately" sized PHP application on a "standard" webserver will perform. I appreciate "moderately" and "standard" are very subjective words so perhaps a table of known execution times for a particular application (eg StackOverFlow's executing time). What are other techniques of measuring performance are there other than execution time? When looking at MVC Framework Performance Comparisom it talks about Requests Per Second (RPS). How is this calculated? I am guessing with my current execution time of 0.01917 seconds can handle 52 RPS (= 1 / 0.01917 ). This seems to be significantly lower than that quoted on the graph especially when you consider my current limited funcitonality.

Read the article
Optimizing MySQL update query

- by Jernej Jerin

This is currently my MySQL UPDATE query, which is called from program written in Java: String query = "UPDATE maxday SET DatePressureREL = (SELECT Date FROM ws3600 WHERE PressureREL = (SELECT MAX" + "(PressureREL) FROM ws3600 WHERE Date >= '" + Date + "') AND Date >= '" + Date + "' ORDER BY Date DESC LIMIT 1), " + "PressureREL = (SELECT PressureREL FROM ws3600 WHERE PressureREL = (SELECT MAX(PressureREL) FROM ws3600 " + "WHERE Date >= '" + Date + "') AND Date >= '" + Date + "' ORDER BY Date DESC LIMIT 1), ..."; try { s.execute(query); } catch (SQLException e) { System.out.println("SQL error"); } catch(Exception e) { e.printStackTrace(); } Let me explain first, what does it do. I have two tables, first is ws3600, which holds columns (Date, PressureREL, TemperatureOUT, Dewpoint, ...). Then I have second table, called maxday, which holds columns like DatePressureREL, PressureREL, DateTemperatureOUT, TemperatureOUT,... Now as you can see from an example, I update each column, the question is, is there a faster way? I am asking this, because I am calling MAX twice, first to find the Date for that value and secondly to find the actual value. Now I know that I could write like that: SELECT Date, PressureREL FROM ws3600 WHERE PressureREL = (SELECT MAX(PressureREL) FROM ws3600 WHERE Date >= '" + Date + "') AND Date >= '" + Date + "' ORDER BY Date DESC LIMIT 1 That way I get the Date of the max and the max value at the same time and then update with those values the data in maxday table. But the problem of this solution is, that I have to execute many queries, which as I understand takes alot more time compared to executing one long mysql query because of overhead in sending each query to the server. If there is no better way, which solution beetwen this two should I choose. The first, which only takes one query but is very unoptimized or the second which is beter in terms of optimization, but needs alot more queries which probably means that the preformance gain is lost because of overhead in sending each query to the server?

Read the article
SQL: Speed Improvement - Cluttered union query

- by vol7ron

SELECT * FROM ( SELECT a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name FROM current_tbl a INNER JOIN import_tbl b ON ( a.user_id = b.user_id ) UNION SELECT a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name FROM current_tbl a INNER JOIN import_tbl b ON ( lower(a.f_name)=lower(b.f_name) AND lower(a.l_name)=lower(b.l_name) ) ) foo -- UNION -- SELECT a.user_id , a.f_name , a.l_name , '' , '' , '' FROM current_tbl a WHERE a.user_id NOT IN ( select user_id from( SELECT a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name FROM current_tbl a INNER JOIN import_tbl b ON ( a.user_id = b.user_id ) UNION SELECT a.user_id, a.f_name, a.l_name, b.user_id, b.f_name, b.l_name FROM current_tbl a INNER JOIN import_tbl b ON ( lower(a.f_name)=lower(b.f_name) AND lower(a.l_name)=lower(b.l_name) ) ) bar ) ORDER BY user_id Example of table population: current_tbl: ------------------------------- user_id | f_name | l_name ---------+----------+---------- A1 | Adam | Acorn A2 | Beth | Berry A3 | Calv | Chard | | import_tbl: ------------------------------- user_id | f_name | l_name ---------+----------+---------- A1 | Adam | Acorn A2 | Beth | Butcher <- last_name different | | Expected Output: ----------------------------------------------------------------------- user_id1 | f_name1 | l_name1 | user_id2 | f_name2 | l_name2 ----------+-----------+-----------+------------+-----------+----------- A1 | Adam | Acorn | A1 | Adam | Acorn A2 | Beth | Berry | A2 | Beth | Butcher A3 | Calv | Chard | | | Doing this method gets rid of conditions where the row would be: A2 | Beth | Berry | A2 | Beth | Butcher But it keeps the A3 row I hope this makes sense and I haven't overly simplified it. This is a continuation question from my other question. The succession of these improvements has dropped the query down from ~32000ms to where it's at now ~1200ms - quite an improvement. I supect I can optimize by using UNION ALL in the subquery and of course the usual index optimizations, but I'm looking for the best SQL optimization. FYI this particular case is for PostgreSQL.

Read the article
Compile and optimize for different target architectures

- by Peter Smit

Summary: I want to take advantage of compiler optimizations and processor instruction sets, but still have a portable application (running on different processors). Normally I could indeed compile 5 times and let the user choose the right one to run. My question is: how can I can automate this, so that the processor is detected at runtime and the right executable is executed without the user having to chose it? I have an application with a lot of low level math calculations. These calculations will typically run for a long time. I would like to take advantage of as much optimization as possible, preferably also of (not always supported) instruction sets. On the other hand I would like my application to be portable and easy to use (so I would not like to compile 5 different versions and let the user choose). Is there a possibility to compile 5 different versions of my code and run dynamically the most optimized version that's possible at execution time? With 5 different versions I mean with different instruction sets and different optimizations for processors. I don't care about the size of the application. At this moment I'm using gcc on Linux (my code is in C++), but I'm also interested in this for the Intel compiler and for the MinGW compiler for compilation to Windows. The executable doesn't have to be able to run on different OS'es, but ideally there would be something possible with automatically selecting 32 bit and 64 bit as well. Edit: Please give clear pointers how to do it, preferably with small code examples or links to explanations. From my point of view I need a super generic solution, which is applicable on any random C++ project I have later. Edit I assigned the bounty to ShuggyCoUk, he had a great number of pointers to look out for. I would have liked to split it between multiple answers but that is not possible. I'm not having this implemented yet, so the question is still 'open'! Please, still add and/or improve answers, even though there is no bounty to be given anymore. Thanks everybody!

Read the article
Algorithm for finding the best routes for food distribution in game

- by Tautrimas

Hello, I'm designing a city building game and got into a problem. Imagine Sierra's Caesar III game mechanics: you have many city districts with one market each. There are several granaries over the distance connected with a directed weighted graph. The difference: people (here cars) are units that form traffic jams (here goes the graph weights). Note: in Ceasar game series, people harvested food and stockpiled it in several big granaries, whereas many markets (small shops) took food from the granaries and delivered it to the citizens. The task: tell each district where they should be getting their food from while taking least time and minimizing congestions on the city's roads. Map example Sample diagram Suppose that yellow districts need 7, 7 and 4 apples accordingly. Bluish granaries have 7 and 11 apples accordingly. Suppose edges weights to be proportional to their length. Then, the solution should be something like the gray numbers indicated on the edges. Eg, first district gets 4 apples from the 1st and 3 apples from the 2nd granary, while the last district gets 4 apples from only the 2nd granary. Here, vertical roads are first occupied to the max, and then the remaining workers are sent to the diagonal paths. Question What practical and very fast algorithm should I use? I was looking at some papers (Congestion Games: Optimization in Competition etc.) describing congestion games, but could not get the big picture. Any help is very appreciated! P. S. I can post very little links and no images because of new user restriction.

Read the article
How can I improve the performance of this double-for print?

- by Florenc

I have the following static method that prints the data imported from a 40.000 lines .xls spreadsheet. Now, it takes about 27 seconds to print the data in the console and the memory consumption is huge. import org.apache.poi.hssf.usermodel.*; import org.apache.poi.ss.usermodel.*; public static void printSheetData(List<List<HSSFCell>> sheetData) { for (int i = 0; i < sheetData.size(); i++) { List<HSSFCell> list = (List<HSSFCell>) sheetData.get(i); for (int j = 0; j < list.size(); j++) { HSSFCell cell = (HSSFCell) list.get(j); System.out.print(cell.toString()); if (j < list.size() - 1) { System.out.print(", "); } } System.out.println(""); } } Disclaimer: I know, I know large data belong to a database, don't print output in the console, premature optimization is the root of all evils...

Read the article
Coding Practices which enable the compiler/optimizer to make a faster program.

- by EvilTeach

Many years ago, C compilers were not particularly smart. As a workaround K&R invented the register keyword, to hint to the compiler, that maybe it would be a good idea to keep this variable in an internal register. They also made the tertiary operator to help generate better code. As time passed, the compilers matured. They became very smart in that their flow analysis allowing them to make better decisions about what values to hold in registers than you could possibly do. The register keyword became unimportant. FORTRAN can be faster than C for some sorts of operations, due to alias issues. In theory with careful coding, one can get around this restriction to enable the optimizer to generate faster code. What coding practices are available that may enable the compiler/optimizer to generate faster code? Identifying the platform and compiler you use, would be appreciated. Why does the technique seem to work? Sample code is encouraged. Here is a related question [Edit] This question is not about the overall process to profile, and optimize. Assume that the program has been written correctly, compiled with full optimization, tested and put into production. There may be constructs in your code that prohibit the optimizer from doing the best job that it can. What can you do to refactor that will remove these prohibitions, and allow the optimizer to generate even faster code? [Edit] Offset related link

Read the article
Efficient way to maintain a sorted list of access counts in Python

- by David

Let's say I have a list of objects. (All together now: "I have a list of objects.") In the web application I'm writing, each time a request comes in, I pick out up to one of these objects according to unspecified criteria and use it to handle the request. Basically like this: def handle_request(req): for h in handlers: if h.handles(req): return h return None Assuming the order of the objects in the list is unimportant, I can cut down on unnecessary iterations by keeping the list sorted such that the most frequently used (or perhaps most recently used) objects are at the front. I know this isn't something to be concerned about - it'll make only a miniscule, undetectable difference in the app's execution time - but debugging the rest of the code is driving me crazy and I need a distraction :) so I'm asking out of curiosity: what is the most efficient way to maintain the list in sorted order, descending, by the number of times each handler is chosen? The obvious solution is to make handlers a list of (count, handler) pairs, and each time a handler is chosen, increment the count and resort the list. def handle_request(req): for h in handlers[:]: if h[1].handles(req): h[0] += 1 handlers.sort(reverse=True) return h[1] return None But since there's only ever going to be at most one element out of order, and I know which one it is, it seems like some sort of optimization should be possible. Is there something in the standard library, perhaps, that is especially well-suited to this task? Or some other data structure? (Even if it's not implemented in Python) Or should/could I be doing something completely different?

Read the article
Improving File Read Performance (single file, C++, Windows)

- by david

I have large (hundreds of MB or more) files that I need to read blocks from using C++ on Windows. Currently the relevant functions are: errorType LargeFile::read( void* data_out, __int64 start_position, __int64 size_bytes ) const { if( !m_open ) { // return error } else { seekPosition( start_position ); DWORD bytes_read; BOOL result = ReadFile( m_file, data_out, DWORD( size_bytes ), &bytes_read, NULL ); if( size_bytes != bytes_read || result != TRUE ) { // return error } } // return no error } void LargeFile::seekPosition( __int64 position ) const { LARGE_INTEGER target; target.QuadPart = LONGLONG( position ); SetFilePointerEx( m_file, target, NULL, FILE_BEGIN ); } The performance of the above does not seem to be very good. Reads are on 4K blocks of the file. Some reads are coherent, most are not. A couple questions: Is there a good way to profile the reads? What things might improve the performance? For example, would sector-aligning the data be useful? I'm relatively new to file i/o optimization, so suggestions or pointers to articles/tutorials would be helpful.

Read the article
optimize output value using a class and public member

- by wiso

Suppose you have a function, and you call it a lot of times, every time the function return a big object. I've optimized the problem using a functor that return void, and store the returning value in a public member: #include <vector> const int N = 100; std::vector<double> fun(const std::vector<double> & v, const int n) { std::vector<double> output = v; output[n] *= output[n]; return output; } class F { public: F() : output(N) {}; std::vector<double> output; void operator()(const std::vector<double> & v, const int n) { output = v; output[n] *= n; } }; int main() { std::vector<double> start(N,10.); std::vector<double> end(N); double a; // first solution for (unsigned long int i = 0; i != 10000000; ++i) a = fun(start, 2)[3]; // second solution F f; for (unsigned long int i = 0; i != 10000000; ++i) { f(start, 2); a = f.output[3]; } } Yes, I can use inline or optimize in an other way this problem, but here I want to stress on this problem: with the functor I declare and construct the output variable output only one time, using the function I do that every time it is called. The second solution is two time faster than the first with g++ -O1 or g++ -O2. What do you think about it, is it an ugly optimization?

Read the article
Speed comparison - Template specialization vs. Virtual Function vs. If-Statement

- by Person

Just to get it out of the way... Premature optimization is the root of all evil Make use of OOP etc. I understand. Just looking for some advice regarding the speed of certain operations that I can store in my grey matter for future reference. Say you have an Animation class. An animation can be looped (plays over and over) or not looped (plays once), it may have unique frame times or not, etc. Let's say there are 3 of these "either or" attributes. Note that any method of the Animation class will at most check for one of these (i.e. this isn't a case of a giant branch of if-elseif). Here are some options. 1) Give it boolean members for the attributes given above, and use an if statement to check against them when playing the animation to perform the appropriate action. Problem: Conditional checked every single time the animation is played. 2) Make a base animation class, and derive other animations classes such as LoopedAnimation and AnimationUniqueFrames, etc. Problem: Vtable check upon every call to play the animation given that you have something like a vector<Animation>. Also, making a separate class for all of the possible combinations seems code bloaty. 3) Use template specialization, and specialize those functions that depend on those attributes. Like template<bool looped, bool uniqueFrameTimes> class Animation. Problem: The problem with this is that you couldn't just have a vector<Animation> for something's animations. Could also be bloaty. I'm wondering what kind of speed each of these options offer? I'm particularly interested in the 1st and 2nd option because the 3rd doesn't allow one to iterate through a general container of Animations. In short, what is faster - a vtable fetch or a conditional?

Read the article
Why doesn't g++ pay attention to __attribute__((pure)) for virtual functions?

- by jchl

According to the GCC documentation, __attribute__((pure)) tells the compiler that a function has no side-effects, and so it can be subject to common subexpression elimination. This attribute appears to work for non-virtual functions, but not for virtual functions. For example, consider the following code: extern void f( int ); class C { public: int a1(); int a2() __attribute__((pure)); virtual int b1(); virtual int b2() __attribute__((pure)); }; void test_a1( C *c ) { if( c->a1() ) { f( c->a1() ); } } void test_a2( C *c ) { if( c->a2() ) { f( c->a2() ); } } void test_b1( C *c ) { if( c->b1() ) { f( c->b1() ); } } void test_b2( C *c ) { if( c->b2() ) { f( c->b2() ); } } When compiled with optimization enabled (either -O2 or -Os), test_a2() only calls C::a2() once, but test_b2() calls b2() twice. Is there a reason for this? Is it because, even though the implementation in class C is pure, g++ can't assume that the implementation in every subclass will also be pure? If so, is there a way to tell g++ that this virtual function and every subclass's implementation will be pure?

Read the article
Optimize Duplicate Detection

- by Dave Jarvis

Background This is an optimization problem. Oracle Forms XML files have elements such as: <Trigger TriggerName="name" TriggerText="SELECT * FROM DUAL" ... /> Where the TriggerText is arbitrary SQL code. Each SQL statement has been extracted into uniquely named files such as: sql/module=DIAL_ACCESS+trigger=KEY-LISTVAL+filename=d_access.fmb.sql sql/module=REP_PAT_SEEN+trigger=KEY-LISTVAL+filename=rep_pat_seen.fmb.sql I wrote a script to generate a list of exact duplicates using a brute force approach. Problem There are 37,497 files to compare against each other; it takes 8 minutes to compare one file against all the others. Logically, if A = B and A = C, then there is no need to check if B = C. So the problem is: how do you eliminate the redundant comparisons? The script will complete in approximately 208 days. Script Source Code The comparison script is as follows: #!/bin/bash echo Loading directory ... for i in $(find sql/ -type f -name \*.sql); do echo Comparing $i ... for j in $(find sql/ -type f -name \*.sql); do if [ "$i" = "$j" ]; then continue; fi # Case insensitive compare, ignore spaces diff -IEbwBaq $i $j > /dev/null # 0 = no difference (i.e., duplicate code) if [ $? = 0 ]; then echo $i :: $j >> clones.txt fi done done Question How would you optimize the script so that checking for cloned code is a few orders of magnitude faster? System Constraints Using a quad-core CPU with an SSD; trying to avoid using cloud services if possible. The system is a Windows-based machine with Cygwin installed -- algorithms or solutions in other languages are welcome. Thank you!

Read the article
memcached: which is faster, doing an add (and checking result), or doing a get (and set when returni

- by Mike Sherov

The title of this question isn't so clear, but the code and question is straightforward. Let's say I want to show my users an ad once per day. To accomplish this, every time they visit a page on my site, I check to see if a certain memcache key has any data stored on it. If so, don't show an ad. If not, store the value '1' in that key with an expiration of 86400. I can do this 2 ways: //version a $key='OPD_'.date('Ymd').'_'.$type.'_'.$user; if($memcache->get($key)===false){ $memcache->set($key,'1',false,$expire); //show ad } //version b $key='OPD_'.date('Ymd').'_'.$type.'_'.$user; if($memcache->add($key,'1',false,$expire)){ //show ad } Now, it might seem obvious that b is better, it always makes 1 memcache call. However, what is the overhead of "add" vs. "get"? These aren't the real comparisons... and I just made up these numbers, but let's say 1 add ~= 1 set ~= 5 get in terms of effort, and the average user views 5 pages a day: a: (5 get * 1 effort) + (1 set * 5 effort) = 10 units of effort b: (5 add * 5 effort) = 25 units of effort Would it make sense to always do the add call? Is this an unnecessary micro-optimization?

Read the article
PHP, ImageMagick, Google's Page Speed, & JPG File Size Optimization

- by Sonny

I have photo gallery code that does image re-sizing and thumbnail creation. I use ImageMagick to do this. I ran a gallery page through Google's Page Speed tool and it revealed that the re-sized images and thumbnails both have about an extra 10KB of data (JPEG files specifically). What can I add to my scripts to optimize the file size? ADDITIONAL INFORMATION I am using the imagick::FILTER_LANCZOS filter with a blur setting of 0.9 when calling the resizeImage() function. JPEGs have a quality setting of 80.

Read the article
JPG File Size Optimization - PHP, ImageMagick, & Google's Page Speed

- by Sonny

I have photo gallery code that does image re-sizing and thumbnail creation. I use ImageMagick to do this. I ran a gallery page through Google's Page Speed tool and it revealed that the re-sized images and thumbnails both have about an extra 10KB of data (JPEG files specifically). What can I add to my scripts to optimize the file size? ADDITIONAL INFORMATION I am using the imagick::FILTER_LANCZOS filter with a blur setting of 0.9 when calling the resizeImage() function. JPEGs have a quality setting of 80.

Read the article
PHP, ImageMagick, Google's Page Speed, & Image File Size Optimization

- by Sonny

I have photo gallery code that does image re-sizing and thumbnail creation. I use ImageMagick to do this. I ran a gallery page through Google's Page Speed tool and it revealed that the re-sized images and thumbnails both have about an extra 10KB of data (JPEG files specifically). What can I add to my scripts to optimize the file size?

Read the article

< Previous Page | 41 42 43 44 45 46 47 48 49 50 51 52 | Next Page >