Search Results

Search found 7551 results on 303 pages for 'pre optimization'.

Page 55/303 | < Previous Page | 51 52 53 54 55 56 57 58 59 60 61 62  | Next Page >

  • Languages and VMs: Features that are hard to optimize and why

    - by mrjoltcola
    I'm doing a survey of features in preparation for a research project. Name a mainstream language or language feature that is hard to optimize, and why the feature is or isn't worth the price paid, or instead, just debunk my theories below with anecdotal evidence. Before anyone flags this as subjective, I am asking for specific examples of languages or features, and ideas for optimization of these features, or important features that I haven't considered. Also, any references to implementations that prove my theories right or wrong. Top on my list of hard to optimize features and my theories (some of my theories are untested and are based on thought experiments): 1) Runtime method overloading (aka multi-method dispatch or signature based dispatch). Is it hard to optimize when combined with features that allow runtime recompilation or method addition. Or is it just hard, anyway? Call site caching is a common optimization for many runtime systems, but multi-methods add additional complexity as well as making it less practical to inline methods. 2) Type morphing / variants (aka value based typing as opposed to variable based) Traditional optimizations simply cannot be applied when you don't know if the type of someting can change in a basic block. Combined with multi-methods, inlining must be done carefully if at all, and probably only for a given threshold of size of the callee. ie. it is easy to consider inlining simple property fetches (getters / setters) but inlining complex methods may result in code bloat. The other issue is I cannot just assign a variant to a register and JIT it to the native instructions because I have to carry around the type info, or every variable needs 2 registers instead of 1. On IA-32 this is inconvenient, even if improved with x64's extra registers. This is probably my favorite feature of dynamic languages, as it simplifies so many things from the programmer's perspective. 3) First class continuations - There are multiple ways to implement them, and I have done so in both of the most common approaches, one being stack copying and the other as implementing the runtime to use continuation passing style, cactus stacks, copy-on-write stack frames, and garbage collection. First class continuations have resource management issues, ie. we must save everything, in case the continuation is resumed, and I'm not aware if any languages support leaving a continuation with "intent" (ie. "I am not coming back here, so you may discard this copy of the world"). Having programmed in the threading model and the contination model, I know both can accomplish the same thing, but continuations' elegance imposes considerable complexity on the runtime and also may affect cache efficienty (locality of stack changes more with use of continuations and co-routines). The other issue is they just don't map to hardware. Optimizing continuations is optimizing for the less-common case, and as we know, the common case should be fast, and the less-common cases should be correct. 4) Pointer arithmetic and ability to mask pointers (storing in integers, etc.) Had to throw this in, but I could actually live without this quite easily. My feelings are that many of the high-level features, particularly in dynamic languages just don't map to hardware. Microprocessor implementations have billions of dollars of research behind the optimizations on the chip, yet the choice of language feature(s) may marginalize many of these features (features like caching, aliasing top of stack to register, instruction parallelism, return address buffers, loop buffers and branch prediction). Macro-applications of micro-features don't necessarily pan out like some developers like to think, and implementing many languages in a VM ends up mapping native ops into function calls (ie. the more dynamic a language is the more we must lookup/cache at runtime, nothing can be assumed, so our instruction mix is made up of a higher percentage of non-local branching than traditional, statically compiled code) and the only thing we can really JIT well is expression evaluation of non-dynamic types and operations on constant or immediate types. It is my gut feeling that bytecode virtual machines and JIT cores are perhaps not always justified for certain languages because of this. I welcome your answers.

    Read the article

  • Help with optimizing C# function via C and/or Assembly

    - by MusiGenesis
    I have this C# method which I'm trying to optimize: // assume arrays are same dimensions private void DoSomething(int[] bigArray1, int[] bigArray2) { int data1; byte A1; byte B1; byte C1; byte D1; int data2; byte A2; byte B2; byte C2; byte D2; for (int i = 0; i < bigArray1.Length; i++) { data1 = bigArray1[i]; data2 = bigArray2[i]; A1 = (byte)(data1 >> 0); B1 = (byte)(data1 >> 8); C1 = (byte)(data1 >> 16); D1 = (byte)(data1 >> 24); A2 = (byte)(data2 >> 0); B2 = (byte)(data2 >> 8); C2 = (byte)(data2 >> 16); D2 = (byte)(data2 >> 24); A1 = A1 > A2 ? A1 : A2; B1 = B1 > B2 ? B1 : B2; C1 = C1 > C2 ? C1 : C2; D1 = D1 > D2 ? D1 : D2; bigArray1[i] = (A1 << 0) | (B1 << 8) | (C1 << 16) | (D1 << 24); } } The function basically compares two int arrays. For each pair of matching elements, the method compares each individual byte value and takes the larger of the two. The element in the first array is then assigned a new int value constructed from the 4 largest byte values (irrespective of source). I think I have optimized this method as much as possible in C# (probably I haven't, of course - suggestions on that score are welcome as well). My question is, is it worth it for me to move this method to an unmanaged C DLL? Would the resulting method execute faster (and how much faster), taking into account the overhead of marshalling my managed int arrays so they can be passed to the method? If doing this would get me, say, a 10% speed improvement, then it would not be worth my time for sure. If it was 2 or 3 times faster, then I would probably have to do it. Note: please, no "premature optimization" comments, thanks in advance. This is simply "optimization".

    Read the article

  • Al Zimmermann's Son of Darts

    - by polygenelubricants
    There's about 2 months left in Al Zimmermann's Son of Darts programming contest, and I'd like to improve my standing (currently in the 60s) to something more respectable. I'd like to get some ideas from the great community of stackoverflow on how best to approach this problem. The contest problem is known as the Global Postage Stamp Problem in literatures. I don't have much experience with optimization algorithms (I know of hillclimbing and simulated annealing in concept only from college), and in fact the program that I have right now is basically sheer brute force, which of course isn't feasible for the larger search spaces. Here are some papers on the subject: A Postage Stamp Problem (Alter & Barnett, 1980) Algorithms for Computing the h-Range of the Postage Stamp Problem (Mossige, 1981) A Postage Stamp Problem (Lunnon, 1986) Two New Techniques for Computing Extremal h-bases Ak (Challis, 1992) Any hints and suggestions are welcome. Also, feel free to direct me to the proper site if stackoverflow isn't it.

    Read the article

  • Error: Can't find common super class of ...

    - by PatlaDJ
    I am trying to process with Proguard a MS Windows desktop application (Java 6 SE using the SWT lib provided by Eclipse). And I get the following critical error: Unexpected error while performing partial evaluation: Class = [org/eclipse/swt/widgets/DateTime] Method = [<init>(Lorg/eclipse/swt/widgets/Composite;I)V] Exception = [java.lang.IllegalArgumentException] (Can't find common super class of [java/lang/StringBuffer] and [org/eclipse/swt/internal/win32/TCHAR]) Error: Can't find common super class of [java/lang/StringBuffer] and [org/eclipse/swt/internal/win32/TCHAR] ---------------------------- When I tried to Google the error, it came out only on two spots on the entire web, that astonished me greatly. I am newbie using Proguard and Java code optimization tools at all. Any thoughts and suggestions how to fix this, will be appreciated. Thanks in advance.

    Read the article

  • what is the best way to generate fake data for classification problem ?

    - by Berkay
    i'm working on a project and i have a subset of user's key-stroke time data.This means that the user makes n attempts and i will use these recorded attempt time data in various kinds of classification algorithms for future user attempts to verify that the login process is done by the user or some another person. (Simply i can say that this is biometrics) I have 3 different times of the user login attempt process, ofcourse this is subset of the infinite data. until now it is an easy classification problem, i decided to use WEKA but as far as i understand i have to create some fake data to feed the classification algorithm. can i use some optimization algorithms ? or is there any way to create this fake data to get min false positives ? Thanks

    Read the article

  • Is there any way to get MSVC to pass structs arguments in registers on x64?

    - by Luke
    For a function with signature: struct Pair { void *v1, *v2 }; void f(Pair p); compiled on x64, I would like Pair's fields to be passed via register, as if the function was: void f(void *v1, void *v2); Compiling a test with gcc 4.2.1 for x86_64 on OSX 10.6, I can see this is exactly what happens by examining the disassembly. However, compiling with MSVC 2008 for x64 on Windows, the disassembly shows that Pair is passed on the stack. I understand that platform ABIs can prevent this optimization; does anyone know any MSVC-specific annotations, calling conventions, flags, or other hacks that can get this to work? Thank you!

    Read the article

  • How to optimize MATLAB loops?

    - by striglia
    I have been working lately on a number of iterative algorithms in MATLAB, and been getting hit hard by MATLAB's performance (or lack thereof) when it comes to loops. I'm aware of the benefit of vectorizing code when possible, but are there any tools for optimization when you need the loop for your algorithm? I am aware of the MEX-file option to write small subroutines in C/C++, although given my algorithms, this can be a very painful option given the data structures required. I mainly use MATLAB for the simplicity and speed of prototyping, so a syntactically complex, statically typed language is not ideal for my situation. Are there any other suggestions? Even other languages (python?) which have relatively painless matrix tools are an option.

    Read the article

  • Tips for optimizing C#/.NET programs

    - by Bob
    It seems like optimization is a lost art these days. Wasn't there a time when all programmers squeezed every ounce of efficiency from their code? Often doing so while walking 5 miles in the snow? In the spirit of bringing back a lost art, what are some tips that you know of for simple (or perhaps complex) changes to optimize C#/.NET code? Since it's such a broad thing that depends on what one is trying to accomplish it'd help to provide context with your tip. For instance: When concatenating many strings together use StringBuilder instead. If you're only concatenating a handful of strings it's ok to use the + operator. Use string.Compare to compare 2 strings instead of doing something like string1.ToLower() == string2.ToLower()

    Read the article

  • Lucene.Net memory consumption and slow search when too many clauses used

    - by Umer
    I have a DB having text file attributes and text file primary key IDs and indexed around 1 million text files along with their IDs (primary keys in DB). Now, I am searching at two levels. First is straight forward DB search, where i get primary keys as result (roughly 2 or 3 million IDs) Then i make a Boolean query for instance as following +Text:"test*" +(pkID:1 pkID:4 pkID:100 pkID:115 pkID:1041 .... ) and search it in my Index file. The problem is that such query (having 2 million clauses) takes toooooo much time to give result and consumes reallly too much memory.... Is there any optimization solution for this problem ?

    Read the article

  • C++ defines for a 'better' Release mode build in VS

    - by darid
    I currently use the following preprocessor defines, and various optimization settings: WIN32_LEAN_AND_MEAN VC_EXTRALEAN NOMINMAX _CRT_SECURE_NO_WARNINGS _SCL_SECURE_NO_WARNINGS _SECURE_SCL=0 _HAS_ITERATOR_DEBUGGING=0 My question is what other things do fellow SOers use, add, define, in order to get a Release Mode build from VS C++ (2008,2010) to be as performant as possible? btw, I've tried PGO etc, it does help a bit but nothing that comes to parity, also I'm not using streams, the C++ i'm talking about its more like C but making use of templates and STL algorithms. As it stands now very simple code segments flop when compared to what GCC produces on say an equivalent x86 machine running linux (2.6+ kernel) using 02. Side-Note: I believe a lot of the issues relate directly to the STL version (Dinkum) provided by MS. Could people please elaborate on experiences using STLPort etc with VS C++.

    Read the article

  • jQuery Optimizations

    - by aepheus
    I've just come to the end of a large development project. We were on a tight timeline, so a lot of optimization was "deferred". Now that we met our deadline, we're going back and trying to optimize things. My questions is this: What are some of the most important things you look for when optimizing jQuery web sites. Alternately I'd love to hear of sites/lists that have particularly good advise for optimizing jQuery. I've already read a few articles, http://www.tvidesign.co.uk/blog/improve-your-jquery-25-excellent-tips.aspx was an especially good read.

    Read the article

  • Optimize C# Code Fragment

    - by Eric J.
    I'm profiling some C# code. The method below is one of the most expensive ones. For the purpose of this question, assume that micro-optimization is the right thing to do. Is there an approach to improve performance of this method? Changing the input parameter to p to ulong[] would create a macro inefficiency. static ulong Fetch64(byte[] p, int ofs = 0) { unchecked { ulong result = p[0 + ofs] + ((ulong)p[1 + ofs] << 8) + ((ulong)p[2 + ofs] << 16) + ((ulong)p[3 + ofs] << 24) + ((ulong)p[4 + ofs] << 32) + ((ulong)p[5 + ofs] << 40) + ((ulong)p[6 + ofs] << 48) + ((ulong)p[7 + ofs] << 56); return result; } }

    Read the article

  • Is it possible to shrink rt.jar with ProGuard?

    - by PatlaDJ
    Is there a procedure by which you can optimize/shrink/select/obfuscate only 'used by your app' classes/methods/fields from rt.jar provided by Sun by using some optimization software like ProGuard (or maybe other?). Then you would actually be able to minimize the download size of your application considerably and make it much more secure ? Right? Related questions: Do you know if Sun's "jigsaw project" which is waited to come out, is intended to automatically handle this particular issue? Did somebody manage yet to form an opinion about Avian java alternative? Please share it here. Thank you.

    Read the article

  • Compile-time trigonometry in C

    - by lhahne
    I currently have code that looks like while (very_long_loop) { ... y1 = getSomeValue(); ... x1 = y1*cos(PI/2); x2 = y2*cos(SOME_CONSTANT); ... outputValues(x1, x2, ...); } the obvious optimization would be to compute the cosines ahead-of-time. I could do this by filling an array with the values but I was wondering would it be possible to make the compiler compute these at compile-time?

    Read the article

  • In C, would !~b ever be faster than b == 0xff ?

    - by James Morris
    From a long time ago I have a memory which has stuck with me that says comparisons against zero are faster than any other value (ahem Z80). In some C code I'm writing I want to skip values which have all their bits set. Currently the type of these values is char but may change. I have two different alternatives to perform the test: if (!~b) /* skip */ and if (b == 0xff) /* skip */ Apart from the latter making the assumption that b is an 8bit char whereas the former does not, would the former ever be faster due to the old compare to zero optimization trick, or are the CPUs of today way beyond this kind of thing?

    Read the article

  • How do you design a database to allow fast multicolumn searching?

    - by Fletcher Moore
    I am creating a real estate search from RETS data, but this is a general question. When you have a variety of columns that you would like the user to be able to filter their search result by, how do you optimize this? For example, http://www.charlestonrealestateguide.com/listings.php has 16 or so optional filters. Granted, he only has up to 11,000 entries (I have the same data), but I don't imagine the search is performed with just a giant WHERE AND AND AND ... clause. Or is this typically accomplished with one giant multicolumn index? Newegg, Amazon, and countless others also have cool & fast filtering systems for large amounts of data. How do they do it? And is there a database optimization reason for the tendency to provide ranges instead of empty inputs, or is that merely for user convenience?

    Read the article

  • Optimizing MySQL statement with lot of count(row) an sum(row+row2)...

    - by Zombies
    I need to use InnoDB storage engine on a table with about 1mil or so records in it at any given time. It has records being inserted to it at a very fast rate, which are then dropped within a few days, maybe a week. The ping table has about a million rows, whereas the website table only about 10,000. My statement is this: select url from website ws, ping pi where ws.idproxy = pi.idproxy and pi.entrytime > curdate() - 3 and contentping+tcpping is not null group by url having sum(contentping+tcpping)/(count(*)-count(errortype)) < 500 and count(*) > 3 and count(errortype)/count(*) < .15 order by sum(contentping+tcpping)/(count(*)-count(errortype)) asc; I added an index on entrytime, yet no dice. Can anyone throw me a bone as to what I should consider to look into for basic optimization of this query. The result set is only like 200 rows, so I'm not getting killed there.

    Read the article

  • strange results with /fp:fast

    - by martinus
    We have some code that looks like this: inline int calc_something(double x) { if (x > 0.0) { // do something return 1; } else { // do something else return 0; } } Unfortunately, when using the flag /fp:fast, we get calc_something(0)==1 so we are clearly taking the wrong code path. This only happens when we use the method at multiple points in our code with different parameters, so I think there is some fishy optimization going on here from the compiler (Microsoft Visual Studio 2008, SP1). Also, the above problem goes away when we change the interface to inline int calc_something(const double& x) { But I have no idea why this fixes the strange behaviour. Can anyone explane this behaviour? If I cannot understand what's going on we will have to remove the /fp:fastswitch, but this would make our application quite a bit slower.

    Read the article

  • Django: Set foreign key using integer?

    - by User
    Is there a way to set foreign key relationship using the integer id of a model? This would be for optimization purposes. For example, suppose I have an Employee model: class Employee(models.Model): first_name = models.CharField(max_length=100) last_name = models.CharField(max_length=100) type = models.ForeignKey('EmployeeType') and EmployeeType(models.Model): type = models.CharField(max_length=100) I want the flexibility of having unlimited employee types, but in the deployed application there will likely be only a single type so I'm wondering if there is a way to hardcode the id and set the relationship this way. This way I can avoid a db call to get the EmployeeType object first.

    Read the article

  • Should I make my MutexLock volatile?

    - by sje397
    I have some code in a function that goes something like this: void foo() { { // scope the locker MutexLocker locker(&mutex); // do some stuff.. } bar(); } The function call bar() also locks the mutex. I am having an issue whereby the program crashes (for someone else, who has not as yet provided a stack trace or more details) unless the mutex lock inside bar is disabled. Is it possible that some optimization is messing around with the way I have scoped the locker instance, and if so, would making it volatile fix it? Is that a bad idea? Thanks.

    Read the article

  • PHP speed optimisation.

    - by Petah
    Hi, Im wondering about speed optimization in PHP. I have a series of files that are requested every page load. On average there are 20 files. Each file must be read an parsed if they have changed. And this is excluding that standard files required for a web page (HTML, CSS, images, etc). EG - client requests page - server outputs html, css, images - server outputs dynamic content (20+/- files combined and minified). What would be the best way to serve these files as fast as possible?

    Read the article

  • How to check for C++ copy ellision

    - by Steve
    I ran across this article on copy ellision in C++ and I've seen comments about it in the boost library. This is appealing, as I prefer my functions to look like verylargereturntype DoSomething(...) rather than void DoSomething(..., verylargereturntype& retval) So, I have two questions about this Google has virtually no documentation on this at all, how real is this? How can I check that this optimization is actually occuring? I assume it involves looking at the assembly, but lets just say that isn't my strong suit. If anyone can give a very basic example as to what successful ellision looks like, that would be very useful I won't be using copy ellision just to prettify things, but if I can be guaranteed that it works, it sounds pretty useful.

    Read the article

  • In Python, is there a way to call a method on every item of an iterable? [closed]

    - by Thane Brimhall
    Possible Duplicate: Is there a map without result in python? I often come to a situation in my programs when I want to quickly/efficiently call an in-place method on each of the items contained by an iterable. (Quickly meaning the overhead of a for loop is unacceptable). A good example would be a list of sprites when I want to call draw() on each of the Sprite objects. I know I can do something like this: [sprite.draw() for sprite in sprite_list] But I feel like the list comprehension is misused since I'm not using the returned list. The same goes for the map function. Stone me for premature optimization, but I also don't want the overhead of the return value. What I want to know is if there's a method in Python that lets me do what I just explained, perhaps like the hypothetical function I suggest below: do_all(sprite_list, draw)

    Read the article

  • SQL: Optimize insensive SELECTs on DateTime fields

    - by Fedyashev Nikita
    I have an application for scheduling certain events. And all these events must be reviewed after each scheduled time. So basically we have 3 tables: items(id, name) scheduled_items(id, item_id, execute_at - datetime) - item_id column has an index option. reviewed_items(id, item_id, created_at - datetime) - item_id column has an index option. So core function of the application is "give me any items(which are not yet reviewed) for the actual moment". How can I optimize this solution for speed(because it is very core business feature and not micro optimization)? I suppose that adding index to the datetime fields doesn't make any sense because the cardinality or uniqueness on that fields are very high and index won't give any(?) speed-up. Is it correct? What would you recommend? Should I try no-SQL? -- mysql -V 5.075 I use caching where it makes sence.

    Read the article

  • Will these optimizations to my Ruby implementation of diff improve performance in a Rails app?

    - by grg-n-sox
    <tl;dr> In source version control diff patch generation, would it be worth it to use the optimizations listed at the very bottom of this writing (see <optimizations>) in my Ruby implementation of diff for making diff patches? </tl;dr> <introduction> I am programming something I have never done before and there might already be tools out there to do the exact thing I am programming but at this point I am having too much fun to care so I am still going to do it from scratch, even if there is a tool for this. So anyways, I am working on a Ruby on Rails app and need a certain feature. Basically I want each entry in a table of mine, let's say for example a table of video games, to have a stored chunk of text that represents a review or something of the sort for that table entry. However, I want this text to be both editable by any registered user and also keep track of different submissions in a version control system. The simplest solution I could think of is just implement a solution that keeps track of the text body and the diff patch history of different versions of the text body as objects in Ruby and then serialize it, preferably in human readable form (so I'll most likely use YAML for this) for editing if needed due to corruption by a software bug or a mistake is made by an admin doing some version editing. So at first I just tried to dive in head first into this feature to find that the problem of generating a diff patch is more difficult that I thought to do efficiently. So I did some research and came across some ideas. Some I have implemented already and some I have not. However, it all pretty much revolves around the longest common subsequence problem, as you would already know if you have already done anything with diff or diff-like features, and optimization the function that solves it. Currently I have it so it truncates the compared versions of the text body from the beginning and end until non-matching lines are found. Then it solves the problem using a comparison matrix, but instead of incrementing the value stored in a cell when it finds a matching line like in most longest common subsequence algorithms I have seen examples of, I increment when I have a non-matching line so as to calculate edit distance instead of longest common subsequence. Although as far as I can tell between the two approaches, they are essentially two sides of the same coin so either could be used to derive an answer. It then back-traces through the comparison matrix and notes when there was an incrementation and in which adjacent cell (West, Northwest, or North) to determine that line's diff entry and assumes all other lines to be unchanged. Normally I would leave it at that, but since this is going into a Rails environment and not just some stand-alone Ruby script, I started getting worried about needing to optimize at least enough so if a spammer that somehow knew how I implemented the version control system and knew my worst case scenario entry still wouldn't be able to hit the server that bad. After some searching and reading of research papers and articles through the internet, I've come across several that seem decent but all seem to have pros and cons and I am having a hard time deciding how well in this situation that the pros and cons balance out. So are the ones listed here worth it? I have listed them with known pros and cons. </introduction> <optimizations> Chop the compared sequences into multiple chucks of subsequences by splitting where lines are unchanged, and then truncating each section of unchanged lines at the beginning and end of each section. Then solve the edit distance of each subsequence. Pro: Changes the time increase as the changed area gets bigger from a quadratic increase to something more similar to a linear increase. Con: Figuring out where to split already seems like you have to solve edit distance except now you don't care how it is changed. Would be fine if this was solvable by a process closer to solving hamming distance but a single insertion would throw this off. Use a cryptographic hash function to both convert all sequence elements into integers and ensure uniqueness. Then solve the edit distance comparing the hash integers instead of the sequence elements themselves. Pro: The operation of comparing two integers is faster than the operation of comparing two strings, so a slight performance gain is received after every comparison, which can be a lot overall. Con: Using a cryptographic hash function takes time to convert all the sequence elements and may end up costing more time to do the conversion that you gain back from the integer comparisons. You could use the built in hash function for a string but that will not guarantee uniqueness. Use lazy evaluation to only calculate the three center-most diagonals of the comparison matrix and then only calculate additional diagonals as needed. And then also use this approach to possibly remove the need on some comparisons to compare all three adjacent cells as desribed here. Pro: Can turn an algorithm that always takes O(n * m) time and make it so only worst case scenario is that time, best case becomes practically linear, and average case is somewhere between the two. Con: It is an algorithm I've only seen implemented in functional programming languages and I am having a difficult time comprehending how to convert this into Ruby based on how it is described at the site linked to above. Make a C module and do the hard work at the native level in C and just make a Ruby wrapper for it so Ruby can make all the calls to it that it needs. Pro: I have to imagine that evaluating something like this in could be a LOT faster. Con: I have no idea how Rails handles apps with ruby code that has C extensions and it hurts the portability of the app. This is an optimization for after the solving of edit distance, but idea is to store additional combined diffs with the ones produced by each version to make a delta-tree data structure with the most recently made diff as the root node of the tree so getting to any version takes worst case time of O(log n) instead of O(n). Pro: Would make going back to an old version a lot faster. Con: It would mean every new commit, the delta-tree would get a new root node that will cost time to reorganize the delta-tree for an operation that will be carried out a lot more often than going back a version, not to mention the unlikelihood it will be an old version. </optimizations> So are these things worth the effort?

    Read the article

< Previous Page | 51 52 53 54 55 56 57 58 59 60 61 62  | Next Page >