Search Results

Search found 3438 results on 138 pages for 'nonlinear optimization'.

Page 54/138 | < Previous Page | 50 51 52 53 54 55 56 57 58 59 60 61  | Next Page >

  • Faster integer division when denominator is known?

    - by aaa
    hi I am working on GPU device which has very high division integer latency, several hundred cycles. I am looking to optimize divisions. All divisions by denominator which is in a set { 1,3,6,10 }, however numerator is a runtime positive value, roughly 32000 or less. due to memory constraints, lookup table is not option. Can you think of alternatives? I have thought of computing float point inverses, and using those to multiply numerator. Thanks

    Read the article

  • Why is MySQL with InnoDB doing a table scan when key exists and choosing to examine 70 times more ro

    - by andysk
    Hello, I'm troubleshooting a query performance problem. Here's an expected query plan from explain: mysql> explain select * from table1 where tdcol between '2010-04-13:00:00' and '2010-04-14 03:16'; +----+-------------+--------------------+-------+---------------+--------------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------------------+-------+---------------+--------------+---------+------+---------+-------------+ | 1 | SIMPLE | table1 | range | tdcol | tdcol | 8 | NULL | 5437848 | Using where | +----+-------------+--------------------+-------+---------------+--------------+---------+------+---------+-------------+ 1 row in set (0.00 sec) That makes sense, since the index named tdcol (KEY tdcol (tdcol)) is used, and about 5M rows should be selected from this query. However, if I query for just one more minute of data, we get this query plan: mysql> explain select * from table1 where tdcol between '2010-04-13 00:00' and '2010-04-14 03:17'; +----+-------------+--------------------+------+---------------+------+---------+------+-----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------------------+------+---------------+------+---------+------+-----------+-------------+ | 1 | SIMPLE | table1 | ALL | tdcol | NULL | NULL | NULL | 381601300 | Using where | +----+-------------+--------------------+------+---------------+------+---------+------+-----------+-------------+ 1 row in set (0.00 sec) The optimizer believes that the scan will be better, but it's over 70x more rows to examine, so I have a hard time believing that the table scan is better. Also, the 'USE KEY tdcol' syntax does not change the query plan. Thanks in advance for any help, and I'm more than happy to provide more info/answer questions.

    Read the article

  • When optimizing database queries, what exactly is the relationship between number of queries and siz

    - by williamjones
    To optimize application speed, everyone always advises to minimize the number of queries an application makes to the database, consolidating them into fewer queries that retrieve more wherever possible. However, this also always comes with the caution that data transferred is still data transferred, and just because you are making fewer queries doesn't make the data transferred free. I'm in a situation where I can over-include on the query in order to cut down the number of queries, and simply remove the unwanted data in the application code. Is there any type of a rule of thumb on how much of a cost there is to each query, to know when to optimize number of queries versus size of queries? I've tried to Google for objective performance analysis data, but surprisingly haven't been able to find anything like that. Clearly this relationship will change for factors such as when the database grows in size, making this somewhat individualized, but surely this is not so individualized that a broad sense of the landscape can't be drawn out? I'm looking for general answers, but for what it's worth, I'm running an application on Heroku.com, which means Ruby on Rails with a Postgres database.

    Read the article

  • will a mysql query run slower if one of the tables involved has no index defined??

    - by lock
    there's this already populated database which came from another dev im not sure what went on that dev's mind when he created the tables, but on one of our scripts there is this query involving 4 tables and it runs super slow SELECT a.col_1, a.col_2, a.col_3, a.col_4, a.col_5, a.col_6, a.col_7 FROM a, b, c, d WHERE a.id = b.id AND b.c_id = c.id AND c.id = d.c_id AND a.col_8 = '$col_8' AND d.g_id = '$g_id' AND c.private = '1' NOTE: $col_8 and $g_id are variables from a form its only my theory that it's due to tables b and c not having an index, although im guessing that the dev didnt think that it was necessary since those tables only tell relations between a and d, where b tells that the data in a belongs to a certain user, and c tells that the user belongs to a group in d as you can see, there's not even a join or other extensive query functions used but this query which returns only around 100 rows takes 2 minutes to execute. anyway my question is simply this post's title. will a mysql query run slower if one of the tables involved has no index defined??

    Read the article

  • Enforcing a query in MySql to use a specific index

    - by Hossein
    Hi, I have large table. consisting of only 3 columns (id(INT),bookmarkID(INT),tagID(INT)).I have two BTREE indexes one for each bookmarkID and tagID columns.This table has about 21 Million records. I am trying to run this query: SELECT bookmarkID,COUNT(bookmarkID) AS count FROM bookmark_tag_map GROUP BY tagID,bookmarkID HAVING tagID IN (-----"tagIDList"-----) AND count >= N which takes ages to return the results.I read somewhere that if make an index in which it has tagID,bookmarkID together, i will get a much faster result. I created the index after some time. Tried the query again, but it seems that this query is not using the new index that I have made.I ran EXPLAIN and saw that it is actually true. My question now is that how I can enforce a query to use a specific index? also comments on other ways to make the query faster are welcome. Thanks

    Read the article

  • Movies recommendation engine conceptual database design

    - by Supyxy
    I am working at an movie recommendations engine and i'm facing a DB design issue. My actual database looks like this: MOVIES [ID,TITLE] KEYWORDS_TABLE [ID,KEY_ID] - where ID is Foreign Key for MOVIES.id and KEY_ID is a key for a text keywords table This is not the entire DB, but i showed here what's important for my problem. I have about 50,000 movies and about 1,3 milion keywords correlations, and basically my algorithm consists in extracting all the who have the same keywords with a given movie, then ordering them by the number of keywords correlations. For example i looked for a movie similar to 'Cast away' and it returned 'Six days and six nights' because it had the most keywords correlations (4 keywords): Island Airplane crash Stranded Pilot The algorithm is based on more factors, but this one is the most important and the most difficult for the approach. Basically what i do now is getting all the movies that have at least one keyword similar to the given movie and then ordering them by other factors which are not important for a moment. There wouldn't be any problem if there weren't so many records, a query lasts in many cases up to 10-20 seconds and some of them return even over 5000 movies. Someone already helped me on here (thanks Mark Byers) with optimizing the query but that's not enough because it takes too longer SELECT DISTINCT M.title FROM keywords_table K1 JOIN keywords_table K2 ON K2.key_id = K1.key_id JOIN movies M ON K2.id = M.id WHERE K1.id = 4 So i thought it would be better if i pre-made those lists with movies recommendations for each movie, but i'm not sure how to design the tables.. whatever is it a good idea or how would you take this approach?

    Read the article

  • Strange: Planner takes decision with lower cost, but (very) query long runtime

    - by S38
    Facts: PGSQL 8.4.2, Linux I make use of table inheritance Each Table contains 3 million rows Indexes on joining columns are set Table statistics (analyze, vacuum analyze) are up-to-date Only used table is "node" with varios partitioned sub-tables Recursive query (pg = 8.4) Now here is the explained query: WITH RECURSIVE rows AS ( SELECT * FROM ( SELECT r.id, r.set, r.parent, r.masterid FROM d_storage.node_dataset r WHERE masterid = 3533933 ) q UNION ALL SELECT * FROM ( SELECT c.id, c.set, c.parent, r.masterid FROM rows r JOIN a_storage.node c ON c.parent = r.id ) q ) SELECT r.masterid, r.id AS nodeid FROM rows r QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on rows r (cost=2742105.92..2862119.94 rows=6000701 width=16) (actual time=0.033..172111.204 rows=4 loops=1) CTE rows -> Recursive Union (cost=0.00..2742105.92 rows=6000701 width=28) (actual time=0.029..172111.183 rows=4 loops=1) -> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.025..0.027 rows=1 loops=1) Index Cond: (masterid = 3533933) -> Hash Join (cost=0.33..262208.33 rows=600070 width=28) (actual time=40628.371..57370.361 rows=1 loops=3) Hash Cond: (c.parent = r.id) -> Append (cost=0.00..211202.04 rows=12001404 width=20) (actual time=0.011..46365.669 rows=12000004 loops=3) -> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.002..0.002 rows=0 loops=3) -> Seq Scan on node_dataset c (cost=0.00..55001.01 rows=3000001 width=20) (actual time=0.007..3426.593 rows=3000001 loops=3) -> Seq Scan on node_stammdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=0.008..9049.189 rows=3000001 loops=3) -> Seq Scan on node_stammdaten_adresse c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=3.455..8381.725 rows=3000001 loops=3) -> Seq Scan on node_testdaten c (cost=0.00..52059.01 rows=3000001 width=20) (actual time=1.810..5259.178 rows=3000001 loops=3) -> Hash (cost=0.20..0.20 rows=10 width=16) (actual time=0.010..0.010 rows=1 loops=3) -> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.002..0.004 rows=1 loops=3) Total runtime: 172111.371 ms (16 rows) (END) So far so bad, the planner decides to choose hash joins (good) but no indexes (bad). Now after doing the following: SET enable_hashjoins TO false; The explained query looks like that: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- CTE Scan on rows r (cost=15198247.00..15318261.02 rows=6000701 width=16) (actual time=0.038..49.221 rows=4 loops=1) CTE rows -> Recursive Union (cost=0.00..15198247.00 rows=6000701 width=28) (actual time=0.032..49.201 rows=4 loops=1) -> Index Scan using node_dataset_masterid on node_dataset r (cost=0.00..8.60 rows=1 width=28) (actual time=0.028..0.031 rows=1 loops=1) Index Cond: (masterid = 3533933) -> Nested Loop (cost=0.00..1507822.44 rows=600070 width=28) (actual time=10.384..16.382 rows=1 loops=3) Join Filter: (r.id = c.parent) -> WorkTable Scan on rows r (cost=0.00..0.20 rows=10 width=16) (actual time=0.001..0.003 rows=1 loops=3) -> Append (cost=0.00..113264.67 rows=3001404 width=20) (actual time=8.546..12.268 rows=1 loops=4) -> Seq Scan on node c (cost=0.00..24.00 rows=1400 width=20) (actual time=0.001..0.001 rows=0 loops=4) -> Bitmap Heap Scan on node_dataset c (cost=58213.87..113214.88 rows=3000001 width=20) (actual time=1.906..1.906 rows=0 loops=4) Recheck Cond: (c.parent = r.id) -> Bitmap Index Scan on node_dataset_parent (cost=0.00..57463.87 rows=3000001 width=0) (actual time=1.903..1.903 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_stammdaten_parent on node_stammdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=3.272..3.273 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_stammdaten_adresse_parent on node_stammdaten_adresse c (cost=0.00..8.60 rows=1 width=20) (actual time=4.333..4.333 rows=0 loops=4) Index Cond: (c.parent = r.id) -> Index Scan using node_testdaten_parent on node_testdaten c (cost=0.00..8.60 rows=1 width=20) (actual time=2.745..2.746 rows=0 loops=4) Index Cond: (c.parent = r.id) Total runtime: 49.349 ms (21 rows) (END) - incredibly faster, because indexes were used. Notice: Cost of the second query ist somewhat higher than for the first query. So the main question is: Why does the planner make the first decision, instead of the second? Also interesing: Via SET enable_seqscan TO false; i temp. disabled seq scans. Than the planner used indexes and hash joins, and the query still was slow. So the problem seems to be the hash join. Maybe someone can help in this confusing situation? thx, R.

    Read the article

  • Optimizing WordWrap Algorithm

    - by Milo
    I have a word-wrap algorithm that basically generates lines of text that fit the width of the text. Unfortunately, it gets slow when I add too much text. I was wondering if I oversaw any major optimizations that could be made. Also, if anyone has a design that would still allow strings of lines or string pointers of lines that is better I'd be open to rewriting the algorithm. Thanks void AguiTextBox::makeLinesFromWordWrap() { textRows.clear(); textRows.push_back(""); std::string curStr; std::string curWord; int curWordWidth = 0; int curLetterWidth = 0; int curLineWidth = 0; bool isVscroll = isVScrollNeeded(); int voffset = 0; if(isVscroll) { voffset = pChildVScroll->getWidth(); } int AdjWidthMinusVoffset = getAdjustedWidth() - voffset; int len = getTextLength(); int bytesSkipped = 0; int letterLength = 0; size_t ind = 0; for(int i = 0; i < len; ++i) { //get the unicode character letterLength = _unicodeFunctions.bringToNextUnichar(ind,getText()); curStr = getText().substr(bytesSkipped,letterLength); bytesSkipped += letterLength; curLetterWidth = getFont().getTextWidth(curStr); //push a new line if(curStr[0] == '\n') { textRows.back() += curWord; curWord = ""; curLetterWidth = 0; curWordWidth = 0; curLineWidth = 0; textRows.push_back(""); continue; } //ensure word is not longer than the width if(curWordWidth + curLetterWidth >= AdjWidthMinusVoffset && curWord.length() >= 1) { textRows.back() += curWord; textRows.push_back(""); curWord = ""; curWordWidth = 0; curLineWidth = 0; } //add letter to word curWord += curStr; curWordWidth += curLetterWidth; //if we need a Vscroll bar start over if(!isVscroll && isVScrollNeeded()) { isVscroll = true; voffset = pChildVScroll->getWidth(); AdjWidthMinusVoffset = getAdjustedWidth() - voffset; i = -1; curWord = ""; curStr = ""; textRows.clear(); textRows.push_back(""); ind = 0; curWordWidth = 0; curLetterWidth = 0; curLineWidth = 0; bytesSkipped = 0; continue; } if(curLineWidth + curWordWidth >= AdjWidthMinusVoffset && textRows.back().length() >= 1) { textRows.push_back(""); curLineWidth = 0; } if(curStr[0] == ' ' || curStr[0] == '-') { textRows.back() += curWord; curLineWidth += curWordWidth; curWord = ""; curWordWidth = 0; } } if(curWord != "") { textRows.back() += curWord; } updateWidestLine(); }

    Read the article

  • Optimizing spacing of mesh containing a given set of points

    - by Feynman
    I tried to summarize the this as best as possible in the title. I am writing an initial value problem solver in the most general way possible. I start with an arbitrary number of initial values at arbitrary locations (inside a boundary.) The first part of my program creates a mesh/grid (I am not sure which is the correct nuance), with N points total, that contains all the initial values. My goal is to optimize the mesh such that the spacing is as uniform as possible. My solver seems to work half decently (it needs some more obscure debugging that is not relevant here.) I am starting with one dimension. I intend to generalize the algorithm to an arbitrary number of dimensions once I get it working consistently. I am writing my code in fortran, but feel free to reply with pseudocode or the language of your choice. Allow me to elaborate with an example: Say I am working on a closed interval [1,10] xmin=1 xmax=10 Say I have 3 initial points: xmin, 5 and xmax num_ivc=3 known(num_ivc)=[xmin,5,xmax] //my arrays start at 1. Assume "known" starts sorted I store my mesh/grid points in an array called coord. Say I want 10 points total in my mesh/grid. N=10 coord(10) Remember, all this is arbitrary--except the variable names of course. The algorithm should set coord to {1,2,3,4,5,6,7,8,9,10} Now for a less trivial example: num_ivc=3 known(num_ivc)=[xmin,5.5,xmax or just num_ivc=1 known(num_ivc)=[5.5] Now, would you have 5 evenly spaced points on the interval [1, 5.5] and 5 evenly spaced points on the interval (5.5, 10]? But there is more space between 1 and 5.5 than between 5.5 and 10. So would you have 6 points on [1, 5.5] followed by 4 on (5.5 to 10]. The key is to minimize the difference in spacing. I have been working on this for 2 days straight and I can assure you it is a lot trickier than it sounds. I have written code that only works if N is large only works if N is small only works if it the known points are close together only works if it the known points are far apart only works if at least one of the known points is near a boundary only works if none of the known points are near a boundary So as you can see, I have coded the gamut of almost-solutions. I cannot figure out a way to get it to perform equally well in all possible scenarios (that is, create the optimum spacing.)

    Read the article

  • Position of least significant bit that is set

    - by peterchen
    I am looking for an efficient way to determine the position of the least significant bit that is set in an integer, e.g. for 0x0FF0 it would be 4. A trivial implementation is this: unsigned GetLowestBitPos(unsigned value) { assert(value != 0); // handled separately unsigned pos = 0; while (!(value & 1)) { value >>= 1; ++pos; } return pos; } Any ideas how to squeeze some cycles out of it? (Note: this question is for people that enjoy such things, not for people to tell me xyzoptimization is evil.) [edit] Thanks everyone for the ideas! I've learnt a few other things, too. Cool!

    Read the article

  • Passing variables to functions

    - by Faken
    A quick question: When i pass a variable to a function, dose the program make a copy of that variable to use in the function? If it dose and I knew that the function would only read the variable and never write to it, is it possible to pass a variable to the function without creating a copy of that variable or should I just leave that up to the compiler optimizations to do that automatically for me?

    Read the article

  • Function-Local Static Const variable Initialization semantics.

    - by Hassan Syed
    The questions are in bold, for those that cannot be bothered reading a question in depth. This is a followup to this question. It is to do with the initialization semantics of static variables in functions. Static variables should be initialized once, and their internal state might be altered later - as I (currently) do in the linked question. However, the code in question does not require the feature to change the state of the variable later. Let me clarrify my position, since I don't require the string object's internal state to change. The code is for a trait class for meta programming, and as such would would benifit from a const char * const ptr -- thus Ideally a local cost static const variable is needed. My educated guess is that in this case the string in question will be optimally placed in memory by the link-loader, and that the code is more secure and maps to the intended semantics. This leads to the semantics of such a variable "The C++ Programming language Third Edition -- Stroustrup" does not have anything (that I could find) to say about this matter. All that is said is that the variable is initialized once when the flow of control of the thread first reaches the code. This leads me to ponder if the following code would be sensible, and if not what are the intended semantics ?. #include <iostream> const char * const GetString(const char * x_in) { static const char * const x = x_in; return x; } int main() { const char * const temp = GetString("yahoo"); std::cout << temp << std::endl; const char * const temp2 = GetString("yahoo2"); std::cout << temp2 << std::endl; } The following compiles on GCC and prints "yahoo" twice. Which is what I want -- However it might not be standards compliant (which is why I post this question). It might be more elegant to have two functions, "SetString" and "String" where the latter forwards to the first. If it is standards compliant does someone know of a templates implementation in boost (or elsewhere) ?

    Read the article

  • Optimize code performance when odd/even threads are doing different things in CUDA

    - by Orion Nebula
    Hi all! I have two large vectors, I am trying to do some sort of element multiplication, where an even-numbered element in the first vector is multiplied by the next odd-numbered element in the second vector .... and where the odd-numbered element in the first vector is multiplied by the preceding even-numbered element in the second vector Ex. vector 1 is V1(1) V1(2) V1(3) V1(4) vector 2 is V2(1) V2(2) V2(3) V2(4) V1(1) * V2(2) V1(3) * V2(4) V1(2) * V2(1) V1(4) * V2(3) I have written a Cuda code to do this: (Pds has the elements of the first vector in shared memory, Nds the second Vector) //instead of using %2 .. i check for the first bit to decide if number is odd/even -- faster if ((tx & 0x0001) == 0x0000) Nds[tx+1] = Pds[tx] * Nds[tx+1]; else Nds[tx-1] = Pds[tx] * Nds[tx-1]; __syncthreads(); Is there anyway to further accelerate this code or avoid divergence ? Thanks

    Read the article

  • PHP Initialising strings as boolean first

    - by Anriëtte Myburgh
    I'm in the habit of initialising variables in PHP to false and then applying whatever (string, boolean, float) value to it later. Which would you reckon is better? $name = false; if (condition == true) { $name = $something_else; } if ($name) { …do something… } vs. $name =''; if (condition == true) { $name = $something_else; } if (!empty($name)) { …do something… } Which would you reckon can possibly give better performance? Which method would you use?

    Read the article

  • FxCop giving a warning on private constructor CA1823 and CA1053

    - by Luis Sánchez
    I have a class that looks like the following: Public Class Utilities Public Shared Function blah(userCode As String) As String 'doing some stuff End Function End Class I'm running FxCop 10 on it and it says: "Because type 'Utilities' contains only 'static' ( 'Shared' in Visual Basic) members, add a default private constructor to prevent the compiler from adding a default public constructor." Ok, you're right Mr. FxCop, I'll add a private constructor: Private Utilities() Now I'm having: "It appears that field 'Utilities.Utilities' is never used or is only ever assigned to. Use this field or remove it." Any ideas of what should I do to get rid of both warnings?

    Read the article

  • MySQL efficiency as it relates to the database/table size

    - by mlissner
    I'm building a system using django, Sphinx and MySQL that's very quickly becoming quite large. The database currently has about 2000 rows, and I've written a program that's going to populate it with another 40,000 rows in a couple days. Since the database is live right now, and since I've never had a database with this much information in it, I'm worried about some things: Is adding all these rows going to seriously degrade the efficiency of my django app? Will I need to go back through it and optimize all my database calls so they're doing things more cleverly? Or will this make the database slow all around to the extent that I can't do anything about it at all? If you scoff at my 40k rows, then, my next question is, at what point SHOULD I be concerned? I will likely be adding another couple hundred thousand soon, so I worry, and I fret. How is sphinx going to feel about all this? Is it going to freak out when it realizes it has to index all this data? Or will it be fine? Is this normal for it? If it is, at what point should I be concerned that it's too much data for Sphinx? Thanks for any thoughts.

    Read the article

  • What is the Fastest Way to Check for a Keyword in a List of Keywords in Delphi?

    - by lkessler
    I have a small list of keywords. What I'd really like to do is akin to: case MyKeyword of 'CHIL': (code for CHIL); 'HUSB': (code for HUSB); 'WIFE': (code for WIFE); 'SEX': (code for SEX); else (code for everything else); end; Unfortunately the CASE statement can't be used like that for strings. I could use the straight IF THEN ELSE IF construct, e.g.: if MyKeyword = 'CHIL' then (code for CHIL) else if MyKeyword = 'HUSB' then (code for HUSB) else if MyKeyword = 'WIFE' then (code for WIFE) else if MyKeyword = 'SEX' then (code for SEX) else (code for everything else); but I've heard this is relatively inefficient. What I had been doing instead is: P := pos(' ' + MyKeyword + ' ', ' CHIL HUSB WIFE SEX '); case P of 1: (code for CHIL); 6: (code for HUSB); 11: (code for WIFE); 17: (code for SEX); else (code for everything else); end; This, of course is not the best programming style, but it works fine for me and up to now didn't make a difference. So what is the best way to rewrite this in Delphi so that it is both simple, understandable but also fast? (For reference, I am using Delphi 2009 with Unicode strings.) Followup: Toby recommended I simply use the If Then Else construct. Looking back at my examples that used a CASE statement, I can see how that is a viable answer. Unfortunately, my inclusion of the CASE inadvertently hid my real question. I actually don't care which keyword it is. That is just a bonus if the particular method can identify it like the POS method can. What I need is to know whether or not the keyword is in the set of keywords. So really I want to know if there is anything better than: if pos(' ' + MyKeyword + ' ', ' CHIL HUSB WIFE SEX ') > 0 then The If Then Else equivalent does not seem better in this case being: if (MyKeyword = 'CHIL') or (MyKeyword = 'HUSB') or (MyKeyword = 'WIFE') or (MyKeyword = 'SEX') then In Barry's comment to Kornel's question, he mentions the TDictionary Generic. I've not yet picked up on the new Generic collections and it looks like I should delve into them. My question here would be whether they are built for efficiency and how would using TDictionary compare in looks and in speed to the above two lines? In later profiling, I have found that the concatenation of strings as in: (' ' + MyKeyword + ' ') is VERY expensive time-wise and should be avoided whenever possible. Almost any other solution is better than doing this.

    Read the article

  • How can I further optimize this color difference function?

    - by aLfa
    I have made this function to calculate color differences in the CIE Lab colorspace, but it lacks speed. Since I'm not a Java expert, I wonder if any Java guru around has some tips that can improve the speed here. The code is based on the matlab function mentioned in the comment block. /** * Compute the CIEDE2000 color-difference between the sample color with * CIELab coordinates 'sample' and a standard color with CIELab coordinates * 'std' * * Based on the article: * "The CIEDE2000 Color-Difference Formula: Implementation Notes, * Supplementary Test Data, and Mathematical Observations,", G. Sharma, * W. Wu, E. N. Dalal, submitted to Color Research and Application, * January 2004. * available at http://www.ece.rochester.edu/~gsharma/ciede2000/ */ public static double deltaE2000(double[] lab1, double[] lab2) { double L1 = lab1[0]; double a1 = lab1[1]; double b1 = lab1[2]; double L2 = lab2[0]; double a2 = lab2[1]; double b2 = lab2[2]; // Cab = sqrt(a^2 + b^2) double Cab1 = Math.sqrt(a1 * a1 + b1 * b1); double Cab2 = Math.sqrt(a2 * a2 + b2 * b2); // CabAvg = (Cab1 + Cab2) / 2 double CabAvg = (Cab1 + Cab2) / 2; // G = 1 + (1 - sqrt((CabAvg^7) / (CabAvg^7 + 25^7))) / 2 double CabAvg7 = Math.pow(CabAvg, 7); double G = 1 + (1 - Math.sqrt(CabAvg7 / (CabAvg7 + 6103515625.0))) / 2; // ap = G * a double ap1 = G * a1; double ap2 = G * a2; // Cp = sqrt(ap^2 + b^2) double Cp1 = Math.sqrt(ap1 * ap1 + b1 * b1); double Cp2 = Math.sqrt(ap2 * ap2 + b2 * b2); // CpProd = (Cp1 * Cp2) double CpProd = Cp1 * Cp2; // hp1 = atan2(b1, ap1) double hp1 = Math.atan2(b1, ap1); // ensure hue is between 0 and 2pi if (hp1 < 0) { // hp1 = hp1 + 2pi hp1 += 6.283185307179586476925286766559; } // hp2 = atan2(b2, ap2) double hp2 = Math.atan2(b2, ap2); // ensure hue is between 0 and 2pi if (hp2 < 0) { // hp2 = hp2 + 2pi hp2 += 6.283185307179586476925286766559; } // dL = L2 - L1 double dL = L2 - L1; // dC = Cp2 - Cp1 double dC = Cp2 - Cp1; // computation of hue difference double dhp = 0.0; // set hue difference to zero if the product of chromas is zero if (CpProd != 0) { // dhp = hp2 - hp1 dhp = hp2 - hp1; if (dhp > Math.PI) { // dhp = dhp - 2pi dhp -= 6.283185307179586476925286766559; } else if (dhp < -Math.PI) { // dhp = dhp + 2pi dhp += 6.283185307179586476925286766559; } } // dH = 2 * sqrt(CpProd) * sin(dhp / 2) double dH = 2 * Math.sqrt(CpProd) * Math.sin(dhp / 2); // weighting functions // Lp = (L1 + L2) / 2 - 50 double Lp = (L1 + L2) / 2 - 50; // Cp = (Cp1 + Cp2) / 2 double Cp = (Cp1 + Cp2) / 2; // average hue computation // hp = (hp1 + hp2) / 2 double hp = (hp1 + hp2) / 2; // identify positions for which abs hue diff exceeds 180 degrees if (Math.abs(hp1 - hp2) > Math.PI) { // hp = hp - pi hp -= Math.PI; } // ensure hue is between 0 and 2pi if (hp < 0) { // hp = hp + 2pi hp += 6.283185307179586476925286766559; } // LpSqr = Lp^2 double LpSqr = Lp * Lp; // Sl = 1 + 0.015 * LpSqr / sqrt(20 + LpSqr) double Sl = 1 + 0.015 * LpSqr / Math.sqrt(20 + LpSqr); // Sc = 1 + 0.045 * Cp double Sc = 1 + 0.045 * Cp; // T = 1 - 0.17 * cos(hp - pi / 6) + // + 0.24 * cos(2 * hp) + // + 0.32 * cos(3 * hp + pi / 30) - // - 0.20 * cos(4 * hp - 63 * pi / 180) double hphp = hp + hp; double T = 1 - 0.17 * Math.cos(hp - 0.52359877559829887307710723054658) + 0.24 * Math.cos(hphp) + 0.32 * Math.cos(hphp + hp + 0.10471975511965977461542144610932) - 0.20 * Math.cos(hphp + hphp - 1.0995574287564276334619251841478); // Sh = 1 + 0.015 * Cp * T double Sh = 1 + 0.015 * Cp * T; // deltaThetaRad = (pi / 3) * e^-(36 / (5 * pi) * hp - 11)^2 double powerBase = hp - 4.799655442984406; double deltaThetaRad = 1.0471975511965977461542144610932 * Math.exp(-5.25249016001879 * powerBase * powerBase); // Rc = 2 * sqrt((Cp^7) / (Cp^7 + 25^7)) double Cp7 = Math.pow(Cp, 7); double Rc = 2 * Math.sqrt(Cp7 / (Cp7 + 6103515625.0)); // RT = -sin(delthetarad) * Rc double RT = -Math.sin(deltaThetaRad) * Rc; // de00 = sqrt((dL / Sl)^2 + (dC / Sc)^2 + (dH / Sh)^2 + RT * (dC / Sc) * (dH / Sh)) double dLSl = dL / Sl; double dCSc = dC / Sc; double dHSh = dH / Sh; return Math.sqrt(dLSl * dLSl + dCSc * dCSc + dHSh * dHSh + RT * dCSc * dHSh); }

    Read the article

  • Complicated idea - how to create car racing for my RPG game's players

    - by Donator
    So, I want to create car racing for my RPG game's players. Player can create race and choose how many participants can participate in race. After race is being created, other people can join it. When the maximum participants are collected, race begins. My idea, when the last participant joins, then instantly choose the winner (who's car is the best, that person wins), but how can I do it? If I choose to pick the winner after the last participant joins, then I have to put many queries in one page (select data from table, then delete the race, then select players' cars' statistics and pick the winner and then again, using mysql, send message to everyone). But this idea is really not optimal and it will lag cruelly for that last person. Maybe you have any ideas how I can avoid lag and make it more optimal. Thank you very much.

    Read the article

  • How do you control what your C compiler Optimizes?

    - by Jordan S
    I am writing the firmware for an embedded device in C using the Silicon Labs IDE and the SDCC compiler. The device architecture is based on the 8051 family. The function in question is shown below. The function is used to set the ports on my MCU to drive a stepper motor. It gets called in by an interrupt handler. The big switch statement just sets the ports to the proper value for the next motor step. The bottom part of the function looks at an input from a hall effect sensor and a number of steps moved in order to detect if the motor has stalled. The problem is, for some reason the second IF statement that looks like this if (StallDetector > (GapSize + 20)) { HandleStallEvent(); } always seems to get optimized out. If I try to put a breakpoint at the HandleStallEvent() call the IDE gives me a message saying "No Address Correlation to this line number". I am not really good enough at reading assembly to tell what it is doing but I have pasted a snippet from the asm output below. Any help would be much appreciated. void OperateStepper(void) { //static bit LastHomeMagState = HomeSensor; static bit LastPosMagState = PosSensor; if(PulseMotor) { if(MoveDirection == 1) // Go clockwise { switch(STEPPER_POSITION) { case 'A': STEPPER_POSITION = 'B'; P1 = 0xFD; break; case 'B': STEPPER_POSITION = 'C'; P1 = 0xFF; break; case 'C': STEPPER_POSITION = 'D'; P1 = 0xFE; break; case 'D': STEPPER_POSITION = 'A'; P1 = 0xFC; break; default: STEPPER_POSITION = 'A'; P1 = 0xFC; } //end switch } else // Go CounterClockwise { switch(STEPPER_POSITION) { case 'A': STEPPER_POSITION = 'D'; P1 = 0xFE; break; case 'B': STEPPER_POSITION = 'A'; P1 = 0xFC; break; case 'C': STEPPER_POSITION = 'B'; P1 = 0xFD; break; case 'D': STEPPER_POSITION = 'C'; P1 = 0xFF; break; default: STEPPER_POSITION = 'A'; P1 = 0xFE; } //end switch } //end else MotorSteps++; StallDetector++; if(PosSensor != LastPosMagState) { StallDetector = 0; LastPosMagState = PosSensor; } else { if (PosSensor == ON) { if (StallDetector > (MagnetSize + 20)) { HandleStallEvent(); } } else if (PosSensor == OFF) { if (StallDetector > (GapSize + 20)) { HandleStallEvent(); } } } } //end if PulseMotor } ... and the asm output for the the bottom part of this function... ; C:\SiLabs\Optec Programs\HSFW_HID_SDCC_2\MotionControl.c:653: if(PosSensor != LastPosMagState) mov c,_P1_4 jb _OperateStepper_LastPosMagState_1_1,00158$ cpl c 00158$: jc 00126$ C$MotionControl.c$655$3$7 ==. ; C:\SiLabs\Optec Programs\HSFW_HID_SDCC_2\MotionControl.c:655: StallDetector = 0; clr a mov _StallDetector,a mov (_StallDetector + 1),a C$MotionControl.c$657$3$7 ==. ; C:\SiLabs\Optec Programs\HSFW_HID_SDCC_2\MotionControl.c:657: LastPosMagState = PosSensor; mov c,_P1_4 mov _OperateStepper_LastPosMagState_1_1,c ret 00126$: C$MotionControl.c$661$2$8 ==. ; C:\SiLabs\Optec Programs\HSFW_HID_SDCC_2\MotionControl.c:661: if (PosSensor == ON) jb _P1_4,00123$ C$MotionControl.c$663$4$9 ==. ; C:\SiLabs\Optec Programs\HSFW_HID_SDCC_2\MotionControl.c:663: if (StallDetector > (MagnetSize + 20)) mov a,_MagnetSize mov r2,a rlc a subb a,acc mov r3,a mov a,#0x14 add a,r2 mov r2,a clr a addc a,r3 mov r3,a clr c mov a,r2 subb a,_StallDetector mov a,r3 subb a,(_StallDetector + 1) jnc 00130$ C$MotionControl.c$665$5$10 ==. ; C:\SiLabs\Optec Programs\HSFW_HID_SDCC_2\MotionControl.c:665: HandleStallEvent(); ljmp _HandleStallEvent 00123$: C$MotionControl.c$668$2$8 ==. ; C:\SiLabs\Optec Programs\HSFW_HID_SDCC_2\MotionControl.c:668: else if (PosSensor == OFF) jnb _P1_4,00130$ C$MotionControl.c$670$4$11 ==. ; C:\SiLabs\Optec Programs\HSFW_HID_SDCC_2\MotionControl.c:670: if (StallDetector > (GapSize + 20)) mov a,#0x14 add a,_GapSize mov r2,a clr a addc a,(_GapSize + 1) mov r3,a clr c mov a,r2 subb a,_StallDetector mov a,r3 subb a,(_StallDetector + 1) jnc 00130$ C$MotionControl.c$672$5$12 ==. ; C:\SiLabs\Optec Programs\HSFW_HID_SDCC_2\MotionControl.c:672: HandleStallEvent(); C$MotionControl.c$678$2$1 ==. XG$OperateStepper$0$0 ==. ljmp _HandleStallEvent 00130$: ret It looks to me like the compiler is NOT optimizing out this second if statement from the looks of the asm but if that is the case why does the IDE not allow me so set a breakpoint there? Maybe it's just a dumb IDE!

    Read the article

  • explicit copy constructor or implicit parameter by value

    - by R Samuel Klatchko
    I recently read (and unfortunately forgot where), that the best way to write operator= is like this: foo &operator=(foo other) { swap(*this, other); return *this; } instead of this: foo &operator=(const foo &other) { foo copy(other); swap(*this, copy); return *this; } The idea is that if operator= is called with an rvalue, the first version can optimize away construction of a copy. So when called with a rvalue, the first version is faster and when called with an lvalue the two are equivalent. I'm curious as to what other people think about this? Would people avoid the first version because of lack of explicitness? Am I correct that the first version can be better and can never be worse?

    Read the article

  • Combine static files or load in parallel

    - by Niall Collins
    I am at present introducing code to my site to combine css and javascript files. Is there a way without having to include an external library to load javascript asynchronously or in parallel? I have read on some blogs that combining of files can be counter productive as the load of the http request can be large and its better to load multiple files in parallel. Opinions on this? I am caching my javascript/css. And would have thought it was better to combine rather than have multiple http requests.

    Read the article

  • gcc memory alignment pragma

    - by aaa
    hello. Does gcc have memory alignment pragma, akin #pragma vector aligned in Intel compiler? I would like to tell compiler to optimize particular loop using aligned loads/store instructions. Thanks

    Read the article

  • How to optimize neural network by using genetic algorithm?

    - by Billy Coen
    I'm quite new with this topic so any help would be great. What i need is to optimize a neural network in MATLAB by using GA. My network has [2x98] input and [1x98] target, i've tried consulting matlab help but im still kind of clueless about what to do :( so, any help would be appreciated. Thanks in advance. edit: i guess i didn't say what is there to be optimized as Dan said in the 1st answer. I guess most important thing is number of hidden neurons. And maybe number of hidden layers and training parameters like number of epochs or so. Sorry for not providing enough info, i'm still learning about this.

    Read the article

< Previous Page | 50 51 52 53 54 55 56 57 58 59 60 61  | Next Page >