Search Results

Search found 14719 results on 589 pages for 'optimization level'.

Page 80/589 | < Previous Page | 76 77 78 79 80 81 82 83 84 85 86 87  | Next Page >

  • reduce image size in bytes without resize and quality lose in c#

    - by SR Dusad
    Hi I m using C#.NET 4.0 I have an jpeg image and i want to reduce its size in bytes .I don't want to change the image size in manner of height and width and not want to lose image quality.Some bit of reduce quality is not an issue. I try to make it a thumbnail image but it reduce the size according to height and width. I can't found any solution. Any type help will be appreciated..

    Read the article

  • What approach to take for SIMD optimizations

    - by goldenmean
    Hi, I am trying to optimize below code for SIMD operations (8way/4way/2way SIMD whiechever possible and if it gives gains in performance) I am tryin to analyze it first on paper to understand the algorithm used. How can i optimize it for SIMD:- void idct(uint8_t *dst, int stride, int16_t *input, int type) { int16_t *ip = input; uint8_t *cm = ff_cropTbl + MAX_NEG_CROP; int A, B, C, D, Ad, Bd, Cd, Dd, E, F, G, H; int Ed, Gd, Add, Bdd, Fd, Hd; int i; /* Inverse DCT on the rows now */ for (i = 0; i < 8; i++) { /* Check for non-zero values */ if ( ip[0] | ip[1] | ip[2] | ip[3] | ip[4] | ip[5] | ip[6] | ip[7] ) { A = M(xC1S7, ip[1]) + M(xC7S1, ip[7]); B = M(xC7S1, ip[1]) - M(xC1S7, ip[7]); C = M(xC3S5, ip[3]) + M(xC5S3, ip[5]); D = M(xC3S5, ip[5]) - M(xC5S3, ip[3]); Ad = M(xC4S4, (A - C)); Bd = M(xC4S4, (B - D)); Cd = A + C; Dd = B + D; E = M(xC4S4, (ip[0] + ip[4])); F = M(xC4S4, (ip[0] - ip[4])); G = M(xC2S6, ip[2]) + M(xC6S2, ip[6]); H = M(xC6S2, ip[2]) - M(xC2S6, ip[6]); Ed = E - G; Gd = E + G; Add = F + Ad; Bdd = Bd - H; Fd = F - Ad; Hd = Bd + H; /* Final sequence of operations over-write original inputs. */ ip[0] = (int16_t)(Gd + Cd) ; ip[7] = (int16_t)(Gd - Cd ); ip[1] = (int16_t)(Add + Hd); ip[2] = (int16_t)(Add - Hd); ip[3] = (int16_t)(Ed + Dd) ; ip[4] = (int16_t)(Ed - Dd ); ip[5] = (int16_t)(Fd + Bdd); ip[6] = (int16_t)(Fd - Bdd); } ip += 8; /* next row */ } ip = input; for ( i = 0; i < 8; i++) { /* Check for non-zero values (bitwise or faster than ||) */ if ( ip[1 * 8] | ip[2 * 8] | ip[3 * 8] | ip[4 * 8] | ip[5 * 8] | ip[6 * 8] | ip[7 * 8] ) { A = M(xC1S7, ip[1*8]) + M(xC7S1, ip[7*8]); B = M(xC7S1, ip[1*8]) - M(xC1S7, ip[7*8]); C = M(xC3S5, ip[3*8]) + M(xC5S3, ip[5*8]); D = M(xC3S5, ip[5*8]) - M(xC5S3, ip[3*8]); Ad = M(xC4S4, (A - C)); Bd = M(xC4S4, (B - D)); Cd = A + C; Dd = B + D; E = M(xC4S4, (ip[0*8] + ip[4*8])) + 8; F = M(xC4S4, (ip[0*8] - ip[4*8])) + 8; if(type==1){ //HACK E += 16*128; F += 16*128; } G = M(xC2S6, ip[2*8]) + M(xC6S2, ip[6*8]); H = M(xC6S2, ip[2*8]) - M(xC2S6, ip[6*8]); Ed = E - G; Gd = E + G; Add = F + Ad; Bdd = Bd - H; Fd = F - Ad; Hd = Bd + H; /* Final sequence of operations over-write original inputs. */ if(type==0){ ip[0*8] = (int16_t)((Gd + Cd ) >> 4); ip[7*8] = (int16_t)((Gd - Cd ) >> 4); ip[1*8] = (int16_t)((Add + Hd ) >> 4); ip[2*8] = (int16_t)((Add - Hd ) >> 4); ip[3*8] = (int16_t)((Ed + Dd ) >> 4); ip[4*8] = (int16_t)((Ed - Dd ) >> 4); ip[5*8] = (int16_t)((Fd + Bdd ) >> 4); ip[6*8] = (int16_t)((Fd - Bdd ) >> 4); }else if(type==1){ dst[0*stride] = cm[(Gd + Cd ) >> 4]; dst[7*stride] = cm[(Gd - Cd ) >> 4]; dst[1*stride] = cm[(Add + Hd ) >> 4]; dst[2*stride] = cm[(Add - Hd ) >> 4]; dst[3*stride] = cm[(Ed + Dd ) >> 4]; dst[4*stride] = cm[(Ed - Dd ) >> 4]; dst[5*stride] = cm[(Fd + Bdd ) >> 4]; dst[6*stride] = cm[(Fd - Bdd ) >> 4]; }else{ dst[0*stride] = cm[dst[0*stride] + ((Gd + Cd ) >> 4)]; dst[7*stride] = cm[dst[7*stride] + ((Gd - Cd ) >> 4)]; dst[1*stride] = cm[dst[1*stride] + ((Add + Hd ) >> 4)]; dst[2*stride] = cm[dst[2*stride] + ((Add - Hd ) >> 4)]; dst[3*stride] = cm[dst[3*stride] + ((Ed + Dd ) >> 4)]; dst[4*stride] = cm[dst[4*stride] + ((Ed - Dd ) >> 4)]; dst[5*stride] = cm[dst[5*stride] + ((Fd + Bdd ) >> 4)]; dst[6*stride] = cm[dst[6*stride] + ((Fd - Bdd ) >> 4)]; } } else { if(type==0){ ip[0*8] = ip[1*8] = ip[2*8] = ip[3*8] = ip[4*8] = ip[5*8] = ip[6*8] = ip[7*8] = ((xC4S4 * ip[0*8] + (IdctAdjustBeforeShift<<16))>>20); }else if(type==1){ dst[0*stride]= dst[1*stride]= dst[2*stride]= dst[3*stride]= dst[4*stride]= dst[5*stride]= dst[6*stride]= dst[7*stride]= cm[128 + ((xC4S4 * ip[0*8] + (IdctAdjustBeforeShift<<16))>>20)]; }else{ if(ip[0*8]){ int v= ((xC4S4 * ip[0*8] + (IdctAdjustBeforeShift<<16))>>20); dst[0*stride] = cm[dst[0*stride] + v]; dst[1*stride] = cm[dst[1*stride] + v]; dst[2*stride] = cm[dst[2*stride] + v]; dst[3*stride] = cm[dst[3*stride] + v]; dst[4*stride] = cm[dst[4*stride] + v]; dst[5*stride] = cm[dst[5*stride] + v]; dst[6*stride] = cm[dst[6*stride] + v]; dst[7*stride] = cm[dst[7*stride] + v]; } } } ip++; /* next column */ dst++; } }

    Read the article

  • SQL Server Management Studio – tips for improving the TSQL coding process

    - by kristof
    I used to work in a place where a common practice was to use Pair Programming. I remember how many small things we could learn from each other when working together on the code. Picking up new shortcuts, code snippets etc. with time significantly improved our efficiency of writing code. Since I started working with SQL Server I have been left on my own. The best habits I would normally pick from working together with other people which I cannot do now. So here is the question: What are you tips on efficiently writing TSQL code using SQL Server Management Studio? Please keep the tips to 2 – 3 things/shortcuts that you think improve you speed of coding Please stay within the scope of TSQL and SQL Server Management Studio 2005/2008 If the feature is specific to the version of Management Studio please indicate: e.g. “Works with SQL Server 2008 only" Thanks EDIT: I am afraid that I could have been misunderstood by some of you. I am not looking for tips for writing efficient TSQL code but rather for advice on how to efficiently use Management Studio to speed up the coding process itself. The type of answers that I am looking for are: use of templates, keyboard-shortcuts, use of IntelliSense plugins etc. Basically those little things that make the coding experience a bit more efficient and pleasant. Thanks again

    Read the article

  • Genetic algorithms

    - by daniels
    I'm trying to implement a genetic algorithm that will calculate the minimum of the Rastrigin functon and I'm having some issues. I need to represent the chromosome as a binary string and as the Rastrigin's function takes a list of numbers as a parameter, how can decode the chromosome to a list of numbers? Also the Rastrigin's wants the elements in the list to be -5.12<=x(i)<=5.12 what happens if when i generate the chromosome it will produce number not in that interval? I'm new to this so help and explanation that will aid me in understanding will be highly appreciated. Thanks.

    Read the article

  • Image rotation algorithm

    - by Stefano Driussi
    I'm looking for an algorithm that rotates an image by some degrees (input). public Image rotateImage(Image image, int degrees) (Image instances could be replaced with int[] containing each pixel RGB values, My problem is that i need to implement it for a JavaME MIDP 2.0 project so i must use code runnable on JVM prior to version 1.5 Can anyone help me out with this ? EDIT: I forgot to mention that i don't have SVG APIs available and that i need a method to rotate by arbitrary degree other than 90 - 180- 270 Also, no java.awt.* packages are available on MIDP 2.0

    Read the article

  • What is an Efficient algorithm to find Area of Overlapping Rectangles

    - by namenlos
    My situation Input: a set of rectangles each rect is comprised of 4 doubles like this: (x0,y0,x1,y1) they are not "rotated" at any angle, all they are "normal" rectangles that go "up/down" and "left/right" with respect to the screen they are randomly placed - they may be touching at the edges, overlapping , or not have any contact I will have several hundred rectangles this is implemented in C# I need to find The area that is formed by their overlap - all the area in the canvas that more than one rectangle "covers" (for example with two rectangles, it would be the intersection) I don't need the geometry of the overlap - just the area (example: 4 sq inches) Overlaps shouldn't be counted multiple times - so for example imagine 3 rects that have the same size and position - they are right on top of each other - this area should be counted once (not three times) Example The image below contains thre rectangles: A,B,C A and B overlap (as indicated by dashes) B and C overlap (as indicated by dashes) What I am looking for is the area where the dashes are shown - AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAA--------------BBB AAAAAAAAAAAAAAAA--------------BBB AAAAAAAAAAAAAAAA--------------BBB AAAAAAAAAAAAAAAA--------------BBB BBBBBBBBBBBBBBBBB BBBBBBBBBBBBBBBBB BBBBBBBBBBBBBBBBB BBBBBB-----------CCCCCCCC BBBBBB-----------CCCCCCCC BBBBBB-----------CCCCCCCC CCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCC

    Read the article

  • optimizing iPhone OpenGL ES fill rate

    - by NateS
    I have an Open GL ES game on the iPhone. My framerate is pretty sucky, ~20fps. Using the Xcode OpenGL ES performance tool on an iPhone 3G, it shows: Renderer Utilization: 95% to 99% Tiler Utilization: ~27% I am drawing a lot of pretty large images with a lot of blending. If I reduce the number of images drawn, framerates go from ~20 to ~40, though the performance tool results stay about the same (renderer still maxed). I think I'm being limited by the fill rate of the iPhone 3G, but I'm not sure. My questions are: How can I determine with more granularity where the bottleneck is? That is my biggest problem, I just don't know what is taking all the time. If it is fillrate, is there anything I do to improve it besides just drawing less? I am using texture atlases. I have tried to minimize image binds, though it isn't always possible (drawing order, not everything fits on one 1024x1024 texture, etc). Every frame I do 10 image binds. This seem pretty reasonable, but I could be mistaken. I'm using vertex arrays and glDrawArrays. I don't really have a lot of geometry. I can try to be more precise if needed. Each image is 2 triangles and I try to batch things were possible, though often (maybe half the time) images are drawn with individual glDrawArrays calls. Besides the images, I have ~60 triangles worth of geometry being rendered in ~6 glDrawArrays calls. I often glTranslate before calling glDrawArrays. Would it improve the framerate to switch to VBOs? I don't think it is a huge amount of geometry, but maybe it is faster for other reasons? Are there certain things to watch out for that could reduce performance? Eg, should I avoid glTranslate, glColor4g, etc? I'm using glScissor in a 3 places per frame. Each use consists of 2 glScissor calls, one to set it up, and one to reset it to what it was. I don't know if there is much of a performance impact here. If I used PVRTC would it be able to render faster? Currently all my images are GL_RGBA. I don't have memory issues. Here is a rough idea of what I'm drawing, in this order: 1) Switch to perspective matrix. 2) Draw a full screen background image 3) Draw a full screen image with translucency (this one has a scrolling texture). 4) Draw a few sprites. 5) Switch to ortho matrix. 6) Draw a few sprites. 7) Switch to perspective matrix. 8) Draw sprites and some other textured geometry. 9) Switch to ortho matrix. 10) Draw a few sprites (eg, game HUD). Steps 1-6 draw a bunch of background stuff. 8 draws most of the game content. 10 draws the HUD. As you can see, there are many layers, some of them full screen and some of the sprites are pretty large (1/4 of the screen). The layers use translucency, so I have to draw them in back-to-front order. This is further complicated by needing to draw various layers in ortho and others in perspective. I will gladly provide additional information if reqested. Thanks in advance for any performance tips or general advice on my problem!

    Read the article

  • Java: ArrayList bottleneck

    - by Jack
    Hello, while profiling a java application that calculates hierarchical clustering of thousands of elements I realized that ArrayList.get occupies like half of the CPU needed in the clusterization part of the execution. The algorithm searches the two more similar elements (so it is O(n*(n+1)/2) ), here's the pseudo code: int currentMax = 0.0f for (int i = 0 to n) for (int j = i to n) get content i-th and j-th if their similarity > currentMax update currentMax merge the two clusters So effectively there are a lot of ArrayList.get involved. Is there a faster way? I though that since ArrayList should be a linear array of references it should be the quickest way and maybe I can't do anything since there are simple too many gets.. but maybe I'm wrong. I don't think using a HashMap could work since I need to get them all on every iteration and map.values() should be backed by an ArrayList anyway.. Otherwise should I try other collection libraries that are more optimized? Like google's one, or apache one.. Thanks

    Read the article

  • Design approach, string table data, variables, stl memory usage

    - by howieh
    I have an old structure class like this: typedef vector<vector<string>> VARTYPE_T; which works as a single variable. This variable can hold from one value over a list to data like a table. Most values are long,double, string or double [3] for coordinates (x,y,z). I just convert them as needed. The variables are managed in a map like this : map<string,VARTYPE_T *> where the string holds the variable name. Sure, they are wrapped in classes. Also i have a tree of nodes, where each node can hold one of these variablemaps. Using VS 2008 SP1 for this, i detect a lot of memory fragmentation. Checking against the stlport, stlport seemed to be faster (20% ) and uses lesser memory (30%, for my test cases). So the question is: What is the best implementation to solve this requirement with fast an properly used memory ? Should i write an own allocator like a pool allocator. How would you do this ? Thanks in advance, Howie

    Read the article

  • MySQL Query Select using sub-select takes too long

    - by True Soft
    I noticed something strange while executing a select from 2 tables: SELECT * FROM table_1 WHERE id IN ( SELECT id_element FROM table_2 WHERE column_2=3103); This query took approximatively 242 seconds. But when I executed the subquery SELECT id_element FROM table_2 WHERE column_2=3103 it took less than 0.002s (and resulted 2 rows). Then, when I did SELECT * FROM table_1 WHERE id IN (/* prev.result */) it was the same: 0.002s. I was wondering why MySQL is doing the first query like that, taking much more time than the last 2 queries separately? Is it an optimal solution for selecting something based from the results of a sub-query? Other details: table_1 has approx. 9000 rows, and table_2 has 90000 rows. After I added an index on column_2 from table_2, the first query took 0.15s.

    Read the article

  • Shouldn't prepared statements be much more fsater?

    - by silversky
    $s = explode (" ", microtime()); $s = $s[0]+$s[1]; $con = mysqli_connect ('localhost', 'test', 'pass', 'db') or die('Err'); for ($i=0; $i<1000; $i++) { $stmt = $con -> prepare( " SELECT MAX(id) AS max_id , MIN(id) AS min_id FROM tb "); $stmt -> execute(); $stmt->bind_result($M,$m); $stmt->free_result(); $rand = mt_rand( $m , $M ).'<br/>'; $res = $con -> prepare( " SELECT * FROM tb WHERE id >= ? LIMIT 0,1 "); $res -> bind_param("s", $rand); $res -> execute(); $res->free_result(); } $e = explode (" ", microtime()); $e = $e[0]+$e[1]; echo number_format($e-$s, 4, '.', ''); // and: $link = mysql_connect ("localhost", "test", "pass") or die (); mysql_select_db ("db") or die ("Unable to select database".mysql_error()); for ($i=0; $i<1000; $i++) { $range_result = mysql_query( " SELECT MAX(`id`) AS max_id , MIN(`id`) AS min_id FROM tb "); $range_row = mysql_fetch_object( $range_result ); $random = mt_rand( $range_row->min_id , $range_row->max_id ); $result = mysql_query( " SELECT * FROM tb WHERE id >= $random LIMIT 0,1 "); } defenitly prepared statements are much more safer but also every where it says that they are much faster BUT in my test on the above code I have: - 2.45 sec for prepared statements - 5.05 sec for the secon example What do you think I'm doing wrong? Should I use the second solution or I should try to optimize the prep stmt?

    Read the article

  • "volatile" qualifier and compiler reorderings

    - by Checkers
    A compiler cannot eliminate or reorder reads/writes to a volatile-qualified variables. But what about the cases where other variables are present, which may or may not be volatile-qualified? Scenario 1 volatile int a; volatile int b; a = 1; b = 2; a = 3; b = 4; Can the compiler reorder first and the second, or third and the fourth assignments? Scenario 2 volatile int a; int b, c; b = 1; a = 1; c = b; a = 3; Same question, can the compiler reorder first and the second, or third and the fourth assignments?

    Read the article

  • PHP Flush: How Often and Best Practises

    - by Cory Dee
    I just finished reading this post: http://developer.yahoo.com/performance/rules.html#flush and have already implemented a flush after the top portion of my page loads (head, css, top banner/search/nav). Is there any performance hit in flushing? Is there such a thing as doing it too often? What are the best practices? If I am going to hit an external API for data, would it make sense to flush before hand so that the user isn't waiting on that data to come back, and can at least get some data before hand? Thanks to everyone in advance.

    Read the article

  • Fastest way to remove non-numeric characters from a VARCHAR in SQL Server

    - by Dan Herbert
    I'm writing an import utility that is using phone numbers as a unique key within the import. I need to check that the phone number does not already exist in my DB. The problem is that phone numbers in the DB could have things like dashes and parenthesis and possibly other things. I wrote a function to remove these things, the problem is that it is slow and with thousands of records in my DB and thousands of records to import at once, this process can be unacceptably slow. I've already made the phone number column an index. I tried using the script from this post: http://stackoverflow.com/questions/52315/t-sql-trim-nbsp-and-other-non-alphanumeric-characters But that didn't speed it up any. Is there a faster way to remove non-numeric characters? Something that can perform well when 10,000 to 100,000 records have to be compared. Whatever is done needs to perform fast. Update Given what people responded with, I think I'm going to have to clean the fields before I run the import utility. To answer the question of what I'm writing the import utility in, it is a C# app. I'm comparing BIGINT to BIGINT now, with no need to alter DB data and I'm still taking a performance hit with a very small set of data (about 2000 records). Could comparing BIGINT to BIGINT be slowing things down? I've optimized the code side of my app as much as I can (removed regexes, removed unneccessary DB calls). Although I can't isolate SQL as the source of the problem anymore, I still feel like it is.

    Read the article

  • Tips on creating user interfaces and optimizing the user experience

    - by Saif Bechan
    I am currently working on a project where a lot of user interaction is going to take place. There is also a commercial side as people can buy certain items and services. In my opinion a good blend of user interface, speed and security is essential for these types of websites. It is fairly easy to use ajax and JavaScript nowadays to do almost everything, as there are a lot of libraries available such as jQuery and others. But this can have some performance and incompatibility issues. This can lead to users just going to the next website. The overall look of the website is important too. Where to place certain buttons, where to place certain types of articles such as faq and support. Where and how to display error messages so that the user sees them but are not bothering him. And an overall color scheme is important too. The basic question is: How to create an interface that triggers a user to buy/use your services I know psychology also plays a huge role in how users interact with your website. The color scheme for example is important. When the colors are irritating on a website you just want to click away. I have not found any articles that explain those concept. Does anyone have any tips and/or recourses where i can get some articles that guide you in making the correct choices for your website.

    Read the article

  • ILOG CPLEX: how to populate IloLPMatrix while using addGe to set up the model?

    - by downer
    I have a queatoin about IloLPMatrix and addGe. I was trying to follow the example of AdMIPex5.java to generate user defined cutting planes based on the solution to the LP relaxation. The difference is that eh initial MIP model is not read in from a mps file, but set up in the code using methods like addGe, addLe etc. I think this is why I ran into problems while copying the exampe to do the following. IloLPMatrix lp = (IloLPMatrix)cplex.LPMatrixIterator().next(); lp from the above line turns to be NULL. I am wondering 1. What is the relationship between IloLPMatrix and the addLe, addGe commands? I tried to addLPMatrix() to the model, and then used model.addGe methods. but the LPMatrix seems to be empty still. How do I populate the IloLPMatrix of the moel according to the value that I had set up using addGe and addLe. Is the a method to this easily, or do I have to set them up row by row myself? I was doing this to get the number of variables and their values by doing lp.getNumVars(). Is there other methods that I can use to get the number of variables and their values wihout doing these, since my system is set up by addLe, addGe etc? Thanks a lot for your help on this.

    Read the article

  • vectorizing a for loop in numpy/scipy?

    - by user248237
    I'm trying to vectorize a for loop that I have inside of a class method. The for loop has the following form: it iterates through a bunch of points and depending on whether a certain variable (called "self.condition_met" below) is true, calls a pair of functions on the point, and adds the result to a list. Each point here is an element in a vector of lists, i.e. a data structure that looks like array([[1,2,3], [4,5,6], ...]). Here is the problematic function: def myClass: def my_inefficient_method(self): final_vector = [] # Assume 'my_vector' and 'my_other_vector' are defined numpy arrays for point in all_points: if not self.condition_met: a = self.my_func1(point, my_vector) b = self.my_func2(point, my_other_vector) else: a = self.my_func3(point, my_vector) b = self.my_func4(point, my_other_vector) c = a + b final_vector.append(c) # Choose random element from resulting vector 'final_vector' self.condition_met is set before my_inefficient_method is called, so it seems unnecessary to check it each time, but I am not sure how to better write this. Since there are no destructive operations here it is seems like I could rewrite this entire thing as a vectorized operation -- is that possible? any ideas how to do this?

    Read the article

  • SQLite3-ruby extremely slow under 1.9.1?

    - by NilObject
    I decided to upgrade my server to Ruby 1.9.1, and a lot of things are indeed much faster. However, I have a process that dumps a database to sqlite, and it's become glacially slow. What used to take 30 seconds now takes upwards of 10 minutes. The code does several create table statements, and then lots of inserts. The insert statements nearly all use placeholders (?), so SQLite is doing the heavy lifting of binding the parameters. In short, I can't see why this particular usage has slowed down so much. Does anyone know of any problems that have caused it? I'm using sqlite3-ruby (1.2.5), and I'm hoping that someone has encountered this and profiled it. If not, I guess I'm going to learn how to profile ruby code :)

    Read the article

  • How to optimize my PageRank calculation?

    - by asmaier
    In the book Programming Collective Intelligence I found the following function to compute the PageRank: def calculatepagerank(self,iterations=20): # clear out the current PageRank tables self.con.execute("drop table if exists pagerank") self.con.execute("create table pagerank(urlid primary key,score)") self.con.execute("create index prankidx on pagerank(urlid)") # initialize every url with a PageRank of 1.0 self.con.execute("insert into pagerank select rowid,1.0 from urllist") self.dbcommit() for i in range(iterations): print "Iteration %d" % i for (urlid,) in self.con.execute("select rowid from urllist"): pr=0.15 # Loop through all the pages that link to this one for (linker,) in self.con.execute("select distinct fromid from link where toid=%d" % urlid): # Get the PageRank of the linker linkingpr=self.con.execute("select score from pagerank where urlid=%d" % linker).fetchone()[0] # Get the total number of links from the linker linkingcount=self.con.execute("select count(*) from link where fromid=%d" % linker).fetchone()[0] pr+=0.85*(linkingpr/linkingcount) self.con.execute("update pagerank set score=%f where urlid=%d" % (pr,urlid)) self.dbcommit() However, this function is very slow, because of all the SQL queries in every iteration >>> import cProfile >>> cProfile.run("crawler.calculatepagerank()") 2262510 function calls in 136.006 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.000 0.000 136.006 136.006 <string>:1(<module>) 1 20.826 20.826 136.006 136.006 searchengine.py:179(calculatepagerank) 21 0.000 0.000 0.528 0.025 searchengine.py:27(dbcommit) 21 0.528 0.025 0.528 0.025 {method 'commit' of 'sqlite3.Connecti 1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler 1339864 112.602 0.000 112.602 0.000 {method 'execute' of 'sqlite3.Connec 922600 2.050 0.000 2.050 0.000 {method 'fetchone' of 'sqlite3.Cursor' 1 0.000 0.000 0.000 0.000 {range} So I optimized the function and came up with this: def calculatepagerank2(self,iterations=20): # clear out the current PageRank tables self.con.execute("drop table if exists pagerank") self.con.execute("create table pagerank(urlid primary key,score)") self.con.execute("create index prankidx on pagerank(urlid)") # initialize every url with a PageRank of 1.0 self.con.execute("insert into pagerank select rowid,1.0 from urllist") self.dbcommit() inlinks={} numoutlinks={} pagerank={} for (urlid,) in self.con.execute("select rowid from urllist"): inlinks[urlid]=[] numoutlinks[urlid]=0 # Initialize pagerank vector with 1.0 pagerank[urlid]=1.0 # Loop through all the pages that link to this one for (inlink,) in self.con.execute("select distinct fromid from link where toid=%d" % urlid): inlinks[urlid].append(inlink) # get number of outgoing links from a page numoutlinks[urlid]=self.con.execute("select count(*) from link where fromid=%d" % urlid).fetchone()[0] for i in range(iterations): print "Iteration %d" % i for urlid in pagerank: pr=0.15 for link in inlinks[urlid]: linkpr=pagerank[link] linkcount=numoutlinks[link] pr+=0.85*(linkpr/linkcount) pagerank[urlid]=pr for urlid in pagerank: self.con.execute("update pagerank set score=%f where urlid=%d" % (pagerank[urlid],urlid)) self.dbcommit() This function is 20 times faster (but uses a lot more memory for all the temporary dictionaries) because it avoids the unnecessary SQL queries in every iteration: >>> cProfile.run("crawler.calculatepagerank2()") 64802 function calls in 6.950 CPU seconds Ordered by: standard name ncalls tottime percall cumtime percall filename:lineno(function) 1 0.004 0.004 6.950 6.950 <string>:1(<module>) 1 1.004 1.004 6.946 6.946 searchengine.py:207(calculatepagerank2 2 0.000 0.000 0.104 0.052 searchengine.py:27(dbcommit) 23065 0.012 0.000 0.012 0.000 {meth 'append' of 'list' objects} 2 0.104 0.052 0.104 0.052 {meth 'commit' of 'sqlite3.Connection 1 0.000 0.000 0.000 0.000 {meth 'disable' of '_lsprof.Profiler' 31298 5.809 0.000 5.809 0.000 {meth 'execute' of 'sqlite3.Connectio 10431 0.018 0.000 0.018 0.000 {method 'fetchone' of 'sqlite3.Cursor' 1 0.000 0.000 0.000 0.000 {range} But is it possible to further reduce the number of SQL queries to speed up the function even more?

    Read the article

  • Lots of mysql Sleep processes

    - by user259284
    Hello, I am still having trouble with my mysql server. It seems that since i optimize it, the tables were growing and now sometimes is very slow again. I have no idea of how to optimize more. mySQL server has 48GB of RAM and mysqld is using about 8, most of the tables are innoDB. Site has about 2000 users online. I also run explain on every query and every one of them is indexed. mySQL processes: http://www.pik.ba/mysqlStanje.php my.cnf: # The MySQL database server configuration file. # # You can copy this to one of: # - "/etc/mysql/my.cnf" to set global options, # - "~/.my.cnf" to set user-specific options. # # One can use all long options that the program supports. # Run program with --help to get a list of available options and with # --print-defaults to see which it would actually understand and use. # # For explanations see # http://dev.mysql.com/doc/mysql/en/server-system-variables.html # This will be passed to all mysql clients # It has been reported that passwords should be enclosed with ticks/quotes # escpecially if they contain "#" chars... # Remember to edit /etc/mysql/debian.cnf when changing the socket location. [client] port = 3306 socket = /var/run/mysqld/mysqld.sock # Here is entries for some specific programs # The following values assume you have at least 32M ram # This was formally known as [safe_mysqld]. Both versions are currently parsed. [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] # # * Basic Settings # user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp language = /usr/share/mysql/english skip-external-locking # # Instead of skip-networking the default is now to listen only on # localhost which is more compatible and is not less secure. bind-address = 10.100.27.30 # # * Fine Tuning # key_buffer = 64M key_buffer_size = 512M max_allowed_packet = 16M thread_stack = 128K thread_cache_size = 8 # This replaces the startup script and checks MyISAM tables if needed # the first time they are touched myisam-recover = BACKUP max_connections = 1000 table_cache = 1000 join_buffer_size = 2M tmp_table_size = 2G max_heap_table_size = 2G innodb_buffer_pool_size = 3G innodb_additional_mem_pool_size = 128M innodb_log_file_size = 100M log-slow-queries = /var/log/mysql/slow.log sort_buffer_size = 5M net_buffer_length = 5M read_buffer_size = 2M read_rnd_buffer_size = 12M thread_concurrency = 10 ft_min_word_len = 3 #thread_concurrency = 10 # # * Query Cache Configuration # query_cache_limit = 1M query_cache_size = 512M # # * Logging and Replication # # Both location gets rotated by the cronjob. # Be aware that this log type is a performance killer. #log = /var/log/mysql/mysql.log # # Error logging goes to syslog. This is a Debian improvement :) # # Here you can see queries with especially long duration #log_slow_queries = /var/log/mysql/mysql-slow.log #long_query_time = 2 #log-queries-not-using-indexes # # The following can be used as easy to replay backup logs or for replication. # note: if you are setting up a replication slave, see README.Debian about # other settings you may need to change. #server-id = 1 #log_bin = /var/log/mysql/mysql-bin.log expire_logs_days = 10 max_binlog_size = 100M #binlog_do_db = include_database_name #binlog_ignore_db = include_database_name # # * BerkeleyDB # # Using BerkeleyDB is now discouraged as its support will cease in 5.1.12. skip-bdb # # * InnoDB # # InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/. # Read the manual for more InnoDB related options. There are many! # You might want to disable InnoDB to shrink the mysqld process by circa 100MB. #skip-innodb # # * Security Features # # Read the manual, too, if you want chroot! # chroot = /var/lib/mysql/ # # For generating SSL certificates I recommend the OpenSSL GUI "tinyca". # # ssl-ca=/etc/mysql/cacert.pem # ssl-cert=/etc/mysql/server-cert.pem # ssl-key=/etc/mysql/server-key.pem [mysqldump] quick quote-names max_allowed_packet = 16M [mysql] #no-auto-rehash # faster start of mysql but no tab completition [isamchk] key_buffer = 16M # # * NDB Cluster # # See /usr/share/doc/mysql-server-*/README.Debian for more information. # # The following configuration is read by the NDB Data Nodes (ndbd processes) # not from the NDB Management Nodes (ndb_mgmd processes). # # [MYSQL_CLUSTER] # ndb-connectstring=127.0.0.1 # # * IMPORTANT: Additional settings that can override those from this file! # The files must end with '.cnf', otherwise they'll be ignored. # !includedir /etc/mysql/conf.d/

    Read the article

  • MySQL Query performance - huge difference in time

    - by Damo
    I have a query that is returning in vastly different amounts of time between 2 datasets. For one set (database A) it returns in a few seconds, for the other (database B)....well I haven't waited long enough yet, but over 10 minutes. I have dumped both of these databases to my local machine where I can reproduce the issue running MySQL 5.1.37. Curiously, database B is smaller than database A. A stripped down version of the query that reproduces the problem is: SELECT * FROM po_shipment ps JOIN po_shipment_item psi USING (ship_id) JOIN po_alloc pa ON ps.ship_id = pa.ship_id AND pa.UID_items = psi.UID_items JOIN po_header ph ON pa.hdr_id = ph.hdr_id LEFT JOIN EVENT_TABLE ev0 ON ev0.TABLE_ID1 = ps.ship_id AND ev0.EVENT_TYPE = 'MAS0' LEFT JOIN EVENT_TABLE ev1 ON ev1.TABLE_ID1 = ps.ship_id AND ev1.EVENT_TYPE = 'MAS1' LEFT JOIN EVENT_TABLE ev2 ON ev2.TABLE_ID1 = ps.ship_id AND ev2.EVENT_TYPE = 'MAS2' LEFT JOIN EVENT_TABLE ev3 ON ev3.TABLE_ID1 = ps.ship_id AND ev3.EVENT_TYPE = 'MAS3' LEFT JOIN EVENT_TABLE ev4 ON ev4.TABLE_ID1 = ps.ship_id AND ev4.EVENT_TYPE = 'MAS4' LEFT JOIN EVENT_TABLE ev5 ON ev5.TABLE_ID1 = ps.ship_id AND ev5.EVENT_TYPE = 'MAS5' WHERE ps.eta >= '2010-03-22' GROUP BY ps.ship_id LIMIT 100; The EXPLAIN query plan for the first database (A) that returns in ~2 seconds is: +----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+----------------------------------------------+ | 1 | SIMPLE | ps | range | PRIMARY,IX_ETA_DATE | IX_ETA_DATE | 4 | NULL | 174 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | ev0 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | | | 1 | SIMPLE | ev1 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | | | 1 | SIMPLE | ev2 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | | | 1 | SIMPLE | ev3 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | | | 1 | SIMPLE | ev4 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | | | 1 | SIMPLE | ev5 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_PROD.ps.ship_id,const | 1 | | | 1 | SIMPLE | psi | ref | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1 | IX_po_shipment_item_po_shipment1 | 4 | UNIVIS_PROD.ps.ship_id | 1 | | | 1 | SIMPLE | pa | ref | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | FK_po_alloc_po_shipment1 | 4 | UNIVIS_PROD.psi.ship_id | 5 | Using where | | 1 | SIMPLE | ph | eq_ref | PRIMARY,IX_HDR_ID | PRIMARY | 4 | UNIVIS_PROD.pa.hdr_id | 1 | | +----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+------------------------------+------+----------------------------------------------+ The EXPLAIN query plan for the second database (B) that returns in 600 seconds is: +----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+--------------------------------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+--------------------------------+------+----------------------------------------------+ | 1 | SIMPLE | ps | range | PRIMARY,IX_ETA_DATE | IX_ETA_DATE | 4 | NULL | 38 | Using where; Using temporary; Using filesort | | 1 | SIMPLE | psi | ref | PRIMARY,IX_po_shipment_item_po_shipment1,FK_po_shipment_item_po_shipment1 | IX_po_shipment_item_po_shipment1 | 4 | UNIVIS_DEV01.ps.ship_id | 1 | | | 1 | SIMPLE | ev0 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.psi.ship_id,const | 1 | | | 1 | SIMPLE | ev1 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.psi.ship_id,const | 1 | | | 1 | SIMPLE | ev2 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | | | 1 | SIMPLE | ev3 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.psi.ship_id,const | 1 | | | 1 | SIMPLE | ev4 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.psi.ship_id,const | 1 | | | 1 | SIMPLE | ev5 | ref | IX_EVENT_ID_EVENT_TYPE | IX_EVENT_ID_EVENT_TYPE | 36 | UNIVIS_DEV01.ps.ship_id,const | 1 | | | 1 | SIMPLE | pa | ref | IX_po_alloc_po_shipment_item2,IX_po_alloc_po_details_old,FK_po_alloc_po_shipment1,FK_po_alloc_po_shipment_item1,FK_po_alloc_po_header1 | IX_po_alloc_po_shipment_item2 | 4 | UNIVIS_DEV01.ps.ship_id | 4 | Using where | | 1 | SIMPLE | ph | eq_ref | PRIMARY,IX_HDR_ID | PRIMARY | 4 | UNIVIS_DEV01.pa.hdr_id | 1 | | +----+-------------+-------+--------+----------------------------------------------------------------------------------------------------------------------------------------+----------------------------------+---------+--------------------------------+------+----------------------------------------------+ When database B is running I can look at the MySQL Administrator and the state remains at "Copying to tmp table" indefinitely. Database A also has this state but for only a second or so. There are no differences in the table structure, indexes, keys etc between these databases (I have done show create tables and diff'd them). The sizes of the tables are: database A: po_shipment 1776 po_shipment_item 1945 po_alloc 36298 po_header 71642 EVENT_TABLE 1608 database B: po_shipment 463 po_shipment_item 470 po_alloc 3291 po_header 56149 EVENT_TABLE 1089 Some points to note: Removing the WHERE clause makes the query return < 1 sec. Removing the GROUP BY makes the query return < 1 sec. Removing ev5, ev4, ev3 etc makes the query get faster for each one removed. Can anyone suggest how to resolve this issue? What have I missed? Many Thanks.

    Read the article

  • Optimizing Levenshtein Distance Algorithm

    - by Matt
    I have a stored procedure that uses Levenshtein Distance to determine the result closest to what the user typed. The only thing really affecting the speed is the function that calculates the Levenshtein Distance for all the records before selecting the record with the lowest distance (I've verified this by putting a 0 in place of the call to the Levenshtein function). The table has 1.5 million records, so even the slightest adjustment may shave off a few seconds. Right now the entire thing runs over 10 minutes. Here's the method I'm using: ALTER function dbo.Levenshtein ( @Source nvarchar(200), @Target nvarchar(200) ) RETURNS int AS BEGIN DECLARE @Source_len int, @Target_len int, @i int, @j int, @Source_char nchar, @Dist int, @Dist_temp int, @Distv0 varbinary(8000), @Distv1 varbinary(8000) SELECT @Source_len = LEN(@Source), @Target_len = LEN(@Target), @Distv1 = 0x0000, @j = 1, @i = 1, @Dist = 0 WHILE @j <= @Target_len BEGIN SELECT @Distv1 = @Distv1 + CAST(@j AS binary(2)), @j = @j + 1 END WHILE @i <= @Source_len BEGIN SELECT @Source_char = SUBSTRING(@Source, @i, 1), @Dist = @i, @Distv0 = CAST(@i AS binary(2)), @j = 1 WHILE @j <= @Target_len BEGIN SET @Dist = @Dist + 1 SET @Dist_temp = CAST(SUBSTRING(@Distv1, @j+@j-1, 2) AS int) + CASE WHEN @Source_char = SUBSTRING(@Target, @j, 1) THEN 0 ELSE 1 END IF @Dist > @Dist_temp BEGIN SET @Dist = @Dist_temp END SET @Dist_temp = CAST(SUBSTRING(@Distv1, @j+@j+1, 2) AS int)+1 IF @Dist > @Dist_temp SET @Dist = @Dist_temp BEGIN SELECT @Distv0 = @Distv0 + CAST(@Dist AS binary(2)), @j = @j + 1 END END SELECT @Distv1 = @Distv0, @i = @i + 1 END RETURN @Dist END Anyone have any ideas? Any input is appreciated. Thanks, Matt

    Read the article

  • C# File IO with Streams - Best Memory Buffer Size

    - by AJ
    Hi, I am writing a small IO library to assist with a larger (hobby) project. A part of this library performs various functions on a file, which is read / written via the FileStream object. On each StreamReader.Read(...) pass, I fire off an event which will be used in the main app to display progress information. The processing that goes on in the loop is vaired, but is not too time consuming (it could just be a simple file copy, for example, or may involve encryption...). My main question is: What is the best memory buffer size to use? Thinking about physical disk layouts, I could pick 2k, which would cover a CD sector size and is a nice multiple of a 512 byte hard disk sector. Higher up the abstraction tree, you could go for a larger buffer which could read an entire FAT cluster at a time. I realise with today's PC's, I could go for a more memory hungry option (a couple of MiB, for example), but then I increase the time between UI updates and the user perceives a less responsive app. As an aside, I'm eventually hoping to provide a similar interface to files hosted on FTP / HTTP servers (over a local network / fastish DSL). What would be the best memory buffer size for those (again, a "best-case" tradeoff between perceived responsiveness vs. performance). Thanks in advance for any ideas, Adam

    Read the article

  • Prevent full table scan for query with multiple where clauses

    - by Dave Jarvis
    A while ago I posted a message about optimizing a query in MySQL. I have since ported the data and query to PostgreSQL, but now PostgreSQL has the same problem. The solution in MySQL was to force the optimizer to not optimize using STRAIGHT_JOIN. PostgreSQL offers no such option. Here is the explain: Here is the query: SELECT avg(d.amount) AS amount, y.year FROM station s, station_district sd, year_ref y, month_ref m, daily d LEFT JOIN city c ON c.id = 10663 WHERE -- Find all the stations within a specific unit radius ... -- 6371.009 * SQRT( POW(RADIANS(c.latitude_decimal - s.latitude_decimal), 2) + (COS(RADIANS(c.latitude_decimal + s.latitude_decimal) / 2) * POW(RADIANS(c.longitude_decimal - s.longitude_decimal), 2)) ) <= 50 AND -- Ignore stations outside the given elevations -- s.elevation BETWEEN 0 AND 2000 AND sd.id = s.station_district_id AND -- Gather all known years for that station ... -- y.station_district_id = sd.id AND -- The data before 1900 is shaky; insufficient after 2009. -- y.year BETWEEN 1980 AND 2000 AND -- Filtered by all known months ... -- m.year_ref_id = y.id AND m.month = 12 AND -- Whittled down by category ... -- m.category_id = '001' AND -- Into the valid daily climate data. -- m.id = d.month_ref_id AND d.daily_flag_id <> 'M' GROUP BY y.year It appears as though PostgreSQL is looking at the DAILY table first, which is simply not the right way to go about this query as there are nearly 300 million rows. How do I force PostgreSQL to start at the CITY table? Thank you!

    Read the article

  • MySQL queries - how expensive are they really?

    - by incrediman
    I've heard that mysql queries are very expensive, and that you should avoid at all costs making too many of them. I'm developing a site that will be used by quite a few people, and I'm wondering: How expensive are mysql queries actually? If I have 400,000 people in my database, how expensive is it to query it for one of them? How close attention do I need to pay that I don't make too many queries per client request?

    Read the article

< Previous Page | 76 77 78 79 80 81 82 83 84 85 86 87  | Next Page >