Search Results

Search found 4136 results on 166 pages for 'micro optimization'.

Page 68/166 | < Previous Page | 64 65 66 67 68 69 70 71 72 73 74 75  | Next Page >

  • In SQL Server what is most efficient way to compare records to other records for duplicates with in

    - by Glenn
    We have an SQL Server that gets daily imports of data files from clients. This data is interrelated and we are always scrubbing it and having to look for suspect duplicate records between these files. Finding and tagging suspect records can get pretty complicated. We use logic that requires some field values to be the same, allows some field values to differ, and allows a range to be specified for how different certain field values can be. The only way we've found to do it is by using a cursor based process, and it places a heavy burden on the database. So I wanted to ask if there's a more efficient way to do this. I've heard it said that there's almost always a more efficient way to replace cursors with clever JOINS. But I have to admit I'm having a lot of trouble with this one. For a concrete example suppose we have 1 table, an "orders" table, with the following 6 fields. order_id, customer_id product_id, quantity, sale_date, price We want to look through the records to find suspect duplicates on the following example criteria. These get increasingly harder. 1. Records that have the same product_id, sale_date, and quantity but different customer_id's should be marked as suspect duplicates for review. 2. Records that have the same customer_id, product_id, quantity and have sale_dates within five days of each other should be marked as suspect duplicates for review 3. Records that have the same customer_id, product_id, but different quantities within 20 units, and sales dates within five days of each other should be considered suspect. Is it possible to satisfy each one of these criteria with a single SQL Query that uses JOINS? Is this the most efficient way to do this?

    Read the article

  • Feedback on Optimizing C# NET Code Block

    - by Brett Powell
    I just spent quite a few hours reading up on TCP servers and my desired protocol I was trying to implement, and finally got everything working great. I noticed the code looks like absolute bollocks (is the the correct usage? Im not a brit) and would like some feedback on optimizing it, mostly for reuse and readability. The packet formats are always int, int, int, string, string. try { BinaryReader reader = new BinaryReader(clientStream); int packetsize = reader.ReadInt32(); int requestid = reader.ReadInt32(); int serverdata = reader.ReadInt32(); Console.WriteLine("Packet Size: {0} RequestID: {1} ServerData: {2}", packetsize, requestid, serverdata); List<byte> str = new List<byte>(); byte nextByte = reader.ReadByte(); while (nextByte != 0) { str.Add(nextByte); nextByte = reader.ReadByte(); } // Password Sent to be Authenticated string string1 = Encoding.UTF8.GetString(str.ToArray()); str.Clear(); nextByte = reader.ReadByte(); while (nextByte != 0) { str.Add(nextByte); nextByte = reader.ReadByte(); } // NULL string string string2 = Encoding.UTF8.GetString(str.ToArray()); Console.WriteLine("String1: {0} String2: {1}", string1, string2); // Reply to Authentication Request MemoryStream stream = new MemoryStream(); BinaryWriter writer = new BinaryWriter(stream); writer.Write((int)(1)); // Packet Size writer.Write((int)(requestid)); // Mirror RequestID if Authenticated, -1 if Failed byte[] buffer = stream.ToArray(); clientStream.Write(buffer, 0, buffer.Length); clientStream.Flush(); } I am going to be dealing with other packet types as well that are formatted the same (int/int/int/str/str), but different values. I could probably create a packet class, but this is a bit outside my scope of knowledge for how to apply it to this scenario. If it makes any difference, this is the Protocol I am implementing. http://developer.valvesoftware.com/wiki/Source_RCON_Protocol

    Read the article

  • PHP Increasing writing to page speed.

    - by Frederico
    I'm currently writing out xml and have done the following: header ("content-type: text/xml"); header ("content-length: ".strlen($xml)); $xml being the xml to be written out. I'm near about 1.8 megs of text (which I found via firebug), it seems as the writing is taking more time than the script to run.. is there a way to increase this write speed? Thank you in advance.

    Read the article

  • How to make Visual C++ 9 not emit code that is actually never called?

    - by sharptooth
    My native C++ COM component uses ATL. In DllRegisterServer() I call CComModule::RegisterServer(): STDAPI DllRegisterServer() { return _Module.RegisterServer(FALSE); // <<< notice FALSE here } FALSE is passed to indicate to not register the type library. ATL is available as sources, so I in fact compile the implementation of CComModule::RegisterServer(). Somewhere down the call stack there's an if statement: if( doRegisterTypeLibrary ) { //<< FALSE goes here // do some stuff, then call RegisterTypeLib() } The compiler sees all of the above code and so it can see that in fact the if condition is always false, yet when I inspect the linker progress messages I see that the reference to RegisterTypeLib() is still there, so the if statement is not eliminated. Can I make Visual C++ 9 perform better static analysis and actually see that some code is never called and not emit that code?

    Read the article

  • Resize an image and maintain quality?

    - by JasonS
    Hi, I have a problem with resizing images. What happens is that if you upload a file larger than the stated parameters, the image is cropped, then saved at 100% quality. So if I upload a large jpeg which is 272Kb. The image is cropped by 100 odd pixels. The file size then goes up to 1.2Mb. We are saving images at a 100% quality. I assume that this is what is causing the problem. The image is exported from Photoshop at 30% quality which reduces the file size. Resaving the image at 100% quality creates the same image but I assume with a lot of redundant file data. Has anyone encountered this before? Does anyone have a solution? This is what we are using. $source_im = imagecreatefromjpeg ($file); $dest_im = imagecreatetruecolor ($newsize_x, $newsize_y); imagecopyresampled ( $dest_im, $source_im, 0, 0, $offset_x, $offset_y, $newsize_x, $newsize_y, $sourceWidth, $sourceHeight ); imagedestroy ($source_im); if ($greyscale) { $dest_im = $this->imageconvertgreyscale ($dest_im); } imagejpeg($dest_im, $save_to_file, $quality); break;

    Read the article

  • How to handle large table in MySQL ?

    - by Frantz Miccoli
    I've a database used to store items and properties about these items. The number of properties is extensible, thus there is a join table to store each property associated to an item value. CREATE TABLE `item_property` ( `property_id` int(11) NOT NULL, `item_id` int(11) NOT NULL, `value` double NOT NULL, PRIMARY KEY (`property_id`,`item_id`), KEY `item_id` (`item_id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci; This database has two goals : storing (which has first priority and has to be very quick, I would like to perform many inserts (hundreds) in few seconds), retrieving data (selects using item_id and property_id) (this is a second priority, it can be slower but not too much because this would ruin my usage of the DB). Currently this table hosts 1.6 billions entries and a simple count can take up to 2 minutes... Inserting isn't fast enough to be usable. I'm using Zend_Db to access my data and would really be happy if you don't suggest me to develop any php side part. Thanks for your advices !

    Read the article

  • Cleaner way of using modulus for columns

    - by WmasterJ
    I currently have a list (<ul>) of people that I have divided up into two columns. But after finishing the code for it I keept wondering if there is a more effective or clean way to do the same thing. echo "<table class='area_list'><tr>"; // Loop users within areas, divided up in 2 columns $count = count($areaArray); for($i=0 ; $i<$count ; $i++) { $uid = $areaArray[$i]; // get the modulus value + ceil for uneven numbers $rowCalc = ($i+1) % ceil($count/2); if ($rowCalc == 1) echo "<td><ul>"; // OUTPUT the actual list item echo "<li>{$users[$uid]->profile_lastname}</li>"; if ($rowCalc == 0 && $i!=0) echo "</ul></td>"; } echo "</tr></table>"; Any ideas of how to make this cleaner or do it in another way?

    Read the article

  • Does anybody have any suggestions on which of these two approaches is better for large delete?

    - by RPS
    Approach #1: DECLARE @count int SET @count = 2000 DECLARE @rowcount int SET @rowcount = @count WHILE @rowcount = @count BEGIN DELETE TOP (@count) FROM ProductOrderInfo WHERE ProductId = @product_id AND bCopied = 1 AND FileNameCRC = @localNameCrc SELECT @rowcount = @@ROWCOUNT WAITFOR DELAY '000:00:00.400' Approach #2: DECLARE @count int SET @count = 2000 DECLARE @rowcount int SET @rowcount = @count WHILE @rowcount = @count BEGIN DELETE FROM ProductOrderInfo WHERE ProductId = @product_id AND FileNameCRC IN ( SELECT TOP(@count) FileNameCRC FROM ProductOrderInfo WITH (NOLOCK) WHERE bCopied = 1 AND FileNameCRC = @localNameCrc ) SELECT @rowcount = @@ROWCOUNT WAITFOR DELAY '000:00:00.400' END

    Read the article

  • Which is quicker? Memcache or file query? (using maxmind geoip.dat file)

    - by tomcritchlow
    Hi, I'm using Python on Appengine and am looking up the geolocation of an IP address like this: import pygeoip gi = pygeoip.GeoIP('GeoIP.dat') Location = gi.country_code_by_addr(self.request.remote_addr) (pygeoip can be found here: http://code.google.com/p/pygeoip/) I want to geolocate each page of my app for a user so currently I lookup the IP address once then store it in memcache. My question - which is quicker? Looking up the IP address each time from the .dat file or fetching it from memcache? Are there any other pros/cons I need to be aware of? For general queries like this, is there a good guide to teach me how to optimise my code and run speed tests myself? I'm new to python and coding in general so apologies if this is a basic concept. Thanks! Tom

    Read the article

  • optimized grid for rectangular items

    - by peterchen
    I have N rectangular items with an aspect ratio Aitem (X:Y). I have a rectangular display area with an aspect ratio Aview The items should be arranged in a table-like layout (i.e. r rows, c columns). what is the ideal grid rows x columns, so that individual items are largest? (rows * colums = N, of course - i.e. there may be "unused" grid places). A simple algorithm could iterate over rows = 1..N, calculate the required number of columns, and keep the row/column pair with the largest items. I wonder if there's a non-iterative algorithm, though (e.g. for Aitem = Aview = 1, rows / cols can be approximated by sqrt(N)).

    Read the article

  • Nginx , Apache , Mysql , Memcache with server 4G ram. How optimize to enoigh of memory?

    - by TomSawyer
    i have 1 dedicated server with Nginx proxy for Apache. Memcache, mysql, 4G Ram. These day, my visitor on my site wasn't increased, but my server get overload always in some specified time. (9AM - 15PM) Ram in use is increased second by second to full. that's moment, my server will get overload. i have to kill all apache , mysql service and reboot it to get free memory. and it'll full again. that's the terrible circle. here is my ram in use at the moment 160(nginx) 220(apache) 512(memcache) 924(mysql) here's process number 4(nginx) 14(apache) 5(memcache) 20(mysql) and here's my my.cnf config. someone can help me to optimize it? [mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock user=mysql skip-locking skip-networking skip-name-resolve # enable log-slow-queries log-slow-queries = /var/log/mysql-slow-queries.log long_query_time=3 max_connections=200 wait_timeout=64 connect_timeout = 10 interactive_timeout = 25 thread_stack = 512K max_allowed_packet=16M table_cache=1500 read_buffer_size=4M join_buffer_size=4M sort_buffer_size=4M read_rnd_buffer_size = 4M max_heap_table_size=256M tmp_table_size=256M thread_cache=256 query_cache_type=1 query_cache_limit=4M query_cache_size=16M thread_concurrency=8 myisam_sort_buffer_size=128M # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 [mysqldump] quick max_allowed_packet=16M [mysql] no-auto-rehash [isamchk] key_buffer=256M sort_buffer=256M read_buffer=64M write_buffer=64M [myisamchk] key_buffer=256M sort_buffer=256M read_buffer=64M write_buffer=64M [mysqlhotcopy] interactive-timeout [mysql.server] user=mysql basedir=/var/lib [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid

    Read the article

  • Eliminate full table scan due to BETWEEN (and GROUP BY)

    - by Dave Jarvis
    Description According to the explain command, there is a range that is causing a query to perform a full table scan (160k rows). How do I keep the range condition and reduce the scanning? I expect the culprit to be: Y.YEAR BETWEEN 1900 AND 2009 AND Code Here is the code that has the range condition (the STATION_DISTRICT is likely superfluous). SELECT COUNT(1) as MEASUREMENTS, AVG(D.AMOUNT) as AMOUNT, Y.YEAR as YEAR, MAKEDATE(Y.YEAR,1) as AMOUNT_DATE FROM CITY C, STATION S, STATION_DISTRICT SD, YEAR_REF Y FORCE INDEX(YEAR_IDX), MONTH_REF M, DAILY D WHERE -- For a specific city ... -- C.ID = 10663 AND -- Find all the stations within a specific unit radius ... -- 6371.009 * SQRT( POW(RADIANS(C.LATITUDE_DECIMAL - S.LATITUDE_DECIMAL), 2) + (COS(RADIANS(C.LATITUDE_DECIMAL + S.LATITUDE_DECIMAL) / 2) * POW(RADIANS(C.LONGITUDE_DECIMAL - S.LONGITUDE_DECIMAL), 2)) ) <= 50 AND -- Get the station district identification for the matching station. -- S.STATION_DISTRICT_ID = SD.ID AND -- Gather all known years for that station ... -- Y.STATION_DISTRICT_ID = SD.ID AND -- The data before 1900 is shaky; insufficient after 2009. -- Y.YEAR BETWEEN 1900 AND 2009 AND -- Filtered by all known months ... -- M.YEAR_REF_ID = Y.ID AND -- Whittled down by category ... -- M.CATEGORY_ID = '003' AND -- Into the valid daily climate data. -- M.ID = D.MONTH_REF_ID AND D.DAILY_FLAG_ID <> 'M' GROUP BY Y.YEAR Update The SQL is performing a full table scan, which results in MySQL performing a "copy to tmp table", as shown here: +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ | 1 | SIMPLE | C | const | PRIMARY | PRIMARY | 4 | const | 1 | | | 1 | SIMPLE | Y | range | YEAR_IDX | YEAR_IDX | 4 | NULL | 160422 | Using where | | 1 | SIMPLE | SD | eq_ref | PRIMARY | PRIMARY | 4 | climate.Y.STATION_DISTRICT_ID | 1 | Using index | | 1 | SIMPLE | S | eq_ref | PRIMARY | PRIMARY | 4 | climate.SD.ID | 1 | Using where | | 1 | SIMPLE | M | ref | PRIMARY,YEAR_REF_IDX,CATEGORY_IDX | YEAR_REF_IDX | 8 | climate.Y.ID | 54 | Using where | | 1 | SIMPLE | D | ref | INDEX | INDEX | 8 | climate.M.ID | 11 | Using where | +----+-------------+-------+--------+-----------------------------------+--------------+---------+-------------------------------+--------+-------------+ Related http://dev.mysql.com/doc/refman/5.0/en/how-to-avoid-table-scan.html http://dev.mysql.com/doc/refman/5.0/en/where-optimizations.html http://stackoverflow.com/questions/557425/optimize-sql-that-uses-between-clause Thank you!

    Read the article

  • How to optimize a postgreSQL server for a "write once, read many"-type infrastructure ?

    - by mhu
    Greetings, I am working on a piece of software that logs entries (and related tagging) in a PostgreSQL database for storage and retrieval. We never update any data once it has been inserted; we might remove it when the entry gets too old, but this is done at most once a day. Stored entries can be retrieved by users. The insertion of new entries can happen rather fast and regularly, thus the database will commonly hold several millions elements. The tables used are pretty simple : one table for ids, raw content and insertion date; and one table storing tags and their values associated to an id. User search mostly concern tags values, so SELECTs usually consist of JOIN queries on ids on the two tables. To sum it up : 2 tables Lots of INSERT no UPDATE some DELETE, once a day at most some user-generated SELECT with JOIN huge data set What would an optimal server configuration (software and hardware, I assume for example that RAID10 could help) be for my PostgreSQL server, given these requirements ? By optimal, I mean one that allows SELECT queries taking a reasonably little amount of time. I can provide more information about the current setup (like tables, indexes ...) if needed.

    Read the article

  • optimize python code

    - by user283405
    i have code that uses BeautifulSoup library for parsing. But it is very slow. The code is written in such a way that threads cannot be used. Can anyone help me about this? I am using beautifulsoup library for parsing and than save in DB. if i comment the save statement, than still it takes time so there is no problem with database. def parse(self,text): soup = BeautifulSoup(text) arr = soup.findAll('tbody') for i in range(0,len(arr)-1): data=Data() soup2 = BeautifulSoup(str(arr[i])) arr2 = soup2.findAll('td') c=0 for j in arr2: if str(j).find("<a href=") > 0: data.sourceURL = self.getAttributeValue(str(j),'<a href="') else: if c == 2: data.Hits=j.renderContents() #and few others... #... c = c+1 data.save() Any suggestions? Note: I already ask this question here but that was closed due to incomplete information.

    Read the article

  • STL vectors with uninitialized storage?

    - by Jim Hunziker
    I'm writing an inner loop that needs to place structs in contiguous storage. I don't know how many of these structs there will be ahead of time. My problem is that STL's vector initializes its values to 0, so no matter what I do, I incur the cost of the initialization plus the cost of setting the struct's members to their values. Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements? (I'm certain that this part of the code needs to be optimized, and I'm certain that the initialization is a significant cost.) Also, see my comments below for a clarification about when the initialization occurs. SOME CODE: void GetsCalledALot(int* data1, int* data2, int count) { int mvSize = memberVector.size() memberVector.resize(mvSize + count); // causes 0-initialization for (int i = 0; i < count; ++i) { memberVector[mvSize + i].d1 = data1[i]; memberVector[mvSize + i].d2 = data2[i]; } }

    Read the article

  • MySQL Prepared Statements vs Stored Procedures Performance

    - by amardilo
    Hi there, I have an old MySQL 4.1 database with a table that has a few millions rows and an old Java application that connects to this database and returns several thousand rows from this this table on a frequent basis via a simple SQL query (i.e. SELECT * FROM people WHERE first_name = 'Bob'. I think the Java application uses client side prepared statements but was looking at switching this to the server, and in the example mentioned the value for first_name will vary depending on what the user enters). I would like to speed up performance on the select query and was wondering if I should switch to Prepared Statements or Stored Procedures. Is there a general rule of thumb of what is quicker/less resource intensive (or if a combination of both is better)

    Read the article

  • Limit CPU usage of a process

    - by jb
    I have a service running which periodically checks a folder for a file and then processes it. (Reads it, extracts the data, stores it in sql) So I ran it on a test box and it took a little longer thaan expected. The file had 1.6 million rows, and it was still running after 6 hours (then I went home). The problem is the box it is running on is now absolutely crippled - remote desktop was timing out so I cant even get on it to stop the process, or attach a debugger to see how far through etc. It's solidly using 90%+ CPU, and all other running services or apps are suffering. The code is (from memory, may not compile): List<ItemDTO> items = new List<ItemDTO>(); using (StreamReader sr = fileInfo.OpenText()) { while (!sr.EndOfFile) { string line = sr.ReadLine() try { string s = line.Substring(0,8); double y = Double.Parse(line.Substring(8,7)); //If the item isnt already in the collection, add it. if (items.Find(delegate(ItemDTO i) { return (i.Item == s); }) == null) items.Add(new ItemDTO(s,y)); } catch { /*Crash*/ } } return items; } - So I am working on improving the code (any tips appreciated). But it still could be a slow affair, which is fine, I've no problems with it taking a long time as long as its not killing my server. So what I want from you fine people is: 1) Is my code hideously un-optimized? 2) Can I limit the amount of CPU my code block may use? Cheers all

    Read the article

  • Is a program compiled with -g gcc flag slower than the same program compiled without -g?

    - by e271p314
    I'm compiling a program with -O3 for performance and -g for debug symbols (in case of crash I can use the core dump). One thing bothers me a lot, does the -g option results in a performance penalty? When I look on the output of the compilation with and without -g, I see that the output without -g is 80% smaller than the output of the compilation with -g. If the extra space goes for the debug symbols, I don't care about it (I guess) since this part is not used during runtime. But if for each instruction in the compilation output without -g I need to do 4 more instructions in the compilation output with -g than I certainly prefer to stop using -g option even at the cost of not being able to process core dumps. How to know the size of the debug symbols section inside the program and in general does compilation with -g creates a program which runs slower than the same code compiled without -g?

    Read the article

  • Where does the compiler store methods for C++ classes?

    - by Mashmagar
    This is more a curiosity than anything else... Suppose I have a C++ class Kitty as follows: class Kitty { void Meow() { //Do stuff } } Does the compiler place the code for Meow() in every instance of Kitty? Obviously repeating the same code everywhere requires more memory. But on the other hand, branching to a relative location in nearby memory requires fewer assembly instructions than branching to an absolute location in memory on modern processors, so this is potentially faster. I suppose this is an implementation detail, so different compilers may perform differently. Keep in mind, I'm not considering static or virtual methods here.

    Read the article

  • Fastest way to do a weighted tag search in SQL Server

    - by Hasan Khan
    My table is as follows ObjectID bigint Tag nvarchar(50) Weight float Type tinyint I want to get search for all objects that has tags 'big' or 'large' I want the objectid in order of sum of weights (so objects having both the tags will be on top) select objectid, row_number() over (order by sum(weight) desc) as rowid from tags where tag in ('big', 'large') and type=0 group by objectid the reason for row_number() is that i want paging over results. The query in its current form is very slow, takes a minute to execute over 16 million tags. What should I do to make it faster? I have a non clustered index (objectid, tag, type) Any suggestions?

    Read the article

  • Java: how to read BufferedReader faster

    - by Cata
    Hello, I want to optimize this code: InputStream is = rp.getEntity().getContent(); BufferedReader reader = new BufferedReader(new InputStreamReader(is)); String text = ""; String aux = ""; while ((aux = reader.readLine()) != null) { text += aux; } The thing is that i don't know how to read the content of the bufferedreader and copy it in a String faster than what I have above. I need to spend as little time as possible. Thank you

    Read the article

< Previous Page | 64 65 66 67 68 69 70 71 72 73 74 75  | Next Page >