Search Results

Search found 1631 results on 66 pages for 'optimize'.

Page 55/66 | < Previous Page | 51 52 53 54 55 56 57 58 59 60 61 62  | Next Page >

  • Web page database query optimization

    - by morpheous
    I am putting together a web page which is quite 'expensive' in terms of database hits. I don't want to start optimizing at this stage - though with me trying to hit a deadline, I may end up not optimizing at all. Currently the page requires 18 (that's right eighteen) hits to the db. I am already using joins, and some of the queries are UNIONed to minimize the trips to the db. My local dev machine can handle this (page is not slow) however, I feel if I release this into the wild, the number of queries will quickly overwhelm my database (MySQL). I could always use memcache or something similar, but I would much rather continue with my other dev work that needs to be completed before the deadline - at least retrieving the page works - its simply a matter of optimization now (if required). My question therefore is - is 18 db queries for a single page retrieval completely outrageous - (i.e. I should put everything on hold and optimize the hell of the retrieval logic), or shall I continue as normal, meet the deadline and release on schedule and see what happens? [Edit] Just to clarify, I have already done the 'obvious' things like using (single and composite) indexes for fields used in the queries. What I haven't yet done is to run a query analyzer to see if my indexes etc are optimal.

    Read the article

  • Divide and conquer of large objects for GC performance

    - by Aperion
    At my work we're discussing different approaches to cleaning up a large amount of managed ~50-100MB memory.There are two approaches on the table (read: two senior devs can't agree) and not having the experience the rest of the team is unsure of what approach is more desirable, performance or maintainability. The data being collected is many small items, ~30000 which in turn contains other items, all objects are managed. There is a lot of references between these objects including event handlers but not to outside objects. We'll call this large group of objects and references as a single entity called a blob. Approach #1: Make sure all references to objects in the blob are severed and let the GC handle the blob and all the connections. Approach #2: Implement IDisposable on these objects then call dispose on these objects and set references to Nothing and remove handlers. The theory behind the second approach is since the large longer lived objects take longer to cleanup in the GC. So, by cutting the large objects into smaller bite size morsels the garbage collector will processes them faster, thus a performance gain. So I think the basic question is this: Does breaking apart large groups of interconnected objects optimize data for garbage collection or is better to keep them together and rely on the garbage collection algorithms to processes the data for you? I feel this is a case of pre-optimization, but I do not know enough of the GC to know what does help or hinder it.

    Read the article

  • Optimization in Python - do's, don'ts and rules of thumb.

    - by JV
    Well I was reading this post and then I came across a code which was: jokes=range(1000000) domain=[(0,(len(jokes)*2)-i-1) for i in range(0,len(jokes)*2)] I thought wouldn't it be better to calculate the value of len(jokes) once outside the list comprehension? Well I tried it and timed three codes jv@Pioneer:~$ python -m timeit -s 'jokes=range(1000000);domain=[(0,(len(jokes)*2)-i-1) for i in range(0,len(jokes)*2)]' 10000000 loops, best of 3: 0.0352 usec per loop jv@Pioneer:~$ python -m timeit -s 'jokes=range(1000000);l=len(jokes);domain=[(0,(l*2)-i-1) for i in range(0,l*2)]' 10000000 loops, best of 3: 0.0343 usec per loop jv@Pioneer:~$ python -m timeit -s 'jokes=range(1000000);l=len(jokes)*2;domain=[(0,l-i-1) for i in range(0,l)]' 10000000 loops, best of 3: 0.0333 usec per loop Observing the marginal difference 2.55% between the first and the second made me think - is the first list comprehension domain=[(0,(len(jokes)*2)-i-1) for i in range(0,len(jokes)*2)] optimized internally by python? or is 2.55% a big enough optimization (given that the len(jokes)=1000000)? If this is - What are the other implicit/internal optimizations in Python ? What are the developer's rules of thumb for optimization in Python? Edit1: Since most of the answers are "don't optimize, do it later if its slow" and I got some tips and links from Triptych and Ali A for the do's. I will change the question a bit and request for don'ts. Can we have some experiences from people who faced the 'slowness', what was the problem and how it was corrected? Edit2: For those who haven't here is an interesting read Edit3: Incorrect usage of timeit in question please see dF's answer for correct usage and hence timings for the three codes.

    Read the article

  • Database nesting layout confusion

    - by arzon
    I'm no expert in databases and a beginner in Rails, so here goes something which kinda confuses me... Assuming I have three classes as a sample (note that no effort has been made to address any possible Rails reserved words issue in the sample). class File < ActiveRecord::Base has_many :records, :dependent => :destroy accepts_nested_attributes_for :records, :allow_destroy => true end class Record < ActiveRecord::Base belongs_to :file has_many :users, :dependent => :destroy accepts_nested_attributes_for :users, :allow_destroy => true end class User < ActiveRecord::Base belongs_to :record end Upon entering records, the database contents will appear as such. My issue is that if there are a lot of Files for the same Record, there will be duplicate record names. This will also be true if there will be multiple Records for the same user in the the Users table. I was wondering if there is a better way than this so as to have one or more files point to a single Record entry and one or more Records will point to a single User. BTW, the File names are unique. Files table: id name 1 name1 2 name2 3 name3 4 name4 Records table: id file_id record_name record_type 1 1 ForDaisy1 ... 2 2 ForDonald1 ... 3 3 ForDonald2 ... 4 4 ForDaisy1 ... Users table: id record_id username 1 1 Daisy 2 2 Donald 3 3 Donald 4 4 Daisy Is there any way to optimize the database to prevent duplication of entries, or this should really the correct and proper behavior. I spread out the database into different tables to be able to easily add new columns in the future.

    Read the article

  • Should I write more SQL to be more efficient, or less SQL to be less buggy?

    - by RenderIn
    I've been writing a lot of one-off SQL queries to return exactly what a certain page needs and no more. I could reuse existing queries and issue a number of SQL requests linear to the number of records on the page. As an example, I have a query to return People and a query to return Job Details for a person. To return a list of people with their job details I could query once for people and then once for each person to retrieve their job details. I've found that in most cases that solution returns things in a reasonable amount of time, but I don't know how well it will scale in my environment. Instead I've been writing queries to join people + job details, or people + salary history, etc. I'm looking at my models and I see how I could shave off maybe 30% of my code if I were to re-use existing queries. This is a big temptation. Is it a bad thing to go for reuse over efficiency in general or does it all come down to the specific situation? Should I first do it the easy way and then optimize later, or is it best to get the code knocked out while everything is fresh in my mind? Thoughts, experiences?

    Read the article

  • speed string search in PHP

    - by Marc
    Hi! I have a 1.2GB file that contains a one line string. What I need is to search the entire file to find the position of an another string (currently I have a list of strings to search). The way what I'm doing it now is opening the big file and move a pointer throught 4Kb blocks, then moving the pointer X positions back in the file and get 4Kb more. My problem is that a bigger string to search, a bigger time he take to got it. Can you give me some ideas to optimize the script to get better search times? this is my implementation: function busca($inici){ $limit = 4096; $big_one = fopen('big_one.txt','r'); $options = fopen('options.txt','r'); while(!feof($options)){ $search = trim(fgets($options)); $retro = strlen($search);//maybe setting this position absolute? (like 12 or 15) $punter = 0; while(!feof($big_one)){ $ara = fgets($big_one,$limit); $pos = strpos($ara,$search); $ok_pos = $pos + $punter; if($pos !== false){ echo "$pos - $punter - $search : $ok_pos <br>"; break; } $punter += $limit - $retro; fseek($big_one,$punter); } fseek($big_one,0); } } Thanks in advance!

    Read the article

  • Optimizing a shared buffer in a producer/consumer multithreaded environment

    - by Etan
    I have some project where I have a single producer thread which writes events into a buffer, and an additional single consumer thread which takes events from the buffer. My goal is to optimize this thing for a single machine to achieve maximum throughput. Currently, I am using some simple lock-free ring buffer (lock-free is possible since I have only one consumer and one producer thread and therefore the pointers are only updated by a single thread). #define BUF_SIZE 32768 struct buf_t { volatile int writepos; volatile void * buffer[BUF_SIZE]; volatile int readpos;) }; void produce (buf_t *b, void * e) { int next = (b->writepos+1) % BUF_SIZE; while (b->readpos == next); // queue is full. wait b->buffer[b->writepos] = e; b->writepos = next; } void * consume (buf_t *b) { while (b->readpos == b->writepos); // nothing to consume. wait int next = (b->readpos+1) % BUF_SIZE; void * res = b->buffer[b->readpos]; b->readpos = next; return res; } buf_t *alloc () { buf_t *b = (buf_t *)malloc(sizeof(buf_t)); b->writepos = 0; b->readpos = 0; return b; } However, this implementation is not yet fast enough and should be optimized further. I've tried with different BUF_SIZE values and got some speed-up. Additionaly, I've moved writepos before the buffer and readpos after the buffer to ensure that both variables are on different cache lines which resulted also in some speed. What I need is a speedup of about 400 %. Do you have any ideas how I could achieve this using things like padding etc?

    Read the article

  • Ajax heavy JS apps using excessive amounts of memory over time.

    - by Shane Reustle
    I seem to have some pretty large memory leaks in an app that I am working on. The app itself is not very complex. Every 15 seconds, the page requests approx 40kb of JSON from the server, and draws a table on the page using it. It is cheaper to draw the table over because the data is usually always new. I am attaching a few events to the table, approx 5 per line, 30 lines in the table. I used jQuery's .html() method to put the new html into the container and overwrite the existing. I do this specifically so that jQuery's special cleanup functions go in and attempt to detach all events on the elements in the element that it is overwriting. I then also delete the large variables of html once they are sent to the DOM using delete my_var. I have checked for circular references and attached events that are never cleared a few times, but never REALLY dug into it. I was wondering if someone could give me a few pointers on how to optimize a very heavy app like this. I just picked up "High Performance Javascript" by Nicholas Zakas, but didn't have much time to get into it yet. To give an idea on how much memory this is using, after 4~ hours, it is using about 420,000k on chrome, and much more on Firefox or IE. Thanks!

    Read the article

  • Increment a value from AAA to ZZZ with cyclic rotation

    - by www.openidfrance.frfxkim
    Hi all, I need to code a method that increment a string value from AAA to ZZZ with cyclic rotation (next value after ZZZ is AAA) Here is my code: public static string IncrementValue(string value) { if (string.IsNullOrEmpty(value) || value.Length != 3) { string msg = string.Format("Incorrect value ('{0}' is not between AAA and ZZZ)", value); throw new ApplicationException(msg); } if (value == "ZZZ") { return "AAA"; } char pos1 = value[0]; char pos2 = value[1]; char pos3 = value[2]; bool incrementPos2 = false; bool incrementPos1 = false; if (pos3 == 'Z') { pos3 = 'A'; incrementPos2 = true; } else { pos3++; } if (incrementPos2 && pos2 == 'Z') { pos2 = 'A'; incrementPos1 = true; } else { if (incrementPos2) { if (pos2 == 'Z') { pos2 = 'A'; incrementPos1 = true; } pos2++; } } if (incrementPos1) { pos1++; } return pos1.ToString() + pos2.ToString() + pos3.ToString(); } I know this piece of code is quite dirty and not very efficient but I dont know how to do it properly. How is secured this snippet? (this will only run on windows plaform) How can I optimize-it and make it more readable ? Thanks for your comments

    Read the article

  • Query with many CASE statements - optimization

    - by Nemanja Vujacic
    Hi guys, I have one very dirty query that per sure can be optimized because there are so many CASE statements in it! SELECT (CASE pa.KplusTable_Id WHEN 1 THEN sp.sp_id WHEN 2 THEN fw.fw_id WHEN 3 THEN s.sw_Id WHEN 4 THEN id.ia_id END) as Deal_Id, max(CASE pa.KplusTable_Id WHEN 1 THEN sp.Trans_Id WHEN 2 THEN fw.Trans_Id WHEN 3 THEN s.Trans_Id WHEN 4 THEN id.Trans_Id END) as TransId_CurrentMax INTO #MaxRazlicitOdNull FROM #PotencijalniAktuelni pa LEFT JOIN kplus_sp sp (nolock) on sp.sp_id=pa.Deal_Id AND pa.KplusTable_Id=1 LEFT JOIN kplus_fw fw (nolock) on fw.fw_id=pa.Deal_Id AND pa.KplusTable_Id=2 LEFT JOIN dev_sw s (nolock) on s.sw_Id=pa.Deal_Id AND pa.KplusTable_Id=3 LEFT JOIN kplus_ia id (nolock) on id.ia_id=pa.Deal_Id AND pa.KplusTable_Id=4 WHERE isnull(CASE pa.KplusTable_Id WHEN 1 THEN sp.BROJ_TIKETA WHEN 2 THEN fw.BROJ_TIKETA WHEN 3 THEN s.tiket WHEN 4 THEN id.BROJ_TIKETA END, '')<>'' GROUP BY CASE pa.KplusTable_Id WHEN 1 THEN sp.sp_id WHEN 2 THEN fw.fw_id WHEN 3 THEN s.sw_Id WHEN 4 THEN id.ia_id END Because I have same condition couple times, do you have idea how to optimize query, make it simpler and better. All suggestions are welcome! TnX in advance! Nemanja

    Read the article

  • Discussion on SEO best-practices for site development involving php...

    - by Bradley Herman
    Recently in our work, I've started getting some experience with SEO (finally). It's something I've put off for a long time because I've always maintained that SEO is a buzz-word b.s. pseudo-science and more about providing quality, relevant content (assuming proper header tags and the basics are covered). However, sometimes a client doesn't have stellar content yet still demands SEO and high rankings. While it's not how I design sites 100% of the time (as design dictates structure), I typically create a basic template from the design my boss gives me, then I optimize it, and then strip the top and bottom and move those to header.php and footer.php, using the following to bring in the header and footer based on AJAX versus HTML requests: <?php if($_SERVER['HTTP_X_REQUESTED_WITH']==''){ include('includes/header.php'); }?> #content here <?php if($_SERVER['HTTP_X_REQUESTED_WITH']==''){ include('includes/footer.php'); }?> Then, I use jQuery to intercept page requests and I use AJAX to fill in, for example, a #copy div with the new content. This avoids unnecessarily loading all the header and footer info everytime, but still allows users without Java to access pages without any problems. (also to think about, depending on size of content, do the extra http requests added using this method render it more of a server strain versus a single, larger file?) I don't have a really solid understanding of the meta keywords and their SEO significance, but as I recall reading, the keywords, title, and description on a page should match up to the pages content--ie. each page should have slightly different keywords/description while retaining some common ground. What I'm getting at here is trying to foster a discussion on whether my approach is flawed to begin with, if there are things I can do (within reason) that keep the site structure simple but allow for better SEO practices, or if my SEO understandings are wrong. This isn't a question, per say, but hopefully a constructive discussion here that more than just I can learn from. I appreciate any responses and hope to hear from you. Thanks!

    Read the article

  • ControlCollection extension method optimazation

    - by Johan Leino
    Hi, got question regarding an extension method that I have written that looks like this: public static IEnumerable<T> FindControlsOfType<T>(this ControlCollection instance) where T : class { T control; foreach (Control ctrl in instance) { if ((control = ctrl as T) != null) { yield return control; } foreach (T child in FindControlsOfType<T>(ctrl.Controls)) { yield return child; } } } public static IEnumerable<T> FindControlsOfType<T>(this ControlCollection instance, Func<T, bool> match) where T : class { return FindControlsOfType<T>(instance).Where(match); } The idea here is to find all controls that match a specifc criteria (hence the Func<..) in the controls collection. My question is: Does the second method (that has the Func) first call the first method to find all the controls of type T and then performs the where condition or does the "runtime" optimize the call to perform the where condition on the "whole" enumeration (if you get what I mean). secondly, are there any other optimizations that I can do to the code to perform better. An example can look like this: var checkbox = this.Controls.FindControlsOfType<MyCustomCheckBox>( ctrl => ctrl.CustomProperty == "Test" ) .FirstOrDefault();

    Read the article

  • Browser performance when combining image resizing with animated movement

    - by Steve Reichgut
    I have been tasked with the job of converting a Flash animation into HTML. The animation is rather complex and requires the need to move multiple images (9) from location (x,y) to location (x2,y2) while simultaneously increasing the image size from 215px wide to 930px wide. While doing some initial testing of this animation with just 1-2 images, I noticed a lot of choppiness in FF handling of this animation. To try and isolate the problem, I removed the dynamic resizing of the animation and just moved it from point A to point B. What was interesting was that I saw the same behavior when simply moving a 930px image that was resized down to 215px (via the CSS width or inline width properties). When I try the same animation with a different image that is actually 215px wide, it performed smoothly. I then tried the same animation with the original 930px wide image (with no resizing) and it performed well also. This makes me wonder if the browser is having to "resize" the image down to 215px each time it is moved which is causing the choppiness. Is this a correct assumption? If so, is there any other way to optimize the animation to allow for simultaneous image resizing and image movement? Notes: 1) One optimization I have done is to position the images absolutely in order to minimize the reflow process. 2) I have tested the animation using both jQuery and the fX animation framework.

    Read the article

  • Nhibernate: Stop it from joining to a table that is not needed

    - by Aaron
    I have two tables (tbArea, tbPost) that relate to the following classes. class Area { int ID string Name ... } class Post { int ID string Title Area Area ... } These two classes map up with Fluent Nhibernate. Below is the post mapping. public class PostMapping : ClassMap<Post> { public PostMapping() { Cache.NonStrictReadWrite(); this.Table("tbPost"); Id(x => x.ID) .Column("PostID") .GeneratedBy .Identity(); References(x => x.Area) .ForeignKey("AreaID") .Column("AreaID"); ... } } Any time I perform a query on the Post table "where AreaID = 1(any AreaId)", nhibernate will join to the area table. (What Nhibernate generates for a query) SELECT post fields , area fields (automatically added) FROM tbPost p LEFT JOIN tbArea a on p.areaid = a.areaid where p.areaid = 1 I have tried setting Area to LazyLoad, to Fetch.Select, ReadOnly, and any other setting on the reference and still it will always join to Area. I am trying to optimize the backend database queries, and since I don't need the area object loaded just filtered I would like to eliminate the unnecessary join to Area each time I Query post. What configurations do I need to change or mappings to get area to still be related to post in my objects, but not query it when I filter on AreaID?

    Read the article

  • multi-core processing in R on windows XP - via doMC and foreach

    - by Jan
    Hi guys, I'm posting this question to ask for advice on how to optimize the use of multiple processors from R on a Windows XP machine. At the moment I'm creating 4 scripts (each script with e.g. for (i in 1:100) and (i in 101:200), etc) which I run in 4 different R sessions at the same time. This seems to use all the available cpu. I however would like to do this a bit more efficient. One solution could be to use the "doMC" and the "foreach" package but this is not possible in R on a Windows machine. e.g. library("foreach") library("strucchange") library("doMC") # would this be possible on a windows machine? registerDoMC(2) # for a computer with two cores (processors) ## Nile data with one breakpoint: the annual flows drop in 1898 ## because the first Ashwan dam was built data("Nile") plot(Nile) ## F statistics indicate one breakpoint fs.nile <- Fstats(Nile ~ 1) plot(fs.nile) breakpoints(fs.nile) # , hpc = "foreach" --> It would be great to test this. lines(breakpoints(fs.nile)) Any solutions or advice? Thanks, Jan

    Read the article

  • Flex Modules vs RSL

    - by nil
    Hi, I'm a little bit confused about when is better to use Flex Modules or RSL libriaries (in Flex 3.5). My goal is split my project in several unit projects, so I can test and work separately. Let's assume I have a Customer app and Vendor app. I also have a front-end panel with two buttons. Each button launches Customer app or Vendor app. These applications make different things. They share some .as functions and common components, too. I understand that if I make a main project (for user login and to show a first panel) and two modules (customer, vendor) I must have all that components in my Eclipse project, isn't it? Instead of doing modules, should I create SWC for Vendor and other for Customer app and call from main app by using RSL? So, which option is more suitable? What do you advise me? Which are the trade-offs of each option? On the other side, this flex application is integrated with Java through Blaze and ibatis for persistence managment, and hold by a web apache server. I considered also to create independent war files to keep this indpendence, but I thought this do not optimize flex code. I'm right? Thank you. Nil

    Read the article

  • What's the best way to resolve this scope problem?

    - by Peter Stewart
    I'm writing a program in python that uses genetic techniques to optimize expressions. Constructing and evaluating the expression tree is the time consumer as it can happen billions of times per run. So I thought I'd learn enough c++ to write it and then incorporate it in python using cython or ctypes. I've done some searching on stackoverflow and learned a lot. This code compiles, but leaves the pointers dangling. I tried 'this_node = new Node(...' . It didn't seem to work. And I'm not at all sure how I'd delete all the references as there would be hundreds. I'd like to use variables that stay in scope, but maybe that's not the c++ way. What is the c++ way? class Node { public: char *cargo; int depth; Node *left; Node *right; } Node make_tree(int depth) { depth--; if(depth <= 0) { Node tthis_node("value",depth,NULL,NULL); return tthis_node; } else { Node this_node("operator" depth, &make_tree(depth), &make_tree(depth)); return this_node; } };

    Read the article

  • Stream PDF to another local App

    - by Nathan
    Hi, I'm currently trying to optimize a small firefox extension that will grab a pdf off the current document and send it to a port that another local application is listening on. Right now it uses a terrifying hackjob of cache viewer. The way I'm getting it is loading the cache, searching through it using the current URL and grabbing the file and saving it to a temp directory. Then I stream the file in, delete the temp, and send it through the socket. Now, my new design, ideally I'd want to build it from scratch and cut out saving it to the local machine at all, and just stream it through the socket. I've been looking at doing something like, //check page to ensure its a pdf //init in/out streams //stream through sock //flush Now, this would be vastly superior to the 400 line hacked up mess I have now, but I'm new to building FF extensions, and after reading a lot about URIs and the file streaming and such I'm probably more confused than when I started trying to fix this three hours ago. I'm okay with sending things through the sockets and whatnot, I understand that, I'm mainly confused about what multitude of interfaces I want to use. Gah! Thanks! Also, long time reader, first time poster!

    Read the article

  • Coding Practices which enable the compiler/optimizer to make a faster program.

    - by EvilTeach
    Many years ago, C compilers were not particularly smart. As a workaround K&R invented the register keyword, to hint to the compiler, that maybe it would be a good idea to keep this variable in an internal register. They also made the tertiary operator to help generate better code. As time passed, the compilers matured. They became very smart in that their flow analysis allowing them to make better decisions about what values to hold in registers than you could possibly do. The register keyword became unimportant. FORTRAN can be faster than C for some sorts of operations, due to alias issues. In theory with careful coding, one can get around this restriction to enable the optimizer to generate faster code. What coding practices are available that may enable the compiler/optimizer to generate faster code? Identifying the platform and compiler you use, would be appreciated. Why does the technique seem to work? Sample code is encouraged. Here is a related question [Edit] This question is not about the overall process to profile, and optimize. Assume that the program has been written correctly, compiled with full optimization, tested and put into production. There may be constructs in your code that prohibit the optimizer from doing the best job that it can. What can you do to refactor that will remove these prohibitions, and allow the optimizer to generate even faster code? [Edit] Offset related link

    Read the article

  • Python: Most efficient way to concatenate and rearrange files

    - by user300890
    Hi, I am reading from several files, each file is divided into 2 pieces, first a header section of a few thousand lines followed by a body of a few thousand. My problem is I need to concatenate these files into one file where all the headers are on the top followed by the body. Currently I am using two loops; one to pull out all the headers and write them, and the second to write the body of each file (I also include a tmp_count variable to limit the number of lines to be loading into memory before dumping to file). This is pretty slow - about 6min for 13gb file. Can anyone tell me how to optimize this or if there is a faster way to do this in python ? Thanks! Here is my code: def cat_files_sam(final_file_name,work_directory_master,file_count): final_file = open(final_file_name,"w") if len(file_count) > 1: file_count=sort_output_files(file_count) # only for @ headers for bowtie_file in file_count: #print bowtie_file tmp_list = [] tmp_count = 0 for line in open(os.path.join(work_directory_master,bowtie_file)): if line.startswith("@"): if tmp_count == 1000000: final_file.writelines(tmp_list) tmp_list = [] tmp_count = 0 tmp_list.append(line) tmp_count += 1 else: final_file.writelines(tmp_list) break for bowtie_file in file_count: #print bowtie_file tmp_list = [] tmp_count = 0 for line in open(os.path.join(work_directory_master,bowtie_file)): if line.startswith("@"): continue if tmp_count == 1000000: final_file.writelines(tmp_list) tmp_list = [] tmp_count = 0 tmp_list.append(line) tmp_count += 1 final_file.writelines(tmp_list) final_file.close()

    Read the article

  • Beginner Question: For extract a large subset of a table from MySQL, how does Indexing, order of tab

    - by chongman
    Sorry if this is too simple, but thanks in advance for helping. This is for MySQL but might be relevant for other RDMBSs tblA has 4 columns: colA, colB, colC, mydata, A_id It has about 10^9 records, with 10^3 distinct values for colA, colB, colC. tblB has 3 columns: colA, colB, B_id It has about 10^4 records. I want all the records from tblA (except the A_id) that have a match in tblB. In other words, I want to use tblB to describe the subset that I want to extract and then extract those records from tblA. Namely: SELECT a.colA, a.colB, a.colC, a.mydata FROM tblA as a INNER JOIN tblB as b ON a.colA=b.colA a.colB=b.colB ; It's taking a really long time (more than an hour) on a newish computer (4GB, Core2Quad, ubuntu), and I just want to check my understanding of the following optimization steps. ** Suppose this is the only query I will ever run on these tables. So ignore the need to run other queries. Now my questions: 1) What indexes should I create to optimize this query? I think I just need a multiple index on (colA, colB) for both tables. I don't think I need separate indexes for colA and colB. Another stack overflow article (that I can't find) mentioned that when adding new indexes, it is slower when there are existing indexes, so that might be a reason to use the multiple index. 2) Is INNER JOIN correct? I just want results where a match is found. 3) Is it faster if I join (tblA to tblB) or the other way around, (tblB to tblA)? This previous answer says that the optimizer should take care of that. 4) Does the order of the part after ON matter? This previous answer say that the optimizer also takes care of the execution order.

    Read the article

  • real time stock quotes, StreamReader performance optimization

    - by sean717
    I am working on a program that extracts real time quote for 900+ stocks from a website. I use HttpWebRequest to send HTTP request to the site and store the response to a stream and open a stream using the following code: HttpWebResponse response = (HttpWebResponse)request.GetResponse(); Stream stream = response.GetResponseStream (); StreamReader reader = new StreamReader( stream ) the size of the received HTML is large (5000+ lines), so it takes a long time to parse it and extract the price. For 900 files, It takes about 6 mins for parsing and extracting. Which my boss isn't happy with, he told me he'd want the whole process to be done in TWO mins. I've identified the part of the program that takes most of time to finish is parsing and extracting. I've tried to optimize the code to make it faster, the following is what I have now after some optimization: // skip lines at the top for(int i=0;i<1500;++i) reader.ReadLine(); // read the line that contains the price string theLine = reader.ReadLine(); // ... extract the price from the line now it takes about 4 mins to process all the files, there is still a significant gap to what my boss's expecting. So I am wondering, is there other way that I can further speed up the parsing and extracting and have everything done within 2 mins?

    Read the article

  • Optimizing an embedded SELECT query in mySQL

    - by Crazy Serb
    Ok, here's a query that I am running right now on a table that has 45,000 records and is 65MB in size... and is just about to get bigger and bigger (so I gotta think of the future performance as well here): SELECT count(payment_id) as signup_count, sum(amount) as signup_amount FROM payments p WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30' AND completed > 0 AND tm_completed IS NOT NULL AND member_id NOT IN (SELECT p2.member_id FROM payments p2 WHERE p2.completed=1 AND p2.tm_completed < '2009-05-01' AND p2.tm_completed IS NOT NULL GROUP BY p2.member_id) And as you might or might not imagine - it chokes the mysql server to a standstill... What it does is - it simply pulls the number of new users who signed up, have at least one "completed" payment, tm_completed is not empty (as it is only populated for completed payments), and (the embedded Select) that member has never had a "completed" payment before - meaning he's a new member (just because the system does rebills and whatnot, and this is the only way to sort of differentiate between an existing member who just got rebilled and a new member who got billed for the first time). Now, is there any possible way to optimize this query to use less resources or something, and to stop taking my mysql resources down on their knees...? Am I missing any info to clarify this any further? Let me know... EDIT: Here are the indexes already on that table: PRIMARY PRIMARY 46757 payment_id member_id INDEX 23378 member_id payer_id INDEX 11689 payer_id coupon_id INDEX 1 coupon_id tm_added INDEX 46757 tm_added, product_id tm_completed INDEX 46757 tm_completed, product_id

    Read the article

  • How do I enable mod_deflate for PHP files?

    - by DM.
    I have a Liquid Web VPS account, I've made sure that mod_deflate is installed and running/active. I used to gzip my css and js files via PHP, as well as my PHP files themselves... However, I'm now trying to do this via mod_deflate, and it seems to work fine for all files except for PHP files. (Txt files work fine, css, js, static HTML files, just nothing that is generated via a PHP file.) How do I fix this? (I used the "Compress all content" option under "Optimize Website" in cPanel, which creates an .htaccess file in the home directory (not public_html, one level higher than that) with exactly the same text as the "compress everything except images" example on http://httpd.apache.org/docs/2.0/mod/mod_deflate.html) .htaccess file: <IfModule mod_deflate.c> SetOutputFilter DEFLATE <IfModule mod_setenvif.c> # Netscape 4.x has some problems... BrowserMatch ^Mozilla/4 gzip-only-text/html # Netscape 4.06-4.08 have some more problems BrowserMatch ^Mozilla/4\.0[678] no-gzip # MSIE masquerades as Netscape, but it is fine # BrowserMatch \bMSIE !no-gzip !gzip-only-text/html # NOTE: Due to a bug in mod_setenvif up to Apache 2.0.48 # the above regex won't work. You can use the following # workaround to get the desired effect: BrowserMatch \bMSI[E] !no-gzip !gzip-only-text/html # Don't compress images SetEnvIfNoCase Request_URI .(?:gif|jpe?g|png)$ no-gzip dont-vary </IfModule> <IfModule mod_headers.c> # Make sure proxies don't deliver the wrong content Header append Vary User-Agent env=!dont-vary </IfModule> </IfModule>

    Read the article

  • IIS 6+ASP.NET - many temp files generated

    - by moshe_ravid
    I have a ASP.NET + some .NET web-services running on IIS 6 (win 2003 server). The issue is that IIS is generating a lot (!) of files in "c:\WINDOWS\Temp" directory. a lot of files means thousands of files, which get to more than 3G of size so far. The files are generated by this command: C:\WINDOWS\SysWOW64\inetsrv "C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\csc.exe" /t:library /utf8output /R:"C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\Temporary ASP.NET Files\vfagt\113819dd\db0d5802\assembly\dl3\fedc6ef1\006e24d8_3bc9ca01\VfAgentWService.DLL" /R:"C:\WINDOWS\assembly\GAC_MSIL\System.Web.Services\2.0.0.0__b03f5f7f11d50a3a\System.Web.Services.dll" /R:"C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\mscorlib.dll" /R:"C:\WINDOWS\assembly\GAC_MSIL\System.Xml\2.0.0.0__b77a5c561934e089\System.Xml.dll" /R:"C:\WINDOWS\assembly\GAC_MSIL\System\2.0.0.0__b77a5c561934e089\System.dll" /out:"C:\WINDOWS\TEMP\9i_i2bmg.dll" /debug- /optimize+ /nostdlib /D:_DYNAMIC_XMLSERIALIZER_COMPILATION "C:\WINDOWS\TEMP\9i_i2bmg.0.cs" The files in the temp directory are pairs of *.out & *.err, where the *.err file is zero size, and the *.out file contains the compilation output messages. What is causing IIS to generate so many files? How can I prevent it? UPDATE: The problem is that the command i described above (csc.exe) is being executed many (many) times, causing the .out & .err to be generated so many times, until it consumes the disk space. So - my question is: what is causing this command to run so many times? (i don't have that many .aspx & .asmx files in my web app). Thanks, Moe

    Read the article

< Previous Page | 51 52 53 54 55 56 57 58 59 60 61 62  | Next Page >