Search Results

Search found 4136 results on 166 pages for 'micro optimization'.

Page 58/166 | < Previous Page | 54 55 56 57 58 59 60 61 62 63 64 65  | Next Page >

  • Why does adding Crossover to my Genetic Algorithm gives me worse results?

    - by MahlerFive
    I have implemented a Genetic Algorithm to solve the Traveling Salesman Problem (TSP). When I use only mutation, I find better solutions than when I add in crossover. I know that normal crossover methods do not work for TSP, so I implemented both the Ordered Crossover and the PMX Crossover methods, and both suffer from bad results. Here are the other parameters I'm using: Mutation: Single Swap Mutation or Inverted Subsequence Mutation (as described by Tiendil here) with mutation rates tested between 1% and 25%. Selection: Roulette Wheel Selection Fitness function: 1 / distance of tour Population size: Tested 100, 200, 500, I also run the GA 5 times so that I have a variety of starting populations. Stop Condition: 2500 generations With the same dataset of 26 points, I usually get results of about 500-600 distance using purely mutation with high mutation rates. When adding crossover my results are usually in the 800 distance range. The other confusing thing is that I have also implemented a very simple Hill-Climbing algorithm to solve the problem and when I run that 1000 times (faster than running the GA 5 times) I get results around 410-450 distance, and I would expect to get better results using a GA. Any ideas as to why my GA performing worse when I add crossover? And why is it performing much worse than a simple Hill-Climb algorithm which should get stuck on local maxima as it has no way of exploring once it finds a local max?

    Read the article

  • How to optimize my PostgreSQL DB for prefix search?

    - by asmaier
    I have a table called "nodes" with roughly 1.7 million rows in my PostgreSQL db =#\d nodes Table "public.nodes" Column | Type | Modifiers --------+------------------------+----------- id | integer | not null title | character varying(256) | score | double precision | Indexes: "nodes_pkey" PRIMARY KEY, btree (id) I want to use information from that table for autocompletion of a search field, showing the user a list of the ten titles having the highest score fitting to his input. So I used this query (here searching for all titles starting with "s") =# explain analyze select title,score from nodes where title ilike 's%' order by score desc; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Sort (cost=64177.92..64581.38 rows=161385 width=25) (actual time=4930.334..5047.321 rows=161264 loops=1) Sort Key: score Sort Method: external merge Disk: 5712kB -> Seq Scan on nodes (cost=0.00..46630.50 rows=161385 width=25) (actual time=0.611..4464.413 rows=161264 loops=1) Filter: ((title)::text ~~* 's%'::text) Total runtime: 5260.791 ms (6 rows) This was much to slow for using it with autocomplete. With some information from Using PostgreSQL in Web 2.0 Applications I was able to improve that with a special index =# create index title_idx on nodes using btree(lower(title) text_pattern_ops); =# explain analyze select title,score from nodes where lower(title) like lower('s%') order by score desc limit 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Limit (cost=18122.41..18122.43 rows=10 width=25) (actual time=1324.703..1324.708 rows=10 loops=1) -> Sort (cost=18122.41..18144.60 rows=8876 width=25) (actual time=1324.700..1324.702 rows=10 loops=1) Sort Key: score Sort Method: top-N heapsort Memory: 17kB -> Bitmap Heap Scan on nodes (cost=243.53..17930.60 rows=8876 width=25) (actual time=96.124..1227.203 rows=161264 loops=1) Filter: (lower((title)::text) ~~ 's%'::text) -> Bitmap Index Scan on title_idx (cost=0.00..241.31 rows=8876 width=0) (actual time=90.059..90.059 rows=161264 loops=1) Index Cond: ((lower((title)::text) ~>=~ 's'::text) AND (lower((title)::text) ~<~ 't'::text)) Total runtime: 1325.085 ms (9 rows) So this gave me a speedup of factor 4. But can this be further improved? What if I want to use '%s%' instead of 's%'? Do I have any chance of getting a decent performance with PostgreSQL in that case, too? Or should I better try a different solution (Lucene?, Sphinx?) for implementing my autocomplete feature?

    Read the article

  • Any ideas on How to search a 2D array quickly?

    - by Tattat
    I jave a 2D array like this, just like a matrix: {{1, 2, 4, 5, 3, 6}, {8, 3, 4, 4, 5, 2}, {8, 3, 4, 2, 6, 2}, //code skips... ... } I want to get all the "4" position, instead of searching the array one by way, and return the position, how can I search it faster / more efficient? thz in advance.

    Read the article

  • How to improve my LDAP schema?

    - by asmaier
    Hello, I have a OpenLDAP Database and it holds some project objects that look like dn: cn=Proj1,ou=Project,ou=ua,dc=org cn: Proj1 objectClass: top objectClass: posixGroup member: 001ag member: 002ag System: ABEL System: PCx Budget: ABEL:1000000:0.3 Budget: PCx:300000:0.3 One can see that the Budget attribute is a ":"-separated string, where the first part holds the name of the system the budget is for, the second part holds some budget (which may change every month) and the last entry is a conversion factor for the budget of that system. Seeing this, I thought this is bad database design, since attribute values should always be atomic. But how can I improve that in LDAP, so that I can do a direct ldapsearch or a direct ldapmodify of the budget of System "ABEL" instead of writing a script, that will have to parse and split the ":"-separated string?

    Read the article

  • optimal memory layout for read-only/write memory segments.

    - by aaa
    hello. Suppose I have two memory segments (equal size each, approximately 1kb in size) , one is read-only (after initialization), and other is read/write. what is the best layout in memory for such segments in terms of memory performance? one allocation, contiguous segments or two allocations (in general not contiguous). my primary architecture is linux Intel 64-bit. my feeling is former (cache friendlier) case is better. is there circumstances, where second layout is preferred? Thanks

    Read the article

  • Date arithmetic using integer values

    - by Dave Jarvis
    Problem String concatenation is slowing down a query: date(extract(YEAR FROM m.taken)||'-1-1') d1, date(extract(YEAR FROM m.taken)||'-1-31') d2 This is realized in code as part of a string, which follows (where the p_ variables are integers): date(extract(YEAR FROM m.taken)||''-'||p_month1||'-'||p_day1||''') d1, date(extract(YEAR FROM m.taken)||''-'||p_month2||'-'||p_day2||''') d2 This part of the query runs in 3.2 seconds with the dates, and 1.5 seconds without, leading me to believe there is ample room for improvement. Question What is a better way to create the date (presumably without concatenation)? Many thanks!

    Read the article

  • Is a red-black tree my ideal data structure?

    - by Hugo van der Sanden
    I have a collection of items (big rationals) that I'll be processing. In each case, processing will consist of removing the smallest item in the collection, doing some work, and then adding 0-2 new items (which will always be larger than the removed item). The collection will be initialised with one item, and work will continue until it is empty. I'm not sure what size the collection is likely to reach, but I'd expect in the range 1M-100M items. I will not need to locate any item other than the smallest. I'm currently planning to use a red-black tree, possibly tweaked to keep a pointer to the smallest item. However I've never used one before, and I'm unsure whether my pattern of use fits its characteristics well. 1) Is there a danger the pattern of deletion from the left + random insertion will affect performance, eg by requiring a significantly higher number of rotations than random deletion would? Or will delete and insert operations still be O(log n) with this pattern of use? 2) Would some other data structure give me better performance, either because of the deletion pattern or taking advantage of the fact I only ever need to find the smallest item? Update: glad I asked, the binary heap is clearly a better solution for this case, and as promised turned out to be very easy to implement. Hugo

    Read the article

  • WPF, how can I optimize lines and circles drawing ?

    - by Aurélien Ribon
    Hello ! I am developping an application where I need to draw a graph on the screen. For this purpose, I use a Canvas and I put Controls on it. An example of such a draw as shown in the app can be found here : http://free0.hiboox.com/images/1610/d82e0b7cc3521071ede601d3542c7bc5.png It works fine for simple graphs, but I also want to be able to draw very large graphs (hundreds of nodes). And when I try to draw a very large graph, it takes a LOT of time to render. My problem is that the code is not optimized at all, I just wanted it to work. Until now, I have a Canvas on the one hand, and multiple Controls on the other hands. Actually, circles and lines are listed in collections, and for each item of these collections, I use a ControlTemplate, defining a red circle, a black circle, a line, etc. Here is an example, the definition of a graph circle : <!-- STYLE : DISPLAY DATA NODE --> <Style TargetType="{x:Type flow.elements:DisplayNode}"> <Setter Property="Canvas.Left" Value="{Binding X, RelativeSource={RelativeSource Self}}" /> <Setter Property="Canvas.Top" Value="{Binding Y, RelativeSource={RelativeSource Self}}" /> <Setter Property="Template"> <Setter.Value> <ControlTemplate TargetType="{x:Type flow.elements:DisplayNode}"> <!--TEMPLATE--> <Grid x:Name="grid" Margin="-30,-30,0,0"> <Ellipse x:Name="selectionEllipse" StrokeThickness="0" Width="60" Height="60" Opacity="0" IsHitTestVisible="False"> <Ellipse.Fill> <RadialGradientBrush> <GradientStop Color="Black" Offset="0.398" /> <GradientStop Offset="1" /> </RadialGradientBrush> </Ellipse.Fill> </Ellipse> <Ellipse Stroke="Black" Width="30" Height="30" x:Name="ellipse"> <Ellipse.Fill> <LinearGradientBrush EndPoint="0,1"> <GradientStop Offset="0" Color="White" /> <GradientStop Offset="1.5" Color="LightGray" /> </LinearGradientBrush> </Ellipse.Fill> </Ellipse> <TextBlock x:Name="tblock" Text="{Binding NodeName, RelativeSource={RelativeSource Mode=TemplatedParent}}" Foreground="Black" VerticalAlignment="Center" HorizontalAlignment="Center" FontSize="10.667" /> </Grid> <!--TRIGGERS--> <ControlTemplate.Triggers> <!--DATAINPUT--> <MultiTrigger> <MultiTrigger.Conditions> <Condition Property="SkinMode" Value="NODETYPE" /> <Condition Property="NodeType" Value="DATAINPUT" /> </MultiTrigger.Conditions> <Setter TargetName="tblock" Property="Foreground" Value="White" /> <Setter TargetName="ellipse" Property="Fill"> <Setter.Value> <LinearGradientBrush EndPoint="0,1"> <GradientStop Offset="-0.5" Color="White" /> <GradientStop Offset="1" Color="Black" /> </LinearGradientBrush> </Setter.Value> </Setter> </MultiTrigger> <!--DATAOUTPUT--> <MultiTrigger> <MultiTrigger.Conditions> <Condition Property="SkinMode" Value="NODETYPE" /> <Condition Property="NodeType" Value="DATAOUTPUT" /> </MultiTrigger.Conditions> <Setter TargetName="tblock" Property="Foreground" Value="White" /> <Setter TargetName="ellipse" Property="Fill"> <Setter.Value> <LinearGradientBrush EndPoint="0,1"> <GradientStop Offset="-0.5" Color="White" /> <GradientStop Offset="1" Color="Black" /> </LinearGradientBrush> </Setter.Value> </Setter> </MultiTrigger> ....... THERE IS A TOTAL OF 7 MULTITRIGGERS ....... </ControlTemplate.Triggers> </ControlTemplate> </Setter.Value> </Setter> </Style> Also, the lines are drawn using the Line Control. <!-- STYLE : DISPLAY LINK --> <Style TargetType="{x:Type flow.elements:DisplayLink}"> <Setter Property="Template"> <Setter.Value> <ControlTemplate TargetType="{x:Type flow.elements:DisplayLink}"> <!--TEMPLATE--> <Line X1="{Binding X1, RelativeSource={RelativeSource TemplatedParent}}" X2="{Binding X2, RelativeSource={RelativeSource TemplatedParent}}" Y1="{Binding Y1, RelativeSource={RelativeSource TemplatedParent}}" Y2="{Binding Y2, RelativeSource={RelativeSource TemplatedParent}}" Stroke="Gray" StrokeThickness="2" x:Name="line" /> <!--TRIGGERS--> <ControlTemplate.Triggers> <!--BRANCH : ASSERTION--> <MultiTrigger> <MultiTrigger.Conditions> <Condition Property="SkinMode" Value="BRANCHTYPE" /> <Condition Property="BranchType" Value="ASSERTION" /> </MultiTrigger.Conditions> <Setter TargetName="line" Property="Stroke" Value="#E0E0E0" /> </MultiTrigger> </ControlTemplate.Triggers> </ControlTemplate> </Setter.Value> </Setter> </Style> So, I need your advices. How can I drastically improve the rendering performances ? Should I define each MultiTrigger circle rendering possibility in its own ControlTemplate instead ? Is there a better line drawing technique ? Should I open a DrawingContext and draw everything in one control, instead of having hundreds of controls ?

    Read the article

  • Custom View - Avoid redrawing when non-interactive

    - by MasterGaurav
    I have a complex custom view - photo collage. What is observed is whenever any UI interaction happens, the view is redrawn. How can I avoid complete redrawing (for example, use a cached UI) of the view specially when I click the "back" button to go back to previous activity because that also causes redrawing of the view. While exploring the API and web, I found a method - getDrawingCache() - but don't know how to use it effectively. How do I use it effectively? I've had other issues with Custom Views that I outline here.

    Read the article

  • High performance text file parsing in .net

    - by diamandiev
    Here is the situation: I am making a small prog to parse server log files. I tested it with a log file with several thousand requests (between 10000 - 20000 don't know exactly) What i have to do is to load the log text files into memory so that i can query them. This is taking the most resources. The methods that take the most cpu time are those (worst culprits first): string.split - splits the line values into a array of values string.contains - checking if the user agent contains a specific agent string. (determine browser ID) string.tolower - various purposes streamreader.readline - to read the log file line by line. string.startswith - determine if line is a column definition line or a line with values there were some others that i was able to replace. For example the dictionary getter was taking lots of resources too. Which i had not expected since its a dictionary and should have its keys indexed. I replaced it with a multidimensional array and saved some cpu time. Now i am running on a fast dual core and the total time it takes to load the file i mentioned is about 1 sec. Now this is really bad. Imagine a site that has tens of thousands of visits a day. It's going to take minutes to load the log file. So what are my alternatives? If any, cause i think this is just a .net limitation and i can't do much about it.

    Read the article

  • Lazy loading the addthis script? (or lazy loading external js content dependent on already fired eve

    - by Keith Bentrup
    I want to have the addthis widget available for my users, but I want to lazy load it so that my page loads as quickly as possible. However, after trying it via a script tag and then via my lazy loading method, it appears to only work via the script tag. In the obfuscated code, I see something that looks like it's dependent on the DOMContentLoaded event (at least for firefox). Since the DOMContentLoaded event has already fired, the widget doesn't render properly. What to do? I could just use a script tag (slower)... or could I fire (in a cross browser way) the DOMContentLoaded (or equivalent) event? I have a feeling this may not be possible b/c I believe that (like jQuery) there are multiple tests of the content ready event, and so multiple simulated events would have to occur. Nonetheless, this is an interesting problem b/c I have seen a couple widgets now assume that you are including their stuff via static script tags. It would be nice if they wrote code that was more useful to developers concerned about speed, but until then, is there a work around?? And/or are any of my assumptions wrong? Edit: Because the 1st answer to the question seemed to miss the point of my problem, I wanted to clarify the situation. This is about a specific problem. I'm not looking for yet another lazy load script or check if some dependencies are loaded script. Specifically this problem deals with external widgets that you do not have control over and may or may not be obfuscated delaying the load of the external widgets until they are needed or at least, til substantially after everything else has been loaded including other deferred elements b/c of the how the widget was written, precludes existing, typical lazy loading paradigms While it's esoteric, I have seen it happen with a couple widgets - where the widget developers assume that you're just willing to throw in another script tag at the bottom of the page. I'm looking to save those 500-1000 ms** though as numerous studies by yahoo, google, and amazon show it to be important to your user's experience. **My testing with hammerhead and personal experience indicates that this will be my savings in this case.

    Read the article

  • Python performance improvement request for winkler

    - by Martlark
    I'm a python n00b and I'd like some suggestions on how to improve the algorithm to improve the performance of this method to compute the Jaro-Winkler distance of two names. def winklerCompareP(str1, str2): """Return approximate string comparator measure (between 0.0 and 1.0) USAGE: score = winkler(str1, str2) ARGUMENTS: str1 The first string str2 The second string DESCRIPTION: As described in 'An Application of the Fellegi-Sunter Model of Record Linkage to the 1990 U.S. Decennial Census' by William E. Winkler and Yves Thibaudeau. Based on the 'jaro' string comparator, but modifies it according to whether the first few characters are the same or not. """ # Quick check if the strings are the same - - - - - - - - - - - - - - - - - - # jaro_winkler_marker_char = chr(1) if (str1 == str2): return 1.0 len1 = len(str1) len2 = len(str2) halflen = max(len1,len2) / 2 - 1 ass1 = '' # Characters assigned in str1 ass2 = '' # Characters assigned in str2 #ass1 = '' #ass2 = '' workstr1 = str1 workstr2 = str2 common1 = 0 # Number of common characters common2 = 0 #print "'len1', str1[i], start, end, index, ass1, workstr2, common1" # Analyse the first string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len1): start = max(0,i-halflen) end = min(i+halflen+1,len2) index = workstr2.find(str1[i],start,end) #print 'len1', str1[i], start, end, index, ass1, workstr2, common1 if (index > -1): # Found common character common1 += 1 #ass1 += str1[i] ass1 = ass1 + str1[i] workstr2 = workstr2[:index]+jaro_winkler_marker_char+workstr2[index+1:] #print "str1 analyse result", ass1, common1 #print "str1 analyse result", ass1, common1 # Analyse the second string - - - - - - - - - - - - - - - - - - - - - - - - - # for i in range(len2): start = max(0,i-halflen) end = min(i+halflen+1,len1) index = workstr1.find(str2[i],start,end) #print 'len2', str2[i], start, end, index, ass1, workstr1, common2 if (index > -1): # Found common character common2 += 1 #ass2 += str2[i] ass2 = ass2 + str2[i] workstr1 = workstr1[:index]+jaro_winkler_marker_char+workstr1[index+1:] if (common1 != common2): print('Winkler: Wrong common values for strings "%s" and "%s"' % \ (str1, str2) + ', common1: %i, common2: %i' % (common1, common2) + \ ', common should be the same.') common1 = float(common1+common2) / 2.0 ##### This is just a fix ##### if (common1 == 0): return 0.0 # Compute number of transpositions - - - - - - - - - - - - - - - - - - - - - # transposition = 0 for i in range(len(ass1)): if (ass1[i] != ass2[i]): transposition += 1 transposition = transposition / 2.0 # Now compute how many characters are common at beginning - - - - - - - - - - # minlen = min(len1,len2) for same in range(minlen+1): if (str1[:same] != str2[:same]): break same -= 1 if (same > 4): same = 4 common1 = float(common1) w = 1./3.*(common1 / float(len1) + common1 / float(len2) + (common1-transposition) / common1) wn = w + same*0.1 * (1.0 - w) return wn

    Read the article

  • Is there a way to tell JVM to optimize my code before processing?

    - by Rogach
    I have a method, which takes much time to execute first time. But after several invocations, it takes about 30 times less time. So, to make my application respond to user interaction faster, I "warm-up" this method (5 times) with some sample data on initialization of application. But this increases app start-up time. I read, that JVM's can optimize and compile my java code to native, thus speeding things up. I wanted to know - maybe there is some way to explicitly tell JVM that I want this method to be compiled on startup of application?

    Read the article

  • What is the best algorithm for this problem?

    - by mark
    What is the most efficient algorithm to solve the following problem? Given 6 arrays, D1,D2,D3,D4,D5 and D6 each containing 6 numbers like: D1[0] = number D2[0] = number ...... D6[0] = number D1[1] = another number D2[1] = another number .... ..... .... ...... .... D1[5] = yet another number .... ...... .... Given a second array ST1, containing 1 number: ST1[0] = 6 Given a third array ans, containing 6 numbers: ans[0] = 3, ans[1] = 4, ans[2] = 5, ......ans[5] = 8 Using as index for the arrays D1,D2,D3,D4,D5 and D6, the number that goes from 0, to the number stored in ST1[0] minus one, in this example 6, so from 0 to 6-1, compare each res array against each D array My algorithm so far is: I tried to keep everything unlooped as much as possible. EML := ST1[0] //number contained in ST1[0] EML1 := 0 //start index for the arrays D While EML1 < EML if D1[ELM1] = ans[0] goto two if D2[ELM1] = ans[0] goto two if D3[ELM1] = ans[0] goto two if D4[ELM1] = ans[0] goto two if D5[ELM1] = ans[0] goto two if D6[ELM1] = ans[0] goto two ELM1 = ELM1 + 1 return 0 //bad row of numbers, if while ends two: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[1] goto two if D2[ELM1] = ans[1] goto two if D3[ELM1] = ans[1] goto two if D4[ELM1] = ans[1] goto two if D5[ELM1] = ans[1] goto two if D6[ELM1] = ans[1] goto two ELM1 = ELM1 + 1 return 0 three: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[2] goto two if D2[ELM1] = ans[2] goto two if D3[ELM1] = ans[2] goto two if D4[ELM1] = ans[2] goto two if D5[ELM1] = ans[2] goto two if D6[ELM1] = ans[2] goto two ELM1 = ELM1 + 1 return 0 four: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[3] goto two if D2[ELM1] = ans[3] goto two if D3[ELM1] = ans[3] goto two if D4[ELM1] = ans[3] goto two if D5[ELM1] = ans[3] goto two if D6[ELM1] = ans[3] goto two ELM1 = ELM1 + 1 return 0 five: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[4] goto two if D2[ELM1] = ans[4] goto two if D3[ELM1] = ans[4] goto two if D4[ELM1] = ans[4] goto two if D5[ELM1] = ans[4] goto two if D6[ELM1] = ans[4] goto two ELM1 = ELM1 + 1 return 0 six: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[0] return 1 //good row of numbers if D2[ELM1] = ans[0] return 1 if D3[ELM1] = ans[0] return 1 if D4[ELM1] = ans[0] return 1 if D5[ELM1] = ans[0] return 1 if D6[ELM1] = ans[0] return 1 ELM1 = ELM1 + 1 return 0 As language of choice, it would be pure c

    Read the article

  • Optimizing Haskell code

    - by Masse
    I'm trying to learn Haskell and after an article in reddit about Markov text chains, I decided to implement Markov text generation first in Python and now in Haskell. However I noticed that my python implementation is way faster than the Haskell version, even Haskell is compiled to native code. I am wondering what I should do to make the Haskell code run faster and for now I believe it's so much slower because of using Data.Map instead of hashmaps, but I'm not sure I'll post the Python code and Haskell as well. With the same data, Python takes around 3 seconds and Haskell is closer to 16 seconds. It comes without saying that I'll take any constructive criticism :). import random import re import cPickle class Markov: def __init__(self, filenames): self.filenames = filenames self.cache = self.train(self.readfiles()) picklefd = open("dump", "w") cPickle.dump(self.cache, picklefd) picklefd.close() def train(self, text): splitted = re.findall(r"(\w+|[.!?',])", text) print "Total of %d splitted words" % (len(splitted)) cache = {} for i in xrange(len(splitted)-2): pair = (splitted[i], splitted[i+1]) followup = splitted[i+2] if pair in cache: if followup not in cache[pair]: cache[pair][followup] = 1 else: cache[pair][followup] += 1 else: cache[pair] = {followup: 1} return cache def readfiles(self): data = "" for filename in self.filenames: fd = open(filename) data += fd.read() fd.close() return data def concat(self, words): sentence = "" for word in words: if word in "'\",?!:;.": sentence = sentence[0:-1] + word + " " else: sentence += word + " " return sentence def pickword(self, words): temp = [(k, words[k]) for k in words] results = [] for (word, n) in temp: results.append(word) if n > 1: for i in xrange(n-1): results.append(word) return random.choice(results) def gentext(self, words): allwords = [k for k in self.cache] (first, second) = random.choice(filter(lambda (a,b): a.istitle(), [k for k in self.cache])) sentence = [first, second] while len(sentence) < words or sentence[-1] is not ".": current = (sentence[-2], sentence[-1]) if current in self.cache: followup = self.pickword(self.cache[current]) sentence.append(followup) else: print "Wasn't able to. Breaking" break print self.concat(sentence) Markov(["76.txt"]) -- module Markov ( train , fox ) where import Debug.Trace import qualified Data.Map as M import qualified System.Random as R import qualified Data.ByteString.Char8 as B type Database = M.Map (B.ByteString, B.ByteString) (M.Map B.ByteString Int) train :: [B.ByteString] -> Database train (x:y:[]) = M.empty train (x:y:z:xs) = let l = train (y:z:xs) in M.insertWith' (\new old -> M.insertWith' (+) z 1 old) (x, y) (M.singleton z 1) `seq` l main = do contents <- B.readFile "76.txt" print $ train $ B.words contents fox="The quick brown fox jumps over the brown fox who is slow jumps over the brown fox who is dead."

    Read the article

  • Is there a way to optimize this mysql query...?

    - by SpikETidE
    Hi Everyone... Say, I got these two tables.... Table 1 : Hotels hotel_id hotel_name 1 abc 2 xyz 3 efg Table 2 : Payments payment_id payment_date hotel_id total_amt comission p1 23-03-2010 1 100 10 p2 23-03-2010 2 50 5 p3 23-03-2010 2 200 25 p4 23-03-2010 1 40 2 Now, I need to get the following details from the two tables Given a particular date (say, 23-03-2010), the sum of the total_amt for each of the hotel for which a payment has been made on that particular date. All the rows that has the date 23-03-2010 ordered according to the hotel name A sample output is as follows... +------------+------------+------------+---------------+ | hotel_name | date | total_amt | commission | +------------+------------+------------+---------------+ | * abc | 23-03-2010 | 140 | 12 | +------------+------------+------------+---------------+ |+-----------+------------+------------+--------------+| || paymt_id | date | total_amt | commission || |+-----------+------------+------------+--------------+| || p1 | 23-03-2010 | 100 | 10 || |+-----------+------------+------------+--------------+| || p4 | 23-03-2010 | 40 | 2 || |+-----------+------------+------------+--------------+| +------------+------------+------------+---------------+ | * xyz | 23-03-2010 | 250 | 30 | +------------+------------+------------+---------------+ |+-----------+------------+------------+--------------+| || paymt_id | date | total_amt | commission || |+-----------+------------+------------+--------------+| || p2 | 23-03-2010 | 50 | 5 || |+-----------+------------+------------+--------------+| || p3 | 23-03-2010 | 200 | 25 || |+-----------+------------+------------+--------------+| +------------------------------------------------------+ Above the sample of the table that has to be printed... The idea is first to show the consolidated detail of each hotel, and when the '*' next to the hotel name is clicked the breakdown of the payment details will become visible... But that can be done by some jquery..!!! The table itself can be generated with php... Right now i am using two separate queries : One to get the sum of the amount and commission grouped by the hotel name. The next is to get the individual row for each entry having that date in the table. This is, of course, because grouping the records for calculating sum() returns only one row for each of the hotel with the sum of the amounts... Is there a way to combine these two queries into a single one and do the operation in a more optimized way...?? Hope i am being clear.. Thanks for your time and replies...

    Read the article

  • How can I implement the Gale-Shapley stable marriage algorithm in Perl?

    - by srk
    Problem : We have equal number of men and women.each men has a preference score toward each woman. So do the woman for each man. each of the men and women have certain interests. Based on the interest we calculate the preference scores. So initially we have an input in a file having x columns. First column is the person(men/woman) id. id are nothing but 0.. n numbers.(first half are men and next half woman) the remaining x-1 columns will have the interests. these are integers too. now using this n by x-1 matrix... we have come up with a n by n/2 matrix. the new matrix has all men and woman as their rows and scores for opposite sex in columns. We have to sort the scores in descending order, also we need to know the id of person related to the scores after sorting. So here i wanted to use hash table. once we get the scores we need to make up pairs.. for which we need to follow some rules. My trouble is with the second matrix of n by n/2 that needs to give information of which man/woman has how much preference on a woman/man. I need these scores sorted so that i know who is the first preferred woman/man, 2nd preferred and so on for a man/woman. I hope to get good suggestions on the data structures i use.. I prefer php or perl. Thank you in advance Hey guys this is not an home work. This a little modified version of stable marriage algorithm. I have working solution. I am only working on optimizing my code. more info: It is very similar to stable marriage problem but here we need to calculate the scores based on the interests they share. So i have implemented it as the way you see in the wiki page http://en.wikipedia.org/wiki/Stable_marriage_problem. my problem is not solving the problem. i solved it and can run it. I am just trying to have a better solution. so i am asking suggestions on the type of data structure to use. Conceptually I tried using an array of hashes. where the array index give the person id and the hash in it gives the id's <= score's in sorted manner. I initially start with an array of hashes. now i sort the hashes on values, but i could not store the sorted hashes back in an array.So just stored the keys after sorting and used these to get the values from my initial unsorted hashes. Can we store the hashes after sorting ? Can you suggest a better structure ?

    Read the article

  • cheapest way to draw a fullscreen quad

    - by Soubok
    I wondering if there is a faster way to draw a full-screen quad in OpenGL: NewList(); PushMatrix(); LoadIdentity(); MatrixMode(PROJECTION); PushMatrix(); LoadIdentity(); Begin(QUADS); Vertex(-1,-1,0); Vertex(1,-1,0); Vertex(1,1,0); Vertex(-1,1,0); End(); PopMatrix(); MatrixMode(MODELVIEW); PopMatrix(); EndList();

    Read the article

  • Improving the speed of php

    - by cast01
    I'm currently working on a website in PHP, and I'm wondering what the best practices/methods are to reduce the time requests take. I've build the site in a modular way, so a page would consist of a number of modules, and each of these would need to request information. For example, I have a cart module, that (if a cart is set) will fetch the cart with the id (stored in a session variable) from the database and return its contents. I have another module that lists categories and this needs to fetch the categories from the database. My system is built with models, and each model might also make a request, for example a category model will make a request to get products in that category.

    Read the article

  • PHP Hashtable array optimisation.

    - by hiprakhar
    I made a PHP app which was taking about ~0.0070sec for execution. Now, I added a hashtable array with about 2000 values. Suddenly the time for execution has gone up to ~0.0700 secs. Almost 10 times the previous value. I tried commenting out the part where I was searching inside the hashtable array (but array was still left defined). Still, the execution time remains about ~0.0500secs. Array is something like: $subjectinfo = array( 'TPT753' => 'Industrial Training', 'TPT801' => 'High Polymeric Engineering', 'TPT802' => 'Corrosion Engineering', 'TPT803' => 'Decorative ,Industrial And High Performance Coatings', 'TPT851' => 'Project'); Is there any way to optimize this part? I cannot use Database as I am running this app on Google app engine which is still not supporting JDO database for php. Some more code from the app: function getsubjectinfo($name) { $subjectinfo = array( 'TPT753' => 'Industrial Training', 'TPT801' => 'High Polymeric Engineering', 'TPT802' => 'Corrosion Engineering', 'TPT803' => 'Decorative ,Industrial And High Performance Coatings', 'TPT851' => 'Project'); $name = str_replace("-", "", $name); $name = str_replace(" ", "", $name); if (isset($subjectinfo["$name"])) return "(".$subjectinfo["$name"].")"; else return ""; } Then I am using the following statement 2-3 times in the app: echo $key." ".$this->getsubjectinfo($key)

    Read the article

  • Why is MySQL with InnoDB doing a table scan when key exists and choosing to examine 70 times more ro

    - by andysk
    Hello, I'm troubleshooting a query performance problem. Here's an expected query plan from explain: mysql> explain select * from table1 where tdcol between '2010-04-13:00:00' and '2010-04-14 03:16'; +----+-------------+--------------------+-------+---------------+--------------+---------+------+---------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------------------+-------+---------------+--------------+---------+------+---------+-------------+ | 1 | SIMPLE | table1 | range | tdcol | tdcol | 8 | NULL | 5437848 | Using where | +----+-------------+--------------------+-------+---------------+--------------+---------+------+---------+-------------+ 1 row in set (0.00 sec) That makes sense, since the index named tdcol (KEY tdcol (tdcol)) is used, and about 5M rows should be selected from this query. However, if I query for just one more minute of data, we get this query plan: mysql> explain select * from table1 where tdcol between '2010-04-13 00:00' and '2010-04-14 03:17'; +----+-------------+--------------------+------+---------------+------+---------+------+-----------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+--------------------+------+---------------+------+---------+------+-----------+-------------+ | 1 | SIMPLE | table1 | ALL | tdcol | NULL | NULL | NULL | 381601300 | Using where | +----+-------------+--------------------+------+---------------+------+---------+------+-----------+-------------+ 1 row in set (0.00 sec) The optimizer believes that the scan will be better, but it's over 70x more rows to examine, so I have a hard time believing that the table scan is better. Also, the 'USE KEY tdcol' syntax does not change the query plan. Thanks in advance for any help, and I'm more than happy to provide more info/answer questions.

    Read the article

  • Faster integer division when denominator is known?

    - by aaa
    hi I am working on GPU device which has very high division integer latency, several hundred cycles. I am looking to optimize divisions. All divisions by denominator which is in a set { 1,3,6,10 }, however numerator is a runtime positive value, roughly 32000 or less. due to memory constraints, lookup table is not option. Can you think of alternatives? I have thought of computing float point inverses, and using those to multiply numerator. Thanks

    Read the article

  • will a mysql query run slower if one of the tables involved has no index defined??

    - by lock
    there's this already populated database which came from another dev im not sure what went on that dev's mind when he created the tables, but on one of our scripts there is this query involving 4 tables and it runs super slow SELECT a.col_1, a.col_2, a.col_3, a.col_4, a.col_5, a.col_6, a.col_7 FROM a, b, c, d WHERE a.id = b.id AND b.c_id = c.id AND c.id = d.c_id AND a.col_8 = '$col_8' AND d.g_id = '$g_id' AND c.private = '1' NOTE: $col_8 and $g_id are variables from a form its only my theory that it's due to tables b and c not having an index, although im guessing that the dev didnt think that it was necessary since those tables only tell relations between a and d, where b tells that the data in a belongs to a certain user, and c tells that the user belongs to a group in d as you can see, there's not even a join or other extensive query functions used but this query which returns only around 100 rows takes 2 minutes to execute. anyway my question is simply this post's title. will a mysql query run slower if one of the tables involved has no index defined??

    Read the article

  • When optimizing database queries, what exactly is the relationship between number of queries and siz

    - by williamjones
    To optimize application speed, everyone always advises to minimize the number of queries an application makes to the database, consolidating them into fewer queries that retrieve more wherever possible. However, this also always comes with the caution that data transferred is still data transferred, and just because you are making fewer queries doesn't make the data transferred free. I'm in a situation where I can over-include on the query in order to cut down the number of queries, and simply remove the unwanted data in the application code. Is there any type of a rule of thumb on how much of a cost there is to each query, to know when to optimize number of queries versus size of queries? I've tried to Google for objective performance analysis data, but surprisingly haven't been able to find anything like that. Clearly this relationship will change for factors such as when the database grows in size, making this somewhat individualized, but surely this is not so individualized that a broad sense of the landscape can't be drawn out? I'm looking for general answers, but for what it's worth, I'm running an application on Heroku.com, which means Ruby on Rails with a Postgres database.

    Read the article

< Previous Page | 54 55 56 57 58 59 60 61 62 63 64 65  | Next Page >