premature optimization - Page 56

Random Complete System Unresponsiveness Running Mathematical Functions

- by Computer Guru

I have a program that loads a file (anywhere from 10MB to 5GB) a chunk at a time (ReadFile), and for each chunk performs a set of mathematical operations (basically calculates the hash). After calculating the hash, it stores info about the chunk in an STL map (basically <chunkID, hash>) and then writes the chunk itself to another file (WriteFile). That's all it does. This program will cause certain PCs to choke and die. The mouse begins to stutter, the task manager takes 2 min to show, ctrl+alt+del is unresponsive, running programs are slow.... the works. I've done literally everything I can think of to optimize the program, and have triple-checked all objects. What I've done: Tried different (less intensive) hashing algorithms. Switched all allocations to nedmalloc instead of the default new operator Switched from stl::map to unordered_set, found the performance to still be abysmal, so I switched again to Google's dense_hash_map. Converted all objects to store pointers to objects instead of the objects themselves. Caching all Read and Write operations. Instead of reading a 16k chunk of the file and performing the math on it, I read 4MB into a buffer and read 16k chunks from there instead. Same for all write operations - they are coalesced into 4MB blocks before being written to disk. Run extensive profiling with Visual Studio 2010, AMD Code Analyst, and perfmon. Set the thread priority to THREAD_MODE_BACKGROUND_BEGIN Set the thread priority to THREAD_PRIORITY_IDLE Added a Sleep(100) call after every loop. Even after all this, the application still results in a system-wide hang on certain machines under certain circumstances. Perfmon and Process Explorer show minimal CPU usage (with the sleep), no constant reads/writes from disk, few hard pagefaults (and only ~30k pagefaults in the lifetime of the application on a 5GB input file), little virtual memory (never more than 150MB), no leaked handles, no memory leaks. The machines I've tested it on run Windows XP - Windows 7, x86 and x64 versions included. None have less than 2GB RAM, though the problem is always exacerbated under lower memory conditions. I'm at a loss as to what to do next. I don't know what's causing it - I'm torn between CPU or Memory as the culprit. CPU because without the sleep and under different thread priorities the system performances changes noticeably. Memory because there's a huge difference in how often the issue occurs when using unordered_set vs Google's dense_hash_map. What's really weird? Obviously, the NT kernel design is supposed to prevent this sort of behavior from ever occurring (a user-mode application driving the system to this sort of extreme poor performance!?)..... but when I compile the code and run it on OS X or Linux (it's fairly standard C++ throughout) it performs excellently even on poor machines with little RAM and weaker CPUs. What am I supposed to do next? How do I know what the hell it is that Windows is doing behind the scenes that's killing system performance, when all the indicators are that the application itself isn't doing anything extreme? Any advice would be most welcome.

Read the article

What is the best algorithm for this array-comparison problem?

- by mark

What is the most efficient for speed algorithm to solve the following problem? Given 6 arrays, D1,D2,D3,D4,D5 and D6 each containing 6 numbers like: D1[0] = number D2[0] = number ...... D6[0] = number D1[1] = another number D2[1] = another number .... ..... .... ...... .... D1[5] = yet another number .... ...... .... Given a second array ST1, containing 1 number: ST1[0] = 6 Given a third array ans, containing 6 numbers: ans[0] = 3, ans[1] = 4, ans[2] = 5, ......ans[5] = 8 Using as index for the arrays D1,D2,D3,D4,D5 and D6, the number that goes from 0, to the number stored in ST1[0] minus one, in this example 6, so from 0 to 6-1, compare each res array against each D array My algorithm so far is: I tried to keep everything unlooped as much as possible. EML := ST1[0] //number contained in ST1[0] EML1 := 0 //start index for the arrays D While EML1 < EML if D1[ELM1] = ans[0] goto two if D2[ELM1] = ans[0] goto two if D3[ELM1] = ans[0] goto two if D4[ELM1] = ans[0] goto two if D5[ELM1] = ans[0] goto two if D6[ELM1] = ans[0] goto two ELM1 = ELM1 + 1 return 0 //If the ans[0] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers two: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[1] goto three if D2[ELM1] = ans[1] goto three if D3[ELM1] = ans[1] goto three if D4[ELM1] = ans[1] goto three if D5[ELM1] = ans[1] goto three if D6[ELM1] = ans[1] goto three ELM1 = ELM1 + 1 return 0 //If the ans[1] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers three: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[2] goto four if D2[ELM1] = ans[2] goto four if D3[ELM1] = ans[2] goto four if D4[ELM1] = ans[2] goto four if D5[ELM1] = ans[2] goto four if D6[ELM1] = ans[2] goto four ELM1 = ELM1 + 1 return 0 //If the ans[2] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers four: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[3] goto five if D2[ELM1] = ans[3] goto five if D3[ELM1] = ans[3] goto five if D4[ELM1] = ans[3] goto five if D5[ELM1] = ans[3] goto five if D6[ELM1] = ans[3] goto five ELM1 = ELM1 + 1 return 0 //If the ans[3] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers five: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[4] goto six if D2[ELM1] = ans[4] goto six if D3[ELM1] = ans[4] goto six if D4[ELM1] = ans[4] goto six if D5[ELM1] = ans[4] goto six if D6[ELM1] = ans[4] goto six ELM1 = ELM1 + 1 return 0 //If the ans[4] number is not found in either D1[0-6], D2[0-6].... D6[0-6] return 0 which will then exclude ans[0-6] numbers six: EML1 := 0 start index for arrays Ds While EML1 < EML if D1[ELM1] = ans[5] return 1 ////If the ans[1] number is not found in either D1[0-6]..... if D2[ELM1] = ans[5] return 1 which will then include ans[0-6] numbers return 1 if D3[ELM1] = ans[5] return 1 if D4[ELM1] = ans[5] return 1 if D5[ELM1] = ans[5] return 1 if D6[ELM1] = ans[5] return 1 ELM1 = ELM1 + 1 return 0 As language of choice, it would be pure c

Read the article

Is it possible to do A/B testing by page rather than by individual?

- by mojones

Lets say I have a simple ecommerce site that sells 100 different t-shirt designs. I want to do some a/b testing to optimise my sales. Let's say I want to test two different "buy" buttons. Normally, I would use AB testing to randomly assign each visitor to see button A or button B (and try to ensure that that the user experience is consistent by storing that assignment in session, cookies etc). Would it be possible to take a different approach and instead, randomly assign each of my 100 designs to use button A or B, and measure the conversion rate as (number of sales of design n) / (pageviews of design n) This approach would seem to have some advantages; I would not have to worry about keeping the user experience consistent - a given page (e.g. www.example.com/viewdesign?id=6) would always return the same html. If I were to test different prices, it would be far less distressing to the user to see different prices for different designs than different prices for the same design on different computers. I also wonder whether it might be better for SEO - my suspicion is that Google would "prefer" that it always sees the same html when crawling a page. Obviously this approach would only be suitable for a limited number of sites; I was just wondering if anyone has tried it?

Read the article

How does loop address alignment affect the speed on Intel x86_64?

- by Alexander Gololobov

I'm seeing 15% performance degradation of the same C++ code compiled to exactly same machine instructions but located on differently aligned addresses. When my tiny main loop starts at 0x415220 it's faster then when it is at 0x415250. I'm running this on Intel Core2 Duo. I use gcc 4.4.5 on x86_64 Ubuntu. Can anybody explain the cause of slowdown and how I can force gcc to optimally align the loop? Here is the disassembly for both cases with profiler annotation: 415220 576 12.56% |XXXXXXXXXXXXXX 48 c1 eb 08 shr $0x8,%rbx 415224 110 2.40% |XX 0f b6 c3 movzbl %bl,%eax 415227 0.00% | 41 0f b6 04 00 movzbl (%r8,%rax,1),%eax 41522c 40 0.87% | 48 8b 04 c1 mov (%rcx,%rax,8),%rax 415230 806 17.58% |XXXXXXXXXXXXXXXXXXX 4c 63 f8 movslq %eax,%r15 415233 186 4.06% |XXXX 48 c1 e8 20 shr $0x20,%rax 415237 102 2.22% |XX 4c 01 f9 add %r15,%rcx 41523a 414 9.03% |XXXXXXXXXX a8 0f test $0xf,%al 41523c 680 14.83% |XXXXXXXXXXXXXXXX 74 45 je 415283 ::Run(char const*, char const*)+0x4b3 41523e 0.00% | 41 89 c7 mov %eax,%r15d 415241 0.00% | 41 83 e7 01 and $0x1,%r15d 415245 0.00% | 41 83 ff 01 cmp $0x1,%r15d 415249 0.00% | 41 89 c7 mov %eax,%r15d 415250 679 13.05% |XXXXXXXXXXXXXXXX 48 c1 eb 08 shr $0x8,%rbx 415254 124 2.38% |XX 0f b6 c3 movzbl %bl,%eax 415257 0.00% | 41 0f b6 04 00 movzbl (%r8,%rax,1),%eax 41525c 43 0.83% |X 48 8b 04 c1 mov (%rcx,%rax,8),%rax 415260 828 15.91% |XXXXXXXXXXXXXXXXXXX 4c 63 f8 movslq %eax,%r15 415263 388 7.46% |XXXXXXXXX 48 c1 e8 20 shr $0x20,%rax 415267 141 2.71% |XXX 4c 01 f9 add %r15,%rcx 41526a 634 12.18% |XXXXXXXXXXXXXXX a8 0f test $0xf,%al 41526c 749 14.39% |XXXXXXXXXXXXXXXXXX 74 45 je 4152b3 ::Run(char const*, char const*)+0x4c3 41526e 0.00% | 41 89 c7 mov %eax,%r15d 415271 0.00% | 41 83 e7 01 and $0x1,%r15d 415275 0.00% | 41 83 ff 01 cmp $0x1,%r15d 415279 0.00% | 41 89 c7 mov %eax,%r15d

Read the article

Very slow Eclipse 4.2, how to make it more responsive?

- by Laurent

I'm using Eclipse PDT on a rather large PHP project and the IDE is almost unusable. It takes nearly 30 seconds to open a file, and other actions, like selecting a folder in the file explorer, editing some text, etc. are equally slow. I followed various instructions to speed it up but nothing seems to work. This is my current eclipse.ini file. Any idea how I can improve it? -startup plugins/org.eclipse.equinox.launcher_1.3.0.v20120522-1813.jar --launcher.library plugins/org.eclipse.equinox.launcher.win32.win32.x86_1.1.200.v20120522-1813 -showsplash org.eclipse.platform --launcher.XXMaxPermSize 256m --launcher.defaultAction openFile -vmargs -server -Dosgi.requiredJavaVersion=1.7 -Xmn128m -Xms1024m -Xmx1024m -Xss2m -XX:PermSize=128m -XX:MaxPermSize=128m -XX:+UseParallelGC System: Eclipse 4.2.0, Windows 7, 4 GB RAM

Read the article

Jruby rspec to be run parallely

- by Priyank

Hi. Is there something like Spork for Jruby too? We want to parallelize our specs to run faster and pre-load the classes while running the rake task; however we have not been able to do so. Since our project is considerable in size, specs take about 15 minutes to complete and this poses a serious challenge to quick turnaround. Any ideas are more than welcome. Cheers

Read the article

Complicated idea - how to create car racing for my RPG game's players

- by Donator

So, I want to create car racing for my RPG game's players. Player can create race and choose how many participants can participate in race. After race is being created, other people can join it. When the maximum participants are collected, race begins. My idea, when the last participant joins, then instantly choose the winner (who's car is the best, that person wins), but how can I do it? If I choose to pick the winner after the last participant joins, then I have to put many queries in one page (select data from table, then delete the race, then select players' cars' statistics and pick the winner and then again, using mysql, send message to everyone). But this idea is really not optimal and it will lag cruelly for that last person. Maybe you have any ideas how I can avoid lag and make it more optimal. Thank you very much.

Read the article

Optimize date query for large child tables: GiST or GIN?

- by Dave Jarvis

Problem 72 child tables, each having a year index and a station index, are defined as follows: CREATE TABLE climate.measurement_12_013 ( -- Inherited from table climate.measurement_12_013: id bigint NOT NULL DEFAULT nextval('climate.measurement_id_seq'::regclass), -- Inherited from table climate.measurement_12_013: station_id integer NOT NULL, -- Inherited from table climate.measurement_12_013: taken date NOT NULL, -- Inherited from table climate.measurement_12_013: amount numeric(8,2) NOT NULL, -- Inherited from table climate.measurement_12_013: category_id smallint NOT NULL, -- Inherited from table climate.measurement_12_013: flag character varying(1) NOT NULL DEFAULT ' '::character varying, CONSTRAINT measurement_12_013_category_id_check CHECK (category_id = 7), CONSTRAINT measurement_12_013_taken_check CHECK (date_part('month'::text, taken)::integer = 12) ) INHERITS (climate.measurement) CREATE INDEX measurement_12_013_s_idx ON climate.measurement_12_013 USING btree (station_id); CREATE INDEX measurement_12_013_y_idx ON climate.measurement_12_013 USING btree (date_part('year'::text, taken)); (Foreign key constraints to be added later.) The following query runs abysmally slow due to a full table scan: SELECT count(1) AS measurements, avg(m.amount) AS amount FROM climate.measurement m WHERE m.station_id IN ( SELECT s.id FROM climate.station s, climate.city c WHERE -- For one city ... -- c.id = 5182 AND -- Where stations are within an elevation range ... -- s.elevation BETWEEN 0 AND 3000 AND 6371.009 * SQRT( POW(RADIANS(c.latitude_decimal - s.latitude_decimal), 2) + (COS(RADIANS(c.latitude_decimal + s.latitude_decimal) / 2) * POW(RADIANS(c.longitude_decimal - s.longitude_decimal), 2)) ) <= 50 ) AND -- -- Begin extracting the data from the database. -- -- The data before 1900 is shaky; insufficient after 2009. -- extract( YEAR FROM m.taken ) BETWEEN 1900 AND 2009 AND -- Whittled down by category ... -- m.category_id = 1 AND m.taken BETWEEN -- Start date. (extract( YEAR FROM m.taken )||'-01-01')::date AND -- End date. Calculated by checking to see if the end date wraps -- into the next year. If it does, then add 1 to the current year. -- (cast(extract( YEAR FROM m.taken ) + greatest( -1 * sign( (extract( YEAR FROM m.taken )||'-12-31')::date - (extract( YEAR FROM m.taken )||'-01-01')::date ), 0 ) AS text)||'-12-31')::date GROUP BY extract( YEAR FROM m.taken ) The sluggishness comes from this part of the query: m.taken BETWEEN /* Start date. */ (extract( YEAR FROM m.taken )||'-01-01')::date AND /* End date. Calculated by checking to see if the end date wraps into the next year. If it does, then add 1 to the current year. */ (cast(extract( YEAR FROM m.taken ) + greatest( -1 * sign( (extract( YEAR FROM m.taken )||'-12-31')::date - (extract( YEAR FROM m.taken )||'-01-01')::date ), 0 ) AS text)||'-12-31')::date The HashAggregate from the plan shows a cost of 10006220141.11, which is, I suspect, on the astronomically huge side. There is a full table scan on the measurement table (itself having neither data nor indexes) being performed. The table aggregates 237 million rows from its child tables. Question What is the proper way to index the dates to avoid full table scans? Options I have considered: GIN GiST Rewrite the WHERE clause Separate year_taken, month_taken, and day_taken columns to the tables What are your thoughts? Thank you!

Read the article

Memory efficient int-int dict in Python

- by Bolo

Hi, I need a memory efficient int-int dict in Python that would support the following operations in O(log n) time: d[k] = v # replace if present v = d[k] # None or a negative number if not present I need to hold ~250M pairs, so it really has to be tight. Do you happen to know a suitable implementation (Python 2.7)? EDIT Removed impossible requirement and other nonsense. Thanks, Craig and Kylotan! To rephrase. Here's a trivial int-int dictionary with 1M pairs: >>> import random, sys >>> from guppy import hpy >>> h = hpy() >>> h.setrelheap() >>> d = {} >>> for _ in xrange(1000000): ... d[random.randint(0, sys.maxint)] = random.randint(0, sys.maxint) ... >>> h.heap() Partition of a set of 1999530 objects. Total size = 49161112 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 0 25165960 51 25165960 51 dict (no owner) 1 1999521 100 23994252 49 49160212 100 int On average, a pair of integers uses 49 bytes. Here's an array of 2M integers: >>> import array, random, sys >>> from guppy import hpy >>> h = hpy() >>> h.setrelheap() >>> a = array.array('i') >>> for _ in xrange(2000000): ... a.append(random.randint(0, sys.maxint)) ... >>> h.heap() Partition of a set of 14 objects. Total size = 8001108 bytes. Index Count % Size % Cumulative % Kind (class / dict of class) 0 1 7 8000028 100 8000028 100 array.array On average, a pair of integers uses 8 bytes. I accept that 8 bytes/pair in a dictionary is rather hard to achieve in general. Rephrased question: is there a memory-efficient implementation of int-int dictionary that uses considerably less than 49 bytes/pair?

Read the article

Optimizing Python code with many attribute and dictionary lookups

- by gotgenes

I have written a program in Python which spends a large amount of time looking up attributes of objects and values from dictionary keys. I would like to know if there's any way I can optimize these lookup times, potentially with a C extension, to reduce the time of execution, or if I need to simply re-implement the program in a compiled language. The program implements some algorithms using a graph. It runs prohibitively slowly on our data sets, so I profiled the code with cProfile using a reduced data set that could actually complete. The vast majority of the time is being burned in one function, and specifically in two statements, generator expressions, within the function: The generator expression at line 202 is neighbors_in_selected_nodes = (neighbor for neighbor in node_neighbors if neighbor in selected_nodes) and the generator expression at line 204 is neighbor_z_scores = (interaction_graph.node[neighbor]['weight'] for neighbor in neighbors_in_selected_nodes) The source code for this function of context provided below. selected_nodes is a set of nodes in the interaction_graph, which is a NetworkX Graph instance. node_neighbors is an iterator from Graph.neighbors_iter(). Graph itself uses dictionaries for storing nodes and edges. Its Graph.node attribute is a dictionary which stores nodes and their attributes (e.g., 'weight') in dictionaries belonging to each node. Each of these lookups should be amortized constant time (i.e., O(1)), however, I am still paying a large penalty for the lookups. Is there some way which I can speed up these lookups (e.g., by writing parts of this as a C extension), or do I need to move the program to a compiled language? Below is the full source code for the function that provides the context; the vast majority of execution time is spent within this function. def calculate_node_z_prime( node, interaction_graph, selected_nodes ): """Calculates a z'-score for a given node. The z'-score is based on the z-scores (weights) of the neighbors of the given node, and proportional to the z-score (weight) of the given node. Specifically, we find the maximum z-score of all neighbors of the given node that are also members of the given set of selected nodes, multiply this z-score by the z-score of the given node, and return this value as the z'-score for the given node. If the given node has no neighbors in the interaction graph, the z'-score is defined as zero. Returns the z'-score as zero or a positive floating point value. :Parameters: - `node`: the node for which to compute the z-prime score - `interaction_graph`: graph containing the gene-gene or gene product-gene product interactions - `selected_nodes`: a `set` of nodes fitting some criterion of interest (e.g., annotated with a term of interest) """ node_neighbors = interaction_graph.neighbors_iter(node) neighbors_in_selected_nodes = (neighbor for neighbor in node_neighbors if neighbor in selected_nodes) neighbor_z_scores = (interaction_graph.node[neighbor]['weight'] for neighbor in neighbors_in_selected_nodes) try: max_z_score = max(neighbor_z_scores) # max() throws a ValueError if its argument has no elements; in this # case, we need to set the max_z_score to zero except ValueError, e: # Check to make certain max() raised this error if 'max()' in e.args[0]: max_z_score = 0 else: raise e z_prime = interaction_graph.node[node]['weight'] * max_z_score return z_prime Here are the top couple of calls according to cProfiler, sorted by time. ncalls tottime percall cumtime percall filename:lineno(function) 156067701 352.313 0.000 642.072 0.000 bpln_contextual.py:204(<genexpr>) 156067701 289.759 0.000 289.759 0.000 bpln_contextual.py:202(<genexpr>) 13963893 174.047 0.000 816.119 0.000 {max} 13963885 69.804 0.000 936.754 0.000 bpln_contextual.py:171(calculate_node_z_prime) 7116883 61.982 0.000 61.982 0.000 {method 'update' of 'set' objects}

Read the article

How can i benchmark method execution time in java?

- by KP65

I have a program which i have myself written in java, but I want to test method execution times and get timings for specific methods. I was wondering if this is possible, by maybe somehow an eclipse plug-in? or maybe inserting some code? Many thanks

Read the article

help optimize sql query

- by msony

I have tracking table tbl_track with id, session_id, created_date fields I need count unique session_id for one day here what i got: select count(0) from ( select distinct session_id from tbl_track where created_date between getdate()-1 and getdate() group by session_id )tbl im feeling that it could be better solution for it

Read the article

Try to fill the GAE datastore but the code consumes to much cpu time. How to optimize this?

- by Neverland

I try to get the list of images in Amazon EC2 inside the Google datastore. I want to realize this with a cron job inside the GAE. class AmazonEC2uswest(db.Model): ami = db.StringProperty(required=True) mani = db.StringProperty() typ = db.StringProperty() arch = db.StringProperty() state = db.StringProperty() owner = db.StringProperty() class CronAMIsAmazonUS_WEST(webapp.RequestHandler): def get(self): aws_access_key_id_admin = "<secret>" aws_secret_access_key_admin = "<secret>" conn_us_west = boto.ec2.connect_to_region('us-west-1', aws_access_key_id=aws_access_key_id_admin, aws_secret_access_key=aws_secret_access_key_admin, is_secure = False) liste_images_us_west = conn_us_west.get_all_images() laenge_liste_images_us_west = len(liste_images_us_west) for i in range(laenge_liste_images_us_west): datastore_uswest_AMIs = AmazonEC2uswest(ami=liste_images_us_west[i].id, mani=str(liste_images_us_west[i].location), typ=liste_images_us_west[i].type, arch=liste_images_us_west[i].architecture, state=liste_images_us_west[i].state, owner=liste_images_us_west[i].ownerId) datastore_uswest_AMIs.put() The problem: Getting the list with get_all_images() lasts only a few seconds. But writing the data to the Google datastore needs way too much CPU time. My IBM T42p (P4M with 2GHz) needs for that piece of code approx. 1 Minute! Is it possible to optimize my code in a way that it needs fewer CPU time?

Read the article

Why doesn't gcc remove this check of a non-volatile variable?

- by Thomas

This question is mostly academic. I ask out of curiosity, not because this poses an actual problem for me. Consider the following incorrect C program. #include <signal.h> #include <stdio.h> static int running = 1; void handler(int u) { running = 0; } int main() { signal(SIGTERM, handler); while (running) ; printf("Bye!\n"); return 0; } This program is incorrect because the handler interrupts the program flow, so running can be modified at any time and should therefore be declared volatile. But let's say the programmer forgot that. gcc 4.3.3, with the -O3 flag, compiles the loop body (after one initial check of the running flag) down to the infinite loop .L7: jmp .L7 which was to be expected. Now we put something trivial inside the while loop, like: while (running) putchar('.'); And suddenly, gcc does not optimize the loop condition anymore! The loop body's assembly now looks like this (again at -O3): .L7: movq stdout(%rip), %rsi movl $46, %edi call _IO_putc movl running(%rip), %eax testl %eax, %eax jne .L7 We see that running is re-loaded from memory each time through the loop; it is not even cached in a register. Apparently gcc now thinks that the value of running could have changed. So why does gcc suddenly decide that it needs to re-check the value of running in this case?

Read the article

c++ optimize array of ints

- by a432511

I have a 2D lookup table of int16_t. int16_t my_array[37][73] = {{**DATA HERE**}} I have a mixture of values that range from just above the range of int8_t to just below the range of int8_t and some of the values repeat themselves. I am trying to reduce the size of this lookup table. What I have done so far is split each int16_t value into two int8_t values to visualize the wasted bytes. int8_t part_1 = original_value >> 4; int8_t part_2 = original_value & 0x0000FFFF; // If the upper 4 bits of the original_value were empty if(part_1 == 0) wasted_bytes_count++; I can easily remove the zero value int8_t that are wasting a byte of space and I can also remove the duplicate values, but my question is how do I do remove those values while retaining the ability to lookup based on the two indices? I contemplated translating this into a 1D array and adding a number following each duplicated value that would represent the number of duplicates that were removed, but I am struggling with how I would then identify what is a lookup value and what is a duplicate count. Also, it is further complicated by stripping out the zero int8_t values that were wasted bytes. EDIT: This array is stored in ROM already. RAM is even more limited than ROM so it is already stored in ROM. EDIT: I am going to post a bounty for this question as soon as I can. I need a complete answer of how to store the information AND retrieve it. It does not need to be a 2D array as long as I can get the same values. EDIT: Adding the actual array below: {150,145,140,135,130,125,120,115,110,105,100,95,90,85,80,75,70,65,60,55,50,45,40,35,30,25,20,15,10,5,0,-4,-9,-14,-19,-24,-29,-34,-39,-44,-49,-54,-59,-64,-69,-74,-79,-84,-89,-94,-99,104,109,114,119,124,129,134,139,144,149,154,159,164,169,174,179,175,170,165,160,155,150}, \ {143,137,131,126,120,115,110,105,100,95,90,85,80,75,71,66,62,57,53,48,44,39,35,31,27,22,18,14,9,5,1,-3,-7,-11,-16,-20,-25,-29,-34,-38,-43,-47,-52,-57,-61,-66,-71,-76,-81,-86,-91,-96,101,107,112,117,123,128,134,140,146,151,157,163,169,175,178,172,166,160,154,148,143}, \ {130,124,118,112,107,101,96,92,87,82,78,74,70,65,61,57,54,50,46,42,38,34,31,27,23,19,16,12,8,4,1,-2,-6,-10,-14,-18,-22,-26,-30,-34,-38,-43,-47,-51,-56,-61,-65,-70,-75,-79,-84,-89,-94,100,105,111,116,122,128,135,141,148,155,162,170,177,174,166,159,151,144,137,130}, \ {111,104,99,94,89,85,81,77,73,70,66,63,60,56,53,50,46,43,40,36,33,30,26,23,20,16,13,10,6,3,0,-3,-6,-9,-13,-16,-20,-24,-28,-32,-36,-40,-44,-48,-52,-57,-61,-65,-70,-74,-79,-84,-88,-93,-98,103,109,115,121,128,135,143,152,162,172,176,165,154,144,134,125,118,111}, \ {85,81,77,74,71,68,65,63,60,58,56,53,51,49,46,43,41,38,35,32,29,26,23,19,16,13,10,7,4,1,-1,-3,-6,-9,-13,-16,-19,-23,-26,-30,-34,-38,-42,-46,-50,-54,-58,-62,-66,-70,-74,-78,-83,-87,-91,-95,100,105,110,117,124,133,144,159,178,160,141,125,112,103,96,90,85}, \ {62,60,58,57,55,54,52,51,50,48,47,46,44,42,41,39,36,34,31,28,25,22,19,16,13,10,7,4,2,0,-3,-5,-8,-10,-13,-16,-19,-22,-26,-29,-33,-37,-41,-45,-49,-53,-56,-60,-64,-67,-70,-74,-77,-80,-83,-86,-89,-91,-94,-97,101,105,111,130,109,84,77,74,71,68,66,64,62}, \ {46,46,45,44,44,43,42,42,41,41,40,39,38,37,36,35,33,31,28,26,23,20,16,13,10,7,4,1,-1,-3,-5,-7,-9,-12,-14,-16,-19,-22,-26,-29,-33,-36,-40,-44,-48,-51,-55,-58,-61,-64,-66,-68,-71,-72,-74,-74,-75,-74,-72,-68,-61,-48,-25,2,22,33,40,43,45,46,47,46,46}, \ {36,36,36,36,36,35,35,35,35,34,34,34,34,33,32,31,30,28,26,23,20,17,14,10,6,3,0,-2,-4,-7,-9,-10,-12,-14,-15,-17,-20,-23,-26,-29,-32,-36,-40,-43,-47,-50,-53,-56,-58,-60,-62,-63,-64,-64,-63,-62,-59,-55,-49,-41,-30,-17,-4,6,15,22,27,31,33,34,35,36,36}, \ {30,30,30,30,30,30,30,29,29,29,29,29,29,29,29,28,27,26,24,21,18,15,11,7,3,0,-3,-6,-9,-11,-12,-14,-15,-16,-17,-19,-21,-23,-26,-29,-32,-35,-39,-42,-45,-48,-51,-53,-55,-56,-57,-57,-56,-55,-53,-49,-44,-38,-31,-23,-14,-6,0,7,13,17,21,24,26,27,29,29,30}, \ {25,25,26,26,26,25,25,25,25,25,25,25,25,26,25,25,24,23,21,19,16,12,8,4,0,-3,-7,-10,-13,-15,-16,-17,-18,-19,-20,-21,-22,-23,-25,-28,-31,-34,-37,-40,-43,-46,-48,-49,-50,-51,-51,-50,-48,-45,-42,-37,-32,-26,-19,-13,-7,-1,3,7,11,14,17,19,21,23,24,25,25}, \ {21,22,22,22,22,22,22,22,22,22,22,22,22,22,22,22,21,20,18,16,13,9,5,1,-3,-7,-11,-14,-17,-18,-20,-21,-21,-22,-22,-22,-23,-23,-25,-27,-29,-32,-35,-37,-40,-42,-44,-45,-45,-45,-44,-42,-40,-36,-32,-27,-22,-17,-12,-7,-3,0,3,7,9,12,14,16,18,19,20,21,21}, \ {18,19,19,19,19,19,19,19,19,19,19,19,19,19,19,19,18,17,16,14,10,7,2,-1,-6,-10,-14,-17,-19,-21,-22,-23,-24,-24,-24,-24,-23,-23,-23,-24,-26,-28,-30,-33,-35,-37,-38,-39,-39,-38,-36,-34,-31,-28,-24,-19,-15,-10,-6,-3,0,1,4,6,8,10,12,14,15,16,17,18,18}, \ {16,16,17,17,17,17,17,17,17,17,17,16,16,16,16,16,16,15,13,11,8,4,0,-4,-9,-13,-16,-19,-21,-23,-24,-25,-25,-25,-25,-24,-23,-21,-20,-20,-21,-22,-24,-26,-28,-30,-31,-32,-31,-30,-29,-27,-24,-21,-17,-13,-9,-6,-3,-1,0,2,4,5,7,9,10,12,13,14,15,16,16}, \ {14,14,14,15,15,15,15,15,15,15,14,14,14,14,14,14,13,12,11,9,5,2,-2,-6,-11,-15,-18,-21,-23,-24,-25,-25,-25,-25,-24,-22,-21,-18,-16,-15,-15,-15,-17,-19,-21,-22,-24,-24,-24,-23,-22,-20,-18,-15,-12,-9,-5,-3,-1,0,1,2,4,5,6,8,9,10,11,12,13,14,14}, \ {12,13,13,13,13,13,13,13,13,13,13,13,12,12,12,12,11,10,9,6,3,0,-4,-8,-12,-16,-19,-21,-23,-24,-24,-24,-24,-23,-22,-20,-17,-15,-12,-10,-9,-9,-10,-12,-13,-15,-17,-17,-18,-17,-16,-15,-13,-11,-8,-5,-3,-1,0,1,1,2,3,4,6,7,8,9,10,11,12,12,12}, \ {11,11,11,11,11,12,12,12,12,12,11,11,11,11,11,10,10,9,7,5,2,-1,-5,-9,-13,-17,-20,-22,-23,-23,-23,-23,-22,-20,-18,-16,-14,-11,-9,-6,-5,-4,-5,-6,-8,-9,-11,-12,-12,-12,-12,-11,-9,-8,-6,-3,-1,0,0,1,1,2,3,4,5,6,7,8,9,10,11,11,11}, \ {10,10,10,10,10,10,10,10,10,10,10,10,10,10,9,9,9,7,6,3,0,-3,-6,-10,-14,-17,-20,-21,-22,-22,-22,-21,-19,-17,-15,-13,-10,-8,-6,-4,-2,-2,-2,-2,-4,-5,-7,-8,-8,-9,-8,-8,-7,-5,-4,-2,0,0,1,1,1,2,2,3,4,5,6,7,8,9,10,10,10}, \ {9,9,9,9,9,9,9,10,10,9,9,9,9,9,9,8,8,6,5,2,0,-4,-7,-11,-15,-17,-19,-21,-21,-21,-20,-18,-16,-14,-12,-10,-8,-6,-4,-2,-1,0,0,0,-1,-2,-4,-5,-5,-6,-6,-5,-5,-4,-3,-1,0,0,1,1,1,1,2,3,3,5,6,7,8,8,9,9,9}, \ {9,9,9,9,9,9,9,9,9,9,9,9,8,8,8,8,7,5,4,1,-1,-5,-8,-12,-15,-17,-19,-20,-20,-19,-18,-16,-14,-11,-9,-7,-5,-4,-2,-1,0,0,1,1,0,0,-2,-3,-3,-4,-4,-4,-3,-3,-2,-1,0,0,0,0,0,1,1,2,3,4,5,6,7,8,8,9,9}, \ {9,9,9,8,8,8,9,9,9,9,9,8,8,8,8,7,6,5,3,0,-2,-5,-9,-12,-15,-17,-18,-19,-19,-18,-16,-14,-12,-9,-7,-5,-4,-2,-1,0,0,1,1,1,1,0,0,-1,-2,-2,-3,-3,-2,-2,-1,-1,0,0,0,0,0,0,0,1,2,3,4,5,6,7,8,8,9}, \ {8,8,8,8,8,8,9,9,9,9,9,9,8,8,8,7,6,4,2,0,-3,-6,-9,-12,-15,-17,-18,-18,-17,-16,-14,-12,-10,-8,-6,-4,-2,-1,0,0,1,2,2,2,2,1,0,0,-1,-1,-1,-2,-2,-1,-1,0,0,0,0,0,0,0,0,0,1,2,3,4,5,6,7,8,8}, \ {8,8,8,8,9,9,9,9,9,9,9,9,9,8,8,7,5,3,1,-1,-4,-7,-10,-13,-15,-16,-17,-17,-16,-15,-13,-11,-9,-6,-5,-3,-2,0,0,0,1,2,2,2,2,1,1,0,0,0,-1,-1,-1,-1,-1,0,0,0,0,-1,-1,-1,-1,-1,0,0,1,3,4,5,7,7,8}, \ {8,8,9,9,9,9,10,10,10,10,10,10,10,9,8,7,5,3,0,-2,-5,-8,-11,-13,-15,-16,-16,-16,-15,-13,-12,-10,-8,-6,-4,-2,-1,0,0,1,2,2,3,3,2,2,1,0,0,0,0,0,0,0,0,0,0,-1,-1,-2,-2,-2,-2,-2,-1,0,0,1,3,4,6,7,8}, \ {7,8,9,9,9,10,10,11,11,11,11,11,10,10,9,7,5,3,0,-2,-6,-9,-11,-13,-15,-16,-16,-15,-14,-13,-11,-9,-7,-5,-3,-2,0,0,1,1,2,3,3,3,3,2,2,1,1,0,0,0,0,0,0,0,-1,-1,-2,-3,-3,-4,-4,-4,-3,-2,-1,0,1,3,5,6,7}, \ {6,8,9,9,10,11,11,12,12,12,12,12,11,11,9,7,5,2,0,-3,-7,-10,-12,-14,-15,-16,-15,-15,-13,-12,-10,-8,-7,-5,-3,-1,0,0,1,2,2,3,3,4,3,3,3,2,2,1,1,1,0,0,0,0,-1,-2,-3,-4,-4,-5,-5,-5,-5,-4,-2,-1,0,2,3,5,6}, \ {6,7,8,10,11,12,12,13,13,14,14,13,13,11,10,8,5,2,0,-4,-8,-11,-13,-15,-16,-16,-16,-15,-13,-12,-10,-8,-6,-5,-3,-1,0,0,1,2,3,3,4,4,4,4,4,3,3,3,2,2,1,1,0,0,-1,-2,-3,-5,-6,-7,-7,-7,-6,-5,-4,-3,-1,0,2,4,6}, \ {5,7,8,10,11,12,13,14,15,15,15,14,14,12,11,8,5,2,-1,-5,-9,-12,-14,-16,-17,-17,-16,-15,-14,-12,-11,-9,-7,-5,-3,-1,0,0,1,2,3,4,4,5,5,5,5,5,5,4,4,3,3,2,1,0,-1,-2,-4,-6,-7,-8,-8,-8,-8,-7,-6,-4,-2,0,1,3,5}, \ {4,6,8,10,12,13,14,15,16,16,16,16,15,13,11,9,5,2,-2,-6,-10,-13,-16,-17,-18,-18,-17,-16,-15,-13,-11,-9,-7,-5,-4,-2,0,0,1,3,3,4,5,6,6,7,7,7,7,7,6,5,4,3,2,0,-1,-3,-5,-7,-8,-9,-10,-10,-10,-9,-7,-5,-4,-1,0,2,4}, \ {4,6,8,10,12,14,15,16,17,18,18,17,16,15,12,9,5,1,-3,-8,-12,-15,-18,-19,-20,-20,-19,-18,-16,-15,-13,-11,-8,-6,-4,-2,-1,0,1,3,4,5,6,7,8,9,9,9,9,9,9,8,7,5,3,1,-1,-3,-6,-8,-10,-11,-12,-12,-11,-10,-9,-7,-5,-2,0,1,4}, \ {4,6,8,11,13,15,16,18,19,19,19,19,18,16,13,10,5,0,-5,-10,-15,-18,-21,-22,-23,-22,-22,-20,-18,-17,-14,-12,-10,-8,-5,-3,-1,0,1,3,5,6,8,9,10,11,12,12,13,12,12,11,9,7,5,2,0,-3,-6,-9,-11,-12,-13,-13,-12,-11,-10,-8,-6,-3,-1,1,4}, \ {3,6,9,11,14,16,17,19,20,21,21,21,19,17,14,10,4,-1,-8,-14,-19,-22,-25,-26,-26,-26,-25,-23,-21,-19,-17,-14,-12,-9,-7,-4,-2,0,1,3,5,7,9,11,13,14,15,16,16,16,16,15,13,10,7,4,0,-3,-7,-10,-12,-14,-15,-14,-14,-12,-11,-9,-6,-4,-1,1,3}, \ {4,6,9,12,14,17,19,21,22,23,23,23,21,19,15,9,2,-5,-13,-20,-25,-28,-30,-31,-31,-30,-29,-27,-25,-22,-20,-17,-14,-11,-9,-6,-3,0,1,4,6,9,11,13,15,17,19,20,21,21,21,20,18,15,11,6,2,-2,-7,-11,-13,-15,-16,-16,-15,-13,-11,-9,-7,-4,-1,1,4}, \ {4,7,10,13,15,18,20,22,24,25,25,25,23,20,15,7,-2,-12,-22,-29,-34,-37,-38,-38,-37,-36,-34,-31,-29,-26,-23,-20,-17,-13,-10,-7,-4,-1,2,5,8,11,13,16,18,21,23,24,26,26,26,26,24,21,17,12,5,0,-6,-10,-14,-16,-16,-16,-15,-14,-12,-10,-7,-4,-1,1,4}, \ {4,7,10,13,16,19,22,24,26,27,27,26,24,19,11,-1,-15,-28,-37,-43,-46,-47,-47,-45,-44,-41,-39,-36,-32,-29,-26,-22,-19,-15,-11,-8,-4,-1,2,5,9,12,15,19,22,24,27,29,31,33,33,33,32,30,26,21,14,6,0,-6,-11,-14,-15,-16,-15,-14,-12,-9,-7,-4,-1,1,4}, \ {6,9,12,15,18,21,23,25,27,28,27,24,17,4,-14,-34,-49,-56,-60,-60,-60,-58,-56,-53,-50,-47,-43,-40,-36,-32,-28,-25,-21,-17,-13,-9,-5,-1,2,6,10,14,17,21,24,28,31,34,37,39,41,42,43,43,41,38,33,25,17,8,0,-4,-8,-10,-10,-10,-8,-7,-4,-2,0,3,6}, \ {22,24,26,28,30,32,33,31,23,-18,-81,-96,-99,-98,-95,-93,-89,-86,-82,-78,-74,-70,-66,-62,-57,-53,-49,-44,-40,-36,-32,-27,-23,-19,-14,-10,-6,-1,2,6,10,15,19,23,27,31,35,38,42,45,49,52,55,57,60,61,63,63,62,61,57,53,47,40,33,28,23,21,19,19,19,20,22}, \ {168,173,178,176,171,166,161,156,151,146,141,136,131,126,121,116,111,106,101,-96,-91,-86,-81,-76,-71,-66,-61,-56,-51,-46,-41,-36,-31,-26,-21,-16,-11,-6,-1,3,8,13,18,23,28,33,38,43,48,53,58,63,68,73,78,83,88,93,98,103,108,113,118,123,128,133,138,143,148,153,158,163,168}, \ Thanks for your time.

Read the article

When to use certain optimizations such as -fwhole-program and -fprofile-generate with several shared libraries

- by James

Probably a simple answer; I get quite confused with the language used in the GCC documentation for some of these flags! Anyway, I have three libraries and a programme which uses all these three. I compile each of my libraries seperately with individual (potentially) different sets of warning flags. However, I compile all three libraries with the same set of optimisation flags. I then compile my main programme linking in these three libraries with its own set of warning flags and the same optimisation flags used during the libraries' compilation. 1) Do I have to compile the libraries with optimisation flags present or can I just use these flags when compiling the final programme and linking to the libraries? If the latter, will it then optimise all or just some (presumably that which is called) of the code in these libraries? 2) I would like to use -fwhole-program -flto -fuse-linker-plugin and the linker plugin gold. At which stage do I compile with these on ... just the final compilation or do these flags need to be present during the compilation of the libraries? 3) Pretty much the same as 2) however with, -fprofile-generate -fprofile-arcs and -fprofile-use. I understand one first runs a programme with generate, and then with use. However, do I have to compile each of the libraries with generate/use etc. or just the final programme? And if it is just the last programme, when I then compeil with -fprofile-use will it also optimise the libraries functionality? Many thanks, James

Read the article

Does the .NET CLR Really Optimize for the Current Processor

- by dewald

When I read about the performance of JITted languages like C# or Java, authors usually say that they should/could theoretically outperform many native-compiled applications. The theory being that native applications are usually just compiled for a processor family (like x86), so the compiler cannot make certain optimizations as they may not truly be optimizations on all processors. On the other hand, the CLR can make processor-specific optimizations during the JIT process. Does anyone know if Microsoft's (or Mono's) CLR actually performs processor-specific optimizations during the JIT process? If so, what kind of optimizations?

Read the article

How to use constraint programming for optimizing shopping baskets?

- by tangens

I have a list of items I want to buy. The items are offered by different shops and different prices. The shops have individual delivery costs. I'm looking for an optimal shopping strategy (and a java library supporting it) to purchase all of the items with a minimal total price. Example: Item1 is offered at Shop1 for $100, at Shop2 for $111. Item2 is offered at Shop1 for $90, at Shop2 for $85. Delivery cost of Shop1: $10 if total order < $150; $0 otherwise Delivery cost of Shop2: $5 if total order < $50; $0 otherwise If I buy Item1 and Item2 at Shop1 the total cost is $100 + $90 +$0 = $190. If I buy Item1 and Item2 at Shop2 the total cost is $111 + $85 +$0 = $196. If I buy Item1 at Shop1 and Item2 at Shop2 the total cost is $100 + $10 + $85 + $0 = 195. I get the minimal price if I order Item1 and Item2 at Shop1: $190 What I tried so far I asked another question before that led me to the field of constraint programming. I had a look at cream and choco, but I did not figure out how to create a model to solve my problem. | shop1 | shop2 | shop3 | ... ----------------------------------------- item1 | p11 | p12 | p13 | item2 | p21 | p22 | p23 | . | | | | . | | | | ----------------------------------------- shipping | s1 | s2 | s3 | limit | l1 | l2 | l3 | ----------------------------------------- total | t1 | t2 | t3 | ----------------------------------------- My idea was to define these constraints: each price "p xy" is defined in the domain (0, c) where c is the price of the item in this shop only one price in a line should be non zero if one or more items are bought from one shop and the sum of the prices is lower than limit, then add shipping cost to the total cost shop total cost is the sum of the prices of all items in a shop total cost is the sum of all shop totals The objective is "total cost". I want to minimize this. In cream I wasn't able to express the "if then" constraint for conditional shipping costs. In choco these constraints exist, but even for 5 items and 10 shops the program was running for 10 minutes without finding a solution. Question How should I express my constraints to make this problem solvable for a constraint programming solver?

Read the article

Does Delphi really handle dynamic classes better than static?

- by John

Hello, I was told more than once that Delphi handles dynamic classes better than static.Thereby using the following: type Tsomeclass=class(TObject) private procedure proc1; public someint:integer; procedure proc2; end; var someclass:TSomeclass; implementation ... initialization someclass:=TSomeclass.Create; finalization someclass.Free; rather than type Tsomeclass=class private class procedure proc1; public var someint:integer; class procedure proc2; end; 90% of the classes in the project I'm working on have and need only one instance.Do I really have to use the first way for using those classes? Is it better optimized,handled by Delphi? Sorry,I have no arguments to backup this hypothesis,but I want an expert's opinion. Thanks in advance!

Read the article

How to optimise MySQL query containing a subquery?

- by aidan

Read the article

PHP Increasing write to page speed.

- by Frederico

I'm currently writing out xml and have done the following: header ("content-type: text/xml"); header ("content-length: ".strlen($xml)); $xml being the xml to be written out. I'm near about 1.8 megs of text (which I found via firebug), it seems as the writing is taking more time than the script to run.. is there a way to increase this write speed? Thank you in advance.

Read the article

Is it better to echo javascript in raw format with php, or echo a script include that has been minif

- by Scarface

Hey guys quick question, I am currently echoing a lot of javascript that is based conditionally on login status and other variables. I was wondering if it would be better to simply echo the script include like <script type="text/javascript" src="javascript/openlogin.js"></script> that has been run through a minifying program and been gzipped or to echo the full script in raw format. The latter suggestion is messier to me but it reduces http requests while the latter would probably be smaller but take more cpu? Just wondering what some other people think. Thanks in advance for any advice.

Read the article

speed up wamp server + drupal on windows vista

- by Andrew Welch

Hi, My localhost performance with drupal six is pretty slow. I found a solution to add a # before the :: localhost line of the system32/etc/hosts file but this was something I had already done and didn't help much. does anyone know of any other optimisations that might work? tHanks Andy

Read the article

Can anyone recommend a decent tool for optimizing images other than Photoshop

- by toomanyairmiles

Can anyone recommend a decent tool for optimising images other than adobe photoshop, the gimp etc? I'm looking to optimise images for the web preferably online and free. Basically I have a client who can't install additional software on their work PC but needs to optimise photographs and other images for their website and is presently uploading 1 or 2 Mb files. On a personal level I'm interested to see what other people are using...

Read the article

F# - Facebook Hacker Cup - Double Squares

- by Jacob

I'm working on strengthening my F#-fu and decided to tackle the Facebook Hacker Cup Double Squares problem. I'm having some problems with the run-time and was wondering if anyone could help me figure out why it is so much slower than my C# equivalent. There's a good description from another post; Source: Facebook Hacker Cup Qualification Round 2011 A double-square number is an integer X which can be expressed as the sum of two perfect squares. For example, 10 is a double-square because 10 = 3^2 + 1^2. Given X, how can we determine the number of ways in which it can be written as the sum of two squares? For example, 10 can only be written as 3^2 + 1^2 (we don't count 1^2 + 3^2 as being different). On the other hand, 25 can be written as 5^2 + 0^2 or as 4^2 + 3^2. You need to solve this problem for 0 = X = 2,147,483,647. Examples: 10 = 1 25 = 2 3 = 0 0 = 1 1 = 1 My basic strategy (which I'm open to critique on) is to; Create a dictionary (for memoize) of the input numbers initialzed to 0 Get the largest number (LN) and pass it to count/memo function Get the LN square root as int Calculate squares for all numbers 0 to LN and store in dict Sum squares for non repeat combinations of numbers from 0 to LN If sum is in memo dict, add 1 to memo Finally, output the counts of the original numbers. Here is the F# code (See code changes at bottom) I've written that I believe corresponds to this strategy (Runtime: ~8:10); open System open System.Collections.Generic open System.IO /// Get a sequence of values let rec range min max = seq { for num in [min .. max] do yield num } /// Get a sequence starting from 0 and going to max let rec zeroRange max = range 0 max /// Find the maximum number in a list with a starting accumulator (acc) let rec maxNum acc = function | [] -> acc | p::tail when p > acc -> maxNum p tail | p::tail -> maxNum acc tail /// A helper for finding max that sets the accumulator to 0 let rec findMax nums = maxNum 0 nums /// Build a collection of combinations; ie [1,2,3] = (1,1), (1,2), (1,3), (2,2), (2,3), (3,3) let rec combos range = seq { let count = ref 0 for inner in range do for outer in Seq.skip !count range do yield (inner, outer) count := !count + 1 } let rec squares nums = let dict = new Dictionary<int, int>() for s in nums do dict.[s] <- (s * s) dict /// Counts the number of possible double squares for a given number and keeps track of other counts that are provided in the memo dict. let rec countDoubleSquares (num: int) (memo: Dictionary<int, int>) = // The highest relevent square is the square root because it squared plus 0 squared is the top most possibility let maxSquare = System.Math.Sqrt((float)num) // Our relevant squares are 0 to the highest possible square; note the cast to int which shouldn't hurt. let relSquares = range 0 ((int)maxSquare) // calculate the squares up front; let calcSquares = squares relSquares // Build up our square combinations; ie [1,2,3] = (1,1), (1,2), (1,3), (2,2), (2,3), (3,3) for (sq1, sq2) in combos relSquares do let v = calcSquares.[sq1] + calcSquares.[sq2] // Memoize our relevant results if memo.ContainsKey(v) then memo.[v] <- memo.[v] + 1 // return our count for the num passed in memo.[num] // Read our numbers from file. //let lines = File.ReadAllLines("test2.txt") //let nums = [ for line in Seq.skip 1 lines -> Int32.Parse(line) ] // Optionally, read them from straight array let nums = [1740798996; 1257431873; 2147483643; 602519112; 858320077; 1048039120; 415485223; 874566596; 1022907856; 65; 421330820; 1041493518; 5; 1328649093; 1941554117; 4225; 2082925; 0; 1; 3] // Initialize our memoize dictionary let memo = new Dictionary<int, int>() for num in nums do memo.[num] <- 0 // Get the largest number in our set, all other numbers will be memoized along the way let maxN = findMax nums // Do the memoize let maxCount = countDoubleSquares maxN memo // Output our results. for num in nums do printfn "%i" memo.[num] // Have a little pause for when we debug let line = Console.Read() And here is my version in C# (Runtime: ~1:40: using System; using System.Collections.Generic; using System.Diagnostics; using System.IO; using System.Linq; using System.Text; namespace FBHack_DoubleSquares { public class TestInput { public int NumCases { get; set; } public List<int> Nums { get; set; } public TestInput() { Nums = new List<int>(); } public int MaxNum() { return Nums.Max(); } } class Program { static void Main(string[] args) { // Read input from file. //TestInput input = ReadTestInput("live.txt"); // As example, load straight. TestInput input = new TestInput { NumCases = 20, Nums = new List<int> { 1740798996, 1257431873, 2147483643, 602519112, 858320077, 1048039120, 415485223, 874566596, 1022907856, 65, 421330820, 1041493518, 5, 1328649093, 1941554117, 4225, 2082925, 0, 1, 3, } }; var maxNum = input.MaxNum(); Dictionary<int, int> memo = new Dictionary<int, int>(); foreach (var num in input.Nums) { if (!memo.ContainsKey(num)) memo.Add(num, 0); } DoMemoize(maxNum, memo); StringBuilder sb = new StringBuilder(); foreach (var num in input.Nums) { //Console.WriteLine(memo[num]); sb.AppendLine(memo[num].ToString()); } Console.Write(sb.ToString()); var blah = Console.Read(); //File.WriteAllText("out.txt", sb.ToString()); } private static int DoMemoize(int num, Dictionary<int, int> memo) { var highSquare = (int)Math.Floor(Math.Sqrt(num)); var squares = CreateSquareLookup(highSquare); var relSquares = squares.Keys.ToList(); Debug.WriteLine("Starting - " + num.ToString()); Debug.WriteLine("RelSquares.Count = {0}", relSquares.Count); int sum = 0; var index = 0; foreach (var square in relSquares) { foreach (var inner in relSquares.Skip(index)) { sum = squares[square] + squares[inner]; if (memo.ContainsKey(sum)) memo[sum]++; } index++; } if (memo.ContainsKey(num)) return memo[num]; return 0; } private static TestInput ReadTestInput(string fileName) { var lines = File.ReadAllLines(fileName); var input = new TestInput(); input.NumCases = int.Parse(lines[0]); foreach (var lin in lines.Skip(1)) { input.Nums.Add(int.Parse(lin)); } return input; } public static Dictionary<int, int> CreateSquareLookup(int maxNum) { var dict = new Dictionary<int, int>(); int square; foreach (var num in Enumerable.Range(0, maxNum)) { square = num * num; dict[num] = square; } return dict; } } } Thanks for taking a look. UPDATE Changing the combos function slightly will result in a pretty big performance boost (from 8 min to 3:45): /// Old and Busted... let rec combosOld range = seq { let rangeCache = Seq.cache range let count = ref 0 for inner in rangeCache do for outer in Seq.skip !count rangeCache do yield (inner, outer) count := !count + 1 } /// The New Hotness... let rec combos maxNum = seq { for i in 0..maxNum do for j in i..maxNum do yield i,j }

Search Results

Search found 3512 results on 141 pages for 'premature optimization'.

Page 56/141 | < Previous Page | 52 53 54 55 56 57 58 59 60 61 62 63 | Next Page >

- by Computer Guru

- by mark

- by mojones

- by Alexander Gololobov

- by Laurent

- by Priyank

- by Donator

- by Dave Jarvis

- by Bolo

- by gotgenes

- by KP65

- by msony

- by Neverland

- by Thomas

- by a432511

- by James

- by dewald

- by tangens

- by John

- by aidan

- by Frederico

- by Scarface

- by Andrew Welch

- by toomanyairmiles

- by Jacob

< Previous Page | 52 53 54 55 56 57 58 59 60 61 62 63 | Next Page >