Search Results

Search found 1933 results on 78 pages for 'genetic algorithms'.

Page 7/78 | < Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >

  • Looking for algorithms regarding scaling and moving

    - by user1806687
    I've been bashing my head for the past couple of weeks trying to find algorithms that would help me accomplish, on first look very easy task. So, I got this one object currently made out of 5 cuboids (2 sides, 1 top, 1 bottom, 1 back), this is just for an example, later on there will be whole range of different set ups. I have included three pictures of this object(as said this is just for an example). Now, the thing is when the user scales the whole object this is what should happen: X scale: top and bottom cuboids should get scaled by a scale factor, sides should get moved so they are positioned just like they were before(in this case at both ends of top and bottom cuboids), back should get scaled so it fits like before(if I simply scale it by a scale factor it will leave gaps on each side). Y scale: sides should get scaled by a scale factor, top and bottom cuboid should get moved, and back should also get scaled. Z scale: sides, top and bottom cuboids should get scaled, back should get moved. Here is an image of the example object (a thick walled box, with one face missing, where each wall is made by a cuboid): Front of the object: Hope you can help,

    Read the article

  • Merge sort versus quick sort performance

    - by Giorgio
    I have implemented merge sort and quick sort using C (GCC 4.4.3 on Ubuntu 10.04 running on a 4 GB RAM laptop with an Intel DUO CPU at 2GHz) and I wanted to compare the performance of the two algorithms. The prototypes of the sorting functions are: void merge_sort(const char **lines, int start, int end); void quick_sort(const char **lines, int start, int end); i.e. both take an array of pointers to strings and sort the elements with index i : start <= i <= end. I have produced some files containing random strings with length on average 4.5 characters. The test files range from 100 lines to 10000000 lines. I was a bit surprised by the results because, even though I know that merge sort has complexity O(n log(n)) while quick sort is O(n^2), I have often read that on average quick sort should be as fast as merge sort. However, my results are the following. Up to 10000 strings, both algorithms perform equally well. For 10000 strings, both require about 0.007 seconds. For 100000 strings, merge sort is slightly faster with 0.095 s against 0.121 s. For 1000000 strings merge sort takes 1.287 s against 5.233 s of quick sort. For 5000000 strings merge sort takes 7.582 s against 118.240 s of quick sort. For 10000000 strings merge sort takes 16.305 s against 1202.918 s of quick sort. So my question is: are my results as expected, meaning that quick sort is comparable in speed to merge sort for small inputs but, as the size of the input data grows, the fact that its complexity is quadratic will become evident? Here is a sketch of what I did. In the merge sort implementation, the partitioning consists in calling merge sort recursively, i.e. merge_sort(lines, start, (start + end) / 2); merge_sort(lines, 1 + (start + end) / 2, end); Merging of the two sorted sub-array is performed by reading the data from the array lines and writing it to a global temporary array of pointers (this global array is allocate only once). After each merge the pointers are copied back to the original array. So the strings are stored once but I need twice as much memory for the pointers. For quick sort, the partition function chooses the last element of the array to sort as the pivot and scans the previous elements in one loop. After it has produced a partition of the type start ... {elements <= pivot} ... pivotIndex ... {elements > pivot} ... end it calls itself recursively: quick_sort(lines, start, pivotIndex - 1); quick_sort(lines, pivotIndex + 1, end); Note that this quick sort implementation sorts the array in-place and does not require additional memory, therefore it is more memory efficient than the merge sort implementation. So my question is: is there a better way to implement quick sort that is worthwhile trying out? If I improve the quick sort implementation and perform more tests on different data sets (computing the average of the running times on different data sets) can I expect a better performance of quick sort wrt merge sort? EDIT Thank you for your answers. My implementation is in-place and is based on the pseudo-code I have found on wikipedia in Section In-place version: function partition(array, 'left', 'right', 'pivotIndex') where I choose the last element in the range to be sorted as a pivot, i.e. pivotIndex := right. I have checked the code over and over again and it seems correct to me. In order to rule out the case that I am using the wrong implementation I have uploaded the source code on github (in case you would like to take a look at it). Your answers seem to suggest that I am using the wrong test data. I will look into it and try out different test data sets. I will report as soon as I have some results.

    Read the article

  • How to prepare for a programming competition? Graphs, Stacks, Trees, oh my! [closed]

    - by Simucal
    Last semester I attended ACM's (Association for Computing Machinery) bi-annual programming competition at a local University. My University sent 2 teams of 3 people and we competed amongst other schools in the mid-west. We got our butts kicked. You are given a packet with about 11 problems (1 problem per page) and you have 4 hours to solve as many as you can. They'll run your program you submit against a set of data and your output must match theirs exactly. In fact, the judging is automated for the most part. In any case.. I went there fairly confident in my programming skills and I left there feeling drained and weak. It was a terribly humbling experience. In 4 hours my team of 3 people completed only one of the problems. The top team completed 4 of them and took 1st place. The problems they asked were like no problems I have ever had to answer before. I later learned that in order to solve them some of them effectively you have to use graphs/graph algorithms, trees, stacks. Some of them were simply "greedy" algo's. My question is, how can I better prepare for this semesters programming competition so I don't leave there feeling like a complete moron? What tips do you have for me to be able to answer these problems that involve graphs, trees, various "well known" algorithms? How can I easily identify the algorithm we should implement for a given problem? I have yet to take Algorithm Design in school so I just feel a little out of my element. Here are some examples of the questions asked at the competitions: ACM Problem Sets Update: Just wanted to update this since the latest competition is over. My team placed 1st for our small region (about 6-7 universities with between 1-5 teams each school) and ~15th for the midwest! So, it is a marked improvement over last years performance for sure. We also had no graduate students on our team and after reviewing the rules we found out that many teams had several! So, that would be a pretty big advantage in my own opinion. Problems this semester ranged from about 1-2 "easy" problems (ie bit manipulation, string manipulation) to hard (graph problems involving fairly complex math and network flow problems). We were able to solve 4 problems in our 5 hours. Just wanted to thank everyone for the resources they provided here, we used them for our weekly team practices and it definitely helped! Some quick tips that I have that aren't suggested below: When you are seated at your computer before the competition starts, quickly type out various data structures that you might need that you won't have access to in your languages libraries. I typed out a Graph data-structure complete with floyd-warshall and dijkstra's algorithm before the competition began. We ended up using it in our 2nd problem that we solved and this is the main reason why we solved this problem before anyone else in the midwest. We had it ready to go from the beginning. Similarly, type out the code to read in a file since this will be required for every problem. Save this answer "template" someplace so you can quickly copy/paste it to your IDE at the beginning of each problem. There are no rules on programming anything before the competition starts so get any boilerplate code out the way. We found it useful to have one person who is on permanent whiteboard duty. This is usually the person who is best at math and at working out solutions to get a head start on future problems you will be doing. One person is on permanent programming duty. Your fastest/most skilled "programmer" (most familiar with the language). This will save debugging time also. The last person has several roles between assessing the packet of problems for the next "easiest" problem, helping the person on the whiteboard work out solutions and helping the person programming work out bugs/issues. This person needs to be flexible and be able to switch between roles easily.

    Read the article

  • Approach for packing 2D shapes while minimizing total enclosing area

    - by Dennis
    Not sure on my tags for this question, but in short .... I need to solve a problem of packing industrial parts into crates while minimizing total containing area. These parts are motors, or pumps, or custom-made components, and they have quite unusual shapes. For some, it may be possible to assume that a part === rectangular cuboid, but some are not so simple, i.e. they assume a shape more of that of a hammer or letter T. With those, (assuming 2D shape), by alternating direction of top & bottom, one can pack more objects into the same space, than if all tops were in the same direction. Crude example below with letter "T"-shaped parts: ***** xxxxx ***** x ***** *** ooo * x vs * x vs * x vs * x o * x * xxxxx * x * x o xxxxx xxx Right now we are solving the problem by something like this: using CAD software, make actual models of how things fit in crate boxes make estimates of actual crate dimensions & write them into Excel file (1) is crazy amount of work and as the result we have just a limited amount of possible entries in (2), the Excel file. The good things is that programming this is relatively easy. Given a combination of products to go into crates, we do a lookup, and if entry exists in the Excel (or Database), we bring it out. If it doesn't, we say "sorry, no data!". I don't necessarily want to go full force on making up some crazy algorithm that given geometrical part description can align, rotate, and figure out best part packing into a crate, given its shape, but maybe I do.. Question Well, here is my question: assuming that I can represent my parts as 2D (to be determined how), and that some parts look like letter T, and some parts look like rectangles, which algorithm can I use to give me a good estimate on the dimensions of the encompassing area, while ensuring that the parts are packed in a minimal possible area, to minimize crating/shipping costs? Are there approximation algorithms? Seeing how this can get complex, is there an existing library I could use? My thought / Approach My naive approach would be to define a way to describe position of parts, and place the first part, compute total enclosing area & dimensions. Then place 2nd part in 0 degree orientation, repeat, place it at 180 degree orientation, repeat (for my case I don't think 90 degree rotations will be meaningful due to long lengths of parts). Proceed using brute force "tacking on" other parts to the enclosing area until all parts are processed. I may have to shift some parts a tad (see 3rd pictorial example above with letters T). This adds a layer of 2D complexity rather than 1D. I am not sure how to approach this. One idea I have is genetic algorithms, but I think those will take up too much processing power and time. I will need to look out for shape collisions, as well as adding extra padding space, since we are talking about real parts with irregularities rather than perfect imaginary blocks. I'm afraid this can get geometrically messy fairly fast, and I'd rather keep things simple, if I can. But what if the best (practical) solution is to pack things into different crate boxes rather than just one? This can get a bit more tricky. There is human element involved as well, i.e. like parts can go into same box and are thus a constraint to be considered. Some parts that are not the same are sometimes grouped together for shipping and can be considered as a common grouped item. Sometimes customers want things shipped their way, which adds human element to constraints. so there will have to be some customization.

    Read the article

  • Find points whose pairwise distances approximate a given distance matrix

    - by Stephan Kolassa
    Problem. I have a symmetric distance matrix with entries between zero and one, like this one: D = ( 0.0 0.4 0.0 0.5 ) ( 0.4 0.0 0.2 1.0 ) ( 0.0 0.2 0.0 0.7 ) ( 0.5 1.0 0.7 0.0 ) I would like to find points in the plane that have (approximately) the pairwise distances given in D. I understand that this will usually not be possible with strictly correct distances, so I would be happy with a "good" approximation. My matrices are smallish, no more than 10x10, so performance is not an issue. Question. Does anyone know of an algorithm to do this? Background. I have sets of probability densities between which I calculate Hellinger distances, which I would like to visualize as above. Each set contains no more than 10 densities (see above), but I have a couple of hundred sets. What I did so far. I did consider posting at math.SE, but looking at what gets tagged as "geometry" there, it seems like this kind of computational geometry question would be more on-topic here. If the community thinks this should be migrated, please go ahead. This looks like a straightforward problem in computational geometry, and I would assume that anyone involved in clustering might be interested in such a visualization, but I haven't been able to google anything. One simple approach would be to randomly plonk down points and perturb them until the distance matrix is close to D, e.g., using Simulated Annealing, or run a Genetic Algorithm. I have to admit that I haven't tried that yet, hoping for a smarter way. One specific operationalization of a "good" approximation in the sense above is Problem 4 in the Open Problems section here, with k=2. Now, while finding an algorithm that is guaranteed to find the minimum l1-distance between D and the resulting distance matrix may be an open question, it still seems possible that there at least is some approximation to this optimal solution. If I don't get an answer here, I'll mail the gentleman who posed that problem and ask whether he knows of any approximation algorithm (and post any answer I get to that here).

    Read the article

  • Recommend an algorithms exercise book?

    - by Parappa
    I have a little book called Problems on Algorithms by Ian Parberry which is chock full of exercises related to the study of algorithms. Can anybody recommend similar books? What I am not looking for are recommendations of good books related to algorithms or the theory of computation. Introduction to Algorithms is a good one, and of course there's the Knuth stuff. Ideally I want to know of any books that are light on instructional material and heavy on sample problems. In a nutshell, exercise books. Preferably dedicated to algorithms rather than general logic or other math problems. By the way, the Parberry book does not seem to be in print, but it is available as a PDF dowload.

    Read the article

  • programming practices starting

    - by Tamim Ad Dari
    I have taken my major as computer science and Engineering and I am really confused at this moment. My first course was about learning C and C++ and I learned the basics of those. Now I am really confused what to do next. Some says I should practice algorithms and do contests in ACM-ICPC for now. Others tell me to start software development. But As I started digging its really a vast topic and there are many aspects of these, like web design, web-development, iOS-development, android... etc many things. And I am really confused about what should I do just now. Any advice for me to start with?

    Read the article

  • Why hill climbing is called anytime algorithm?

    - by crucified soul
    From wikipedia, Anytime algorithm In computer science an anytime algorithm is an algorithm that can return a valid solution to a problem even if it's interrupted at any time before it ends. The algorithm is expected to find better and better solutions the more time it keeps running. Hill climbing Hill climbing can often produce a better result than other algorithms when the amount of time available to perform a search is limited, such as with real-time systems. It is an anytime algorithm: it can return a valid solution even if it's interrupted at any time before it ends. Hill climbing algorithm can stuck into local optima or ridge, after that even if it runs infinite time, the result won't be any better. Then, why hill climbing is called anytime algorithm?

    Read the article

  • Programmaticaly finding the Landau notation (Big O or Theta notation) of an algorithm?

    - by Julien L
    I'm used to search for the Landau (Big O, Theta...) notation of my algorithms by hand to make sure they are as optimized as they can be, but when the functions are getting really big and complex, it's taking way too much time to do it by hand. it's also prone to human errors. I spent some time on Codility (coding/algo exercises), and noticed they will give you the Landau notation for your submitted solution (both in Time and Memory usage). I was wondering how they do that... How would you do it? Is there another way besides Lexical Analysis or parsing of the code? PS: This question concerns mainly PHP and or JavaScript, but I'm opened to any language and theory.

    Read the article

  • Reverse horizontal and vertical for a HTML table

    - by porton
    There is a two-dimensional array describing a HTML table. Each element of the array consists of: the cell content rowspan colspan Every row of this two dimensional array corresponds to <td> cells of a <tr> of the table which my software should generate. I need to "reverse" the array (interchange vertical and horizontal direction). Insofar I considered algorithm based on this idea: make a rectangular matrix of the size of the table and store in every element of this matrix the corresponding index of the element of the above mentioned array. (Note that two elements of the matrix may be identical due rowspan/colspan.) Then I could use this matrix to calculate rowspan/colspan for the inverted table. But this idea seems bad for me. Any other algorithms? Note that I program in PHP.

    Read the article

  • Random number generation algorithm for human brains?

    - by Magnus Wolffelt
    Are you aware of, or have you devised, any practical, simple-to-learn "in-head" algorithms that let humans generate (somewhat "true") random numbers? By "in-head" I mean.. preferrably without any external tools or devices. Also, a high output (many random numbers per minute) is desirable. Asked this on SO but it didn't get much interest. Maybe this is better suited for programmers.. :) I'm genuinely curious about anything that people might have come up with on this problem.

    Read the article

  • I'm a CS student, and honestly, I don't understand Knuth's books

    - by Raymond Ho
    I stumbled upon this quote from Bill Gates: "You should definitely send me a resume if you can read the whole thing." He was talking about The Art of Programming books. So I was pretty curious and want to read it all. But honestly, I don't understand it. I'm really not that intellectual. So this should be the reason why I can't understand it, but I am eager to learn. I'm currently reading Volume 1 about fundamental algorithms. Are there any books out there that are friendly for novices/slow people like me, which would help to build up my knowledge so that I can read Knuth's book with ease in the future?

    Read the article

  • Shortest Common Superstring: find shortest string that contains all given string fragments

    - by occulus
    Given some string fragments, I would like to find the shortest possible single string ("output string") that contains all the fragments. Fragments can overlap each other in the output string. Example: For the string fragments: BCDA AGF ABC The following output string contains all fragments, and was made by naive appending: BCDAAGFABC However this output string is better (shorter), as it employs overlaps: ABCDAGF ^ ABC ^ BCDA ^ AGF I'm looking for algorithms for this problem. It's not absolutely important to find the strictly shortest output string, but the shorter the better. I'm looking for an algorithm better than the obvious naive one that would try appending all permutations of the input fragments and removing overlaps (which would appear to be NP-Complete). I've started work on a solution and it's proving quite interesting; I'd like to see what other people might come up with. I'll add my work-in-progress to this question in a while.

    Read the article

  • About insertion sort and especially why it's said that copy is much faster than swap?

    - by Software Engeneering Learner
    From Lafore's "Data Structures and Algorithms in Java": (about insertion sort (which uses copy + shift instead of swap (used in bubble and selection sort))) However, a copy isn’t as time-consuming as a swap, so for random data this algo- rithm runs twice as fast as the bubble sort and faster than the selectionsort. Also author doesn't mention how time consuming shift is. From my POV copy is the simplest pointer assignment operation. While swap is 3x pointer assignment operations. Which doesn't take much time. Also shift of N elemtns is Nx pointer assignment operations. Please correct me if I'm wrong. Please explain, why what author says is true? I don't understand.

    Read the article

  • Algorithm to calculate trajectories from vector field

    - by cheeesus
    I have a two-dimensional vector field, i.e., for each point (x, y) I have a vector (u, v), whereas u and v are functions of x and y. This vector field canonically defines a set of trajectories, i.e. a set of paths a particle would take if it follows along the vector field. In the following image, the vector field is depicted in red, and there are four trajectories which are partly visible, depicted in dark red: I need an algorithm which efficiently calculates some trajectories for a given vector field. The trajectories must satisfy some kind of minimum denseness in the plane (for every point in the plane we must have a 'nearby' trajectory), or some other condition to get a reasonable set of trajectories. I could not find anything useful on Google on this, and Stackexchange doesn't seem to handle the topic either. Before I start devising such an algorithm by myself: Are there any known algorithms for this problem? What is their name, for which keywords do I have to search?

    Read the article

  • algorithm to print the digits in the correct order

    - by Aga Waw
    I've been trying to write an algorithm that will print separately the digits from an integer. I have to write it in Pseudocode. I know how to write an algorithm that reverse the digits. digi(n): while n != 0: x = n % 10 n = n // 10 print (x) But I don't know how to write an algorithm to print the digits in the correct order. f.eg. the input is integer 123467 and the output is: 1 2 3 4 6 7 The numbers will be input from the user, and we cannot convert them to a string. I just need help gettin started on writing algorithms. Thanks

    Read the article

  • Algorithm development in jobs

    - by dbeacham
    I have a mathematics background but also consider career in some form of software development. In particular I'm interested in finding out what sort of industries are most likely to have more algorithm development/mathematical and logical problem solving slant rather than pure application development etc. Obviously, I'm assuming that some subset of the canonical data structures and associated algorithms (trees, lists, hash tables, sets, maps with search, insert, traversals etc.) are mostly going to be present in software development. However, where am I more likely to encounter problems of more discrete maths nature (combinatorial, graph theory, sets, strings, ...) explicitly or more likely in disguise. Any pointers much appreciated (including possible open source projects that I could use for my further search for applications and also possibly contribute to).

    Read the article

  • Suggestion for a Non-CSE developer

    - by Md.lbrahim
    Due to financial problems, I couldn't go for CSE in my country and had to settle for a BCIS honors degree. Now, after quite some time, when I want to go for a higher position in software development then I get asked about algorithms and basics that I have missed back in uni. This is affecting my chances of getting selected and I cannot afford that any longer. My question would be that what you would suggest smn like me to do in order to cover the 'basics' without any university or educational institute e.g. books,learn C++,etc? Any suggestion (including -ve) is welcomed and appreciated.

    Read the article

  • Possibility Program for number of pieces

    - by Brad
    I would like to put a program together to calculate the number of 60' pieces would be needed from a list of shorter pieces. For example, I sell rebar cut to length from our standard length of 60'-0". Now the length the customer requires are as follows: 343 pc @ 12.5' 35 pc @ 13' 10 pc @ 15' 63 pc @ 15.5'....... There are 56 total lengths ranging from 12.5' to 30.58' The idea is to limit the amount of waste from the 60' piece. The input from the user would be: number of differnt lengths Length of piece to cut from count of different lengths The result would be the number of prime pieces needed to fulfill the order. What well-known algorithms exist that could help me solve this problem?

    Read the article

  • Code bacteria: evolving mathematical behavior

    - by Stefano Borini
    It would not be my intention to put a link on my blog, but I don't have any other method to clarify what I really mean. The article is quite long, and it's in three parts (1,2,3), but if you are curious, it's worth the reading. A long time ago (5 years, at least) I programmed a python program which generated "mathematical bacteria". These bacteria are python objects with a simple opcode-based genetic code. You can feed them with a number and they return a number, according to the execution of their code. I generate their genetic codes at random, and apply an environmental selection to those objects producing a result similar to a predefined expected value. Then I let them duplicate, introduce mutations, and evolve them. The result is quite interesting, as their genetic code basically learns how to solve simple equations, even for values different for the training dataset. Now, this thing is just a toy. I had time to waste and I wanted to satisfy my curiosity. however, I assume that something, in terms of research, has been made... I am reinventing the wheel here, I hope. Are you aware of more serious attempts at creating in-silico bacteria like the one I programmed? Please note that this is not really "genetic algorithms". Genetic algorithms is when you use evolution/selection to improve a vector of parameters against a given scoring function. This is kind of different. I optimize the code, not the parameters, against a given scoring function.

    Read the article

  • Parallelism in .NET – Introduction

    - by Reed
    Parallel programming is something that every professional developer should understand, but is rarely discussed or taught in detail in a formal manner.  Software users are no longer content with applications that lock up the user interface regularly, or take large amounts of time to process data unnecessarily.  Modern development requires the use of parallelism.  There is no longer any excuses for us as developers. Learning to write parallel software is challenging.  It requires more than reading that one chapter on parallelism in our programming language book of choice… Today’s systems are no longer getting faster with each generation; in many cases, newer computers are actually slower than previous generation systems.  Modern hardware is shifting towards conservation of power, with processing scalability coming from having multiple computer cores, not faster and faster CPUs.  Our CPU frequencies no longer double on a regular basis, but Moore’s Law is still holding strong.  Now, however, instead of scaling transistors in order to make processors faster, hardware manufacturers are scaling the transistors in order to add more discrete hardware processing threads to the system. This changes how we should think about software.  In order to take advantage of modern systems, we need to redesign and rewrite our algorithms to work in parallel.  As with any design domain, it helps tremendously to have a common language, as well as a common set of patterns and tools. For .NET developers, this is an exciting time for parallel programming.  Version 4 of the .NET Framework is adding the Task Parallel Library.  This has been back-ported to .NET 3.5sp1 as part of the Reactive Extensions for .NET, and is available for use today in both .NET 3.5 and .NET 4.0 beta. In order to fully utilize the Task Parallel Library and parallelism, both in .NET 4 and previous versions, we need to understand the proper terminology.  For this series, I will provide an introduction to some of the basic concepts in parallelism, and relate them to the tools available in .NET.

    Read the article

  • Parallelism in .NET – Part 4, Imperative Data Parallelism: Aggregation

    - by Reed
    In the article on simple data parallelism, I described how to perform an operation on an entire collection of elements in parallel.  Often, this is not adequate, as the parallel operation is going to be performing some form of aggregation. Simple examples of this might include taking the sum of the results of processing a function on each element in the collection, or finding the minimum of the collection given some criteria.  This can be done using the techniques described in simple data parallelism, however, special care needs to be taken into account to synchronize the shared data appropriately.  The Task Parallel Library has tools to assist in this synchronization. The main issue with aggregation when parallelizing a routine is that you need to handle synchronization of data.  Since multiple threads will need to write to a shared portion of data.  Suppose, for example, that we wanted to parallelize a simple loop that looked for the minimum value within a dataset: double min = double.MaxValue; foreach(var item in collection) { double value = item.PerformComputation(); min = System.Math.Min(min, value); } .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; } This seems like a good candidate for parallelization, but there is a problem here.  If we just wrap this into a call to Parallel.ForEach, we’ll introduce a critical race condition, and get the wrong answer.  Let’s look at what happens here: // Buggy code! Do not use! double min = double.MaxValue; Parallel.ForEach(collection, item => { double value = item.PerformComputation(); min = System.Math.Min(min, value); }); This code has a fatal flaw: min will be checked, then set, by multiple threads simultaneously.  Two threads may perform the check at the same time, and set the wrong value for min.  Say we get a value of 1 in thread 1, and a value of 2 in thread 2, and these two elements are the first two to run.  If both hit the min check line at the same time, both will determine that min should change, to 1 and 2 respectively.  If element 1 happens to set the variable first, then element 2 sets the min variable, we’ll detect a min value of 2 instead of 1.  This can lead to wrong answers. Unfortunately, fixing this, with the Parallel.ForEach call we’re using, would require adding locking.  We would need to rewrite this like: // Safe, but slow double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach(collection, item => { double value = item.PerformComputation(); lock(syncObject) min = System.Math.Min(min, value); }); This will potentially add a huge amount of overhead to our calculation.  Since we can potentially block while waiting on the lock for every single iteration, we will most likely slow this down to where it is actually quite a bit slower than our serial implementation.  The problem is the lock statement – any time you use lock(object), you’re almost assuring reduced performance in a parallel situation.  This leads to two observations I’ll make: When parallelizing a routine, try to avoid locks. That being said: Always add any and all required synchronization to avoid race conditions. These two observations tend to be opposing forces – we often need to synchronize our algorithms, but we also want to avoid the synchronization when possible.  Looking at our routine, there is no way to directly avoid this lock, since each element is potentially being run on a separate thread, and this lock is necessary in order for our routine to function correctly every time. However, this isn’t the only way to design this routine to implement this algorithm.  Realize that, although our collection may have thousands or even millions of elements, we have a limited number of Processing Elements (PE).  Processing Element is the standard term for a hardware element which can process and execute instructions.  This typically is a core in your processor, but many modern systems have multiple hardware execution threads per core.  The Task Parallel Library will not execute the work for each item in the collection as a separate work item. Instead, when Parallel.ForEach executes, it will partition the collection into larger “chunks” which get processed on different threads via the ThreadPool.  This helps reduce the threading overhead, and help the overall speed.  In general, the Parallel class will only use one thread per PE in the system. Given the fact that there are typically fewer threads than work items, we can rethink our algorithm design.  We can parallelize our algorithm more effectively by approaching it differently.  Because the basic aggregation we are doing here (Min) is communitive, we do not need to perform this in a given order.  We knew this to be true already – otherwise, we wouldn’t have been able to parallelize this routine in the first place.  With this in mind, we can treat each thread’s work independently, allowing each thread to serially process many elements with no locking, then, after all the threads are complete, “merge” together the results. This can be accomplished via a different set of overloads in the Parallel class: Parallel.ForEach<TSource,TLocal>.  The idea behind these overloads is to allow each thread to begin by initializing some local state (TLocal).  The thread will then process an entire set of items in the source collection, providing that state to the delegate which processes an individual item.  Finally, at the end, a separate delegate is run which allows you to handle merging that local state into your final results. To rewriting our routine using Parallel.ForEach<TSource,TLocal>, we need to provide three delegates instead of one.  The most basic version of this function is declared as: public static ParallelLoopResult ForEach<TSource, TLocal>( IEnumerable<TSource> source, Func<TLocal> localInit, Func<TSource, ParallelLoopState, TLocal, TLocal> body, Action<TLocal> localFinally ) The first delegate (the localInit argument) is defined as Func<TLocal>.  This delegate initializes our local state.  It should return some object we can use to track the results of a single thread’s operations. The second delegate (the body argument) is where our main processing occurs, although now, instead of being an Action<T>, we actually provide a Func<TSource, ParallelLoopState, TLocal, TLocal> delegate.  This delegate will receive three arguments: our original element from the collection (TSource), a ParallelLoopState which we can use for early termination, and the instance of our local state we created (TLocal).  It should do whatever processing you wish to occur per element, then return the value of the local state after processing is completed. The third delegate (the localFinally argument) is defined as Action<TLocal>.  This delegate is passed our local state after it’s been processed by all of the elements this thread will handle.  This is where you can merge your final results together.  This may require synchronization, but now, instead of synchronizing once per element (potentially millions of times), you’ll only have to synchronize once per thread, which is an ideal situation. Now that I’ve explained how this works, lets look at the code: // Safe, and fast! double min = double.MaxValue; // Make a "lock" object object syncObject = new object(); Parallel.ForEach( collection, // First, we provide a local state initialization delegate. () => double.MaxValue, // Next, we supply the body, which takes the original item, loop state, // and local state, and returns a new local state (item, loopState, localState) => { double value = item.PerformComputation(); return System.Math.Min(localState, value); }, // Finally, we provide an Action<TLocal>, to "merge" results together localState => { // This requires locking, but it's only once per used thread lock(syncObj) min = System.Math.Min(min, localState); } ); Although this is a bit more complicated than the previous version, it is now both thread-safe, and has minimal locking.  This same approach can be used by Parallel.For, although now, it’s Parallel.For<TLocal>.  When working with Parallel.For<TLocal>, you use the same triplet of delegates, with the same purpose and results. Also, many times, you can completely avoid locking by using a method of the Interlocked class to perform the final aggregation in an atomic operation.  The MSDN example demonstrating this same technique using Parallel.For uses the Interlocked class instead of a lock, since they are doing a sum operation on a long variable, which is possible via Interlocked.Add. By taking advantage of local state, we can use the Parallel class methods to parallelize algorithms such as aggregation, which, at first, may seem like poor candidates for parallelization.  Doing so requires careful consideration, and often requires a slight redesign of the algorithm, but the performance gains can be significant if handled in a way to avoid excessive synchronization.

    Read the article

  • How does Comparison Sites work?

    - by Vijay
    Need your thinking on how does these Comparision Sites actually work. Sites like Junglee.com policybazaar.com and there are many like these which provides comaprision of products , fares etc. grabbed from different websites. I had read a little about it and what i found is-: These sites uses Feeds of the sites data. These sites uses APIs of the sites which are actually provided by those sites. And for some sites which do not have any of these two posibility then the Comparision sites uses web-crawler to crawl their data. This is what i have found out. If you think there is more things to it please do give your own views. But i want to know these for my learning purpose and a little for curiosity- how does they actually matches the crawled data , feeds, and other so that there is no duplicacy. What is the process or algorithms for it. And where should i go to learn these concepts. References for books , articles or anything else.

    Read the article

  • Parsing scripts that use curly braces

    - by Keikoku
    To get an idea of what I'm doing, I am writing a python parser that will parse directx .x text files. The problem I have deals with how the files are formatted. Although I'm writing it in python, I'm looking for general algorithms for dealing with this sort of parsing. .x files define data using templates. The format of a template is template_name { [some_data] } The goal I have is to parse the file line-by-line and whenever I come across a template, I will deal with it accordingly. My initial approach was to check if a line contains an opening or closing brace. If it's an open brace, then I will check what the template name is. Now the catch here is that the open brace doesn't have to occur on the same line as the template name. It could just as well be template_name { [some_data] } So if I were to use my "open brace exists" criteria, it won't work for any files that use the latter format. A lot of languages also use curly braces (though I'm not sure when people would be parsing the scripts themselves), so I was wondering if anyone knows how to accurately get the template name (or in some other languages, it could just as well be a function name, though there aren't any keywords to look for)

    Read the article

  • How do graphics programmers deal with rendering vertices that don't change the image?

    - by canisrufus
    So, the title is a little awkward. I'll give some background, and then ask my question. Background: I work as a web GIS application developer, but in my spare time I've been playing with map rendering and improving data interchange formats. I work only in 2D space. One interesting issue I've encountered is that when you're rendering a polygon at a small scale (zoomed way out), many of the vertices are redundant. An extreme case would be that you have a polygon with 500,000 vertices that only takes up a single pixel. If you're sending this data to the browser, it would make sense to omit ~499,999 of those vertices. One way we achieve that is by rendering an image on a server and and sending it as a PNG: voila, it's a point. Sometimes, though, we want data sent to the browser where it can be rendered with SVG (or canvas, or webgl) so that it can be interactive. The problem: It turns out that, using modern geographic data sets, it's very easy to overload SVG's rendering abilities. In an effort to cope with those limitations, I'm trying to figure out how to visually losslessly reduce a data set for a given scale and map extent (and, if necessary, for a known map pixel width and height). I got a great reduction in data size just using the Douglas-Peucker algorithm, and I believe I was able to get it to keep the polygons true to within one pixel. Unfortunately, Douglas-Peucker doesn't preserve topology, so it changed how borders between polygons got rendered. I couldn't readily find other algorithms to try out and adapt to the purpose, but I don't have much CS/algorithm background and might not recognize them if I saw them.

    Read the article

< Previous Page | 3 4 5 6 7 8 9 10 11 12 13 14  | Next Page >