Search Results

Search found 1889 results on 76 pages for 'evolutionary algorithms'.

Page 69/76 | < Previous Page | 65 66 67 68 69 70 71 72 73 74 75 76 | Next Page >

How I May Have Taken A Wrong Path in Programming

- by Ygam

I am in a major stump right now. I am a BSIT graduate, but I only started actual programming less than a year ago. I observed that I have the following attitude in programming: I tend to be more of a purist, scorning unelegant approaches to solving problems using code I tend to look at anything in a large scale, planning everything before I start coding, either in simple flowcharts or complex UML charts I have a really strong impulse on refactoring my code, even if I miss deadlines or prolong development times I am obsessed with good directory structures, file naming conventions, class, method, and variable naming conventions I tend to always want to study something new, even, as I said, at the cost of missing deadlines I tend to see software development as something to engineer, to architect; that is, seeing how things relate to each other and how blocks of code can interact (I am a huge fan of loose coupling) i.e the OOP thinking I tend to combine OOP and procedural coding whenever I see fit I want my code to execute fast (thus the elegant approaches and refactoring) This bothers me because I see my colleagues doing much better the other way around (aside from the fact that they started programming since our first year in college). By the other way around I mean, they fire up coding, gets the job done much faster because they don't have to really look at how clean their codes are or how elegant their algorithms are, they don't bother with OOP however big their projects are, they mostly use web APIs, piece them together and voila! Working code! CLients are happy, they get paid fast, at the expense of a really unmaintainable or hard-to-read code that lacks structure and conventions, or slow executions of certain actions (which the common reasoning against would be that internet connections are much faster these days, hardware is more powerful). The excuse I often receive is clients don't care about how you write the code, but they do care about how long you deliver it. If it works then all is good. Now, did my "purist" approach to programming may have been the wrong way to start programming? Should I just dump these purist concepts and just code the hell up because I have seen it: clients don't really care how beautifully coded it is?

Read the article
Programming texts and reference material for my Kindle DX, creating the ultimate reference device?

- by mwilliams

(Revisiting this topic with the release of the Kindle DX) Having owned both generation Kindle readers and now getting a Kindle DX; I'm very excited for true PDF handling on an e-ink device! An image of _Why's book on my Kindle (from my iPhone). This gives me a device capable of storing hundreds of thousands of pages that are full text search capable in the form factor of a magazine. What references (preferably PDF to preserve things such as code samples) would you recommend? Ultimately I would like reference material for every modern and applicable programming language (C, C++, Objective-C, Python, Ruby, Java, .NET (C#, Visual Basic, ASP.NET), Erlang, SQL references) as well as general programming texts and frameworks (algorithms, design patterns, theory, Rails, Django, Cocoa, ORMs, etc) and anything else that could be thought of. With so many developers here using such a wide array of languages, as a professional in your particular field, what books or references would you recommend to me for my Kindle? Creative Commons material a plus (translate that to free) as well as the material being in the PDF file format. File size is not an issue. If this turns out to be a success, I will update with a follow-up with a compiled list generated from all of the answers. Thanks for the assistance and contributing! UPDATE I have been using the Kindle DX a lot now for technical books. Check out this blog post I did for high resolution photos of different material: http://www.matthewdavidwilliams.com/2009/06/12/technical-document-pdfs-on-the-kindle-dx/

Read the article
Move camera to fit 3D scene

- by Burre

Hi there. I'm looking for an algorithm to fit a bounding box inside a viewport (in my case a DirectX scene). I know about algorithms for centering a bounding sphere in a orthographic camera but would need the same for a bounding box and a perspective camera. I have most of the data: I have the up-vector for the camera I have the center point of the bounding box I have the look-at vector (direction and distance) from the camera point to the box center I have projected the points on a plane perpendicular to the camera and retrieved the coefficients describing how much the max/min X and Y coords are within or outside the viewing plane. Problems I have: Center of the bounding box isn't necessarily in the center of the viewport (that is, it's bounding rectangle after projection). Since the field of view "skew" the projection (see http://en.wikipedia.org/wiki/File:Perspective-foreshortening.svg) I cannot simply use the coefficients as a scale factor to move the camera because it will overshoot/undershoot the desired camera position How do I find the camera position so that it fills the viewport as pixel perfect as possible (exception being if the aspect ratio is far from 1.0, it only needs to fill one of the screen axis)? I've tried some other things: Using a bounding sphere and Tangent to find a scale factor to move the camera. This doesn't work well, because, it doesn't take into account the perspective projection, and secondly spheres are bad bounding volumes for my use because I have a lot of flat and long geometries. Iterating calls to the function to get a smaller and smaller error in the camera position. This has worked somewhat, but I can sometimes run into weird edge cases where the camera position overshoots too much and the error factor increases. Also, when doing this I didn't recenter the model based on the position of the bounding rectangle. I couldn't find a solid, robust way to do that reliably. Help please!

Read the article
Classical task-scheduling assignment

- by Bruno

I am working on a flight scheduling app (disclaimer: it's for a college project, so no code answers, please). Please read this question w/ a quantum of attention before answering as it has a lot of peculiarities :( First, some terminology issues: You have planes and flights, and you have to pair them up. For simplicity's sake, we'll assume that a plane is free as soon as the flight using it prior lands. Flights are seen as tasks: They have a duration They have dependencies They have an expected date/time for beginning Planes can be seen as resources to be used by tasks (or flights, in our terminology). Flights have a specific type of plane needed. e.g. flight 200 needs a plane of type B. Planes obviously are of one and only one specific type, e.g., Plane Airforce One is of type C. A "project" is the set of all the flights by an airline in a given time period. The functionality required is: Finding the shortest possible duration for a said project The earliest and latest possible start for a task (flight) The critical tasks, with basis on provided data, complete with identifiers of preceding tasks. Automatically pair up flights and planes, so as to get all flights paired up with a plane. (Note: the duration of flights is fixed) Get a Gantt diagram with the projects scheduling, in which all flights begin as early as possible, showing all previously referred data graphically (dependencies, time info, etc.) So the questions is: How in the world do I achieve this? Particularly: We are required to use a graph. What do the graph's edges and nodes respectively symbolise? Are we required to discard tasks to achieve the critical tasks set? If you could also recommend some algorithms for us to look up, that'd be great.

Read the article
finding N contiguous zero bits in an integer to the left of the MSB from another

- by James Morris

First we find the MSB of the first integer, and then try to find a region of N contiguous zero bits within the second number which is to the left of the MSB from the first integer. Here is the C code for my solution: typedef unsigned int t; unsigned const t_bits = sizeof(t) * CHAR_BIT; _Bool test_fit_within_left_of_msb( unsigned width, t val1, t val2, unsigned* offset_result) { unsigned offbit = 0; unsigned msb = 0; t mask; t b; while(val1 >>= 1) ++msb; while(offbit + width < t_bits - msb) { mask = (((t)1 << width) - 1) << (t_bits - width - offbit); b = val2 & mask; if (!b) { *offset_result = offbit; return true; } if (offbit++) /* this conditional bothers me! */ b <<= offbit - 1; while(b <<= 1) offbit++; } return false; } Aside from faster ways of finding the MSB of the first integer, the commented test for a zero offbit seems a bit extraneous, but necessary to skip the highest bit of type t if it is set. I have also implemented similar algorithms but working to the right of the MSB of the first number, so they don't require this seemingly extra condition. How can I get rid of this extra condition, or even, are there far more optimal solutions?

Read the article
Best way to implement plugin framework - are DLLs the only way (C/C++ project)?

- by Microkernel

Introduction: I am currently developing a document classifier software in C/C++ and I will be using Naive-Bayesian model for classification. But I wanted the users to use any algorithm that they want(or I want in the future), hence I went to separate the algorithm part in the architecture as a plugin that will be attached to the main app @ app start-up. Hence any user can write his own algorithm as a plugin and use it with my app. Problem Statement: The way I am intending to develop this is to have each of the algorithms that user wants to use to be made into a DLL file and put into a specific directory. And at the start, my app will search for all the DLLs in that directory and load them. My Questions: (1) What if a malicious code is made as a DLL (and that will have same functions mandated by plugin framework) and put into my plugins directory? In that case, my app will think that its a plugin and picks it and calls its functions, so the malicious code can easily bring down my entire app down (In the worst case could make my app as a malicious code launcher!!!). (2) Is using DLLs the only way available to implement plugin design pattern? (Not only for the fear of malicious plugin, but its a generic question out of curiosity :) ) (3) I think a lot of softwares are written with plugin model for extendability, if so, how do they defend against such attacks? (4) In general what do you think about my decision to use plugin model for extendability (do you think I should look at any other alternatives?) Thank you -MicroKernel :)

Read the article
Fast path cache generation for a connected node graph

- by Sukasa

I'm trying to get a faster pathfinding mechanism in place in a game I'm working on for a connected node graph. The nodes are classed into two types, "Networks" and "Routers." In this picture, the blue circles represent routers and the grey rectangles networks. Each network keeps a list of which routers it is connected to, and vice-versa. Routers cannot connect directly to other routers, and networks cannot connect directly to other networks. Networks list which routers they're connected to Routers do the same I need to get an algorithm that will map out a path, measured in the number of networks crossed, for each possible source and destination network excluding paths where the source and destination are the same network. I have one right now, however it is unusably slow, taking about two seconds to map the paths, which becomes incredibly noticeable for all connected players. The current algorithm is a depth-first brute-force search (It was thrown together in about an hour to just get the path caching working) which returns an array of networks in the order they are traversed, which explains why it's so slow. Are there any algorithms that are more efficient? As a side note, while these example graphs have four networks, the in-practice graphs have 55 networks and about 20 routers in use. Paths which are not possible also can occur, and as well at any time the network/router graph topography can change, requiring the path cache to be rebuilt. What approach/algorithm would likely provide the best results for this type of a graph?

Read the article
How to optimize shopping carts for minimal prices?

- by tangens

I have a list of items I want to buy. The items are offered by different shops and different prices. The shops have individual delivery costs. I'm looking for an optimal shopping strategy (and a java library supporting it) to purchase all of the items with a minimal total price. Example: Item1 is offered at Shop1 for $100, at Shop2 for $111. Item2 is offered at Shop1 for $90, at Shop2 for $85. Delivery cost of Shop1: $10 if total order < $150; $0 otherwise Delivery cost of Shop2: $5 if total order < $50; $0 otherwise If I buy Item1 and Item2 at Shop1 the total cost is $100 + $90 +$0 = $190. If I buy Item1 and Item2 at Shop2 the total cost is $111 + $85 +$0 = $196. If I buy Item1 at Shop1 and Item2 at Shop2 the total cost is $100 + $10 + $85 + $0 = 195. I get the minimal price if I order Item1 at Shop1 and Item2 at Shop2: $195 Question I need some hints which algorithms may help me to solve optimization problems of this kind for number of items about 100 and number of shops about 20. I already looked at apache-math and its optimization package, but I have no idea what algorithm to look for.

Read the article
Three most critical programming concepts

- by Todd

I know this has probably been asked in one form or fashion but I wanted to pose it once again within the context of my situation (and probably others here @ SO). I made a career change to Software Engineering some time ago without having an undergrad or grad degree in CS. I've supplemented my undergrad and grad studies in business with programming courses (VB, Java,C, C#) but never performed academic coursework in the other related disciplines (algorithms, design patterns, discrete math, etc.)...just mostly self-study. I know there are several of you who have either performed interviews and/or made hiring decisions. Given recent trends in demand, what would you say are the three most essential Comp Sci concepts that a developer should have a solid grasp of outside of language syntax? For example, I've seen blog posts of the "Absolute minimum X that every programmer must know" variety...that's what I'm looking for. Again if it's truly a redundancy please feel free to close; my feelings won't be hurt. (Closest ones I could find were http://stackoverflow.com/questions/164048/basic-programming-algorithmic-concepts- which was geared towards a true beginner, and http://stackoverflow.com/questions/648595/essential-areas-of-knowledge-which I didn't feel was concrete enough). Thanks in advance all! T.

Read the article
How to find whole graph coverage path in dynamic state-flow diagram?

- by joseph

Hello, As I've been researching algorithms for path finding in graph, I found interesting problem. Definition of situation: 1)State diagram can have p states, and s Boolean Fields, and z Int Fields 2)Every state can have q ingoing and r outgoing transitions, and h Int fields (h belongs to z - see above) 3)Every transition can have only 1 event, and only 1 action 4)every action can change n Boolean Fields, and x Int Fields 5)every event can have one trigger from combination of any count of Boolean Fields in diagram 6)Transition can be in OPEN/CLOSED form. If the transition is open/closed depends on trigger2 compounded from 0..c Boolean fields. 7) I KNOW algorithm for finding shortest paths from state A to state B. 8) I KNOW algorithm for finding path that covers all states and transitions of whole state diagram, if all transitions are OPEN. Now, what is the goal: I need to find shortest path that covers all states and transitions in dynamically changing state diagram described above. When an action changes some int field, the algorithm should go through all states that have changed int field. The algorithm should also be able to open and close transition (by going through transitions that open and close another transitions by action) in the way that the founded path will be shortest and covers all transitions and states. Any idea how to solve it? I will be really pleased for ANY idea. Thanks for answers.

Read the article
Extracting DCT coefficients from encoded images and video

- by misha

Is there a way to easily extract the DCT coefficients (and quantization parameters) from encoded images and video? Any decoder software must be using them to decode block-DCT encoded images and video. So I'm pretty sure the decoder knows what they are. Is there a way to expose them to whomever is using the decoder? I'm implementing some video quality assessment algorithms that work directly in the DCT domain. Currently, the majority of my code uses OpenCV, so it would be great if anyone knows of a solution using that framework. I don't mind using other libraries (perhaps libjpeg, but that seems to be for still images only), but my primary concern is to do as little format-specific work as possible (I don't want to reinvent the wheel and write my own decoders). I want to be able to open any video/image (H.264, MPEG, JPEG, etc) that OpenCV can open, and if it's block DCT-encoded, to get the DCT coefficients. In the worst case, I know that I can write up my own block DCT code, run the decompressed frames/images through it and then I'd be back in the DCT domain. That's hardly an elegant solution, and I hope I can do better. Presently, I use the fairly common OpenCV boilerplate to open images: IplImage *image = cvLoadImage(filename); // Run quality assessment metric The code I'm using for video is equally trivial: CvCapture *capture = cvCaptureFromAVI(filename); while (cvGrabFrame(capture)) { IplImage *frame = cvRetrieveFrame(capture); // Run quality assessment metric on frame } cvReleaseCapture(&capture); In both cases, I get a 3-channel IplImage in BGR format. Is there any way I can get the DCT coefficients as well?

Read the article
Tips for submitting a library to Boost?

- by AraK

Hi everyone, Summer is coming, and a group of friends and I are getting ready for it :) We decided to build a compile-time Arbitrary precision Unsigned Integers. We would like to provide a set of integers algorithms(functions) with the library. We have seen a number of requests for such a library(SoC2010, C++0x Standard Library wishlist). Also, a regular run-time bigint is requested usually with that, but we don't want to go into the hassle of memory management. The idea came to me from a library called TTMath, unfortunately this library works only on specific platforms because Assembly was used extensively in the library. We would like to write a standard library, depending on the C++ standard library and Boost. Also, we would like to use the available C++0x facilities in current compilers like user-defined literals and others. This would technically make the library non-standard for a while, but we think that it is a matter of time the new standards will be official. Your hints on the whole process including design, implementation, documentation, maintainable of the library are more than welcom. We are a group of students and fresh graduates who are looking for something interesting in the summer, but we see that Boost is full of gurus and we don't want to forget something too obvious. We are communicating on-line, so there is no shared white-boards :( Thanks,

Read the article
Why is my quick sort so slow?

- by user513075

Hello, I am practicing writing sorting algorithms as part of some interview preparation, and I am wondering if anybody can help me spot why this quick sort is not very fast? It appears to have the correct runtime complexity, but it is slower than my merge sort by a constant factor of about 2. I would also appreciate any comments that would improve my code that don't necessarily answer the question. Thanks a lot for your help! Please don't hesitate to let me know if I have made any etiquette mistakes. This is my first question here. private class QuickSort implements Sort { @Override public int[] sortItems(int[] ts) { List<Integer> toSort = new ArrayList<Integer>(); for (int i : ts) { toSort.add(i); } toSort = partition(toSort); int[] ret = new int[ts.length]; for (int i = 0; i < toSort.size(); i++) { ret[i] = toSort.get(i); } return ret; } private List<Integer> partition(List<Integer> toSort) { if (toSort.size() <= 1) return toSort; int pivotIndex = myRandom.nextInt(toSort.size()); Integer pivot = toSort.get(pivotIndex); toSort.remove(pivotIndex); List<Integer> left = new ArrayList<Integer>(); List<Integer> right = new ArrayList<Integer>(); for (int i : toSort) { if (i > pivot) right.add(i); else left.add(i); } left = partition(left); right = partition(right); left.add(pivot); left.addAll(right); return left; } }

Read the article
What's the most trivial function that would benfit from being computed on a GPU?

- by hanDerPeder

Hi. I'm just starting out learning OpenCL. I'm trying to get a feel for what performance gains to expect when moving functions/algorithms to the GPU. The most basic kernel given in most tutorials is a kernel that takes two arrays of numbers and sums the value at the corresponding indexes and adds them to a third array, like so: __kernel void add(__global float *a, __global float *b, __global float *answer) { int gid = get_global_id(0); answer[gid] = a[gid] + b[gid]; } __kernel void sub(__global float* n, __global float* answer) { int gid = get_global_id(0); answer[gid] = n[gid] - 2; } __kernel void ranksort(__global const float *a, __global float *answer) { int gid = get_global_id(0); int gSize = get_global_size(0); int x = 0; for(int i = 0; i < gSize; i++){ if(a[gid] > a[i]) x++; } answer[x] = a[gid]; } I am assuming that you could never justify computing this on the GPU, the memory transfer would out weight the time it would take computing this on the CPU by magnitudes (I might be wrong about this, hence this question). What I am wondering is what would be the most trivial example where you would expect significant speedup when using a OpenCL kernel instead of the CPU?

Read the article
How to make a plot from summaryRprof?

- by ThorDivDev

This is a question for an university assignment. I was given three algorithms to calculate the GCD that I already did. My problem is getting the Rprof results to a plot so I can compare them side by side. From what little understanding I have about Rprof, summaryRprof and plot is that Rprof is used like this: Rprof() #To start #functions here Rprof(NULL) #TO end summaryRprof() # to print results I understand that plot has many different types of inputs, x and y values and something called a data frame which I assume is a fancy word for table. and to draw different lines and things I need to use this: http://www.harding.edu/fmccown/r/ what I cant figure out is how to get the summaryRprof results to the plot() function. > Rprof(filename="RProfOut2.out", interval=0.0001) > gcdBruteForce(10000, 33) [1] 1 > gcdEuclid(10000, 33) [1] 1 > gcdPrimeFact(10000, 33) [1] 1 > Rprof(NULL) > summaryRprof() ?????plot???? I have been reading on stack overflow that and other sites that I can also try to use profr and proftools although I am not very clear on the usage. The only graph I have been able to make is one using plot(system.time(gcdFunction(10,100))) As always any help is appreciated.

Read the article
How string accepting interface should look like?

- by ybungalobill

Hello, This is a follow up of this question. Suppose I write a C++ interface that accepts or returns a const string. I can use a const char* zero-terminated string: void f(const char* str); // (1) The other way would be to use an std::string: void f(const string& str); // (2) It's also possible to write an overload and accept both: void f(const char* str); // (3) void f(const string& str); Or even a template in conjunction with boost string algorithms: template<class Range> void f(const Range& str); // (4) My thoughts are: (1) is not C++ish and may be less efficient when subsequent operations may need to know the string length. (2) is bad because now f("long very long C string"); invokes a construction of std::string which involves a heap allocation. If f uses that string just to pass it to some low-level interface that expects a C-string (like fopen) then it is just a waste of resources. (3) causes code duplication. Although one f can call the other depending on what is the most efficient implementation. However we can't overload based on return type, like in case of std::exception::what() that returns a const char*. (4) doesn't work with separate compilation and may cause even larger code bloat. Choosing between (1) and (2) based on what's needed by the implementation is, well, leaking an implementation detail to the interface. The question is: what is the preffered way? Is there any single guideline I can follow? What's your experience?

Read the article
What are the benefits of `while(condition) { //work }` and `do { //work } while(condition)`?

- by Shaharyar

I found myself confronted with an interview question where the goal was to write a sorting algorithm that sorts an array of unsorted int values: int[] unsortedArray = { 9, 6, 3, 1, 5, 8, 4, 2, 7, 0 }; Now I googled and found out that there are so many sorting algorithms out there! Finally I could motivate myself to dig into Bubble Sort because it seemed pretty simple to start with. I read the sample code and came to a solution looking like this: static int[] BubbleSort(ref int[] array) { long lastItemLocation = array.Length - 1; int temp; bool swapped; do { swapped = false; for (int itemLocationCounter = 0; itemLocationCounter < lastItemLocation; itemLocationCounter++) { if (array[itemLocationCounter] > array[itemLocationCounter + 1]) { temp = array[itemLocationCounter]; array[itemLocationCounter] = array[itemLocationCounter + 1]; array[itemLocationCounter + 1] = temp; swapped = true; } } } while (swapped); return array; } I clearly see that this is a situation where the do { //work } while(cond) statement is a great help to be and prevents the use of another helper variable. But is this the only case that this is more useful or do you know any other application where this condition has been used?

Read the article
How can I implement a splay tree that performs the zig operation last, not first?

- by Jakob

For my Algorithms & Data Structures class, I've been tasked with implementing a splay tree in Haskell. My algorithm for the splay operation is as follows: If the node to be splayed is the root, the unaltered tree is returned. If the node to be splayed is one level from the root, a zig operation is performed and the resulting tree is returned. If the node to be splayed is two or more levels from the root, a zig-zig or zig-zag operation is performed on the result of splaying the subtree starting at that node, and the resulting tree is returned. This is valid according to my teacher. However, the Wikipedia description of a splay tree says the zig step "will be done only as the last step in a splay operation" whereas in my algorithm it is the first step in a splay operation. I want to implement a splay tree that performs the zig operation last instead of first, but I'm not sure how it would best be done. It seems to me that such an algorithm would become more complex, seeing as how one needs to find the node to be splayed before it can be determined whether a zig operation should be performed or not. How can I implement this in Haskell (or some other functional language)?

Read the article
Concept: Information Into Memory Location.

- by Richeve S. Bebedor

I am having troubles conceptualizing an algorithm to be used to transform any information or data into a specific appropriate and reasonable memory location in any data structure that I will be devising. To give you an idea, I have a JPanel object instance and I created another Container type object instance of any subtype (note this is in Java because I love this language), then I collected those instances into a data structure not specifically just for those instances but also applicable to any type of object. Now my procedure for fetching those data again is to extract the object specific features similar in category to all object in that data structure and transform it into a integer data memory location (specifically as much as possible) or any type of data that will pertain to this transformation. And I can already access that memory location without further sorting or applications of O(n) time complex algorithms (which I think preferable but I wanted to do my own way XD). The data structure is of any type either binary tree, linked list, arrays or sets (and the like XD). What is important is I don't need to have successive comparing and analysis of data just to locate information in big structures. To give you a technical idea, I have to an array DS that contains JLabel object instance with a specific name "HelloWorld". But array DS contains other types of object (in multitude). Now this JLabel object has a location in the array at index [124324] (which is if you do any type of searching algorithm just to arrive at that location is conceivably slow because added to it the data structure used was an array *note please disregard the efficiency of the data structure to be used I just want to explain to you my concept XD). Now I want to equate "HelloWorld" to 124324 by using a conceptually made function applicable to all data types. So that I can do a direct search by doing this DS[extractLocation("HelloWorld")] just to get that JLabel instance. I know this may sound crazy but I want to test my concept of non-sorting feature extracting search algorithm for any data structure wherein my main problem is how to transform information to be stored into memory location of where it was stored.

Read the article
Algorithm to determine if array contains n...n+m?

- by Kyle Cronin

I saw this question on Reddit, and there were no positive solutions presented, and I thought it would be a perfect question to ask here. This was in a thread about interview questions: Write a method that takes an int array of size m, and returns (True/False) if the array consists of the numbers n...n+m-1, all numbers in that range and only numbers in that range. The array is not guaranteed to be sorted. (For instance, {2,3,4} would return true. {1,3,1} would return false, {1,2,4} would return false. The problem I had with this one is that my interviewer kept asking me to optimize (faster O(n), less memory, etc), to the point where he claimed you could do it in one pass of the array using a constant amount of memory. Never figured that one out. Along with your solutions please indicate if they assume that the array contains unique items. Also indicate if your solution assumes the sequence starts at 1. (I've modified the question slightly to allow cases where it goes 2, 3, 4...) edit: I am now of the opinion that there does not exist a linear in time and constant in space algorithm that handles duplicates. Can anyone verify this? The duplicate problem boils down to testing to see if the array contains duplicates in O(n) time, O(1) space. If this can be done you can simply test first and if there are no duplicates run the algorithms posted. So can you test for dupes in O(n) time O(1) space?

Read the article
Meaning of NEXT in Linked List creation in perl

- by seleniumnewbie

So I am trying to learn Linked Lists using Perl. I am reading "Mastering Algorithms with Perl" by Job Orwant. In the book he explains how to create a linked list I understand most of it, but I just simply fail to understand the command/index/key NEXT in the second last line of the code snippet. $list=undef; $tail=\$list; foreach (1..5){ my $node = [undef, $_ * $_]; $$tail = $node; $tail = \${$node->[NEXT]}; # The NEXT on this line? } What is he trying to do there? Isn $node a scalar, which stores the address of the unnamed array. Also even if we are de-referencing $node, should we not refer to the individual elements by an index number example (0,1). If we do use "NEXT" as a key, is $node a reference to a hash? I am very confused. Something in plain English will be highly appreciated.

Read the article
Type-safe generic data structures in plain-old C?

- by Bradford Larsen

I have done far more C++ programming than "plain old C" programming. One thing I sorely miss when programming in plain C is type-safe generic data structures, which are provided in C++ via templates. For sake of concreteness, consider a generic singly linked list. In C++, it is a simple matter to define your own template class, and then instantiate it for the types you need. In C, I can think of a few ways of implementing a generic singly linked list: Write the linked list type(s) and supporting procedures once, using void pointers to go around the type system. Write preprocessor macros taking the necessary type names, etc, to generate a type-specific version of the data structure and supporting procedures. Use a more sophisticated, stand-alone tool to generate the code for the types you need. I don't like option 1, as it is subverts the type system, and would likely have worse performance than a specialized type-specific implementation. Using a uniform representation of the data structure for all types, and casting to/from void pointers, so far as I can see, necessitates an indirection that would be avoided by an implementation specialized for the element type. Option 2 doesn't require any extra tools, but it feels somewhat clunky, and could give bad compiler errors when used improperly. Option 3 could give better compiler error messages than option 2, as the specialized data structure code would reside in expanded form that could be opened in an editor and inspected by the programmer (as opposed to code generated by preprocessor macros). However, this option is the most heavyweight, a sort of "poor-man's templates". I have used this approach before, using a simple sed script to specialize a "templated" version of some C code. I would like to program my future "low-level" projects in C rather than C++, but have been frightened by the thought of rewriting common data structures for each specific type. What experience do people have with this issue? Are there good libraries of generic data structures and algorithms in C that do not go with Option 1 (i.e. casting to and from void pointers, which sacrifices type safety and adds a level of indirection)?

Read the article
finding long repeated substrings in a massive string

- by Will

I naively imagined that I could build a suffix trie where I keep a visit-count for each node, and then the deepest nodes with counts greater than one are the result set I'm looking for. I have a really really long string (hundreds of megabytes). I have about 1 GB of RAM. This is why building a suffix trie with counting data is too inefficient space-wise to work for me. To quote Wikipedia's Suffix tree: storing a string's suffix tree typically requires significantly more space than storing the string itself. The large amount of information in each edge and node makes the suffix tree very expensive, consuming about ten to twenty times the memory size of the source text in good implementations. The suffix array reduces this requirement to a factor of four, and researchers have continued to find smaller indexing structures. And that was wikipedia's comments on the tree, not trie. How can I find long repeated sequences in such a large amount of data, and in a reasonable amount of time (e.g. less than an hour on a modern desktop machine)? (Some wikipedia links to avoid people posting them as the 'answer': Algorithms on strings and especially Longest repeated substring problem ;-) )

Read the article
Optimize Duplicate Detection

- by Dave Jarvis

Background This is an optimization problem. Oracle Forms XML files have elements such as: <Trigger TriggerName="name" TriggerText="SELECT * FROM DUAL" ... /> Where the TriggerText is arbitrary SQL code. Each SQL statement has been extracted into uniquely named files such as: sql/module=DIAL_ACCESS+trigger=KEY-LISTVAL+filename=d_access.fmb.sql sql/module=REP_PAT_SEEN+trigger=KEY-LISTVAL+filename=rep_pat_seen.fmb.sql I wrote a script to generate a list of exact duplicates using a brute force approach. Problem There are 37,497 files to compare against each other; it takes 8 minutes to compare one file against all the others. Logically, if A = B and A = C, then there is no need to check if B = C. So the problem is: how do you eliminate the redundant comparisons? The script will complete in approximately 208 days. Script Source Code The comparison script is as follows: #!/bin/bash echo Loading directory ... for i in $(find sql/ -type f -name \*.sql); do echo Comparing $i ... for j in $(find sql/ -type f -name \*.sql); do if [ "$i" = "$j" ]; then continue; fi # Case insensitive compare, ignore spaces diff -IEbwBaq $i $j > /dev/null # 0 = no difference (i.e., duplicate code) if [ $? = 0 ]; then echo $i :: $j >> clones.txt fi done done Question How would you optimize the script so that checking for cloned code is a few orders of magnitude faster? System Constraints Using a quad-core CPU with an SSD; trying to avoid using cloud services if possible. The system is a Windows-based machine with Cygwin installed -- algorithms or solutions in other languages are welcome. Thank you!

Read the article
Java: multi-threaded maps: how do the implementations compare?

- by user346629

I'm looking for a good hash map implementation. Specifically, one that's good for creating a large number of maps, most of them small. So memory is an issue. It should be thread-safe (though losing the odd put might be an OK compromise in return for better performance), and fast for both get and put. And I'd also like the moon on a stick, please, with a side-order of justice. The options I know are: HashMap. Disastrously un-thread safe. ConcurrentHashMap. My first choice, but this has a hefty memory footprint - about 2k per instance. Collections.sychronizedMap(HashMap). That's working OK for me, but I'm sure there must be faster alternatives. Trove or Colt - I think neither of these are thread-safe, but perhaps the code could be adapted to be thread safe. Any others? Any advice on what beats what when? Any really good new hash map algorithms that Java could use an implementation of? Thanks in advance for your input!

Read the article

< Previous Page | 65 66 67 68 69 70 71 72 73 74 75 76 | Next Page >