Search Results

Search found 5070 results on 203 pages for 'algorithm'.

Page 82/203 | < Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89  | Next Page >

  • Time Complexities of recursive algorithms

    - by Peter
    Whenever I see a recursive solution, or I write recursive code for a problem, it is really difficult for me to figure out the time complexity, in most of the cases I just say its exponential? How is it exponential actually? How people say it is 2^n, when it is n!, when it is n^n or n^k. I have some questions in mind, let say find all permutations of a string (O(n!)) find all sequences which sum up to k in an array (exponential, how exactly do I calculate). Find all subsets of size k whose sum is 0 (will k come somewhere in complexity , it should come right?). Can any1 help me how to calculate the exact complexity of such questions, I am able to wrote code for them , but its hard understanding the exact time complexity.

    Read the article

  • algorithm q: Fuzzy matching of structured data

    - by user86432
    I have a fairly small corpus of structured records sitting in a database. Given a tiny fraction of the information contained in a single record, submitted via a web form (so structured in the same way as the table schema), (let us call it the test record) I need to quickly draw up a list of the records that are the most likely matches for the test record, as well as provide a confidence estimate of how closely the search terms match a record. The primary purpose of this search is to discover whether someone is attempting to input a record that is duplicate to one in the corpus. There is a reasonable chance that the test record will be a dupe, and a reasonable chance the test record will not be a dupe. The records are about 12000 bytes wide and the total count of records is about 150,000. There are 110 columns in the table schema and 95% of searches will be on the top 5% most commonly searched columns. The data is stuff like names, addresses, telephone numbers, and other industry specific numbers. In both the corpus and the test record it is entered by hand and is semistructured within an individual field. You might at first blush say "weight the columns by hand and match word tokens within them", but it's not so easy. I thought so too: if I get a telephone number I thought that would indicate a perfect match. The problem is that there isn't a single field in the form whose token frequency does not vary by orders of magnitude. A telephone number might appear 100 times in the corpus or 1 time in the corpus. The same goes for any other field. This makes weighting at the field level impractical. I need a more fine-grained approach to get decent matching. My initial plan was to create a hash of hashes, top level being the fieldname. Then I would select all of the information from the corpus for a given field, attempt to clean up the data contained in it, and tokenize the sanitized data, hashing the tokens at the second level, with the tokens as keys and frequency as value. I would use the frequency count as a weight: the higher the frequency of a token in the reference corpus, the less weight I attach to that token if it is found in the test record. My first question is for the statisticians in the room: how would I use the frequency as a weight? Is there a precise mathematical relationship between n, the number of records, f(t), the frequency with which a token t appeared in the corpus, the probability o that a record is an original and not a duplicate, and the probability p that the test record is really a record x given the test and x contain the same t in the same field? How about the relationship for multiple token matches across multiple fields? Since I sincerely doubt that there is, is there anything that gets me close but is better than a completely arbitrary hack full of magic factors? Barring that, has anyone got a way to do this? I'm especially keen on other suggestions that do not involve maintaining another table in the database, such as a token frequency lookup table :). This is my first post on StackOverflow, thanks in advance for any replies you may see fit to give.

    Read the article

  • Sync Algorithms

    - by Kristopher Johnson
    Are there any good references out there for sync algorithms? I'm interested in algorithms that synchronize the following kinds of data between multiple users: calendars documents lists and outlines I'm not just looking for synchronization of contents of directories a la rsync; I am interested in merging the data within individual files.

    Read the article

  • Base X string encoding

    - by Paul Stone
    I'm looking for a routine that will encode a string (stream of bytes) into an arbitrary base/alphabet (like base64 encoding but I get to choose the alphabet). I've seen a few routines that do base X encoding for a number, but not for a string.

    Read the article

  • How can I group an array of rectangles into "Islands" of connected regions?

    - by Eric
    The problem I have an array of java.awt.Rectangles. For those who are not familiar with this class, the important piece of information is that they provide an .intersects(Rectangle b) function. I would like to write a function that takes this array of Rectangles, and breaks it up into groups of connected rectangles. Lets say for example, that these are my rectangles (constructor takes the arguments x, y, width,height): Rectangle[] rects = new Rectangle[] { new Rectangle(0, 0, 4, 2), //A new Rectangle(1, 1, 2, 4), //B new Rectangle(0, 4, 8, 2), //C new Rectangle(6, 0, 2, 2) //D } A quick drawing shows that A intersects B and B intersects C. D intersects nothing. A tediously drawn piece of ascii art does the job too: +-------+ +---+ ¦A+---+ ¦ ¦ D ¦ +-+---+-+ +---+ ¦ B ¦ +-+---+---------+ ¦ +---+ C ¦ +---------------+ Therefore, the output of my function should be: new Rectangle[][]{ new Rectangle[] {A,B,C}, new Rectangle[] {D} } The failed code This was my attempt at solving the problem: public List<Rectangle> getIntersections(ArrayList<Rectangle> list, Rectangle r) { List<Rectangle> intersections = new ArrayList<Rectangle>(); for(Rectangle rect : list) { if(r.intersects(rect)) { list.remove(rect); intersections.add(rect); intersections.addAll(getIntersections(list, rect)); } } return intersections; } public List<List<Rectangle>> mergeIntersectingRects(Rectangle... rectArray) { List<Rectangle> allRects = new ArrayList<Rectangle>(rectArray); List<List<Rectangle>> groups = new ArrayList<ArrayList<Rectangle>>(); for(Rectangle rect : allRects) { allRects.remove(rect); ArrayList<Rectangle> group = getIntersections(allRects, rect); group.add(rect); groups.add(group); } return groups; } Unfortunately, there seems to be an infinite recursion loop going on here. My uneducated guess would be that java does not like me doing this: for(Rectangle rect : allRects) { allRects.remove(rect); //... } Can anyone shed some light on the issue?

    Read the article

  • Count Occurence of Needle String in Haystack String, most optimally?

    - by Taranfx
    The Problem is simple Find "ABC" in "ABCDSGDABCSAGAABCCCCAAABAABC" Here is the solution I propose, I'm looking for any solutions that might be better than this one. public static void main(String[] args) { String haystack = "ABCDSGDABCSAGAABCCCCAAABAABC"; String needle = "ABC"; char [] needl = needle.toCharArray(); int needleLen = needle.length(); int found=0; char hay[] = haystack.toCharArray(); int index =0; int chMatched =0; for (int i=0; i<hay.length; i++){ if (index >= needleLen || chMatched==0) index=0; System.out.print("\nchar-->"+hay[i] + ", with->"+needl[index]); if(hay[i] == needl[index]){ chMatched++; System.out.println(", matched"); }else { chMatched=0; index=0; if(hay[i] == needl[index]){ chMatched++; System.out.print("\nchar->"+hay[i] + ", with->"+needl[index]); System.out.print(", matched"); }else continue; } if(chMatched == needleLen){ found++; System.out.println("found. Total ->"+found); } index++; } System.out.println("Result Found-->"+found); } It took me a while creating this one. Can someone suggest a better solution (if any) P.S. Drop the sysouts if they look messy to you.

    Read the article

  • simple plot algorithm with autoscale

    - by adrin
    I need to implement a simple plotting component in C#(WPF to be more precise). What i have is a collection of data samples containing time (X axis) and a value (both double types). I have a drawing canvas of a fixed size (Width x Height) and a DrawLine method/function that can draw on it. The problem I am facing now is how do I draw the plot so that it is autoscaled? In other words how do I map the samples I have to actual pixels on my Width x Height canvas?

    Read the article

  • how to elegantly duplicate a graph (neural network)

    - by macias
    I have a graph (network) which consists of layers, which contains nodes (neurons). I would like to write a procedure to duplicate entire graph in most elegant way possible -- i.e. with minimal or no overhead added to the structure of the node or layer. Or yet in other words -- the procedure could be complex, but the complexity should not "leak" to structures. They should be no complex just because they are copyable. I wrote the code in C#, so far it looks like this: neuron has additional field -- copy_of which is pointer the the neuron which base copied from, this is my additional overhead neuron has parameterless method Clone() neuron has method Reconnect() -- which exchanges connection from "source" neuron (parameter) to "target" neuron (parameter) layer has parameterless method Clone() -- it simply call Clone() for all neurons network has parameterless method Clone() -- it calls Clone() for every layer and then it iterates over all neurons and creates mappings neuron=copy_of and then calls Reconnect to exchange all the "wiring" I hope my approach is clear. The question is -- is there more elegant method, I particularly don't like keeping extra pointer in neuron class just in case of being copied! I would like to gather the data in one point (network's Clone) and then dispose it completely (Clone method cannot have an argument though).

    Read the article

  • List of Big-O for PHP functions?

    - by Kendall Hopkins
    After using PHP for a while now, I've noticed that not all PHP built in functions as fast as expected. Consider the below two possible implementations of a function that finds if a number is prime using a cached array of primes. //very slow for large $prime_array $prime_array = array( 2, 3, 5, 7, 11, 13, .... 104729, ... ); $result_array = array(); foreach( $array_of_number => $number ) { $result_array[$number] = in_array( $number, $large_prime_array ); } //still decent performance for large $prime_array $prime_array => array( 2 => NULL, 3 => NULL, 5 => NULL, 7 => NULL, 11 => NULL, 13 => NULL, .... 104729 => NULL, ... ); foreach( $array_of_number => $number ) { $result_array[$number] = array_key_exists( $number, $large_prime_array ); } This is because in_array is implemented with a linear search O(n) which will linearly slow down as $prime_array grows. Where the array_key_exists function is implemented with a hash lookup O(1) which will not slow down unless the hash table gets extremely populated (in which case it's only O(logn)). So far I've had to discover the big-O's via trial and error, and occasionally looking at the source code. Now for the question... I was wondering if there was a list of the theoretical (or practical) big O times for all* the PHP built in functions. *or at least the interesting ones For example find it very hard to predict what the big O of functions listed because the possible implementation depends on unknown core data structures of PHP: array_merge, array_merge_recursive, array_reverse, array_intersect, array_combine, str_replace (with array inputs), etc.

    Read the article

  • How to sort a list so that managers are always ahead of their subordinates (How do I do a topologica

    - by James Black
    I am working on a project using Groovy, and I would like to take an array of employees, so that no manager follows their subordinates in the array. The reason being that I need to add people to a database and I would prefer not to do it in two passes. So, I basically have: <employees> <employee> <employeeid>12</employeeid> <manager>3</manager> </employee> <employee> <employeeid>1</employeeid> <manager></manager> </employee> <employee> <employeeid>3</employeeid> <manager>1</manager> </employee> </employees> So, it should be sorted as such: employeeid = 1 employeeid = 3 employeeid = 12 The first person should have a null for managers. I am thinking about a binary tree representation, but I expect it will be very unbalanced, and I am not certain the best way to do this using Groovy properly. Is there a way to do this that isn't going to involve using nested loops?

    Read the article

  • Methodologies or algorithms for filling in missing data

    - by tbone
    I am dealing with datasets with missing data and need to be able to fill forward, backward, and gaps. So, for example, if I have data from Jan 1, 2000 to Dec 31, 2010, and some days are missing, when a user requests a timespan that begins before, ends after, or encompasses the missing data points, I need to "fill in" these missing values. Is there a proper term to refer to this concept of filling in data? Imputation is one term, don't know if it is "the" term for it though. I presume there are multiple algorithms & methodologies for filling in missing data (use last measured, using median/average/moving average, etc between 2 known numbers, etc. Anyone know the proper term for this problem, any online resources on this topic, or ideally links to open source implementations of some algorithms (C# preferably, but any language would be useful)

    Read the article

  • Client-server synchronization pattern / algorithm?

    - by tm_lv
    I have a feeling that there must be client-server synchronization patterns out there. But i totally failed to google up one. Situation is quite simple - server is the central node, that multiple clients connect to and manipulate same data. Data can be split in atoms, in case of conflict, whatever is on server, has priority (to avoid getting user into conflict solving). Partial synchronization is preferred due to potentially large amounts of data. Are there any patterns / good practices for such situation, or if you don't know of any - what would be your approach? Below is how i now think to solve it: Parallel to data, a modification journal will be held, having all transactions timestamped. When client connects, it receives all changes since last check, in consolidated form (server goes through lists and removes additions that are followed by deletions, merges updates for each atom, etc.). Et voila, we are up to date. Alternative would be keeping modification date for each record, and instead of performing data deletes, just mark them as deleted. Any thoughts?

    Read the article

  • question bitonic sequence

    - by davit-datuashvili
    A sequence is bitonic if it monotonically increases and then monotonically de- creases, or if it can be circularly shifted to monotonically increase and then monotonically decrease. For example the sequences 1, 4, 6, 8, 3, -2 , 9, 2, -4, -10, -5 , and 1, 2, 3, 4 are bitonic, but 1, 3, 12, 4, 2, 10 is not bitonic. please help me to determine if given sequence is bitonic?

    Read the article

  • How to find largest common sub-tree in the given two binary search trees?

    - by Bhushan
    Two BSTs (Binary Search Trees) are given. How to find largest common sub-tree in the given two binary trees? EDIT 1: Here is what I have thought: Let, r1 = current node of 1st tree r2 = current node of 2nd tree There are some of the cases I think we need to consider: Case 1 : r1.data < r2.data 2 subproblems to solve: first, check r1 and r2.left second, check r1.right and r2 Case 2 : r1.data > r2.data 2 subproblems to solve: - first, check r1.left and r2 - second, check r1 and r2.right Case 3 : r1.data == r2.data Again, 2 cases to consider here: (a) current node is part of largest common BST compute common subtree size rooted at r1 and r2 (b)current node is NOT part of largest common BST 2 subproblems to solve: first, solve r1.left and r2.left second, solve r1.right and r2.right I can think of the cases we need to check, but I am not able to code it, as of now. And it is NOT a homework problem. Does it look like?

    Read the article

  • Javascript Number Random Scrambler

    - by stjowa
    Hi, I need a Javascript random number scrambler for my website. Seems simple, but I can not figure out how to do it. Can anyone help me out? I have the following array of numbers: 1 2 3 4 5 6 7 8 9 I would like to be able to have these numbers scrambled randomly. Like the following: 3 6 4 2 9 5 1 8 7 or 4 1 7 3 5 9 2 6 8 So, specifically, I would like a function that takes in an array of numbers (1 - n) and then returns that same array of numbers - scrambled randomly with different calls to the function. Maybe a noob function, but can't seem to figure it out. Thanks!

    Read the article

  • Java algorithm for normalizing audio

    - by Marty Pitt
    I'm trying to normalize an audio file of speech. Specifically, where an audio file contains peaks in volume, I'm trying to level it out, so the quiet sections are louder, and the peaks are quieter. I know very little about audio manipulation, beyond what I've learnt from working on this task. Also, my math is embarrassingly weak. I've done some research, and the Xuggle site provides a sample which shows reducing the volume using the following code: (full version here) @Override public void onAudioSamples(IAudioSamplesEvent event) { // get the raw audio byes and adjust it's value ShortBuffer buffer = event.getAudioSamples().getByteBuffer().asShortBuffer(); for (int i = 0; i < buffer.limit(); ++i) buffer.put(i, (short)(buffer.get(i) * mVolume)); super.onAudioSamples(event); } Here, they modify the bytes in getAudioSamples() by a constant of mVolume. Building on this approach, I've attempted a normalisation modifies the bytes in getAudioSamples() to a normalised value, considering the max/min in the file. (See below for details). I have a simple filter to leave "silence" alone (ie., anything below a value). I'm finding that the output file is very noisy (ie., the quality is seriously degraded). I assume that the error is either in my normalisation algorithim, or the way I manipulate the bytes. However, I'm unsure of where to go next. Here's an abridged version of what I'm currently doing. Step 1: Find peaks in file: Reads the full audio file, and finds this highest and lowest values of buffer.get() for all AudioSamples @Override public void onAudioSamples(IAudioSamplesEvent event) { IAudioSamples audioSamples = event.getAudioSamples(); ShortBuffer buffer = audioSamples.getByteBuffer().asShortBuffer(); short min = Short.MAX_VALUE; short max = Short.MIN_VALUE; for (int i = 0; i < buffer.limit(); ++i) { short value = buffer.get(i); min = (short) Math.min(min, value); max = (short) Math.max(max, value); } // assign of min/max ommitted for brevity. super.onAudioSamples(event); } Step 2: Normalize all values: In a loop similar to step1, replace the buffer with normalized values, calling: buffer.put(i, normalize(buffer.get(i)); public short normalize(short value) { if (isBackgroundNoise(value)) return value; short rawMin = // min from step1 short rawMax = // max from step1 short targetRangeMin = 1000; short targetRangeMax = 8000; int abs = Math.abs(value); double a = (abs - rawMin) * (targetRangeMax - targetRangeMin); double b = (rawMax - rawMin); double result = targetRangeMin + ( a/b ); // Copy the sign of value to result. result = Math.copySign(result,value); return (short) result; } Questions: Is this a valid approach for attempting to normalize an audio file? Is my math in normalize() valid? Why would this cause the file to become noisy, where a similar approach in the demo code doesn't?

    Read the article

  • Way to store a large dictionary with low memory footprint + fast lookups (on Android)

    - by BobbyJim
    I'm developing an android word game app that needs a large (~250,000 word dictionary) available. I need: reasonably fast look ups e.g. constant time preferable, need to do maybe 200 lookups a second on occasion to solve a word puzzle and maybe 20 lookups within 0.2 second more often to check words the user just spelled. EDIT: Lookups are typically asking "Is in the dictionary?". I'd like to support up to two wildcards in the word as well, but this is easy enough by just generating all possible letters the wildcards could have been and checking the generated words (i.e. 26 * 26 lookups for a word with two wildcards). as it's a mobile app, using as little memory as possible and requiring only a small initial download for the dictionary data is top priority. My first naive attempts used Java's HashMap class, which caused an out of memory exception. I've looked into using the SQL lite databases available on android, but this seems like overkill. What's a good way to do what I need?

    Read the article

  • Naive Bayesian classification (spam filtering) - Doubt in one calculation? Which one is right? Plz c

    - by Microkernel
    Hi guys, I am implementing Naive Bayesian classifier for spam filtering. I have doubt on some calculation. Please clarify me what to do. Here is my question. In this method, you have to calculate P(S|W) - Probability that Message is spam given word W occurs in it. P(W|S) - Probability that word W occurs in a spam message. P(W|H) - Probability that word W occurs in a Ham message. So to calculate P(W|S), should I do (1) (Number of times W occuring in spam)/(total number of times W occurs in all the messages) OR (2) (Number of times word W occurs in Spam)/(Total number of words in the spam message) So, to calculate P(W|S), should I do (1) or (2)? (I thought it to be (2), but I am not sure, so plz clarify me) I am refering http://en.wikipedia.org/wiki/Bayesian_spam_filtering for the info by the way. I got to complete the implementation by this weekend :( Thanks and regards, MicroKernel :) @sth: Hmm... Shouldn't repeated occurrence of word 'W' increase a message's spam score? In the your approach it wouldn't, right?. Lets take a scenario and discuss... Lets say, we have 100 training messages, out of which 50 are spam and 50 are Ham. and say word_count of each message = 100. And lets say, in spam messages word W occurs 5 times in each message and word W occurs 1 time in Ham message. So total number of times W occuring in all the spam message = 5*50 = 250 times. And total number of times W occuring in all Ham messages = 1*50 = 50 times. Total occurance of W in all of the training messages = (250+50) = 300 times. So, in this scenario, how do u calculate P(W|S) and P(W|H) ? Naturally we should expect, P(W|S) P(W|H)??? right. Please share your thought...

    Read the article

  • Position elements without overlap

    - by eWolf
    I have a number of rectangular elements that I want to position in a 2D space. I calculate an ideal position for each element. Now my problem is that many elements overlap as very often the ideal positions are concentrated in one region. I want to avoid overlap as much as possible (doesn't have to be perfect, though). How can I do this? I've heard physics simulations are suitable for this - is that correct? And can anyone provide an example/tutorial? By the way: I'm using XNA, if you know any .NET library that does exactly this job - tell me!

    Read the article

  • Display relative time in hour, day, month and year

    - by JohnJohnGa
    I wrote a function toBeautyString(epoch) : String which given a epoch, return a string which will display the relative time from now in hour and minute For instance: // epoch: 1346140800 -> Tue, 28 Aug 2012 05:00:00 GMT // and now: 1346313600 -> Thu, 30 Aug 2012 08:00:00 GMT toBeautyString(1346140800) -> "2 days and 3 hours ago" I want now to extend this function to month and year, so it will be able to print: 2 years, 1 month, 3 days and 1 hour ago Only with epoch without any external libraries. The purpose of this function is to give to the user a better way to visualize the time in the past. I found this: Calculating relative time but the granularity is not enough.

    Read the article

  • How to make disconnected closed curves connected by adding a shortest path using MATLAB?

    - by user198729
    bwlabel can be used to get disconnected objects in an image: [L Ne] = bwlabel(image); I want to make the objects(But my target is only the contours(closed curve) of these objects) connected by adding a shortest path where necessary. How do I approach this? UPDATE Or how to dilate the closed curves so that they get connected? How to calculate the shortest path between two disconnected closed curves?

    Read the article

  • Javascript algorithm that calculates week number in Fiscal Year

    - by ForeignerBR
    Hi, I have been looking for a Javascript algorithms that gives me the week number of a given Date object within a custom fiscal year. The fiscal year of my company starts on 1 September and ends on 31 August. Say today happens to be September 1st and I pass in a newly instanced Date object to this function; I would expect it to return 1. Hopefully someone will be able to help me with it. thanks, fbr

    Read the article

  • Fast dot product for a very special case

    - by psihodelia
    Given a vector X of size L, where every scalar element of X is from a binary set {0,1}, it is to find a dot product z=dot(X,Y) if vector Y of size L consists of the integer-valued elements. I suggest, there must exist a very fast way to do it. Let's say we have L=4; X[L]={1, 0, 0, 1}; Y[L]={-4, 2, 1, 0} and we have to find z=X[0]*Y[0] + X[1]*Y[1] + X[2]*Y[2] + X[3]*Y[3] (which in this case will give us -4). It is obvious that X can be represented using binary digits, e.g. an integer type int32 for L=32. Then, all what we have to do is to find a dot product of this integer with an array of 32 integers. Do you have any idea or suggestions how to do it very fast?

    Read the article

< Previous Page | 78 79 80 81 82 83 84 85 86 87 88 89  | Next Page >