Search Results

Search found 7220 results on 289 pages for 'graph algorithm'.

Page 77/289 | < Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84 | Next Page >

ClassNotFoundException error in implementing Bayesian algorithm in Apache Mahout on Hadoop

- by Shweta

Hi, I have a problem in executing the Bayesian algorithm in Mahout. I built it with Maven and the job file is in target directory. When run from terminal using hadoop, I'm getting the ClassNotFoundException error. What should be done? $HADOOP_HOME/bin/hadoop jar mahout-core-0.3-SNAPSHOT.job org.apache.mahout.classifier.bayes.mapreduce.bayes.bayesdriver -i test -o output Exception in thread "main" java.lang.ClassNotFoundException: org.apache.mahout.classifier.bayes.mapreduce.bayes.bayesdriver at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) at java.lang.ClassLoader.loadClass(ClassLoader.java:307) at java.lang.ClassLoader.loadClass(ClassLoader.java:252) at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:247) at org.apache.hadoop.util.RunJar.main(RunJar.java:149)

Read the article
How to speed up calculation of length of longest common substring?

- by eSKay

I have two very large strings and I am trying to find out their Longest Common Substring. One way is using suffix trees (supposed to have a very good complexity, though a complex implementation), and the another is the dynamic programming method (both are mentioned on the Wikipedia page linked above). Using dynamic programming The problem is that the dynamic programming method has a huge running time (complexity is O(n*m), where n and m are lengths of the two strings). What I want to know (before jumping to implement suffix trees): Is it possible to speed up the algorithm if I only want to know the length of the common substring (and not the common substring itself)?

Read the article
linear interpolation on 8bit microcontroller

- by JB

I need to do a linear interpolation over time between two values on an 8 bit PIC microcontroller (Specifically 16F627A but that shouldn't matter) using PIC assembly language. Although I'm looking for an algorithm here as much as actual code. I need to take an 8 bit starting value, an 8 bit ending value and a position between the two (Currently represented as an 8 bit number 0-255 where 0 means the output should be the starting value and 255 means it should be the final value but that can change if there is a better way to represent this) and calculate the interpolated value. Now PIC doesn't have a divide instruction so I could code up a general purpose divide routine and effectivly calculate (B-A)/(x/255)+A at each step but I feel there is probably a much better way to do this on a microcontroller than the way I'd do it on a PC in c++ Has anyone got any suggestions for implementing this efficiently on this hardware?

Read the article
algorithm to combinatorics

- by peiska

I am trying to solve a combinatorics problem, it seems easy, but i am having some trouble with it. If i have at most X tables, and N persons to sit on the tables, Each table can have 1 to N seating places, and I can only sit persons in one side of a rectangular table( so the order how people sit matters). I want to make a code that can calculate all the distributions of seating places from 1 up to K tables. For example, if I have 12 persons and 1 table i have 479001600 ways of seating persons( thats easy to calculate I've used Factorial of 12). But if I have 12 persons and 3 tables i have 4390848000 ways of seating persons. I've tried different solutions but i was not able to find the correct one. I've tried to divided the 12 in 3, then o use factorial of the result (it didnt work), i've tried to use 12! * 3( it didn't work too). Can some one give me a tip in a algorithm that i can use?

Read the article
Ranking based string matching algorithm..for Midi Music

- by Taha

i am working on midi music project. What i am trying to do is:- matching the Instrument midi track with the similar instrument midi track... for example Flute track in a some midi music is matched against the Flute track in some other music midi file... After matching ,the results should come ranking wise according to their similarity.. Like 1) track1 2) track2 3) track3 I have this sort of string coming from my midi music .. F4/0.01282051282051282E4/0.01282051282051282Eb4/0.01282051282051282 D4/0.01282051282051282C#4/0.01282051282051282C4/0.01282051282051282 Which ranking algorithm with good metrics should i use for such data ? Thanking you in anticipation!

Read the article
reservoir sampling problem

- by eSKay

This MSDN article proves the correctness of Reservoir Sampling algorithm as follows: Base case is trivial. For the k+1st case, the probability a given element i with position <= k is in R is s/k. The probability i is replaced is the probability k+1st element is chosen multiplied by i being chosen to be replaced, which is: s/(k+1) * 1/s = 1/(k+1), and prob that i is not replaced is k/k+1. So any given element's probability of lasting after k+1 rounds is: (chosen in k steps, and not removed in k steps) = s/k * k/(k+1), which is s/(k+1). So, when k+1 = n, any element is present with probability s/n. about step 3: What are the k+1 rounds mentioned? What is chosen in k steps, and not removed in k steps? Why are we only calculating this probability for elements that were already in R after the first s steps?

Read the article
Cross-platform (microcontroller-PC) algorithm development

- by Kyr

Hello people! I was asked to develop a algorithm for network application on C. This project will be developed on Linux for PC and then it will be transferred to a more portable platform, something that will include a microcontroller. There are many microcontroller/companies out there that provide very nice and large libraries for TCP/IP. This software will hold statistics on the network performance. The whole idea of a cross platform (uC - PC) seems rubbish to me cause eventually the code should be written in a more platform specific way for the microcontroller, but I am not expert to judge anyway. Is there any clever way of doing this or is there a anyone that did this before? My brainstorming has "Wrapper library" and "Matlab"... Any ideas? Thx!

Read the article
Print "1 followed by googolplex number of zeros" [closed]

- by Rajan

Assuming we are not concerned about running time of the program (which is practically infinite for human mortals), we want to print out in base 10, the exact value of 10^(googolplex), one digit at a time (mostly zeros). Describe an algorithm (which can be coded on current day computers), or write a program to do this. Since we cannot practically check the output, so we will rely on collective opinion on the correctness of the program. NOTE : I do not know the solution, or whether a solution exists or not. The problem is my own invention. To those readers who think this is not a CS question... kindly reconsider. This is difficult and bit theoretical but definitely CS.

Read the article
Distributing points over a surface within boundries

- by vise

I'm interested in a way (algorithm) of distributing a predefined number of points over a 4 sided surface like a square. The main issue is that each point has got to have a minimum and maximum proximity to each other (random between two predefined values). Basically the distance of any two points should not be closer than let's say 2, and a further than 3. My code will be implemented in ruby (the points are locations, the surface is a map), but any ideas or snippets are definitely welcomed as all my ideas include a fair amount of brute force.

Read the article
Are there any worse sorting algorithms than Bogosort (a.k.a Monkey Sort)?

- by womp

My co-workers took me back in time to my University days with a discussion of sorting algorithms this morning. We reminisced about our favorites like StupidSort, and one of us was sure we had seen a sort algorithm that was O(n!). That got me started looking around for the "worst" sorting algorithms I could find. We postulated that a completely random sort would be pretty bad (i.e. randomize the elements - is it in order? no? randomize again), and I looked around and found out that it's apparently called BogoSort, or Monkey Sort, or sometimes just Random Sort. Monkey Sort appears to have a worst case performance of O(∞), a best case performance of O(n), and an average performance of O(n * n!). Are there any named algorithms that have worse average performance than O(n * n!)? Or are just sillier than Monkey Sort in general?

Read the article
algorithm for python itertools.permutations

- by zaharpopov

Can someone please explain algorithm for itertools.permutations routine in Python standard lib 2.6? I see its code in the documentation but don't undestand why it work? Thanks Code is: def permutations(iterable, r=None): # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC # permutations(range(3)) --> 012 021 102 120 201 210 pool = tuple(iterable) n = len(pool) r = n if r is None else r if r > n: return indices = range(n) cycles = range(n, n-r, -1) yield tuple(pool[i] for i in indices[:r]) while n: for i in reversed(range(r)): cycles[i] -= 1 if cycles[i] == 0: indices[i:] = indices[i+1:] + indices[i:i+1] cycles[i] = n - i else: j = cycles[i] indices[i], indices[-j] = indices[-j], indices[i] yield tuple(pool[i] for i in indices[:r]) break else: return

Read the article
synchronizing audio over a network

- by sharkin

I'm in startup of designing a client/server audio system which can stream audio arbitrarily over a network. One central server pumps out an audio stream and x number of clients receives the audio data and plays it. So far no magic needed and I have even got this scenario to work with VLC media player out of the box. However, the tricky part seems to be synchronizing the audio playback so that all clients are in audible synch (actual latency can be allowed as long as it is perceived to be in sync by a human listener). My question is if there's any known method or algorithm to use for these types of synchronization problems (video is probably solved the same way). My own initial thoughts centers around synchronizing clocks between physical machines and thereby creating a virtual "main timer" and somehow aligning audio data packets against it. Some products already solving the problem: http://www.sonos.com http://netchorus.com/ Any pointers are most welcome. Thanks. PS: This related question seem to have died long ago.

Read the article
string matching algorithms used by lucene

- by iamrohitbanga

i want to know the string matching algorithms used by Apache Lucene. i have been going through the index file format used by lucene given here. it seems that lucene stores all words occurring in the text as is with their frequency of occurrence in each document. but as far as i know that for efficient string matching it would need to preprocess the words occurring in the Documents. example: search for "iamrohitbanga is a user of stackoverflow" (use fuzzy matching) in some documents. it is possible that there is a document containing the string "rohit banga" to find that the substrings rohit and banga are present in the search string, it would use some efficient substring matching. i want to know which algorithm it is. also if it does some preprocessing which function call in the java api triggers it.

Read the article
Algorithm for parsing a flat tree into a non-flat tree

- by Chad Johnson

I have the following flat tree: id name parent_id is_directory =========================================================== 50 app 0 1 31 controllers 50 1 11 application_controller.rb 31 0 46 models 50 1 12 test_controller.rb 31 0 31 test.rb 46 0 and I am trying to figure out an algorithm for getting this into the following tree structuree: [{ id: 50, name: app, is_directory: true children: [{ id: 31, name: controllers, is_directory: true, children: [{ id: 11, name: application_controller.rb is_directory: false },{ id: 12, name: test_controller.rb, is_directory: false }], },{ id: 46, name: models, is_directory: true, children: [{ id: 31, name: test.rb, is_directory: false }] }] }] Can someone point me in the right direction? I'm looking for steps (eg. build an associative array; loop through the array looking for x; etc.).

Read the article
Software to Tune/Calibrate Properties for Heuristic Algorithms

- by Karussell

Today I read that there is a software called WinCalibra (scroll a bit down) which can take a text file with properties as input. This program can then optimize the input properties based on the output values of your algorithm. See this paper or the user documentation for more information (see link above; sadly doc is a zipped exe). Do you know other software which can do the same which runs under Linux? (preferable Open Source) EDIT: Since I need this for a java application I will now invest my research in java libraries like jgap. Other ideas and links would be appreciated!

Read the article
image archive VS image strip

- by DevA

Hi, i've noticed that plenty of games / applications (very common on mobile builds) pack numerous images into an image strip. I figured that the advantages in this are making the program more tidy (file system - wise) and reducing (un)installation time. During the runtime of the application, the entire image strip is allocated and copied from FS to RAM. On the contrary, images can be stored in an image archive and unpacked during runtime to a number of image structures in RAM. The way I see it, the image strip approach is less efficient because of worse caching performance and because that even if the optimal rectangle packing algorithm is used, there will be empty spaces between the stored images in the strip, causing a waste of RAM. What are the advantages in using an image strip over using an image archive file?

Read the article
Question on multi-probe Local Sensitive Hashing

- by Yijinsei

Hey guys sorry to be asking this kind noob question, but because I really need some guidance on how to use Multi probe LSH pretty urgently, so I did not do much research myself. I realize there is a lib call LSHKIT available that implemented that algorithm, but I have trouble trying to figure out how to use it. Right now, I have a few thousand feature vector 296 dimension, each representing an image. The vector is used to query an user input image, to retrieve the most similar image. The method I used to derive the distance between vector is euclidean distance. I know this might be a rather noob question, but do you guys have knowledge on how should i implement multi probe LSH? I am really very grateful to any answer or response.

Read the article
How to find a binary logarithm very fast? (O(1) at best)

- by psihodelia

Is there any very fast method to find a binary logarithm of an integer number? For example, given a number x=52656145834278593348959013841835216159447547700274555627155488768 such algorithm must find y=log(x,2) which is 215. x is always a power of 2. The problem seems to be really simple. All what is required is to find the position of the most significant 1 bit. There is a well-known method FloorLog, but it is not very fast especially for the very long multi-words integers. What is the fastest method?

Read the article
How do people prove the correctness of Computer Vision methods?

- by solvingPuzzles

I'd like to pose a few abstract questions about computer vision research. I haven't quite been able to answer these questions by searching the web and reading papers. How does someone know whether a computer vision algorithm is correct? How do we define "correct" in the context of computer vision? Do formal proofs play a role in understanding the correctness of computer vision algorithms? A bit of background: I'm about to start my PhD in Computer Science. I enjoy designing fast parallel algorithms and proving the correctness of these algorithms. I've also used OpenCV from some class projects, though I don't have much formal training in computer vision. I've been approached by a potential thesis advisor who works on designing faster and more scalable algorithms for computer vision (e.g. fast image segmentation). I'm trying to understand the common practices in solving computer vision problems.

Read the article
Generate 10-digit number using a phone keypad

- by srikfreak

Given a phone keypad as shown below: 1 2 3 4 5 6 7 8 9 0 How many different 10-digit numbers can be formed starting from 1? The constraint is that the movement from 1 digit to the next is similar to the movement of the Knight in a chess game. For eg. if we are at 1 then the next digit can be either 6 or 8 if we are at 6 then the next digit can be 1, 7 or 0. Repetition of digits are allowed - 1616161616 is a valid number. Is there a polynomial time algorithm which solves this problem? The problem requires us to just give the count of 10-digit numbers and not necessarily list the numbers.

Read the article
Popularity Algorithm - SQL / Django

- by RadiantHex

Hi folks, I've been looking into popularity algorithms used on sites such as Reddit, Digg and even Stackoverflow. Reddit algorithm: t = (time of entry post) - (Dec 8, 2005) x = upvotes - downvotes y = {1 if x > 0, 0 if x = 0, -1 if x < 0) z = {1 if x < 0, otherwise x} log(z) + (y * t)/45000 I have always performed simple ordering within SQL, I'm wondering how I should deal with such ordering. Should it be used to define a table, or could I build an SQL with the ordering within the formula (without hindering performance)? I am also wondering, if it is possible to use multiple ordering algorithms in different occasions, without incurring into performance problems. I'm using Django and PostgreSQL. Help would be much appreciated! ^^

Read the article
How to judge the relative efficiency of algorithms given runtimes as functions of 'n'?

- by Lopa

Consider two algorithms A and B which solve the same problem, and have time complexities (in terms of the number of elementary operations they perform) given respectively by a(n) = 9n+6 b(n) = 2(n^2)+1 (i) Which algorithm is the best asymptotically? (ii) Which is the best for small input sizes n, and for what values of n is this the case? (You may assume where necessary that n0.) i think its 9n+6. guys could you please help me with whether its right or wrong?? and whats the answer for part b. what exactly do they want?

Read the article
New to AVL tree implementation.

- by nn

I am writing a sliding window compression algorithm (LZ77) that searches for phrases in a "moving" dictionary. So far I have written a BST where each node is stored in an array and it's index in the array is also the value of the starting position in the window itself. I am now looking at transforming the BST to an AVL tree. I am a little confused at the sample implementations I have seen. Some only appear to store the balance factors whereas others store the height of each tree. Are there any performance advantage/disadvantages of storing the height and/or balance factor for each node? Apologies if this is a very simple question, but I'm still not visualizing how I want to restructure my BST to implement height balancing. Thanks.

Read the article
What's the best way to normalize scores for ranking things?

- by beagleguy

hi all, I'm curious how to do normalizing of numbers for a ranking algorithm let's say I want to rank a link based on importance and I have two columns to work with so a table would look like url | comments | views now I want to rank comments higher than views so I would first think to do comments*3 or something to weight it, however if there is a large view number like 40,000 and only 4 comments then the comments weight gets dropped out. So I'm thinking I have to normalize those scores down to a more equal playing field before I can weight them. Any ideas or pointers to how that's usually done? thanks

Read the article
How to calculate order (big O) for more complex algorithms (ie quicksort)

- by bangoker

I know there are quite a bunch of questions about big O notation, I have already checked Plain english explanation of Big O , Big O, how do you calculate/approximate it?, and Big O Notation Homework--Code Fragment Algorithm Analysis?, to name a few. I know by "intuition" how to calculate it for n, n^2, n! and so, however I am completely lost on how to calculate it for algorithms that are log n , n log n, n log log n and so. What I mean is, I know that Quick Sort is n log n (on average).. but, why? Same thing for merge/comb, etc. Could anybody explain me in a not to math-y way how do you calculate this? The main reason is that Im about to have a big interview and I'm pretty sure they'll ask for this kind of stuff. I have researched for a few days now, and everybody seem to have either an explanation of why bubble sort is n^2 or the (for me) unreadable explanation a la wikipedia Thanks!

Read the article

< Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84 | Next Page >