Search Results

Search found 1933 results on 78 pages for 'genetic algorithms'.

Page 69/78 | < Previous Page | 65 66 67 68 69 70 71 72 73 74 75 76  | Next Page >

  • Classification: Dealing with Abstain/Rejected Class

    - by abner.ayala
    I am asking for your input and/help on a classification problem. If anyone have any references that I can read to help me solve my problem even better. I have a classification problem of four discrete and very well separated classes. However my input is continuous and has a high frequency (50Hz), since its a real-time problem. The circles represent the clusters of the classes, the blue line the decision boundary and Class 5 equals the (neutral/resting do nothing class). This class is the rejected class. However the problem is that when I move from one class to the other I activate a lot of false positives in the transition movements, since the movement is clearly non-linear. For example, every time I move from class 5 (neutral class) to 1 I first see a lot of 3's before getting to the 1 class. Ideally, I will want my decision boundary to look like the one in the picture below where the rejected class is Class =5. Has a higher decision boundary than the others classes to avoid misclassification during transition. I am currently implementing my algorithm in Matlab using naive bayes, kNN, and SVMs optimized algorithms using Matlab. Question: What is the best/common way to handle abstain/rejected classes classes? Should I use (fuzzy logic, loss function, should I include resting cluster in the training)?

    Read the article

  • Have I taken a wrong path in programming by being excessively worried about code elegance and style?

    - by Ygam
    I am in a major stump right now. I am a BSIT graduate, but I only started actual programming less than a year ago. I observed that I have the following attitude in programming: I tend to be more of a purist, scorning unelegant approaches to solving problems using code I tend to look at anything in a large scale, planning everything before I start coding, either in simple flowcharts or complex UML charts I have a really strong impulse on refactoring my code, even if I miss deadlines or prolong development times I am obsessed with good directory structures, file naming conventions, class, method, and variable naming conventions I tend to always want to study something new, even, as I said, at the cost of missing deadlines I tend to see software development as something to engineer, to architect; that is, seeing how things relate to each other and how blocks of code can interact (I am a huge fan of loose coupling) i.e the OOP thinking I tend to combine OOP and procedural coding whenever I see fit I want my code to execute fast (thus the elegant approaches and refactoring) This bothers me because I see my colleagues doing much better the other way around (aside from the fact that they started programming since our first year in college). By the other way around I mean, they fire up coding, gets the job done much faster because they don't have to really look at how clean their codes are or how elegant their algorithms are, they don't bother with OOP however big their projects are, they mostly use web APIs, piece them together and voila! Working code! CLients are happy, they get paid fast, at the expense of a really unmaintainable or hard-to-read code that lacks structure and conventions, or slow executions of certain actions (which the common reasoning against would be that internet connections are much faster these days, hardware is more powerful). The excuse I often receive is clients don't care about how you write the code, but they do care about how long you deliver it. If it works then all is good. Now, did my "purist" approach to programming may have been the wrong way to start programming? Should I just dump these purist concepts and just code the hell up because I have seen it: clients don't really care how beautifully coded it is?

    Read the article

  • Using game of life or other virtual environment for artificial (intelligence) life simulation? [clos

    - by Berlin Brown
    One of my interests in AI focuses not so much on data but more on biologic computing. This includes neural networks, mapping the brain, cellular-automata, virtual life and environments. Described below is an exciting project that includes develop a virtual environment for bots to evolve in. "Polyworld is a cross-platform (Linux, Mac OS X) program written by Larry Yaeger to evolve Artificial Intelligence through natural selection and evolutionary algorithms." http://en.wikipedia.org/wiki/Polyworld " Polyworld is a promising project for studying virtual life but it still is far from creating an "intelligent autonomous" agent. Here is my question, in theory, what parameters would you use create an AI environment? Possibly a brain environment? Possibly multiple self contained life organisms that have their own "brain" or life structures. I would like a create a spin on the game of life simulation. What if you have a 64x64 game of life grid. But instead of one grid, you might have N number of grids. The N number of grids are your "life force" If all of the game of life entities die in a particular grid then that entire grid dies. A group of "grids" makes up a life form. I don't have an immediate goal. First, I want to simulate an environment and visualize what is going on in the environment with OpenGL and see if there are any interesting properties to the environment. I then want to add "scarce resources" and see if the AI environment can manage resources adequately.

    Read the article

  • Log 2 N generic comparison tree

    - by Morano88
    Hey! I'm working on an algorithm for Redundant Binary Representation (RBR) where every two bits represent a digit. I designed the comparator that takes 4 bits and gives out 2 bits. I want to make the comparison in log 2 n so If I have X and Y .. I compare every 2 bits of X with every 2 bits of Y. This is smooth if the number of bits of X or Y equals n where (n = 2^X) i.e n = 2,4,8,16,32,... etc. Like this : However, If my input let us say is 6 or 10 .. then it becomes not smooth and I have to take into account some odd situations like this : I have a shallow experience in algorithms .. is there a generic way to do it .. so at the end I get only 2 bits no matter what the input is ? I just need hints or pseudo-code. If my question is not appropriate here .. so feel free to flag it or tell me to remove it. I'm using VHDL by the way!

    Read the article

  • Automatic tracking algorithm

    - by nico
    Hi everyone, I'm trying to write a simple tracking routine to track some points on a movie. Essentially I have a series of 100-frames-long movies, showing some bright spots on dark background. I have ~100-150 spots per frame, and they move over the course of the movie. I would like to track them, so I'm looking for some efficient (but possibly not overkilling to implement) routine to do that. A few more infos: the spots are a few (es. 5x5) pixels in size the movement are not big. A spot generally does not move more than 5-10 pixels from its original position. The movements are generally smooth. the "shape" of these spots is generally fixed, they don't grow or shrink BUT they become less bright as the movie progresses. the spots don't move in a particular direction. They can move right and then left and then right again the user will select a region around each spot and then this region will be tracked, so I do not need to automatically find the points. As the videos are b/w, I though I should rely on brigthness. For instance I thought I could move around the region and calculate the correlation of the region's area in the previous frame with that in the various positions in the next frame. I understand that this is a quite naïve solution, but do you think it may work? Does anyone know specific algorithms that do this? It doesn't need to be superfast, as long as it is accurate I'm happy. Thank you nico

    Read the article

  • Practical rules for premature optimization

    - by DougW
    It seems that the phrase "Premature Optimization" is the buzz-word of the day. For some reason, iphone programmers in particular seem to think of avoiding premature optimization as a pro-active goal, rather than the natural result of simply avoiding distraction. The problem is, the term is beginning to be applied more and more to cases that are completely inappropriate. For example, I've seen a growing number of people say not to worry about the complexity of an algorithm, because that's premature optimization (eg http://stackoverflow.com/questions/2190275/help-sorting-an-nsarray-across-two-properties-with-nssortdescriptor/2191720#2191720). Frankly, I think this is just laziness, and appalling to disciplined computer science. But it has occurred to me that maybe considering the complexity and performance of algorithms is going the way of assembly loop unrolling, and other optimization techniques that are now considered unnecessary. What do you think? Are we at the point now where deciding between an O(n^n) and O(n!) complexity algorithm is irrelevant? What about O(n) vs O(n*n)? What do you consider "premature optimization"? What practical rules do you use to consciously or unconsciously avoid it? This is a bit vague, but I'm curious to hear other peoples' opinions on the topic.

    Read the article

  • Simplifying Testing through design considerations while utilizing dependency injection

    - by Adam Driscoll
    We are a few months into a green-field project to rework the Logic and Business layers of our product. By utilizing MEF (dependency injection) we have achieved high levels of code coverage and I believe that we have a pretty solid product. As we have been working through some of the more complex logic I have found it increasingly difficult to unit test. We are utilizing the CompositionContainer to query for types required by these complex algorithms. My unit tests are sometimes difficult to follow due to the lengthy mock object setup process that must take place, just right, to allow for certain circumstances to be verified. My unit tests often take me longer to write than the code that I'm trying to test. I realize this is not only an issue with dependency injection but with design as a whole. Is poor method design or lack of composition to blame for my overly complex tests? I've tried base classing tests, creating commonly used mock objects and ensuring that I utilize the container as much as possible to ease this issue but my tests always end up quite complex and hard to debug. What are some tips that you've seen to keep such tests concise, readable, and effective?

    Read the article

  • How to set WS-SecurityPolicy in an inbound CXF service in Mule?

    - by Brakara
    When configuring the service for handling UsernameToken and signatures, it's setup like this: <service name="serviceName"> <inbound> <cxf:inbound-endpoint address="someUrl" protocolConnector="httpsConnector" > <cxf:inInterceptors> <spring:bean class="org.apache.cxf.binding.soap.saaj.SAAJInInterceptor" /> <spring:bean class="org.apache.cxf.ws.security.wss4j.WSS4JInInterceptor"> <spring:constructor-arg> <spring:map> <spring:entry key="action" value="UsernameToken Timestamp Signature" /> <spring:entry key="passwordCallbackRef" value-ref="serverCallback" /> <spring:entry key="signaturePropFile" value="wssecurity.properties" /> </spring:map> </spring:constructor-arg> </spring:bean> </cxf:inInterceptors> </cxf:inbound-endpoint> </inbound> </service> But how is it possible to create a policy of what algorithms that are allowed, and what parts of the message that should be signed?

    Read the article

  • Why Stream/lazy val implementation using is faster than ListBuffer one

    - by anrizal
    I coded the following implementation of lazy sieve algorithms using Stream and lazy val below : def primes(): Stream[Int] = { lazy val ps = 2 #:: sieve(3) def sieve(p: Int): Stream[Int] = { p #:: sieve( Stream.from(p + 2, 2). find(i=> ps.takeWhile(j => j * j <= i). forall(i % _ > 0)).get) } ps } and the following implementation using (mutable) ListBuffer: import scala.collection.mutable.ListBuffer def primes(): Stream[Int] = { def sieve(p: Int, ps: ListBuffer[Int]): Stream[Int] = { p #:: { val nextprime = Stream.from(p + 2, 2). find(i=> ps.takeWhile(j => j * j <= i). forall(i % _ > 0)).get sieve(nextprime, ps += nextprime) } } sieve(3, ListBuffer(3))} When I did primes().takeWhile(_ < 1000000).size , the first implementation is 3 times faster than the second one. What's the explanation for this ? I edited the second version: it should have been sieve(3, ListBuffer(3)) instead of sieve(3, ListBuffer()) .

    Read the article

  • Divide and conquer of large objects for GC performance

    - by Aperion
    At my work we're discussing different approaches to cleaning up a large amount of managed ~50-100MB memory.There are two approaches on the table (read: two senior devs can't agree) and not having the experience the rest of the team is unsure of what approach is more desirable, performance or maintainability. The data being collected is many small items, ~30000 which in turn contains other items, all objects are managed. There is a lot of references between these objects including event handlers but not to outside objects. We'll call this large group of objects and references as a single entity called a blob. Approach #1: Make sure all references to objects in the blob are severed and let the GC handle the blob and all the connections. Approach #2: Implement IDisposable on these objects then call dispose on these objects and set references to Nothing and remove handlers. The theory behind the second approach is since the large longer lived objects take longer to cleanup in the GC. So, by cutting the large objects into smaller bite size morsels the garbage collector will processes them faster, thus a performance gain. So I think the basic question is this: Does breaking apart large groups of interconnected objects optimize data for garbage collection or is better to keep them together and rely on the garbage collection algorithms to processes the data for you? I feel this is a case of pre-optimization, but I do not know enough of the GC to know what does help or hinder it.

    Read the article

  • Generating easy-to-remember random identifiers

    - by Carl Seleborg
    Hi all, As all developers do, we constantly deal with some kind of identifiers as part of our daily work. Most of the time, it's about bugs or support tickets. Our software, upon detecting a bug, creates a package that has a name formatted from a timestamp and a version number, which is a cheap way of creating reasonably unique identifiers to avoid mixing packages up. Example: "Bug Report 20101214 174856 6.4b2". My brain just isn't that good at remembering numbers. What I would love to have is a simple way of generating alpha-numeric identifiers that are easy to remember. Examples would be "azil3", "ulmops", "fel2way", etc. I just made these up, but they are much easier to recognize when you see many of them at once. I know of algorithms that perform trigram analysis on text (say you feed them a whole book in German) and that can generate strings that look and feel like German words. This requires lots of data, though, and makes it slightly less suitable for embedding in an application just for this purpose. Do you know of anything else? Thanks! Carl

    Read the article

  • The Immerman-Szelepcsenyi Theorem

    - by Daniel Lorch
    In the Immerman-Szelepcsenyi Theorem, two algorithms are specified that use non-determinisim. There is a rather lengthy algorithm using "inductive counting", which determines the number of reachable configurations for a given non-deterministic turing machine. The algorithm looks like this: Let m_{i+1}=0 For all configurations C Let b=0, r=0 For all configurations D Guess a path from I to D in at most i steps If found Let r=r+1 If D=C or D goes to C in 1 step Let b=1 If r<m_i halt and reject Let m_{i+1}=m_{i+1}+b I is the starting configuration. m_i is the number of configurations reachable from the starting configuration in i steps. This algorithm only calculates the "next step", i.e. m_i+1 from m_i. This seems pretty reasonable, but since we have nondeterminisim, why don't we just write: Let m_i = 0 For all configurations C Guess a path from I to C in at most i steps If found m_i = m_i + 1 What is wrong with this algorithm? I am using nondeterminism to guess a path from I to C, and I verify reachability I am iterating through the list of ALL configurations, so I am sure to not miss any configuration I respect space bounds I can generate a certificate (the list of reachable configs) I believe I have a misunderstanding of the "power" of non-determinisim, but I can't figure out where to look next. I am stuck on this for quite a while and I would really appreciate any help.

    Read the article

  • Need some ignition in Embedded Systems

    - by Rahul
    I'm very much interested in building applications for Embedded Devices. I'm in my 3rd year Electrical Engineering and I'm passionate about coding, algorithms, Linux OS, etc. And also by Googling I found out that Linux OS is one of the best OSes for Embedded devices(may be/may not be). I want to work for companies which work on mobile applications. I'm a newbie/naive to this domain & my skills include C/C++ & MySQL. I need help to get started in the domain of Embedded Systems; like how/where to start off, Hardware prerequisites, necessary programming skills, also what kind of Embedded Applications etc. I've heard of ARM, firmware, PIC Micorcontrollers; but I don't know anything & just need proper introduction about them. Thanx. P.S: I'm currently reading Bjarne Struotsup's lecture in C++ at Texas A&M University, and one chapter in it describes about Embedded Systems Programming.

    Read the article

  • finding ALL cycles in a huge sparse matrix

    - by Andy
    Hi there, First of all I'm quite a Java beginner, so I'm not sure if this is even possible! Basically I have a huge (3+million) data source of relational data (i.e. A is friends with B+C+D, B is friends with D+G+Z (but not A - i.e. unmutual) etc.) and I want to find every cycle within this (not necessarily connected) directed graph. I've found this thread (http://stackoverflow.com/questions/546655/finding-all-cycles-in-graph/549402#549402) which has pointed me to Donald Johnson's (elementary) cycle-finding algorithm which, superficially at least, looks like it'll do what I'm after (I'm going to try when I'm back at work on Tuesday - thought it wouldn't hurt to ask in the meanwhile!). I had a quick scan through the code of the Java implementation of Johnson's algorithm (in that thread) and it looks like a matrix of relations is the first step, so I guess my questions are: a) Is Java capable of handling a 3+million*3+million matrix? (was planning on representing A-friends-with-B by a binary sparse matrix) b) Do I need to find every connected subgraph as my first problem, or will cycle-finding algorithms handle disjoint data? c) Is this actually an appropriate solution for the problem? My understanding of "elementary" cycles is that in the graph below, rather than picking out A-B-C-D-E-F it'll pick out A-B-F, B-C-D etc. but that's not the end of the world given the task. E / \ D---F / \ / \ C---B---A d) If necessary, I can simplify the problem by enforcing mutuality in relations - i.e. A-friends-with-B <== B-friends-with-A, and if really necessary I can maybe cut down the data size, but realistically it is always going to be around the 1mil mark. z) Is this a P or NP task?! Am I biting off more than I can chew? Thanks all, any help appreciated! Andy

    Read the article

  • Select all points in a matrix within 30m of another point

    - by pinnacler
    So if you look at my other posts, it's no surprise I'm building a robot that can collect data in a forest, and stick it on a map. We have algorithms that can detect tree centers and trunk diameters and can stick them on a cartesian XY plane. We're planning to use certain 'key' trees as natural landmarks for localizing the robot, using triangulation and trilateration among other methods, but programming this and keeping data straight and efficient is getting difficult using just Matlab. Is there a technique for sub-setting an array or matrix of points? Say I have 1000 trees stored over 1km (1000m), is there a way to say, select only points within 30m radius of my current location and work only with those? I would just use a GIS, but I'm doing this in Matlab and I'm unaware of any GIS plugins for Matlab. I forgot to mention, this code is going online, meaning it's going on a robot for real-time execution. I don't know if, as the map grows to several miles, using a different data structure will help or if calculating every distance to a random point is what a spatial database is going to do anyway. I'm thinking of mirroring two arrays, one sorted by X and the other by Y. Then bubble sorting to determine the 30m range in that. I do the same for both arrays, X and Y, and then have a third cross link table that will select the individual values. But I don't know, what that's called, how to program that and I'm sure someone already has so I don't want to reinvent the wheel. Cartesian Plane GIS

    Read the article

  • Webcam video stream processing.

    - by vikramtheone
    Hi Guys, I'm working with an image processing project, my final goal is to detect features on a real time video and finally track those features. I will be working with an Embedded Processor Platform called Freescale's i.MX515, it is a 32-bit media processor running on Ubuntu 9.04. Right now I'm working on the algorithms to locate the features, so, I'm using still images. When I'm satisfied with the results I will have to start using a video stream and I don't want to make use of a video file as a source stream, because then I will have to worry about video decoders then. Instead I would like to plug in a USB Wecam to the embedded platform (It has USB ports on it), directly take the frames as they are captured and send it to my application. I will take care to buy a webcam which will be supported in Linux (Device driver). But my question is will I be able to capture the incoming video stream from the webcam and send it to my application? Will I be able to configure the webcam and DMA to write the incoming frames in a particular memory location whose pointer I can simply pass to my application? (Confused!!!) I hope I could convey my doubts, can anyone guide me with what steps that I have to take to achieve all of these easily? Do you foresee any impossibility here? Help!!! Regards Vikram

    Read the article

  • Web application architecture, and application servers?

    - by seanieb
    Hi, I'm building a web application, and I need to use an architecture that allows me to run it on two servers. The application scrapes information from other sites periodically, and on input from the end user. To do this I'm using Php+curl to scrape the information, Php or python to parse it and store the results in a MySQLDB. Then I will use Python to run some algorithms on the data, this will happen both periodically and on input from the end user. I'm going to cache some of the results in the MySQL DB and sometimes if it is specific to the user, skip storing the data and serve it to the user. I'm think of using Php for the website front end on a separate web server, running the Php spider, MySQL DB and python on another server. As you can see I'm fairly clueless. I'm familiar with using Php, MySQL and the basics of Python, but bringing this all together using something more complex than a cron job is new to me. How do go about implementing this? What frame work(s) should I use? Is MVC a good architecture for this? (I'm new to MVC, architectures etc.) Is Cakephp a good solution? If so will I be able to control and monitor the Python code using it?

    Read the article

  • Have any software engineers gotten math degrees later in their careers?

    - by vin
    I have a bachelors in computer science and worked the last 12 years as a software engineer. I'm bored with doing general development work, so I want to specialize. I'm thinking about getting a master's degree in math so I can build math models and write algorithms to implement them. I'm unsure what type of work I'd do (financial, gaming, graphics, science, research, etc) but I'm open minded. I would need to refresh my undergrad math skills (which are old and faded), but I loved algebra and calculus. I've been working with couple statisticians so I've been finding myself more interested in statistics. Since I'm a parent supporting a household, I would have to continue working while studying. Have any software engineers taken this route? (Specifically, going from BS in comp sci to MS in math.) If so, what advice do you have for coursework, financing, and getting a job that combines programming with advanced math? How abundant are these kinds of jobs? I'm not sure where one starts. Also, how do you hop from a BS to an MS in a different subject?

    Read the article

  • authentication question (security code generation logic)

    - by Stick it to THE MAN
    I have a security number generator device, small enough to go on a key-ring, which has a six digit LCD display and a button. After I have entered my account name and password on an online form, I press the button on the security device and enter the security code number which is displayed. I get a different number every time I press the button and the number generator has a serial number on the back which I had to input during the account set-up procedure. I would like to incorporate similar functionality in my website. As far as I understand, these are the main components: Generate a unique N digit aplha-numeric sequence during registration and assign to user (permanently) Allow user to generate an N (or M?) digit aplha-numeric sequence remotely For now, I dont care about the hardware side, I am only interested in knowing how I may choose a suitable algorithm that will allow the user to generate an N (or M?) long aplha-numeric sequence - presumably, using his unique ID as a seed Identify the user from the number generated in step 2 (which decryption method is the most robust to do this?) I have the following questions: Have I identified all the steps required in such an authentication system?, if not please point out what I have missed and why it is important What are the most robust encryption/decryption algorithms I can use for steps 1 through 3 (preferably using 64bits)?

    Read the article

  • Automated Legal Processing

    - by Chris S
    Will it ever be possible to make legal systems quantifiable enough to process with computer algorithms? What technologies would have to be in place before this is possible? Are there any existing technologies that are already trying to accomplish this? Out of curiosity, I downloaded the text for laws in my local municipality, and tried applying some simple NLP tricks to extract rules from sentences. I had mixed results. Some sentences were very explicit (e.g. "Cars may not be left in the park overnight"), but other sentences seemed hopelessly vague (e.g. "The council's purpose is to ensure the well-being of the community"). I apologize if this is too open-ended a topic, but I've often wondered what society would look like if legal systems were based on less ambiguous language. Lawyers, and the legal process in general, are so expensive because they have to manually process a complex set of rules codified in ambiguous legal texts. If this system could be represented in software, this huge expense could potentially be eliminated, making the legal system more accessible for everyone.

    Read the article

  • PHP's openssl_sign generates different signature than SSCrypto's sign

    - by pascalj
    I'm writing an OS X client for a software that is written in PHP. This software uses a simple RPC interface to receive and execute commands. The RPC client has to sign the commands he sends to ensure that no MITM can modify any of them. However, as the server was not accepting the signatures I sent from my OS X client, I started investigating and found out that PHP's openssl_sign function generates a different signature for a given private key/data combination than the Objective-C SSCrypto framework (which is only a wrapper for the openssl lib): SSCrypto *crypto = [[SSCrypto alloc] initWithPrivateKey:self.localPrivKey]; NSData *shaed = [self sha1:@"hello"]; [crypto setClearTextWithData:shaed]; NSData *data = [crypto sign]; generates a signature like CtbkSxvqNZ+mAN... The PHP code openssl_sign("hello", $signature, $privateKey); generates a signature like 6u0d2qjFiMbZ+... (For my certain key, of course. base64 encoded) I'm not quite shure why this is happening and I unsuccessfully experimented with different hash-algorithms. As the PHP documentation states SHA1 is used by default. So why do these two functions generate different signatures and how can I get my Objective-C part to generate a signature that PHPs openssl_verify will accept? Note: I double checked that the keys and the data is correct!

    Read the article

  • Why i am getting NullPointerException for this btree method??

    - by user306540
    hi, i am writing code for btree algorithms. i am getting NullPointerException . why???? please somebody help me...! public void insertNonFull(BPlusNode root,BPlusNode parent,String key) { int i=0; BPlusNode child=new BPlusNode(); BPlusNode node=parent; while(true) { i=node.numKeys-1; if(node.leaf) { while(i>=0 && key.compareTo(node.keys[i])<0) { node.keys[i+1]=node.keys[i]; i--; } node.keys[i+1]=key; node.numKeys=node.numKeys+1; } else { while(i>=0 && key.compareTo(node.keys[i])<0) { i--; } } i++; child=node.pointers[i]; if(child!=null && child.numKeys==7) { splitChild(root,node,i,child); if(key.compareTo(node.keys[i])>0) { i++; } } node=node.pointers[i]; } }

    Read the article

  • Ideas Related to Subset Sum with 2,3 and more integers

    - by rolandbishop
    I've been struggling with this problem just like everyone else and I'm quite sure there has been more than enough posts to explain this problem. However in terms of understanding it fully, I wanted to share my thoughts and get more efficient solutions from all the great people in here related to Subset Sum problem. I've searched it over the Internet and there is actually a lot sources but I'm really willing to re-implement an algorithm or finding my own in order to understand fully. The key thing I'm struggling with is the efficiency considering the set size will be large. (I do not have a limit, just conceptually large). The two phases I'm trying to implement ideas on is finding two numbers that are equal to given integer T, finding three numbers and eventually K numbers. Some ideas I've though; For the two integer part I'm thing basically sorting the array O(nlogn) and for each element in the array searching for its negative value. (i.e if the array element is 3 searching for -3). Maybe a hash table inclusion could be better, providing a O(1) indexing the element? For the three or more integers I've found an amazing blog post;http://www.skorks.com/2011/02/algorithms-a-dropbox-challenge-and-dynamic-programming/. However even the author itself states that it is not applicable for large numbers. So I was for 2 and 3 and more integers what ideas could be applied for the subset problem. I'm struggling with setting up a dynamic programming method that will be efficient for the large inputs as well.

    Read the article

  • Machine leaning algorithm for data classification.

    - by twk
    Hi all, I'm looking for some guidance about which techniques/algorithms I should research to solve the following problem. I've currently got an algorithm that clusters similar-sounding mp3s using acoustic fingerprinting. In each cluster, I have all the different metadata (song/artist/album) for each file. For that cluster, I'd like to pick the "best" song/artist/album metadata that matches an existing row in my database, or if there is no best match, decide to insert a new row. For a cluster, there is generally some correct metadata, but individual files have many types of problems: Artist/songs are completely misnamed, or just slightly mispelled the artist/song/album is missing, but the rest of the information is there the song is actually a live recording, but only some of the files in the cluster are labeled as such. there may be very little metadata, in some cases just the file name, which might be artist - song.mp3, or artist - album - song.mp3, or another variation A simple voting algorithm works fairly well, but I'd like to have something I can train on a large set of data that might pick up more nuances than what I've got right now. Any links to papers or similar projects would be greatly appreciated. Thanks!

    Read the article

  • GNU Make - Dependencies on non program code

    - by Tim Post
    A requirement for a program I am writing is that it must be able to trust a configuration file. To accomplish this, I am using several kinds of hashing algorithms to generate a hash of the file at compile time, this produces a header with the hashes as constants. Dependencies for this are pretty straight forward, my program depends on config_hash.h, which has a target that produces it. The makefile looks something like this : config_hash.h: $(SH) genhash config/config_file.cfg > $(srcdir)/config_hash.h $(PROGRAM): config_hash.h $(PROGRAM_DEPS) $(CC) ... ... ... I'm using the -M option to gcc, which is great for dealing with dependencies. If my header changes, my program is rebuilt. My problem is, I need to be able to tell if the config file has changed, so that config_hash.h is re-generated. I'm not quite sure how explain that kind of dependency to GNU make. I've tried listing config/config_file.cfg as a dependency for config_hash.h, and providing a .PHONY target for config_file.cfg without success. Obviously, I can't rely on the -M switch to gcc to help me here, since the config file is not a part of any object code. Any suggestions? Unfortunately, I can't post much of the Makefile, or I would have just posted the whole thing.

    Read the article

< Previous Page | 65 66 67 68 69 70 71 72 73 74 75 76  | Next Page >