Search Results

Search found 7490 results on 300 pages for 'algorithm analysis'.

Page 77/300 | < Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >

  • sample java code for approximate string matching or boyer-moore extended for approximate string matc

    - by Dolphin
    Hi I need to find 1.mismatch(incorrectly played notes), 2.insertion(additional played), & 3.deletion (missed notes), in a music piece (e.g. note pitches [string values] stored in a table) against a reference music piece. This is either possible through exact string matching algorithms or dynamic programming/ approximate string matching algos. However I realised that approximate string matching is more appropriate for my problem due to identifying mismatch, insertion, deletion of notes. Or an extended version of Boyer-moore to support approx. string matching. Is there any link for sample java code I can try out approximate string matching? I find complex explanations and equations - but I hope I could do well with some sample code and simple explanations. Or can I find any sample java code on boyer-moore extended for approx. string matching? I understand the boyer-moore concept, but having troubles with adjusting it to support approx. string matching (i.e. to support mismatch, insertion, deletion). Also what is the most efficient approx. string matching algorithm (like boyer-moore in exact string matching algo)? Greatly appreciate any insight/ suggestions. Many thanks in advance

    Read the article

  • Average of two strings in alphabetical order

    - by Bemmu
    Suppose you take the strings 'a' and 'z' and list all the strings that come between them in alphabetical order: ['a','b','c' ... 'x','y','z']. Take the midpoint of this list and you find 'm'. So this is kind of like taking an average of those two strings. You could extend it to strings with more than one character, for example the midpoint between 'aa' and 'zz' would be found in the middle of the list ['aa', 'ab', 'ac' ... 'zx', 'zy', 'zz']. Might there be a Python method somewhere that does this? If not, even knowing the name of the algorithm would help. I began making my own routine that simply goes through both strings and finds midpoint of the first differing letter, which seemed to work great in that 'aa' and 'az' midpoint was 'am', but then it fails on 'cat', 'doggie' midpoint which it thinks is 'c'. Rather than invent a method I thought it better to ask. I tried Googling for "binary search string midpoint" etc. but without knowing the name of what I am trying to do here I had little luck.

    Read the article

  • How to code a URL shortener?

    - by marco92w
    I want to create a URL shortener service where you can write a long URL into an input field and the service shortens the URL to "http://www.example.org/abcdef". Instead of "abcdef" there can be any other string with six characters containing a-z, A-Z and 0-9. That makes 56 trillion possible strings. My approach: I have a database table with three columns: id, integer, auto-increment long, string, the long URL the user entered short, string, the shortened URL (or just the six characters) I would then insert the long URL into the table. Then I would select the auto-increment value for "id" and build a hash of it. This hash should then be inserted as "short". But what sort of hash should I build? Hash algorithms like MD5 create too long strings. I don't use these algorithms, I think. A self-built algorithm will work, too. My idea: For "http://www.google.de/" I get the auto-increment id 239472. Then I do the following steps: short = ''; if divisible by 2, add "a"+the result to short if divisible by 3, add "b"+the result to short ... until I have divisors for a-z and A-Z. That could be repeated until the number isn't divisible any more. Do you think this is a good approach? Do you have a better idea?

    Read the article

  • Autodetect Presence of CSV Headers in a File

    - by banzaimonkey
    Short question: How do I automatically detect whether a CSV file has headers in the first row? Details: I've written a small CSV parsing engine that places the data into an object that I can access as (approximately) an in-memory database. The original code was written to parse third-party CSV with a predictable format, but I'd like to be able to use this code more generally. I'm trying to figure out a reliable way to automatically detect the presence of CSV headers, so the script can decide whether to use the first row of the CSV file as keys / column names or start parsing data immediately. Since all I need is a boolean test, I could easily specify an argument after inspecting the CSV file myself, but I'd rather not have to (go go automation). I imagine I'd have to parse the first 3 to ? rows of the CSV file and look for a pattern of some sort to compare against the headers. I'm having nightmares of three particularly bad cases in which: The headers include numeric data for some reason The first few rows (or large portions of the CSV) are null There headers and data look too similar to tell them apart If I can get a "best guess" and have the parser fail with an error or spit out a warning if it can't decide, that's OK. If this is something that's going to be tremendously expensive in terms of time or computation (and take more time than it's supposed to save me) I'll happily scrap the idea and go back to working on "important things". I'm working with PHP, but this strikes me as more of an algorithmic / computational question than something that's implementation-specific. If there's a simple algorithm I can use, great. If you can point me to some relevant theory / discussion, that'd be great, too. If there's a giant library that does natural language processing or 300 different kinds of parsing, I'm not interested.

    Read the article

  • Please Help Me with my Homework Problem in C++

    - by sil3nt
    Hey there, this is part of a question i got in class, im at the final stretch but this has become a major problem. In it im given a certain value which is called the "gold value" and it is 40.5, this value changes in input. and i have these constants const int RUBIES_PER_DIAMOND = 5; // relative values. * const int EMERALDS_PER_RUBY = 2; const int GOLDS_PER_EMERALDS = 5; const int SILVERS_PER_GOLD = 4; const int COPPERS_PER_SILVER = 5; const int DIAMOND_VALUE = 50; // gold values. * const int RUBY_VALUE = 10; const int EMERALD_VALUE = 5; const float SILVER_VALUE = 0.25; const float COPPER_VALUE = 0.05; which means that basically for every diamond there are 5 rubies, and for every ruby there are 2 emeralds. So on and so forth. and the "gold value" for every diamond for example is 50 (diamond value = 50) this is how much one diamond is worth in golds. my problem is converting 40.5 into these diamonds and ruby values. I know the answer is 4rubies and 2silvers but how do i write the algorithm for this so that it gives the best estimate for every goldvalue that comes along??

    Read the article

  • Algorithms for finding a numerical record in a list of ordered numbers

    - by Ankur
    I have a list of incomplete ordered numbers. I want to find a particular number with as few steps as possible. Are there any improvements on this algorithm, I assume you can count the set size without difficulty - it will be stored and updated every time a new item is added. Your object is to get your cursor over the value x The first number (smallest) is s, and the last number (greatest) is g. Take the midpoint m1 of the set: calculate is x < m1, If yes then s <= x < m1 If no then m1 < x <= g If m1 = x then you're done. Keep repeating till you find x. Basically dividing the set into two parts with each iteration till you hit x. The purpose is to retrieve a numerical id from a very large table to then find the associated other records. I would imagine this is the most trivial kind of indexing available, are there improvements?

    Read the article

  • Programmer Puzzle: Encoding a chess board state throughout a game

    - by Andrew Rollings
    Not strictly a question, more of a puzzle... Over the years, I've been involved in a few technical interviews of new employees. Other than asking the standard "do you know X technology" questions, I've also tried to get a feel for how they approach problems. Typically, I'd send them the question by email the day before the interview, and expect them to come up with a solution by the following day. Often the results would be quite interesting - wrong, but interesting - and the person would still get my recommendation if they could explain why they took a particular approach. So I thought I'd throw one of my questions out there for the Stack Overflow audience. Question: What is the most space-efficient way you can think of to encode the state of a chess game (or subset thereof)? That is, given a chess board with the pieces arranged legally, encode both this initial state and all subsequent legal moves taken by the players in the game. No code required for the answer, just a description of the algorithm you would use. EDIT: As one of the posters has pointed out, I didn't consider the time interval between moves. Feel free to account for that too as an optional extra :) EDIT2: Just for additional clarification... Remember, the encoder/decoder is rule-aware. The only things that really need to be stored are the player's choices - anything else can be assumed to be known by the encoder/decoder. EDIT3: It's going to be difficult to pick a winner here :) Lots of great answers!

    Read the article

  • Fuzzy Search on Material Descriptions including numerical sizes & general descriptions of material t

    - by Kyle
    We're looking to provide a fuzzy search on an electrical materials database (i.e. conduit, cable, etc.). The problem is that, because of a lack of consistency across all material types, we could not split sizes into separate fields from the text description because some materials are rated by things other than size. I've attempted a combination of a full text search & a SQL CLR implementation of the Levenshtein search algorithm (for assistance in ranking), but my results are a little funky (i.e. they are not sorting correctly due to improper ranking). For example, if the search term is "3/4" ABCD Conduit", I'll might get back several irrelevant results in the following order: 1/2" Conduit 1/4" X 3/4" Cable 1/4" Cable Ties 3/4" DFC Conduit Tees 3/4" ABCD Conduit 3/4" Conduit I believe I've nailed the problem down to the fact that these two search algorithms do not factor in the relevance of punctuation & numeric. That is, in such a search, I'd expect the size to take precedence over any fuzzy match on the rest of the description, but my results don't reflect that. My question is: Can anyone recommend better search algorithms or different approaches that may be better suited for searching a combination of alphanumerics & punctuation characters?

    Read the article

  • Finding subsets that can be completed to tuples without duplicates

    - by Jules
    We have a collection of sets A_1,..,A_n. The goal is to find new sets for each of the old sets. newA_i = {a_i in A_i such that there exist (a_1,..,a_n) in (A1,..,An) with no a_k = a_j for all k and j} So in words this says that we remove all the elements from A_i that can't be used to form a tuple (a_1,..a_n) from the sets (A_1,..,A_n) such that the tuple doesn't contain duplicates. My question is how to compute these new sets quickly. If you just implement this definition by generating all possible v's this will take exponential time. Do you know a better algorithm? Edit: here's an example. Take A_1 = {1,2,3,4} A_2 = {2}. Now the new sets look like this: newA_1 = {1,3,4} newA_2 = {2} The 2 has been removed from A_1 because if you choose it the tuple will always be (2,2) which is invalid because it contains duplicates. On the other hand 1,3,4 are valid because (1,2), (3,2) and (4,2) are valid tuples. Another example: A_1 = {1,2,3} A_2 = {1,4,5} A_3 = {2,4,5} A_4 = {1,2,3} A_5 = {1,2,3} Now the new sets are: newA_1 = {1,2,3} newA_2 = {4,5} newA_3 = {4,5} newA_4 = {1,2,3} newA_5 = {1,2,3} The 1 and 2 are removed from sets 2 and 3 because if you choose the 1 or 2 from these sets you'll only have 2 values left for sets 1, 4 and 5, so you will always have duplicates in tuples that look like (_,1,_,_,_) or like (_,_,2,_,_).

    Read the article

  • A function where small changes in input always result in large changes in output

    - by snowlord
    I would like an algorithm for a function that takes n integers and returns one integer. For small changes in the input, the resulting integer should vary greatly. Even though I've taken a number of courses in math, I have not used that knowledge very much and now I need some help... An important property of this function should be that if it is used with coordinate pairs as input and the result is plotted (as a grayscale value for example) on an image, any repeating patterns should only be visible if the image is very big. I have experimented with various algorithms for pseudo-random numbers with little success and finally it struck me that md5 almost meets my criteria, except that it is not for numbers (at least not from what I know). That resulted in something like this Python prototype (for n = 2, it could easily be changed to take a list of integers of course): import hashlib def uniqnum(x, y): return int(hashlib.md5(str(x) + ',' + str(y)).hexdigest()[-6:], 16) But obviously it feels wrong to go over strings when both input and output are integers. What would be a good replacement for this implementation (in pseudo-code, python, or whatever language)?

    Read the article

  • Count all lists of adjacent nodes stored in an array.

    - by Ben Brodie
    There are many naive approaches to this problem, but I'm looking for a good solution. Here is the problem (will be implemented in Java): You have a function foo(int a, int b) that returns true if 'a' is "adjacent" to 'b' and false if 'a' is not adjacent to 'b'. You have an array such as this [1,4,5,9,3,2,6,15,89,11,24], but in reality has a very long length, like 120, and will be repeated over and over (its a genetic algorithm fitness function) which is why efficiency is important. I want a function that returns the length of each possible 'list' of adjacencies in the array, but not including the 'lists' which simply subsets of a larger list. For example, if foo(1,4) - true, foo(4,5) - true, foo(5,9)- false, foo(9,3) & foo(3,2) & foo(2,6), foo(6,15) - true, then there are 'lists' (1,4,5) and (9,3,2,6), so length 3 and 4. I don't want it to return (3,2,6), though, because this is simply a subset of 9,3,2,6. Thanks.

    Read the article

  • How to find minimum cut-sets for several subgraphs of a graph of degrees 2 to 4

    - by Tore
    I have a problem, Im trying to make A* searches through a grid based game like pacman or sokoban, but i need to find "enclosures". What do i mean by enclosures? subgraphs with as few cut edges as possible given a maximum size and minimum size for number of vertices that act as soft constraints. Alternatively you could say i am looking to find bridges between subgraphs, but its generally the same problem. Given a game that looks like this, what i want to do is find enclosures so that i can properly find entrances to them and thus get a good heuristic for reaching vertices inside these enclosures. So what i want is to find these colored regions on any given map. The reason for me bothering to do this and not just staying content with the performance of a simple manhattan distance heuristic is that an enclosure heuristic can give more optimal results and i would not have to actually do the A* to get some proper distance calculations and also for later adding competitive blocking of opponents within these enclosures when playing sokoban type games. Also the enclosure heuristic can be used for a minimax approach to finding goal vertices more properly. Do you know of a good algorithm for solving this problem or have any suggestions in things i should explore?

    Read the article

  • Random Pairings that don't Repeat

    - by Andrew Robinson
    This little project / problem came out of left field for me. Hoping someone can help me here. I have some rough ideas but I am sure (or at least I hope) a simple, fairly efficient solution exists. Thanks in advance.... pseudo code is fine. I generally work in .NET / C# if that sheds any light on your solution. Given: A pool of n individuals that will be meeting on a regular basis. I need to form pairs that have not previously meet. The pool of individuals will slowly change over time. For the purposes of pairing, (A & B) and (B & A) constitute the same pair. The history of previous pairings is maintained. For the purpose of the problem, assume an even number of individuals. For each meeting (collection of pairs) and individual will only pair up once. Is there an algorithm that will allow us to form these pairs? Ideally something better than just ordering the pairs in a random order, generating pairings and then checking against the history of previous pairings. In general, randomness within the pairing is ok.

    Read the article

  • Dealing with Imprecise Drawing in CAD Drawing

    - by Graviton
    I have a CAD application, that allows user to draw lines and polygons and all that. One thorny problem that I face is user drawing can be highly imprecise, for example, a user might want to draw two rectangles that are connected to each other. Hence there should be one line shared by two rectangles. However, it's easy for user to, instead of draw a line, draw two lines that are very close to each other, so close to each other that when look from the screen, you would be mistaken that they are the same line, except that they aren't when you zoom in a little bit. My application would require user to properly draw the lines ( or my preprocessing must be able to do auto correction), or else my internal algorithm would not be able to process the inputs correctly. What is the best strategy to combat this kind of problem? I am thinking about rounding the point coordinates to a certain degree of precision, but although I can't exactly pinpoint the problem of this approach, but I feel that this is not the correct way of doing things, that this will introduce a new set of problem. Any idea?

    Read the article

  • Log 2 N generic comparison tree

    - by Morano88
    Hey! I'm working on an algorithm for Redundant Binary Representation (RBR) where every two bits represent a digit. I designed the comparator that takes 4 bits and gives out 2 bits. I want to make the comparison in log 2 n so If I have X and Y .. I compare every 2 bits of X with every 2 bits of Y. This is smooth if the number of bits of X or Y equals n where (n = 2^X) i.e n = 2,4,8,16,32,... etc. Like this : However, If my input let us say is 6 or 10 .. then it becomes not smooth and I have to take into account some odd situations like this : I have a shallow experience in algorithms .. is there a generic way to do it .. so at the end I get only 2 bits no matter what the input is ? I just need hints or pseudo-code. If my question is not appropriate here .. so feel free to flag it or tell me to remove it. I'm using VHDL by the way!

    Read the article

  • sort outer array based on values in inner array, javascript

    - by ptrn
    I have an array with arrays in it, where I want to sort the outer arrays based on values in a specific column in the inner. I bet that sounded more than a bit confusing, so I'll skip straight to an example. Initial data: var data = [ [ "row_1-col1", "2-row_1-col2", "c-row_1-coln" ], [ "row_2-col1", "1-row_2-col2", "b-row_2-coln" ], [ "row_m-col1", "3-row_m-col2", "a-row_m-coln" ] ]; Sort data, based on column with index 1 data.sortFuncOfSomeKind(1); where the object then would look like this; var data = [ [ "row_2-col1", "1-row_2-col2", "b-row_2-coln" ], [ "row_1-col1", "2-row_1-col2", "c-row_1-coln" ], [ "row_m-col1", "3-row_m-col2", "a-row_m-coln" ] ]; Sort data, based on column with index 2 data.sortFuncOfSomeKind(2); where the object then would look like this; var data = [ [ "row_m-col1", "3-row_m-col2", "a-row_m-coln" ], [ "row_2-col1", "1-row_2-col2", "b-row_2-coln" ], [ "row_1-col1", "2-row_1-col2", "c-row_1-coln" ] ]; The big Q Is there an existing solution to this that you know of, or would I have to write one myself? If so, which would be the easiest sort algorithm to use? QuickSort? _L

    Read the article

  • How should I generate the partitions / pairs for the Chinese Postman problem?

    - by Simucal
    I'm working on a program for class that involves solving the Chinese Postman problem. Our assignment only requires us to write a program to solve it for a hard-coded graph but I'm attempting to solve it for the general case on my own. The part that is giving me trouble is generating the partitions of pairings for the odd vertices. For example, if I had the following labeled odd verticies in a graph: 1 2 3 4 5 6 I need to find all the possible pairings / partitions I can make with these vertices. I've figured out I'll have i paritions given: n = num of odd verticies k = n / 2 i = ((2k)(2k-1)(2k-2)...(k+1))/2 So, given the 6 odd verticies above, we will know that we need to generate i = 15 partitions. The 15 partions would look like: 1 2 3 4 5 6 1 2 3 5 4 6 1 2 3 6 4 5 ... 1 6 ... Then, for each partition, I take each pair and find the shortest distance between them and sum them for that partition. The partition with the total smallest distance between its pairs is selected, and I then double all the edges between the shortest path between the odd vertices (found in the selected partition). These represent the edges the postman will have to walk twice. At first I thought I had worked out an appropriate algorithm for generating these partitions / pairs but it is flawed. I found it wasn't a simple permutation/combination problem. Does anyone who has studied this problem before have any tips that can help point me in the right direction for generating these partitions?

    Read the article

  • Fastest way to modify a decimal-keyed table in MySQL?

    - by javanix
    I am dealing with a MySQL table here that is keyed in a somewhat unfortunate way. Instead of using an auto increment table as a key, it uses a column of decimals to preserve order (presumably so its not too difficult to insert new rows while preserving a primary key and order). Before I go through and redo this table to something more sane, I need to figure out how to rekey it without breaking everything. What I would like to do is something that takes a list of doubles (the current keys) and outputs a list of integers (which can be cast down to doubles for rekeying). For example, input {1.00, 2.00, 2.50, 2.60, 3.00} would give output {1, 2, 3, 4, 5). Since this is a database, I also need to be able to update the rows nicely: UPDATE table SET `key`='3.00' WHERE `key`='2.50'; Can anyone think of a speedy algorithm to do this? My current thought is to read all of the doubles into a vector, take the size of the vector, and output a new vector with values from 1 => doubleVector.size. This seems pretty slow, since you wouldn't want to read every value into the vector if, for instance, only the last n/100 elements needed to be modified. I think there is probably something I can do in place, since only values after the first non-integer double need to be modified, but I can't for the life of me figure anything out that would let me update in place as well. For instance, setting 2.60 to 3.00 the first time you see 2.50 in the original key list would result in an error, since the key value 3.00 is already used for the table.

    Read the article

  • Ranking/ weighing search result

    - by biso
    I am trying to build an application that has a smart adaptive search engine (lets say for cars). If I search for for 4x4 then the DB will return all the 4x4 cars I have (100 cars) - but as time goes by and I start checking out cars, liking them, commenting on them, etc the order of the search result should be the different. That means 1 month later when searching for 4x4, I should get the same result set ordered differently as per my previous interaction with the site. If I was mainly liking and commenting on German cars, BMW should be on the top and Land cruiser should be further down. This ranking should be based on attributes that I captureduring user interaction (eg: car origin, user age, user location, car type[4x4, coupe, hatchback], price range). So for each car result I get, I will be weighing it based on how well it is performing on the 5 attributes above. I intend to use the DB just as a repository and do the ranking and the thinking on the server. My question is, what kind of algorithm should I be using to weigh/rank my search result? Thanks.

    Read the article

  • Fastest way to convert a list of doubles to a unique list of integers?

    - by javanix
    I am dealing with a MySQL table here that is keyed in a somewhat unfortunate way. Instead of using an auto increment table as a key, it uses a column of decimals to preserve order (presumably so its not too difficult to insert new rows while preserving a primary key and order). Before I go through and redo this table to something more sane, I need to figure out how to rekey it without breaking everything. What I would like to do is something that takes a list of doubles (the current keys) and outputs a list of integers (which can be cast down to doubles for rekeying). For example, input {1.00, 2.00, 2.50, 2.60, 3.00} would give output {1, 2, 3, 4, 5). Since this is a database, I also need to be able to update the rows nicely: UPDATE table SET `key`='3.00' WHERE `key`='2.50'; Can anyone think of a speedy algorithm to do this? My current thought is to read all of the doubles into a vector, take the size of the vector, and output a new vector with values from 1 => doubleVector.size. This seems pretty slow, since you wouldn't want to read every value into the vector if, for instance, only the last n/100 elements needed to be modified. I think there is probably something I can do in place, since only values after the first non-integer double need to be modified, but I can't for the life of me figure anything out that would let me update in place as well. For instance, setting 2.60 to 3.00 the first time you see 2.50 in the original key list would result in an error, since the key value 3.00 is already used for the table.

    Read the article

  • Does anyone knows the algorithm how Facebook detects the images when adding a link

    - by Edelcom
    When you add a link to your Facebook page, after some processing, Facebook presents you a next/prev button to choose an image linked to the url your are inserting. Obviously, Facebook reads the html-page and displays the images found on the url you insert. Does anyone knows what algorithm Facebook uses to decide what images to show ? If I insert a link to : http://www.staplijst.be/lachende-wandelaars-aalter-aktivia-003.asp, only 11 images are detected. The one I want, the one at the top right corner, is not included in the list. If I insert a link to http://www.staplijst.be/stichting-kennedymars-rijsbergen-zundert-nederland-knblo-nl-81996.asp, 19 images are displayed (including the one I want (the one at the right top corner of the text area). Both pages are build using asp code but are functionally the same. I thought that it has something to do with the image size, but can't find any deciding factor there. I will investigate some furhter, because if I know what Facebook is looking for, I can make sure that the correct images are included on the page (since they are dynamic pages build with classic asp). But if anyone has any idea ? Help would be appreciated.

    Read the article

  • how to implement a really efficient bitvector sorting in python

    - by xiao
    Hello guys! Actually this is an interesting topic from programming pearls, sorting 10 digits telephone numbers in a limited memory with an efficient algorithm. You can find the whole story here What I am interested in is just how fast the implementation could be in python. I have done a naive implementation with the module bitvector. The code is as following: from BitVector import BitVector import timeit import random import time import sys def sort(input_li): return sorted(input_li) def vec_sort(input_li): bv = BitVector( size = len(input_li) ) for i in input_li: bv[i] = 1 res_li = [] for i in range(len(bv)): if bv[i]: res_li.append(i) return res_li if __name__ == "__main__": test_data = range(int(sys.argv[1])) print 'test_data size is:', sys.argv[1] random.shuffle(test_data) start = time.time() sort(test_data) elapsed = (time.time() - start) print "sort function takes " + str(elapsed) start = time.time() vec_sort(test_data) elapsed = (time.time() - start) print "sort function takes " + str(elapsed) start = time.time() vec_sort(test_data) elapsed = (time.time() - start) print "vec_sort function takes " + str(elapsed) I have tested from array size 100 to 10,000,000 in my macbook(2GHz Intel Core 2 Duo 2GB SDRAM), the result is as following: test_data size is: 1000 sort function takes 0.000274896621704 vec_sort function takes 0.00383687019348 test_data size is: 10000 sort function takes 0.00380706787109 vec_sort function takes 0.0371489524841 test_data size is: 100000 sort function takes 0.0520560741425 vec_sort function takes 0.374383926392 test_data size is: 1000000 sort function takes 0.867373943329 vec_sort function takes 3.80475401878 test_data size is: 10000000 sort function takes 12.9204008579 vec_sort function takes 38.8053860664 What disappoints me is that even when the test_data size is 100,000,000, the sort function is still faster than vec_sort. Is there any way to accelerate the vec_sort function?

    Read the article

  • Good PHP / MYSQL hashing solution for large number of text values

    - by Dave
    Short descriptio: Need hashing algorithm solution in php for large number of text values. Long description. PRODUCT_OWNER_TABLE serial_number (auto_inc), product_name, owner_id OWNER_TABLE owner_id (auto_inc), owener_name I need to maintain a database of 200000 unique products and their owners (AND all subsequent changes to ownership). Each product has one owner, but an owner may have MANY different products. Owner names are "Adam Smith", "John Reeves", etc, just text values (quite likely to be unicode as well). I want to optimize the database design, so what i was thinking was, every week when i run this script, it fetchs the owner of a proudct, then checks against a table i suppose similar to PRODUCT_OWNER_TABLE, fetching the owner_id. It then looks up owner_id in OWNER_TABLE. If it matches, then its the same, so it moves on. The problem is when its different... To optimize the database, i think i should be checking against the other "owner_name" entries in OWNER_TABLE to see if that value exists there. If it does, then i should use that owner_id. If it doesnt, then i should add another entry. Note that there is nothing special about the "name". as long as i maintain the correct linkagaes AND make the OWNER_TABLE "read-only, append-new" type table - I should be able create a historical archive of ownership. I need to do this check for 200000 entries, with i dont know how many unique owner names (~50000?). I think i need a hashing solution - the OWNER_TABLE wont be sorted, so search algos wont be optimal. programming language is PHP. database is MYSQL.

    Read the article

  • Find min. "join" operations for sequence

    - by utyle
    Let's say, we have a list/an array of positive integers x1, x2, ... , xn. We can do a join operation on this sequence, that means that we can replace two elements that are next to each other with one element, which is sum of these elements. For example: - array/list: [1;2;3;4;5;6] we can join 2 and 3, and replace them with 5; we can join 5 and 6, and replace them with 11; we cannot join 2 and 4; we cannot join 1 and 3 etc. Main problem is to find minimum join operations for given sequence, after which this sequence will be sorted in increasing order. Note: empty and one-element sequences are sorted in increasing order. Basic examples: for [4; 6; 5; 3; 9] solution is 1 (we join 5 and 3) for [1; 3; 6; 5] solution is also 1 (we join 6 and 5) What I am looking for, is an algorithm that solve this problem. It could be in pseudocode, C, C++, PHP, OCaml or similar (I mean: I woluld understand solution, if You wrote solution in one of these languages). I would appreciate Your help.

    Read the article

  • Reversing permutation of an array in Java efficiently

    - by HansDampf
    Okay, here is my problem: Im implementing an algorithm in Java and part of it will be following: The Question is to how to do what I will explain now in an efficient way. given: array a of length n integer array perm, which is a permutation of [1..n] now I want to shuffle the array a, using the order determined by array perm, i.e. a=[a,b,c,d], perm=[2,3,4,1] ------ shuffledA[b,c,d,a], I figured out I can do that by iterating over the array with: shuffledA[i]=a[perm[i-1]], (-1 because the permutation indexes in perm start with 1 not 0) Now I want to do some operations on shuffledA... And now I want to do the reverse the shuffle operation. This is where I am not sure how to do it. Note that a can hold an item more than once, i.e. a=[a,a,a,a] If that was not the case, I could iterate perm, and find the corresponding indexes to the values. Now I thought that using a Hashmap instead of the the perm array will help. But I am not sure if this is the best way to do.

    Read the article

< Previous Page | 73 74 75 76 77 78 79 80 81 82 83 84  | Next Page >