genetic algorithm - Page 23

Python: (sampling with replacement): efficient algorithm to extract the set of DISSIMILAR N-tuples from a set

- by Homunculus Reticulli

I have a set of items, from which I want to select DISSIMILAR tuples (more on the definition of dissimilar touples later). The set could contain potentially several thousand items, although typically, it would contain only a few hundreds. I am trying to write a generic algorithm that will allow me to select N items to form an N-tuple, from the original set. The new set of selected N-tuples should be DISSIMILAR. A N-tuple A is said to be DISSIMILAR to another N-tuple B if and only if: Every pair (2-tuple) that occurs in A DOES NOT appear in B Note: For this algorithm, A 2-tuple (pair) is considered SIMILAR/IDENTICAL if it contains the same elements, i.e. (x,y) is considered the same as (y,x). This is a (possible variation on the) classic Urn Problem. A trivial (pseudocode) implementation of this algorithm would be something along the lines of def fetch_unique_tuples(original_set, tuple_size): while True: # randomly select [tuple_size] items from the set to create first set # create a key or hash from the N elements and store in a set # store selected N-tuple in a container if end_condition_met: break I don't think this is the most efficient way of doing this - and though I am no algorithm theorist, I suspect that the time for this algorithm to run is NOT O(n) - in fact, its probably more likely to be O(n!). I am wondering if there is a more efficient way of implementing such an algo, and preferably, reducing the time to O(n). Actually, as Mark Byers pointed out there is a second variable m, which is the size of the number of elements being selected. This (i.e. m) will typically be between 2 and 5. Regarding examples, here would be a typical (albeit shortened) example: original_list = ['CAGG', 'CTTC', 'ACCT', 'TGCA', 'CCTG', 'CAAA', 'TGCC', 'ACTT', 'TAAT', 'CTTG', 'CGGC', 'GGCC', 'TCCT', 'ATCC', 'ACAG', 'TGAA', 'TTTG', 'ACAA', 'TGTC', 'TGGA', 'CTGC', 'GCTC', 'AGGA', 'TGCT', 'GCGC', 'GCGG', 'AAAG', 'GCTG', 'GCCG', 'ACCA', 'CTCC', 'CACG', 'CATA', 'GGGA', 'CGAG', 'CCCC', 'GGTG', 'AAGT', 'CCAC', 'AACA', 'AATA', 'CGAC', 'GGAA', 'TACC', 'AGTT', 'GTGG', 'CGCA', 'GGGG', 'GAGA', 'AGCC', 'ACCG', 'CCAT', 'AGAC', 'GGGT', 'CAGC', 'GATG', 'TTCG'] # Select 3-tuples from the original list should produce a list (or set) similar to: [('CAGG', 'CTTC', 'ACCT') ('CAGG', 'TGCA', 'CCTG') ('CAGG', 'CAAA', 'TGCC') ('CAGG', 'ACTT', 'ACCT') ('CAGG', 'CTTG', 'CGGC') .... ('CTTC', 'TGCA', 'CAAA') ] [[Edit]] Actually, in constructing the example output, I have realized that the earlier definition I gave for UNIQUENESS was incorrect. I have updated my definition and have introduced a new metric of DISSIMILARITY instead, as a result of this finding.

Search Results

Search found 5119 results on 205 pages for 'genetic algorithm'.

Page 23/205 | < Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30 | Next Page >

- by Homunculus Reticulli

- by Arfan M

- by Valentino Ru

- by Valmond

- by Qwertie

- by Rafael Almeida

- by Alexandre P. Levasseur

- by Tom

- by Saeed Neamati

- by user142050

- by Zove Games

- by astinx

- by AGeek

- by kumar

- by Hanseh

- by S.Mark

- by spacker_lechuck

- by The Rook

- by CSharperWithJava

- by psihodelia

- by jett

- by Jeroen

- by Dave

< Previous Page | 19 20 21 22 23 24 25 26 27 28 29 30 | Next Page >