Searching integer sequences
- by David Gibson
I have a fairly complex search problem that I've managed to reduce to the following description. I've been googling but haven't been able to find an algorithm that seems to fit my problem cleanly. In particular the need to skip arbitrary integers. Maybe someone here can point me to something?
Take a sequence of integers A, for example (1 2 3 4)
Take various sequences of integers and test if any of them match A such that.
A contains all of the integers in the tested sequence
The ordering of the integers in the tested sequence are the same in A
We don't care about any integers in A that are not in the test sequence
We want all matching test sequences, not just the first.
An example
A = (1 2 3 4)
B = (1 3)
C = (1 3 4)
D = (3 1)
E = (1 2 5)
B matches A
C matches A
D does not match A as the ordering is different
E does not match A as it contains an integer not in A
I hope that this explanation is clear enough. The best I've managed to do is to contruct a tree of the test sequences and iterate over A. The need to be able to skip integers leads to a lot of unsuccessful search paths.
Thanks
Reading some suggestions I feel that I have to clarify a couple of points that I left too vague.
Repeated numbers are allowed, in fact this is quite important as it allows a single test sequence to match A is multiple ways
A = (1234356), B = (236), matches could be either -23---6 or -2--3-6
I expect there to be a very large number of test sequences, in the thousands at least and sequence A will tend to have a max length of maybe 20. Thus simply trying to match each test sequence one by one by iterating becomes extremely inefficient.
Sorry if this wasn't clear.