Combinatorial optimisation of a distance metric
- by Jose
I have a set of trajectories, made up of points along the trajectory, and with the coordinates associated with each point. I store these in a 3d array ( trajectory, point, param). I want to find the set of r trajectories that have the maximum accumulated distance between the possible pairwise combinations of these trajectories. My first attempt, which I think is working looks like this:
max_dist = 0
for h in itertools.combinations ( xrange(num_traj), r):
for (m,l) in itertools.combinations (h, 2):
accum = 0.
for ( i, j ) in itertools.izip ( range(k), range(k) ):
A = [ (my_mat[m, i, z] - my_mat[l, j, z])**2 \
for z in xrange(k) ]
A = numpy.array( numpy.sqrt (A) ).sum()
accum += A
if max_dist < accum:
selected_trajectories = h
This takes forever, as num_traj can be around 500-1000, and r can be around 5-20. k is arbitrary, but can typically be up to 50.
Trying to be super-clever, I have put everything into two nested list comprehensions, making heavy use of itertools:
chunk = [[ numpy.sqrt((my_mat[m, i, :] - my_mat[l, j, :])**2).sum() \
for ((m,l),i,j) in \
itertools.product ( itertools.combinations(h,2), range(k), range(k)) ]\
for h in itertools.combinations(range(num_traj), r) ]
Apart from being quite unreadable (!!!), it is also taking a long time. Can anyone suggest any ways to improve on this?