Finding k elements of length-n list that sum to less than t in O(nlogk) time
- by tresbot
This is from Programming Pearls ed. 2, Column 2, Problem 8:
Given a set of n real numbers, a real number t, and an integer k, how quickly can you determine whether there exists a k-element subset of the set that sums to at most t?
One easy solution is to sort and sum the first k elements, which is our best hope to find such a sum. However, in the solutions section Bentley alludes to a solution that takes nlog(k) time, though he gives no hints for how to find it. I've been struggling with this; one thought I had was to go through the list and add all the elements less than t/k (in O(n) time); say there are m1 < k such elements, and they sum to s1 < t. Then we are left needing k - m1 elements, so we can scan through the list again in O(n) time looking for all elements less than (t - s1)/(k - m1). Add in again, to get s2 and m2, then again if m2 < k, look for all elements less than (t - s2)/(k - m2). So:
def kSubsetSumUnderT(inList, k, t):
outList = []
s = 0
m = 0
while len(outList) < k:
toJoin = [i for i in inList where i < (t - s)/(k - m)]
if len(toJoin):
if len(toJoin) >= k - m:
toJoin.sort()
if(s0 + sum(toJoin[0:(k - m - 1)]) < t:
return True
return False
outList = outList + toJoin
s += sum(toJoin)
m += len(toJoin)
else:
return False
My intuition is that this might be the O(nlog(k)) algorithm, but I am having a hard time proving it to myself. Thoughts?