Problem Hey folks. I'm looking for some advice on python performance. Some background on my problem:
Given:
A mesh of nodes of size (x,y) each with a value (0...255) starting at 0
A list of N input coordinates each at a specified location within the range (0...x, 0...y)
Increment the value of the node at the input coordinate and the node's neighbors within range Z up to a maximum of 255. Neighbors beyond the mesh edge are ignored. (No wrapping)
BASE CASE: A mesh of size 1024x1024 nodes, with 400 input coordinates and a range Z of 75 nodes.
Processing should be O(x*y*Z*N). I expect x, y and Z to remain roughly around the values in the base case, but the number of input coordinates N could increase up to 100,000. My goal is to minimize processing time.
Current results I have 2 current implementations: f1, f2
Running speed on my 2.26 GHz Intel Core 2 Duo with Python 2.6.1:
f1: 2.9s
f2: 1.8s
f1 is the initial naive implementation: three nested for loops.
f2 is replaces the inner for loop with a list comprehension.
Code is included below for your perusal.
Question How can I further reduce the processing time? I'd prefer sub-1.0s for the test parameters.
Please, keep the recommendations to native Python. I know I can move to a third-party package such as numpy, but I'm trying to avoid any third party packages. Also, I've generated random input coordinates, and simplified the definition of the node value updates to keep our discussion simple. The specifics have to change slightly and are outside the scope of my question.
thanks much!
f1 is the initial naive implementation: three nested for loops. 2.9s
def f1(x,y,n,z):
rows = []
for i in range(x):
rows.append([0 for i in xrange(y)])
for i in range(n):
inputX, inputY = (int(x*random.random()), int(y*random.random()))
topleft = (inputX - z, inputY - z)
for i in xrange(max(0, topleft[0]), min(topleft[0]+(z*2), x)):
for j in xrange(max(0, topleft[1]), min(topleft[1]+(z*2), y)):
if rows[i][j] <= 255: rows[i][j] += 1
f2 is replaces the inner for loop with a list comprehension. 1.8s
def f2(x,y,n,z):
rows = []
for i in range(x):
rows.append([0 for i in xrange(y)])
for i in range(n):
inputX, inputY = (int(x*random.random()), int(y*random.random()))
topleft = (inputX - z, inputY - z)
for i in xrange(max(0, topleft[0]), min(topleft[0]+(z*2), x)):
l = max(0, topleft[1])
r = min(topleft[1]+(z*2), y)
rows[i][l:r] = [j+1 for j in rows[i][l:r] if j < 255]