Optimization in Python - do's, don'ts and rules of thumb.
- by JV
Well I was reading this post and then I came across a code which was:
jokes=range(1000000)
domain=[(0,(len(jokes)*2)-i-1) for i in range(0,len(jokes)*2)]
I thought wouldn't it be better to calculate the value of len(jokes) once outside the list comprehension?
Well I tried it and timed three codes
jv@Pioneer:~$ python -m timeit -s 'jokes=range(1000000);domain=[(0,(len(jokes)*2)-i-1) for i in range(0,len(jokes)*2)]'
10000000 loops, best of 3: 0.0352 usec per loop
jv@Pioneer:~$ python -m timeit -s 'jokes=range(1000000);l=len(jokes);domain=[(0,(l*2)-i-1) for i in range(0,l*2)]'
10000000 loops, best of 3: 0.0343 usec per loop
jv@Pioneer:~$ python -m timeit -s 'jokes=range(1000000);l=len(jokes)*2;domain=[(0,l-i-1) for i in range(0,l)]'
10000000 loops, best of 3: 0.0333 usec per loop
Observing the marginal difference 2.55% between the first and the second made me think - is the first list comprehension
domain=[(0,(len(jokes)*2)-i-1) for i in range(0,len(jokes)*2)]
optimized internally by python? or is 2.55% a big enough optimization (given that the len(jokes)=1000000)?
If this is - What are the other implicit/internal optimizations in Python ?
What are the developer's rules of thumb for optimization in Python?
Edit1: Since most of the answers are "don't optimize, do it later if its slow" and I got some tips and links from Triptych and Ali A for the do's.
I will change the question a bit and request for don'ts.
Can we have some experiences from people who faced the 'slowness', what was the problem and how it was corrected?
Edit2: For those who haven't here is an interesting read
Edit3: Incorrect usage of timeit in question please see dF's answer for correct usage and hence timings for the three codes.