Vectorizatoin of index operation for a scipy.sparse matrix
- by celil
The following code runs too slowly even though everything seems to be vectorized.
from numpy import *
from scipy.sparse import *
n = 100000;
i = xrange(n); j = xrange(n);
data = ones(n);
A=csr_matrix((data,(i,j)));
x = A[i,j]
The problem seems to be that the indexing operation is implemented as a python function, and invoking A[i,j] results in the following profiling output
500033 function calls in 8.718 CPU seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
100000 7.933 0.000 8.156 0.000 csr.py:265(_get_single_element)
1 0.271 0.271 8.705 8.705 csr.py:177(__getitem__)
(...)
Namely, the python function _get_single_element gets called 100000 times which is really inefficient. Why isn't this implemented in pure C? Does anybody know of a way of getting around this limitation, and speeding up the above code?
Should I be using a different sparse matrix type?