Vectorizatoin of index operation for a scipy.sparse matrix

Posted by celil on Stack Overflow See other posts from Stack Overflow or by celil
Published on 2010-03-08T20:19:50Z Indexed on 2010/03/08 20:21 UTC
Read the original article Hit count: 636

Filed under:
|
|
|

The following code runs too slowly even though everything seems to be vectorized.

from numpy import *
from scipy.sparse import *

n = 100000;
i = xrange(n); j = xrange(n);
data = ones(n);

A=csr_matrix((data,(i,j)));

x = A[i,j]

The problem seems to be that the indexing operation is implemented as a python function, and invoking A[i,j] results in the following profiling output

         500033 function calls in 8.718 CPU seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   100000    7.933    0.000    8.156    0.000 csr.py:265(_get_single_element)
        1    0.271    0.271    8.705    8.705 csr.py:177(__getitem__)
(...)

Namely, the python function _get_single_element gets called 100000 times which is really inefficient. Why isn't this implemented in pure C? Does anybody know of a way of getting around this limitation, and speeding up the above code? Should I be using a different sparse matrix type?

© Stack Overflow or respective owner

Related posts about python

Related posts about scipy