Best way in Python to determine all possible intersections in a matrix?

Posted by ssweens on Stack Overflow See other posts from Stack Overflow or by ssweens
Published on 2010-06-12T07:51:38Z Indexed on 2010/06/12 8:03 UTC
Read the original article Hit count: 160

Filed under:
|
|
|

So if I have a matrix (list of lists) of unique words as my column headings, document ids as my row headings, and a 0 or 1 as the values if the word exists in that particular document.

What I'd like to know is how to determine all the possible combinations of words and documents where more than one word is in common with more than one document.

So something like:

[[Docid_3, Docid_5], ['word1', 'word17', 'word23']], [[Docid_3, Docid_9, Docid_334], ['word2', 'word7', 'word23', 'word68', 'word982']], and so on for each possible combination. Would love a solution that provides the complete set of combinations and one that yields only the combinations that are not a subset of another, so from the example, not [[Docid_3, Docid_5], ['word1', 'word17']] since it's a complete subset of the first example.

I feel like there is an elegant solution that just isn't coming to mind and the beer isn't helping.

Thanks.

© Stack Overflow or respective owner

Related posts about python

Related posts about vector