python histogram one-liner

Posted by mykhal on Stack Overflow See other posts from Stack Overflow or by mykhal
Published on 2010-05-20T01:15:39Z Indexed on 2010/05/20 1:20 UTC
Read the original article Hit count: 344

Filed under:
|
|
|
|

there are many ways, how to code histogram in Python.

by histogram, i mean function, counting objects in an interable, resulting in the count table (i.e. dict). e.g.:

>>> L = 'abracadabra'
>>> histogram(L)
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}

it can be written like this:

def histogram(L):
    d = {}
    for x in L:
        if x in d:
            d[x] += 1
        else:
            d[x] = 1
    return d

..however, there are much less ways, how do this in a single expression.

if we had "dict comprehensions" in python, we would write:

>>> { x: L.count(x) for x in set(L) }

but we don't have them, so we have to write:

>>> dict([(x, L.count(x)) for x in set(L)])

however, this approach may yet be readable, but is not efficient - L is walked-through multiple times, so this won't work for single-life generators.. the function should iterate well also through gen(), where:

def gen():
    for x in L:
        yield x

we can go with reduce (R.I.P.):

>>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong!

oops, does not work, the key name is 'x', not x :(

i ended with:

>>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {})

(in py3k, we would have to write list(d.items()) instead of d.items(), but it's hypothethical, since there is no reduce there)

please beat me with a better one-liner, more readable! ;)

© Stack Overflow or respective owner

Related posts about python

Related posts about histogram