python histogram one-liner
- by mykhal
there are many ways, how to code histogram in Python.
by histogram, i mean function, counting objects in an interable, resulting in the count table (i.e. dict). e.g.:
>>> L = 'abracadabra'
>>> histogram(L)
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}
it can be written like this:
def histogram(L):
d = {}
for x in L:
if x in d:
d[x] += 1
else:
d[x] = 1
return d
..however, there are much less ways, how do this in a single expression.
if we had "dict comprehensions" in python, we would write:
>>> { x: L.count(x) for x in set(L) }
but we don't have them, so we have to write:
>>> dict([(x, L.count(x)) for x in set(L)])
however, this approach may yet be readable, but is not efficient - L is walked-through multiple times, so this won't work for single-life generators.. the function should iterate well also through gen(), where:
def gen():
for x in L:
yield x
we can go with reduce (R.I.P.):
>>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong!
oops, does not work, the key name is 'x', not x :(
i ended with:
>>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {})
(in py3k, we would have to write list(d.items()) instead of d.items(), but it's hypothethical, since there is no reduce there)
please beat me with a better one-liner, more readable! ;)