Pig: Count number of keys in a map
Posted
by
Donald Miner
on Stack Overflow
See other posts from Stack Overflow
or by Donald Miner
Published on 2012-12-05T22:12:58Z
Indexed on
2012/12/05
23:03 UTC
Read the original article
Hit count: 265
I'd like to count the number of keys in a map in Pig. I could write a UDF to do this, but I was hoping there would be an easier way.
data = LOAD 'hbase://MARS1'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'A:*', '-loadKey true -caching=100000')
AS (id:bytearray, A_map:map[]);
In the code above, I want to basically build a histogram of id
and how many items in column family A
that key has.
In hoping, I tried c = FOREACH data GENERATE id, COUNT(A_map);
but that unsurprisingly didn't work.
Or, perhaps someone can suggest a better way to do this entirely. If I can't figure this out soon I'll just write a Java MapReduce job or a Pig UDF.
© Stack Overflow or respective owner