How do I concatenate a lot of files into one inside Hadoop, with no mapping or reduction
Posted
by Leonard
on Stack Overflow
See other posts from Stack Overflow
or by Leonard
Published on 2010-04-08T22:12:09Z
Indexed on
2010/04/08
22:53 UTC
Read the original article
Hit count: 377
hadoop
I'm trying to combine multiple files in multiple input directories into a single file, for various odd reasons I won't go into. My initial try was to write a 'nul' mapper and reducer that just copied input to output, but that failed. My latest try is:
vcm_hadoop lester jar /vcm/home/apps/hadoop/contrib/streaming/hadoop-*-streaming.jar -input /cruncher/201004/08/17/00 -output /lcuffcat9 -mapper /bin/cat -reducer NONE
but I end up with multiple output files anyway. Anybody know how I can coax everything into a single output file?
© Stack Overflow or respective owner