Hadoop Map Reduce job never finishes
Posted
by rohanbk
on Stack Overflow
See other posts from Stack Overflow
or by rohanbk
Published on 2010-06-11T03:57:18Z
Indexed on
2010/06/11
4:02 UTC
Read the original article
Hit count: 194
I am running a Hadoop Map Reduce job using a Python Mapper and Reducer script, and Hadoop Streaming. Both my Map and Reduce jobs run till they are both 100%, but the job doesn't end. I know that when things go sour, Hadoop will terminate the job, but in this case, both stages reach a 100% and just never end. Has anyone else encountered anything similar?
Also, how do I debug my program to figure out where things are going wrong? If I use a smaller input file, and I just run something like: $> cat input_file | mapper.py | sort | reduce.py >> output_file
everything works perfectly fine. However, when I use Hadoop, things don't work out.
© Stack Overflow or respective owner