Configuring Hadoop logging to avoid too many log files

Posted by Eric Wendelin on Stack Overflow See other posts from Stack Overflow or by Eric Wendelin
Published on 2010-04-16T21:18:24Z Indexed on 2010/04/16 21:23 UTC
Read the original article Hit count: 257

Filed under:
|
|
|

I'm having a problem with Hadoop producing too many log files in $HADOOP_LOG_DIR/userlogs (the Ext3 filesystem allows only 32000 subdirectories) which looks like the same problem in this question: http://stackoverflow.com/questions/2091287/error-in-hadoop-mapreduce

My question is: does anyone know how to configure Hadoop to roll the log dir or otherwise prevent this? I'm trying to avoid just setting the "mapred.userlog.retain.hours" and/or "mapred.userlog.limit.kb" properties because I want to actually keep the log files.

I was also hoping to configure this in log4j.properties, but looking at the Hadoop 0.20.2 source, it writes directly to logfiles instead of actually using log4j. Perhaps I don't understand how it's using log4j fully.

Any suggestions or clarifications would be greatly appreciated.

© Stack Overflow or respective owner

Related posts about hadoop

Related posts about mapreduce