PIG doesn't read my custom InputFormat
Posted
by
Simon Guo
on Stack Overflow
See other posts from Stack Overflow
or by Simon Guo
Published on 2012-12-18T23:00:43Z
Indexed on
2012/12/18
23:03 UTC
Read the original article
Hit count: 308
I have a custom MyInputFormat
that suppose to deal with record boundary problem for multi-lined inputs. But when I put the MyInputFormat
into my UDF load function. As follow:
public class EccUDFLogLoader extends LoadFunc {
@Override
public InputFormat getInputFormat() {
System.out.println("I am in getInputFormat function");
return new MyInputFormat();
}
}
public class MyInputFormat extends TextInputFormat {
public RecordReader createRecordReader(InputSplit inputSplit, JobConf jobConf) throws IOException {
System.out.prinln("I am in createRecordReader");
//MyRecordReader suppose to handle record boundary
return new MyRecordReader((FileSplit)inputSplit, jobConf);
}
}
For each mapper, it print out I am in getInputFormat function
but not I am in createRecordReader
. I am wondering if anyone can provide a hint on how to hoop up my costome MyInputFormat to PIG's UDF loader? Much Thanks.
I am using PIG on Amazon EMR.
© Stack Overflow or respective owner