Hadoop/MapReduce: Reading and writing classes generated from DDL
Posted
by Dave
on Stack Overflow
See other posts from Stack Overflow
or by Dave
Published on 2010-05-16T21:48:47Z
Indexed on
2010/05/16
21:50 UTC
Read the original article
Hit count: 319
Hi,
Can someone walk me though the basic work-flow of reading and writing data with classes generated from DDL?
I have defined some struct-like records using DDL. For example:
class Customer {
ustring FirstName;
ustring LastName;
ustring CardNo;
long LastPurchase;
}
I've compiled this to get a Customer class and included it into my project. I can easily see how to use this as input and output for mappers and reducers (the generated class implements Writable), but not how to read and write it to file.
The JavaDoc for the org.apache.hadoop.record package talks about serializing these records in Binary, CSV or XML format. How do I actually do that? Say my reducer produces IntWritable keys and Customer values. What OutputFormat do I use to write the result in CSV format? What InputFormat would I use to read the resulting files in later, if I wanted to perform analysis over them?
© Stack Overflow or respective owner