Hadoop/MapReduce: Reading and writing classes generated from DDL
- by Dave
Hi,
Can someone walk me though the basic work-flow of reading and writing data with classes generated from DDL?
I have defined some struct-like records using DDL. For example:
class Customer {
ustring FirstName;
ustring LastName;
ustring CardNo;
long LastPurchase;
}
I've compiled this to get a Customer class and included it into my project. I can easily see how to use this as input and output for mappers and reducers (the generated class implements Writable), but not how to read and write it to file.
The JavaDoc for the org.apache.hadoop.record package talks about serializing these records in Binary, CSV or XML format. How do I actually do that? Say my reducer produces IntWritable keys and Customer values. What OutputFormat do I use to write the result in CSV format? What InputFormat would I use to read the resulting files in later, if I wanted to perform analysis over them?