Hadoop/MapReduce: Reading and writing classes generated from DDL

Posted by Dave on Stack Overflow See other posts from Stack Overflow or by Dave
Published on 2010-05-16T21:48:47Z Indexed on 2010/05/16 21:50 UTC
Read the original article Hit count: 311

Filed under:
|
|

Hi,

Can someone walk me though the basic work-flow of reading and writing data with classes generated from DDL?

I have defined some struct-like records using DDL. For example:

  class Customer {
     ustring FirstName;
     ustring LastName;
     ustring CardNo;
     long LastPurchase;
  }

I've compiled this to get a Customer class and included it into my project. I can easily see how to use this as input and output for mappers and reducers (the generated class implements Writable), but not how to read and write it to file.

The JavaDoc for the org.apache.hadoop.record package talks about serializing these records in Binary, CSV or XML format. How do I actually do that? Say my reducer produces IntWritable keys and Customer values. What OutputFormat do I use to write the result in CSV format? What InputFormat would I use to read the resulting files in later, if I wanted to perform analysis over them?

© Stack Overflow or respective owner

Related posts about hadoop

Related posts about mapreduce