Read from one large file and write to many (tens, hundreds, or thousands) files in Java?
- by Rudiger
I have a large-ish file (4-5 GB compressed) of small messages that I wish to parse into approximately 6,000 files by message type. Messages are small; anywhere from 5 to 50 bytes depending on the type.
Each message starts with a fixed-size type field (a 6-byte key). If I read a message of type '000001', I want to write append its payload to 000001.dat, etc. The input file contains a mixture of messages; I want N homogeneous output files, where each output file contains only the messages of a given type.
What's an efficient a fast way of writing these messages to so many individual files? I'd like to use as much memory and processing power to get it done as fast as possible. I can write compressed or uncompressed files to the disk.
I'm thinking of using a hashmap with a message type key and an outputstream value, but I'm sure there's a better way to do it.
Thanks!