restrictedinfinity - Developer IT

Search Results

Search found 2 results on 1 pages for 'restrictedinfinity'.

Page 1/1 | 1

Hadoop: Processing large serialized objects

- by restrictedinfinity

I am working on development of an application to process (and merge) several large java serialized objects (size of order GBs) using Hadoop framework. Hadoop stores distributes blocks of a file on different hosts. But as deserialization will require the all the blocks to be present on single host, its gonna hit the performance drastically. How can I deal this situation where different blocks have to cant be individually processed, unlike text files ?

Read the article

Hadoop: Mapping binary files

- by restrictedinfinity

Typically in a the input file is capable of being partially read and processed by Mapper function (as in text files). Is there anything that can be done to handle binaries (say images, serialized objects) which would require all the blocks to be on same host, before the processing can start.