Hadoop: Processing large serialized objects

Posted by restrictedinfinity on Stack Overflow See other posts from Stack Overflow or by restrictedinfinity
Published on 2010-06-10T06:28:22Z Indexed on 2010/06/10 6:32 UTC
Read the original article Hit count: 275

Filed under:

java

|

Performance

|

objects

|

hadoop

I am working on development of an application to process (and merge) several large java serialized objects (size of order GBs) using Hadoop framework. Hadoop stores distributes blocks of a file on different hosts. But as deserialization will require the all the blocks to be present on single host, its gonna hit the performance drastically. How can I deal this situation where different blocks have to cant be individually processed, unlike text files ?

© Stack Overflow or respective owner

Related posts about java

Tomcat 6: Access Control Exception?

as seen on Server Fault - Search for 'Server Fault'
I'm trying to setup a tomcat6 server, and I'm trying to match another setup someone else established. However, my deployment (default Ubuntu install) uses a policy.d/ directory structure, and the established server just uses a catalina.policy file. I've tried setting every entry in policy.d to match… >>> More
Problem in creation MDB Queue connection at Jboss StartUp

as seen on Stack Overflow - Search for 'Stack Overflow'
I am not able to create a Queue connection in JBOSS4.2.3GA Version & Java1.5, as I am using MDB as per the below details. I am putting this MDB in a jar file(named utsJar.jar) and copied it in deploy folder of JBOSS, In the test env. this MDB works well but in another env. [ env settings and… >>> More
failing to establish connection between Postgres db and gwt

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I am using Postgres and gwt 2.0 for one of my applications. I am facing problem connecting to the database. When I try to connect it gives "ClassNotFoundException". Here is what I get when I try to connect to database: java.lang.ClassNotFoundException: org.postgresql.Driver at java.net… >>> More
failing to establish connection between postgre db and gwt

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, For i am using postgre and gwt 2.0 for one of my applications. I am facing problem connecting to the database. When i try to connect it gives "ClassNotFoundException". Here is what i get when i try to connect to database: java.lang.ClassNotFoundException: org.postgresql.Driver at java.net… >>> More
Migration and deployement problems JBoss 4.2.2.GA to JBoss 6.0.0.M2

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, I'm trying to migrate an application running on JBoss 4.2.2.GA to JBoss 6.0.0.M2 I give you some log to explain my problem : boot.log : 2010-03-16 09:59:29,406 ERROR [org.jboss.system.server.profileservice.ProfileServiceBootstrap] (Thread-2) Failed to load profile: Summary of incomplete deployments… >>> More

Related posts about Performance

Improving VPN performance - stronger encryption = more performance?

as seen on Server Fault - Search for 'Server Fault'
I have a site-to-site VPN set up with two SonicWall's (a TZ170 and a Pro1260). It was suggested to me that turning off encryption (so the VPN is tunneling only) would improve performance. (I'm not concerned with security, because the VPN is running over a trusted line.) Using FTP and HTTP transfers… >>> More
Inaccurate performance counter timer values in Windows Performance Monitor

as seen on Stack Overflow - Search for 'Stack Overflow'
I am implementing instrumentation within an application and have encountered an issue where the value that is displayed in Windows Performance Monitor from a PerformanceCounter is incongruent with the value that is recorded. I am using a Stopwatch to record the duration of a method execution, then… >>> More
Excel-based Performance Reviews transformed into Web Application for Performance Management

as seen on Geeks with Blogs - Search for 'Geeks with Blogs'
HR TMS provides enterprise talent management solutions for healthcare, retail and corporate customers, focusing on performance management, compensation management and succession planning. As the competency of nurses and other healthcare workers is critical, the government, via the Joint Commission… >>> More
How to save a perfmon Performance Counter as a textfile (Reliability and Performance Monitor Version

as seen on Server Fault - Search for 'Server Fault'
Now the file gets saved as blg, but I would like a txt versin to import in Excel. >>> More
SQLAuthority News – A Successful Performance Tuning Seminar at Pune – Dec 4-5, 2010

as seen on SQL Authority - Search for 'SQL Authority'
This is report to my third of very successful seminar event on SQL Server Performance Tuning. SQL Server Performance Tuning Seminar in Colombo was oversubscribed with total of 35 attendees. You can read the details over here SQLAuthority News – SQL Server Performance Optimizations Seminar – Grand… >>> More