Hadoop: Iterative MapReduce Performance
Posted
by S.N
on Stack Overflow
See other posts from Stack Overflow
or by S.N
Published on 2010-04-18T13:02:54Z
Indexed on
2010/04/18
13:13 UTC
Read the original article
Hit count: 465
Is it correct to say that the parallel computation with iterative MapReduce can be justified only when the training data size is too large for the non-parallel computation for the same logic?
I am aware that the there is overhead for starting MapReduce jobs. This can be critical for overall execution time when a large number of iterations is required.
I can imagine that the sequential computation is faster than the parallel computation with iterative MapReduce as long as the memory allows to hold a data set in many cases.
Is it the only benefit to use the iterative MapReduce? If not, what are the other benefits could be?
© Stack Overflow or respective owner