Hadoop: Iterative MapReduce Performance
- by S.N
Is it correct to say that the parallel computation with iterative MapReduce can be justified only when the training data size is too large for the non-parallel computation for the same logic?
I am aware that the there is overhead for starting MapReduce jobs.
This can be critical for overall execution time when a large number of iterations is required.
I can imagine that the sequential computation is faster than the parallel computation with iterative MapReduce as long as the memory allows to hold a data set in many cases.
Is it the only benefit to use the iterative MapReduce?
If not, what are the other benefits could be?