Having two sets of input combined on hadoop
Posted
by aeolist
on Stack Overflow
See other posts from Stack Overflow
or by aeolist
Published on 2010-04-27T23:01:43Z
Indexed on
2010/04/27
23:03 UTC
Read the original article
Hit count: 199
I have a rather simple hadoop question which i'll try to present with an example
say you have a list of strings and a large file and you want each mapper to process a piece of the file and one of the strings in a grep like program.
how are you supposed to do that? I am under the impression that the number of mappers is a result of the inputsplits produced. I could run subsequent jobs, one for each string, but it seems kinda... messy?
© Stack Overflow or respective owner