Having two sets of input combined on hadoop

Posted by aeolist on Stack Overflow See other posts from Stack Overflow or by aeolist
Published on 2010-04-27T23:01:43Z Indexed on 2010/04/27 23:03 UTC
Read the original article Hit count: 243

Filed under:

hadoop

|

multiple

|

inputs

|

map

|

mapper

I have a rather simple hadoop question which i'll try to present with an example

say you have a list of strings and a large file and you want each mapper to process a piece of the file and one of the strings in a grep like program.

how are you supposed to do that? I am under the impression that the number of mappers is a result of the inputsplits produced. I could run subsequent jobs, one for each string, but it seems kinda... messy?

© Stack Overflow or respective owner

Related posts about hadoop

prerequisites of learnig hadoop, can php developer learn hadoop without java experience [closed]

as seen on Programmers - Search for 'Programmers'
i am willing to learn hadoop as a Developer , but i am confused over the prerequisite of learning it.? is having a good experience in java programming very essential to learn hadoop? I have 4 years of experience in application development in LAMP. But i am not in touch with java programming as a part… >>> More
Hadoop hdfs namenode is throwing an error

as seen on Server Fault - Search for 'Server Fault'
Full list of error: hb@localhost:/etc/hadoop/conf$ sudo service hadoop-hdfs-namenode start * Starting Hadoop namenode: starting namenode, logging to /var/log/hadoop-hdfs/hadoop-hdfs-namenode-localhost.out 12/09/10 14:41:09 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG:… >>> More
Combining HBase and HDFS results in Exception in makeDirOnFileSystem

as seen on Server Fault - Search for 'Server Fault'
Introduction An attempt to combine HBase and HDFS results in the following: 2014-06-09 00:15:14,777 WARN org.apache.hadoop.hbase.HBaseFileSystem: Create Dir ectory, retries exhausted 2014-06-09 00:15:14,780 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown. java… >>> More
Problem compiling hive with ant

as seen on Stack Overflow - Search for 'Stack Overflow'
I compiling with Solaris 10 SPARC, jdk 1.6 from Sun, Ant 1.7.1 from OpenCSW. I have no problem running hadoop 0.17.2.1 However, I have problem compiling/integrating hive with the error 'cannot find symbol', although I followed the tutorial. I have the hive source code from SVN exactly from tutorial… >>> More
no namenode error in pseudo-mode

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm new to hadoop and is in learning phase. As per Hadoop Definitve guide, i have set up my hadoop in pseudo distributed mode and everything was working fine. I was even able to execute all the examples from chapter 3 yesterday. Today, when i rebooted my unix and tried to run start-dfs.sh and then… >>> More

Related posts about multiple

Multiple Base Addresses and Multiple Endpoints in WCF

as seen on Stack Overflow - Search for 'Stack Overflow'
I'm using two bindings TCP and HTTP. I want to give mex data on both bindings. What I want is that the mexHttpBinding only exposes the HTTP services while the mexTcpBinding exposes TCP services only. Or is this possible that I access stats service only from HTTP binding and the eventLogging service… >>> More
LINQ Query using Multiple From and Multiple Collections

as seen on Microsoft .NET Support Team - Search for 'Microsoft .NET Support Team'
1: using System; 2: using System.Collections.Generic; 3: using System.Linq; 4: using System.Text; 5: 6: namespace ConsoleApplication2 7: { 8: class Program 9: { 10: static void Main(string[] args) 11: { 12:… >>> More
How to avoid LinearAlloc Exceeded Capacity error android

as seen on Stack Overflow - Search for 'Stack Overflow'
The application gets crashing every-time, when am running eclipse saying LinearAlloc exceeded capacity (5242880), last=208 This is happening, when am creating AsyncTask, thats strange this is happening everytime . when am commenting and running its running. Logcat is: 02-09 04:02:23.374:… >>> More
Run MySQL INSERT Query multiple times (insert values into multiple tables)

as seen on Stack Overflow - Search for 'Stack Overflow'
Hi, basically, I have 3 tables; users and projects (which is a many-to-many relationship), then I have 'usersprojects' to allow the one-to-many formation. When a user adds a project, I need the project information stored and then the 'userid' and 'projectid' stored in the usersprojects table. It sounds… >>> More
Selecting multiple fields into multiple variables in a MySQL stored procedure

as seen on Stack Overflow - Search for 'Stack Overflow'
I am a little new to store procedures in MySQL and was wondering if I can SELECT multiple columns into multiple variables within the same select query. for example (iName is the input to the function): DECLARE iId INT(20); DECLARE dCreate DATETIME; SELECT Id INTO iId, dateCreated INTO dCreate… >>> More