Implications of Multiple JobTracker nodes in a Hadoop cluster?
Posted
by
Jim Dennis
on Server Fault
See other posts from Server Fault
or by Jim Dennis
Published on 2012-08-28T18:33:08Z
Indexed on
2012/08/30
21:40 UTC
Read the original article
Hit count: 245
I get the impression that one can, potentially, have multiple JobTracker nodes configured to share the same set of MR (TaskTracker) nodes. I know that, conventionally, all the nodes in a Hadoop cluster should have the same set of configuration files (conventionally under /etc/hadoop/conf/
--- at least for the Cloudera Distribution of Hadoop (CDH). Can we define multiple Job Trackers in mapred-site.xml
? Something like:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>jt01.mydomain.not:8021</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>jt02.mydomain.not:8021</value>
</property>
...
</configuration>
Or is there some other allowed syntax for this?
What are the implications of doing this. Does each JobTracker get information about the load on each TaskTracker node. In other words can the two JobTracker co-ordinated their scheduling across the TT nodes only based on the gossip information from the TTs or would they need to talk to one another?
Is this documented anywhere?
© Server Fault or respective owner