Reverse and Forward DNS set up correctly but sometimes MapReduce job fails

Posted by phodamentals on Server Fault See other posts from Server Fault or by phodamentals
Published on 2013-09-19T16:56:28Z Indexed on 2013/10/24 21:57 UTC
Read the original article Hit count: 360

Filed under:

Ever since we switched over our cluster to communicate via private interfaces and created a DNS server with correct forward and reverse lookup zones, we get this message before the M/R job runs:

ERROR org.apache.hadoop.hbase.mapreduce.TableInputFormatBase - Cannot resolve the host name for /192.168.3.9 because of javax.naming.NameNotFoundException: DNS name not found [response code 3]; remaining name '9.3.168.192.in-addr.arpa'

A dig and nslookup both show that the reverse and forward look-ups both get good responses with no errors from within the cluster.

Shortly after these messages, the job runs...but every once in awhile we get a NPE:

Exception in thread "main" java.lang.NullPointerException INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.net.DNS.reverseDns(DNS.java:93) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.reverseDNS(TableInputFormatBase.java:219) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.getSplits(TableInputFormatBase.java:184) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:1063) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:1080) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.mapred.JobClient.access$600(JobClient.java:174) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:992) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:945) INFO app.insights.search.SearchIndexUpdater - at java.security.AccessController.doPrivileged(Native Method) INFO app.insights.search.SearchIndexUpdater - at javax.security.auth.Subject.doAs(Subject.java:415) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:945) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.mapreduce.Job.submit(Job.java:566) INFO app.insights.search.SearchIndexUpdater - at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:596) INFO app.insights.search.SearchIndexUpdater - at app.insights.search.correlator.comments.CommentCorrelator.main(CommentCorrelator.java:72

Does anyone else who has set-up a CDH Hadoop cluster on a private network w/DNS server get this?

CDH 4.3.1 with MR1 2.0.0 and HBase 0.94.6

© Server Fault or respective owner

Related posts about hadoop