posting nutch data into a BASIC auth secured Solr instance
- by mlathe
Hi. I've secured a solr instance using BASIC auth, kind of how it is shown here:
http://blog.comtaste.com/2009/02/securing_your_solr_server_on_t.html
Now i'm trying to update my batch processes to push data into the authenticated instance. The ones using "curl" are easy, but i also have a Nutch crawl that uses the "solrindex" command to push data into Solr. When i do that i get this error:
2010-02-22 12:09:28,226 INFO
auth.AuthChallengeProcessor - basic
authentication scheme selected
2010-02-22 12:09:28,229 INFO
httpclient.HttpMethodDirector - No
credentials available for BASIC
'Tomcat Manager
Application'@ninja:5500 2010-02-22
12:09:28,236 WARN
mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException:
Unauthorized
Unauthorized
request:
http://ninja:5500/solr/foo/update?wt=javabin&version=2.2
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:343)
at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:183)
at
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
at
org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)
at
org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:69)
at
org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:170)
2010-02-22 12:09:29,134 FATAL
solr.SolrIndexer - SolrIndexer:
java.io.IOException: Job failed! at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232)
at
org.apache.nutch.indexer.solr.SolrIndexer.indexSolr(SolrIndexer.java:73)
at
org.apache.nutch.indexer.solr.SolrIndexer.run(SolrIndexer.java:95)
at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.nutch.indexer.solr.SolrIndexer.main(SolrIndexer.java:104)
Apparently nutch uses SolrJ to push the content, and after going through the solrj code, it's clear that it uses commons-httpclient without providing a way to set the credentials.
Here are my question(s)
Is this possible to do? ie push
from nutch into a BASIC auth secured
Solr instance?
Is it possible to tell commons-httpclient about a credential without explicitly doing an _httpclient.getState().setCredentials(...)?
Anyother ideas? One idea i had was to use an IPfiltering Valve for just the "update" Solr webservices. That would mean you could only make an update call from certain nodes.
Thanks