Search Results

Search found 631 results on 26 pages for 'couchdb lucene'.

Page 13/26 | < Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20  | Next Page >

  • How can I search on a list of values using Solr/Lucene?

    - by Mike
    Given the following query: (field:value1 OR field:value2 OR field:value3 OR ... OR field:value50) Can this be broken down into something less verbose? Basically I have hundreds of category IDs, and I need to search for items under large groups of category IDs (20-50 at a time). In MySQL, I'd just use field IN(value1, value2, value3) rather than (field = value1 OR field = value2 etc...). Is there a simpler way for Solr/Lucene?

    Read the article

  • Lucene Search for documents that have a particular field?

    - by RP
    Lucene.Net - Is there a way to query for documents that contain a particular field. Lets say some of my documents have a field 'foo' and some do not. I want to find all documents that have the field 'foo' - regardless of what the value of foo is. How do I do this? Is it some sort of TermQuery?

    Read the article

  • OutOfMemoryError: Java heap space error when start solr

    - by Hamid
    Hi I start indexing DB articles with solr, but after add about 58 million article (and about 113 GB size of disk) , i get below error message on tomcat log error Note1: i already set Init memory pool to 256MB, and Max memory pool:1400MB to tomcat server. Note2: I can post or search article but must wait over 3 min for get response. 8-apr-2010 14:27:07 org.apache.solr.common.SolrException log SEVERE: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.util.PriorityQueue.initialize(PriorityQueue.java:89) at org.apache.lucene.search.HitQueue.<init>(HitQueue.java:67) at org.apache.lucene.search.TopScoreDocCollector.<init>(TopScoreDocCollector.java:113) at org.apache.lucene.search.TopScoreDocCollector.<init>(TopScoreDocCollector.java:37) at org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.<init>(TopScoreDocCollector.java:42) at org.apache.lucene.search.TopScoreDocCollector$InOrderTopScoreDocCollector.<init>(TopScoreDocCollector.java:40) at org.apache.lucene.search.TopScoreDocCollector.create(TopScoreDocCollector.java:100) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:979) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:859) at org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:574) at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1527) at java.lang.Thread.run(Unknown Source) What's problem ? Have any suggestion ? Thanks in advanced

    Read the article

  • Strange Map Reduce Behavior in CouchDB. Rereduce?

    - by Tony
    I have a mapreduce issue with couchdb (both functions shown below): when I run it with grouplevel = 2 (exact) I get accurate output: {"rows":[ {"key":["2011-01-11","staff-1"],"value":{"total":895.72,"count":2,"services":6,"services_ignored":6,"services_liked":0,"services_disliked":0,"services_disliked_avg":0,"Revise":{"total":275.72,"count":1},"Review":{"total":620,"count":1}}}, {"key":["2011-01-11","staff-2"],"value":{"total":8461.689999999999,"count":2,"services":41,"services_ignored":37,"services_liked":4,"services_disliked":0,"services_disliked_avg":0,"Revise":{"total":4432.4,"count":1},"Review":{"total":4029.29,"count":1}}}, {"key":["2011-01-11","staff-3"],"value":{"total":2100.72,"count":1,"services":10,"services_ignored":4,"services_liked":3,"services_disliked":3,"services_disliked_avg":2.3333333333333335,"Revise":{"total":2100.72,"count":1}}}, However, changing to grouplevel=1 so the values for all the different staff keys should be all grouped by date no longer gives accurate output (notice the total is currect but all others are wrong): {"rows":[ {"key":["2011-01-11"],"value":{"total":11458.130000000001,"count":2,"services":0,"services_ignored":0,"services_liked":0,"services_disliked":0,"services_disliked_avg":0,"None":{"total":11458.130000000001,"count":2}}}, My only theory is this has something to do with rereduce, which I have not yet learned. Should I explore that option or am I missing something else here? This is the Map function: function(doc) { if(doc.doc_type == 'Feedback') { emit([doc.date.split('T')[0], doc.staff_id], doc); } } And this is the Reduce: function(keys, vals) { // sum all key points by status: total, count, services (liked, rejected, ignored) var ret = { 'total':0, 'count':0, 'services': 0, 'services_ignored': 0, 'services_liked': 0, 'services_disliked': 0, 'services_disliked_avg': 0, }; var total_disliked_score = 0; // handle status function handle_status(doc) { if(!doc.status || doc.status == '' || doc.status == undefined) { status = 'None'; } else if (doc.status == 'Declined') { status = 'Rejected'; } else { status = doc.status; } if(!ret[status]) ret[status] = {'total':0, 'count':0}; ret[status]['total'] += doc.total; ret[status]['count'] += 1; }; // handle likes / dislikes function handle_services(services) { ret.services += services.length; for(var a in services) { if (services[a].user_likes == 10) { ret.services_liked += 1; } else if (services[a].user_likes >= 1) { ret.services_disliked += 1; total_disliked_score += services[a].user_likes; if (total_disliked_score >= ret.services_disliked) { ret.services_disliked_avg = total_disliked_score / ret.services_disliked; } } else { ret.services_ignored += 1; } } } // loop thru docs for(var i in vals) { // increment the total $ ret.total += vals[i].total; ret.count += 1; // update totals and sums for the status of this route handle_status(vals[i]); // do the likes / dislikes stats if(vals[i].groups) { for(var ii in vals[i].groups) { if(vals[i].groups[ii].services) { handle_services(vals[i].groups[ii].services); } } } // handle deleted services if(vals[i].hidden_services) { if (vals[i].hidden_services) { handle_services(vals[i].hidden_services); } } } return ret; }

    Read the article

  • CouchDB basics for PHP developers

    <b>IBM Developerworks:</b> "If you're a typical PHP developer, it doesn't take a thorough review of past projects to pick out a telling pattern: In most (if not all) cases, you're probably getting PHP to talk to a database back end for all that dynamic data goodness; in 99 percent of those instances, you're using MySQL."

    Read the article

  • Lucene.NET and searching on multiple fields with specific values...

    - by Kieron
    Hi, I've created an index with various bits of data for each document I've added, each document can differ in it field name. Later on, when I come to search the index I need to query it with exact field/ values - for example: FieldName1 = X AND FieldName2 = Y AND FieldName3 = Z What's the best way of constructing the following using Lucene .NET: What analyser is best to use for this exact match type? Upon retrieving a match, I only need one specific field to be returned (which I add to each document) - should this be the only one stored? Later on I'll need to support keyword searching (so a field can have a list of values and I'll need to do a partial match). The fields and values come from a Dictionary<string, string>. It's not user input, it's constructed from code. Thanks, Kieron

    Read the article

  • How do you boost term relevance in Sql Server Full Text Search like you can in Lucene?

    - by Snives
    I'm doing a typical full text search using containstable using 'ISABOUT(term1,term2,term3)' and although it supports term weighting that's not what I need. I need the ability to boost the relevancy of terms contained in certain portions of text. For example, it is customary for metatags or page title to be weighted differently than body text when searching web pages. Although I'm not dealing with web pages I do seek the same functionality. In Lucene it's called Document Field Level Boosting. How would one natively do this in Sql Server Full Text Search?

    Read the article

  • Lucene: Fastest way to return the document occurance of a phrase?

    - by dont say the kid's name
    Hi Guys, I am trying to use Lucene (actually PyLucene!) to find out how many documents contain my exact phrase. My code currently looks like this... but it runs rather slow. Does anyone know a faster way to return document counts? phraseList = ["some phrase 1", "some phrase 2"] #etc, a list of phrases... countsearcher = IndexSearcher(SimpleFSDirectory(File(STORE_DIR)), True) analyzer = StandardAnalyzer(Version.LUCENE_CURRENT) for phrase in phraseList: query = QueryParser(Version.LUCENE_CURRENT, "contents", analyzer).parse("\"" + phrase + "\"") scoreDocs = countsearcher.search(query, 200).scoreDocs print "count is: " + str(len(scoreDocs))

    Read the article

  • How to keep Lucene index synchronized with Mysql database?

    - by ?????
    I am trying to utilize Lucene to develop full text search in my application, which need to build index based on my mysql database. I was wondering is how to keep these index synchronized with db? I came up with to ways: 1) add extra code in business logic tightly to update the search index . 2) running a separated task to rebuild the index periodically. do you have any other approaches? and what do you think is the best way? Any comments would be appreciate, thanks in advance!

    Read the article

  • Caching Authentication Data

    - by PartlyCloudy
    Hi, I'm currently implementing a REST web service using CouchDB and RESTlet. The RESTlet layer is mainly for authentication and some minor filtering of the JSON data served by CouchDB: Clients <= HTTP = [ RESTlet <= HTTP = CouchDB ] I'm using CouchDB also to store user login data, because I don't want to add an additional database server for that purpose. Thus, each request to my service causes two CouchDB requests conducted by RESTlet (auth data + "real" request). In order to keep the service as efficent as possible, I want to reduce the number of requests, in this case redundant requests for login data. My idea now is to provide a cache (i.e.LRU-Cache via LinkedHashMap) within my RESTlet application that caches login data, because HTTP caching will probabily not be enough. But how do I invalidate the cache data, once a user changes the password, for instance. Thanks to REST, the application might run on several servers in parallel, and I don't want to create a central instance just to cache login data. Currently, I save requested auth data in the cache and try to auth new requests by using them. If a authentication fails or there is now entry available, I'll dispatch a GET request to my CouchDB storage in order to obtain the actual auth data. So in a worst case, users that have changed their data will perhaps still be able to login with their old credentials. How can I deal with that? Or what is a good strategy to keep the cache(s) up-to-date in general? Thanks in advance.

    Read the article

  • Zend_Search_Lucene vs SOLR

    - by spacemonkey
    Hi, I have recenlty stumbled into Zend Lucene port of Lucene project. I have a little bit experience with SOLR so I would like to know what is the difference between two of them especially from performance and installation side. As much as I know SOLR requires Tomcat serverlet running in web hosting in order to work, what about Zend Lucene library? I am also a bit confused what means "being implemented on the top of Lucene"?

    Read the article

  • How to structure an index for type ahead for extremely large dataset using Lucene or similar?

    - by Pete
    I have a dataset of 200million+ records and am looking to build a dedicated backend to power a type ahead solution. Lucene is of interest given its popularity and license type, but I'm open to other open source suggestions as well. I am looking for advice, tales from the trenches, or even better direct instruction on what I will need as far as amount of hardware and structure of software. Requirements: Must have: The ability to do starts with substring matching (I type in 'st' and it should match 'Stephen') The ability to return results very quickly, I'd say 500ms is an upper bound. Nice to have: The ability to feed relevance information into the indexing process, so that, for example, more popular terms would be returned ahead of others and not just alphabetical, aka Google style. In-word substring matching, so for example ('st' would match 'bestseller') Note: This index will purely be used for type ahead, and does not need to serve standard search queries. I am not worried about getting advice on how to set up the front end or AJAX, as long as the index can be queried as a service or directly via Java code. Up votes for any useful information that allows me to get closer to an enterprise level type ahead solution

    Read the article

  • How to search Multiple Sites using Lucene Search engine API?

    - by Wael Salman
    Hope that someone can help me as soon as possible :-) I would like to know how can we search Multiple Sites using Lucene??! (All sites are in one index). I have succeeded to search one website , and to index multiple sites, however I am not able to search all websites. Consider this method that I have: private void PerformSearch() { DateTime start = DateTime.Now; //Create the Searcher object string strIndexDir = Server.MapPath("index") + @"\" + mstrURL; IndexSearcher objSearcher = new IndexSearcher(strIndexDir); //Parse the query, "text" is the default field to search Query objQuery = QueryParser.Parse(mstrQuery, "text", new StandardAnalyzer()); //Create the result DataTable mobjDTResults.Columns.Add("title", typeof(string)); mobjDTResults.Columns.Add("path", typeof(string)); mobjDTResults.Columns.Add("score", typeof(string)); mobjDTResults.Columns.Add("sample", typeof(string)); mobjDTResults.Columns.Add("explain", typeof(string)); //Perform search and get hit count Hits objHits = objSearcher.Search(objQuery); mintTotal = objHits.Length(); //Create Highlighter QueryHighlightExtractor highlighter = new QueryHighlightExtractor(objQuery, new StandardAnalyzer(), "<B>", "</B>"); //Initialize "Start At" variable mintStartAt = GetStartAt(); //How many items we should show? int intResultsCt = GetSmallerOf(mintTotal, mintMaxResults + mintStartAt); //Loop through results and display for (int intCt = mintStartAt; intCt < intResultsCt; intCt++) { //Get the document from resuls index Document doc = objHits.Doc(intCt); //Get the document's ID and set the cache location string strID = doc.Get("id"); string strLocation = ""; if (mstrURL.Substring(0,3) == "www") strLocation = Server.MapPath("cache") + @"\" + mstrURL + @"\" + strID + ".htm"; else strLocation = doc.Get("path") + doc.Get("filename"); //Load the HTML page from cache string strPlainText; using (StreamReader sr = new StreamReader(strLocation, System.Text.Encoding.Default)) { strPlainText = ParseHTML(sr.ReadToEnd()); } //Add result to results datagrid DataRow row = mobjDTResults.NewRow(); if (mstrURL.Substring(0,3) == "www") row["title"] = doc.Get("title"); else row["title"] = doc.Get("filename"); row["path"] = doc.Get("path"); row["score"] = String.Format("{0:f}", (objHits.Score(intCt) * 100)) + "%"; row["sample"] = highlighter.GetBestFragments(strPlainText, 200, 2, "..."); Explanation objExplain = objSearcher.Explain(objQuery, intCt); row["explain"] = objExplain.ToHtml(); mobjDTResults.Rows.Add(row); } objSearcher.Close(); //Finalize results information mTsDuration = DateTime.Now - start; mintFromItem = mintStartAt + 1; mintToItem = GetSmallerOf(mintStartAt + mintMaxResults, mintTotal); } as you can see that I use the site URL 'mstrURL' when I create the search object string strIndexDir = Server.MapPath("index") + @"\" + mstrURL; How can I do the same when I want to search multiple sites?? Actually I am using the code from http://www.keylimetie.com/blog/2005/8/4/lucenenet/

    Read the article

< Previous Page | 9 10 11 12 13 14 15 16 17 18 19 20  | Next Page >