Search Results

Search found 631 results on 26 pages for 'couchdb lucene'.

Page 12/26 | < Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19  | Next Page >

  • Remove results below a certain score threshold in Solr/Lucene?

    - by snickernet
    Hi Guys, Is there a built-in functionalities in solr/lucene to filter the results if they fall below a certain score threshold? Let's say if I provide a score threshold of .2, then all documents with score less than .2 will be removed from my results. My intuition is that this is possible by updating/customizing solr or lucene. Could you point me to right direction on how to do this? Thanks in advance!

    Read the article

  • In CouchDB, how to get documents limited on value in related document? In terms of SQL, how to make WHERE on JOINed table

    - by amorfis
    Crossposting from [email protected] Assume we have two kind of documents in CouchDB. Person and Car: Person: _id firstname surname position salary Car: _id person_id reg_number brand So there is one to many relationship. One person can have many cars. I can construct map function to get every person and his/her car next to each other. In such case key is array [person.id, 0] and [car.person_id, 1]. What I can't do, is limiting this view to owners of specific brand only, e.g. if I need salaries of owners of Ferrari.

    Read the article

  • couchdb: one database per account vs all in one database w. a namespace / property

    - by thruflo
    I'm modelling a document generation system in couchdb. It semi-automates the production of proposal and presentation documents from managable document fragments. Much like, say, Basecamp, it breaks down very simply into self-contained data per 'account'. Each account has multiple users, projects, documents, etc. However, nothing should be shared between accounts. I can see two ways of doing this: one couchdb database per account use a namespace / property to identify the account It seems to me that the first approach is conceptually sound and potentially has security and partitioning advantages. However, it seems to me to restrict some cross-database data querying (that I don't have a use case for now but you never know...) and to make updating views potentially require an awful lot of writes. Does anyone experienced with this kind of decision have any advice?

    Read the article

  • Can't open Evolution CouchDB address book

    - by Amanda
    Unable to open address book This address book cannot be opened. This either means that an incorrect URI was entered, or the server is unreachable. I tried the solution (and suggestions) in Evolution has no access to couchdb but that isn't working for me. I tried stopping desktopcouch-service and deleting my access keys and now the error I get says Unable to open address book This address book cannot be opened. This either means that an incorrect URI was entered, or the server is unreachable. Detailed error message: Address Book does not exist Do I need to create my addressbook anew?

    Read the article

  • does lucene search function work in large size document?

    - by shaon-fan
    Hi,there I have a problem when do search with lucene. First, in lucene indexing function, it works well to huge size document. such as .pst file, the outlook mail storage. It can build indexing file include all the information of .pst. The only problem is to large sometimes, include very much words. So when i search using lucene, it only can process the front part of this indexing file, if one word come out the back part of the indexing file, it couldn't find this word and no hits in result. But when i separate this indexing file to several parts in stupid way when debugging, and searching every parts, it can work well. So i want to know how to separate indexing file, how much size should be the limit of searching? cheers and wait 4 reply. ++++++++++++++++++++++++++++++++++++++++++++++++++ hi,there, follow Coady siad, i set the length to max 2^31-1. But the search result still can't include what i want. simply, i convert the doc word to string array[] to analyze, one doc word has 79680 words include the space and any symbol. when i search certain word, it just return 300 count, actually it has more than 300 results. The same reason, when i search a word in back part of the doc, it also couldn't find. //////////////set the length idexwriter.SetMaxFieldLength(2147483647); ////////////////////search IndexSearcher searcher = new ndexSearcher(Program.Parameters["INDEX_LOCATION"].ToString()); Hits hits = searcher.Search(query); This is my code, as others same. I found that problem when i need to count every word hits in a doc. So i also found it couldn't search word in back part of doc. pls help me to find, is there any set searcher length somewhere? how u meet this problem.

    Read the article

  • Apache Lucene: Is Relevance Score Always Between 0 and 1?

    - by Eamorr
    Greetings, I have the following Apache Lucene snippet that's giving me some nice results: int numHits=100; int resultsPerPage=100; IndexSearcher searcher=new IndexSearcher(reader); TopScoreDocCollector collector=TopScoreDocCollector.create(numHits,true); Query q=parser.parse(queryString); searcher.search(q,collector); ScoreDoc[] hits=collector.topDocs(0*resultsPerPage,resultsPerPage).scoreDocs; Results r=new Results(); r.length=hits.length; for(int i=0;i<hits.length;i++){ Document doc=searcher.doc(hits[i].doc); double distanceKm=getGreatCircleDistance(lucene2double(doc.get("lat")), lucene2double(doc.get("lng")), Double.parseDouble(userLat), Double.parseDouble(userLng)); double newRelevance=((1/distanceKm)*Math.log(hits[i].score)/Math.log(2))*(0-1); System.out.println(hits[i].doc+"\t"+hits[i].score+"\t"+doc.get("content")+"\t"+"Km="+distanceKm+"\trlvnc="+String.valueOf(newRelevance)); } What I want to know, is hits[i].score always between 0 and 1? It seems that way, but I can't be sure. I've even checked the Lucene documentation (class ScoreDocs) to no avail. You'll see I'm calculating the log of the "newRelevance" value, which is based on hits[i].score. I need hits[i].score to be between 0 and 1, because if it is below zero, I'll get an error; above 1 and the sign will change from negative to positive. I hope some Lucene expert out there can offer me some insight. Many thanks,

    Read the article

  • Using Lucene to index private data, should I have a separate index for each user or a single index

    - by Nathan Bayles
    I am developing an Azure based website and I want to provide search capabilities using Lucene. (structured json objects would be indexed and stored in Lucene and other content such as Word documents, etc. would be indexed in lucene but stored in blob storage) I want the search to be secure, such that one user would never see a document belonging to another user. I want to allow ad-hoc searches as typed by the user. Lastly, I want to query programmatically to return predefined sets of data, such as "all notes for user X". I think I understand how to add properties to each document to achieve these 3 objectives. (I am listing them here so if anyone is kind enough to answer, they will have better idea of what I am trying to do) My questions revolve around performance and security. Can I improve document security by having a separate index for each user, or is including the user's ID as a parameter in each search sufficient? Can I improve indexing speed and total throughput of the system by having a separate index for each user? My thinking is that having separate indexes would allow me to scale the system by having multiple index writers (perhaps even on different server instances) working at the same time, each on their own index. Any insight would be greatly appreciated. Regards, Nate

    Read the article

  • Best practices for combining Lucene.NET and a relational database?

    - by FlySwat
    I'm working on a project where I will have a LOT of data, and it will be searchable by several forms that are very efficiently expressed as SQL Queries, but it also needs to be searched via natural language processing. My plan is to build an index using Lucene for this form of search. My question is that if I do this, and perform a search, Lucene will then return the ID's of matching documents in the index, I then have to lookup these entities from the relational database. This could be done in two ways (That I can think of so far): N amount of queries (Horrible) Pass all the ID's to a stored procedure at once (Perhaps as a comma delimited parameter). This has the downside of being limited to the max parameter size, and the slow performance of a UDF to split the string into a temporary table. I'm almost tempted to mirror everything into lucenes index, so that I can periodicly generate the index from the backing store, but only need to access it for the frontend. Advice?

    Read the article

  • Can documents indexed with Solr on JDK6 be retrieved using only lucene api on JDK1.4?

    - by huynhjl
    My runtime environment is still on JDK1.4 but I like the Solr features related to how documents are ingested and indexed. Would I be able to index my documents using Solr offline on a recent version of the JDK, copy the index over and use it in my runtime environment with an older version of the JDK? Version wise, Solr 1.4.0 uses Apache Lucene 2.9.1 which is JDK1.4 compatible. (but Solr itself requires JDK5). Assuming what I'm trying to do is even possible, what features would I lose if I search Solr indices only with the Lucene API?

    Read the article

  • How to do query auto-completion/suggestions in Lucene?

    - by Mat Mannion
    I'm looking for a way to do query auto-completion/suggestions in Lucene. I've Googled around a bit and played around a bit, but all of the examples I've seen seem to be setting up filters in Solr. We don't use Solr and aren't planning to move to using Solr in the near future, and Solr is obviously just wrapping around Lucene anyway, so I imagine there must be a way to do it! I've looked into using EdgeNGramFilter, and I realise that I'd have to run the filter on the index fields and get the tokens out and then compare them against the inputted Query... I'm just struggling to make the connection between the two into a bit of code, so help is much appreciated! To be clear on what I'm looking for (I realised I wasn't being overly clear, sorry) - I'm looking for a solution where when searching for a term, it'd return a list of suggested queries. When typing 'inter' into the search field, it'll come back with a list of suggested queries, such as 'internet', 'international', etc.

    Read the article

  • Can a raw Lucene index be loaded by Solr?

    - by wynz
    Some colleagues of mine have a large Java web app that uses a search system built with Lucene Java. What I'd like to do is have a nice HTTP-based API to access those existing search indexes. I've used Nutch before and really liked how simple the OpenSearch implementation made it to grab results as RSS. I've tried setting Solr's dataDir in solrconfig.xml, hoping it would happily pick up the existing index files, but it seems to just ignore them. My main question is: Can Solr be used to access Lucene indexes created elsewhere? Or might there be a better solution?

    Read the article

  • How do I do the SQL equivalent of "DISTINCT" in CouchDB?

    - by Blaine LaFreniere
    I have a bunch of MP3 metadata in couchDB. I want to return every album that is in the MP3 metadata, but no duplicates. A typical document looks like this: { "_id": "005e16a055ba78589695c583fbcdf7e26064df98", "_rev": "2-87aa12c52ee0a406084b09eca6116804", "name": "Fifty-Fifty Clown", "number": 15, "artist": "Cocteau Twins", "bitrate": 320, "album": "Stars and Topsoil: A Collection (1982-1990)", "path": "Cocteau Twins/Stars and Topsoil: A Collection (1982-1990)/15 - Fifty-Fifty Clown.mp3", "year": 0, "genre": "Shoegaze" }

    Read the article

  • Lucene.Net: How can I add a date filter to my search results?

    - by rockinthesixstring
    I've got my searcher working really well, however it does tend to return results that are obsolete. My site is much like NerdDinner whereby events in the past become irrelevant. I'm currently indexing like this Public Function AddIndex(ByVal searchableEvent As [Event]) As Boolean Implements ILuceneService.AddIndex Dim writer As New IndexWriter(luceneDirectory, New StandardAnalyzer(), False) Dim doc As Document = New Document doc.Add(New Field("id", searchableEvent.ID, Field.Store.YES, Field.Index.UN_TOKENIZED)) doc.Add(New Field("fullText", FullTextBuilder(searchableEvent), Field.Store.YES, Field.Index.TOKENIZED)) doc.Add(New Field("user", If(searchableEvent.User.UserName = Nothing, "User" & searchableEvent.User.ID, searchableEvent.User.UserName), Field.Store.YES, Field.Index.TOKENIZED)) doc.Add(New Field("title", searchableEvent.Title, Field.Store.YES, Field.Index.TOKENIZED)) doc.Add(New Field("location", searchableEvent.Location.Name, Field.Store.YES, Field.Index.TOKENIZED)) doc.Add(New Field("date", searchableEvent.EventDate, Field.Store.YES, Field.Index.UN_TOKENIZED)) writer.AddDocument(doc) writer.Optimize() writer.Close() Return True End Function Notice how I have a "date" index that stores the event date. My search then looks like this ''# code omitted Dim reader As IndexReader = IndexReader.Open(luceneDirectory) Dim searcher As IndexSearcher = New IndexSearcher(reader) Dim parser As QueryParser = New QueryParser("fullText", New StandardAnalyzer()) Dim query As Query = parser.Parse(q.ToLower) ''# We're using 10,000 as the maximum number of results to return ''# because I have a feeling that we'll never reach that full amount ''# anyways. And if we do, who in their right mind is going to page ''# through all of the results? Dim topDocs As TopDocs = searcher.Search(query, Nothing, 10000) Dim doc As Document = Nothing ''# loop through the topDocs and grab the appropriate 10 results based ''# on the submitted page number While i <= last AndAlso i < topDocs.totalHits doc = searcher.Doc(topDocs.scoreDocs(i).doc) IDList.Add(doc.[Get]("id")) i += 1 End While ''# code omitted I did try the following, but it was to no avail (threw a NullReferenceException). While i <= last AndAlso i < topDocs.totalHits If Date.Parse(doc.[Get]("date")) >= Date.Today Then doc = searcher.Doc(topDocs.scoreDocs(i).doc) IDList.Add(doc.[Get]("id")) i += 1 End If End While I also found the following documentation, but I can't make heads or tails of it http://lucene.apache.org/java/1_4_3/api/org/apache/lucene/search/DateFilter.html

    Read the article

  • Error about 'invalid JSON' with couchDB view but the json's fine...

    - by Chris Huang-Leaver
    I am trying to setup the following view on CouchDB { "_id":"_design/id", "_rev":"1-9be2e55e05ac368da3047841f301203d", "language":"javascript", "views":{ "by_id":{ "map" : "function(doc) { emit(doc.id, doc)}" },"from_user_id":{ "map" : "function(doc) { if (doc.from_user_id) {emit(doc.from_user_id, doc)}}"}, "from_user":{ "map" : "function(doc) { if (doc.from_user) {emit(doc.from_user, doc)}}"}, "to_user_id":{ "map" : "function(doc) {if (doc.to_user_id){ emit(doc.to_user_id, doc)}}"}, "to_user":{ "map" : "function(doc) {if (doc.to_user){ emit(doc.to_user, doc)}}" }, "max_id":{ "map" : "function(doc) { if (doc.id) {emit(doc._id, eval(doc.id))}}", "reduce" :"function(key,value) { a = value[0]; for (i=1; i <value.length; ++i){a = Math.max(a,value[i])} return a}" } } } when I try to 'PUT' this using curl: curl -X PUT -d keys.json $CDB/_design/id {"error":"bad_request","reason":"invalid UTF-8 JSON"} I know it's not invalid JSON, because I tested it using the 'json' library built into Python 2.6, it loads fine. JS screw ups give me the error 'must evaluate to a function' What else might be wrong with it?

    Read the article

  • Can a CouchDB document update handler get an update conflict?

    - by jhs
    How likely is a revision conflict when using an update handler? Should I concern myself with conflict-handling code when writing a robust update function? As described in Document Update Handlers, CouchDB 0.10 and later allows on-demand server-side document modification. Update handlers can process non-JSON formats; but the other major features are these: An HTTP front-end to arbitrarily complex document modification code Similar code needn't be written for all possible clients—a DRY architecture Execution is faster and less likely to hit a revision conflict I am unclear about the third point. Executing locally, the update handler will run much faster and with lower latency. But in situations with high contention, that does not guarantee a successful update. Or does the update handler guarantee a successful update?

    Read the article

  • How can I get a view of favorite user documents by user in Couchdb map/reduce?

    - by Jeremy Raymond
    My Couchdb database as a main document type that looks something like: { "_id" : "doc1", "type" : "main_doc", "title" : "the first doc" ... } There is another type of document that stores user information. I want users to be able to tag documents as favorites. Different users can save the same or different documents as favorites. My idea was to introduce a favorite document to track this something like: { "_id" : "fav1", "type" : "favorite", "user_id" : "user1", "doc_id" : "doc1" } It's easy enough to create a view with user_id as the key to get a list of their favorite doc IDs. E.g: function(doc) { if (doc.type == "favorite") { emit(doc.user_id, doc.doc_id); } } However I want to list of favorites to display the user_id, doc_id and title from the document. So output something like: { "key" : "user1", "value" : ["doc1", "the first doc"] }

    Read the article

  • What is the chance a CouchDB document update handler will get a revision conflict?

    - by jhs
    How likely is a revision conflict when using an update handler? Should I concern myself with conflict-handling code when writing a robust update function? As described in Document Update Handlers, CouchDB 0.10 and later allows on-demand server-side document modification. Update handlers can process non-JSON formats; but the other major features are these: An HTTP front-end to arbitrarily complex document modification code Similar code needn't be written for all possible clients—a DRY architecture Execution is faster and less likely to hit a revision conflict I am unclear about the third point. Executing locally, the update handler will run much faster and with lower latency. But in situations with high contention, that does not guarantee a successful update. Or does the update handler guarantee a successful update?

    Read the article

  • CouchDB emit with lookup key that is array, such that order of array elements are ignored.

    - by MatternPatching
    When indexing a couchdb view, you can emit an array as the key such as: emit(["one", "two", "three"], doc); I appreciate the fact that when searching the view, the order is important, but sometimes I would like the view to ignore it. I have thought of a couple of options. 1. By convention, just emit the contents in alphabetical order, and ensure that looking up uses the same convention. 2. Somehow hash in a manner that disregards the order, and emit/search based on that hash. (This is fairly easy, if you simply hash each one individually, "sum" the hashes, then mod.) Note: I'm sure this may be covered somewhere in the authoritative guide, but I was unsuccessful in finding it.

    Read the article

  • How can I "undelete" a set of documents in CouchDB?

    - by radicand
    I have a large set of documents in a CouchDB database that were just accidentally bulk deleted using _deleted:true. I also have a backup for this set of data that includes their last known good revision and metadata. I need to maintain the same _id, so simple restore with a new _id is not an option. Compaction has not been run and I can access any of these documents via the &rev= url parameter as well as their attachments (which are needed). What I need to do is "restore" these documents to the revision I have on file. Surprisingly, I have come up empty with any queries on how to achieve this. Tips or hacks appreciated.

    Read the article

  • Using DesktopCouch without Ubuntu One?

    - by burli
    I want to know if it is possible to use DesktopCouch without UbuntuOne, but with a local CouchDB Server. I found a pairing Tool, but this crashes, when I try to pair two computer. I can find the local Desktop Couches with the Avahi Zeroconf Browser and it should be possible to find them with Python and start a replication To make a long story short: I want to sync DesktopCouch Databases in my local network without Ubuntu One. Is that possible?

    Read the article

< Previous Page | 8 9 10 11 12 13 14 15 16 17 18 19  | Next Page >