Search Results

Search found 631 results on 26 pages for 'couchdb lucene'.

Page 9/26 | < Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16  | Next Page >

  • Lucene based database search engine

    - by Abhay
    Hi All, I am planing to add search feature in my web application. Now the items that will be searched are strored in a Relational database. In order to achieve a full text search engine I have following doubts : For database based search engine should I use just lucene or some other utility based on Luncene like Solr, luSql, Compass etc. In case of Solr, can it be embeded in to the web applcation rather than deploying it as a seaparate WAR. Thanks for your time. regards

    Read the article

  • Lucene MultiFieldQueryParser which column of the three generated the hit

    - by user549432
    I am using Lucene MultiFieldQueryParser and the implementation is as shown below QueryParser parser = new MultiFieldQueryParser (Version.LUCENE_30,new String[] {"First Name","Middle Name","Last Name"}, standardAnalyzer); Query query = parser.parse(queryString); and using it to find a match for the input string in my DB columns First Name, Middle Name and Last name . I am able to get the hits with normal search and fuzzy search - The only problem I am facing is finding which column of the three generated the hit - Can you pls help me here - Thanks

    Read the article

  • How can I index HTML documents?

    - by Swami
    I am using Lucene .NEt to do full-text searching. Till now I have been indexing PDF docs, but now I have a few webpages that I need to index. What's the best/easiest way to index HTML documents to add to my Lucene index? I am using .NET/C#

    Read the article

  • Situations to prefer Apache Lucene over Solr?

    - by Karussell
    There are several advantages to use Solr (out-of-the-box facetting search, grouping, replication, http administration vs. luke, ...). Even if I embed a search-functionality in my Java application I could use SolrJ to avoid the HTTP trade-off when using Solr. So, when would you recommend to use "pure-Lucene"? Does it have a better performance or requires less RAM? Is it better unit-testable? PS: I am aware of this question.

    Read the article

  • Have boost effect on lucene/compass field search.

    - by PeterP
    Hi there, In our compass mapping, we're boosting "better" documents to push them up in the list of search results. Something like this: <boost name="boostFactor" default="1.0"/> <property name="name"><meta-data>name</meta-data></property> While this works fine for fulltext search, it does not when doing a field search, e.g. the boost is ignored when searching something like name:Peter Is there any way to enable boosting for field searches? Thanks for your help and sorry if this is a dumb question - I am new to Lucene/Compass. Best regards, Peter

    Read the article

  • Where can I find different versions of Lucene.Net Analyzer

    - by Vinay Pandey
    Hi All, I know its silly question but I am struggling in allowing japanese/other such languages search for my web application using lucene.net. I know that different analyzers can be used for all different languages and can be implemented but I could not find any dll for analyzers or example for the same. the question is:- Will using different analyzers be a good option for web application, as search text can be in any form. Where can I find dll and sample application for implementing search for all different sets of language I have spend whole day but no luck :(.

    Read the article

  • Running Long Process: Indexing 5GB docs with Lucene

    - by Robert Dondo
    Situation:I have an ASP .NET application that will search through docs using Lucene. I want to run the initial indexing (the index will be incremental after the initial run so there wont be need to index the whole directory again in future). Currently, I have about 5GB of docs (45000files). Problem: My application times out before completing the process. I have altered the TimeOut like this: HttpContext.Current.Server.ScriptTimeout = 200000; but it still does not complete the process. How can I run the index?

    Read the article

  • StructureMap 'conditional singleton' for Lucene.Net IndexReader

    - by Gareth D
    I have a threadsafe object that is expensive to create and needs to be available through my application (a Lucene.Net IndexReader). The object can become invalid, at which point I need to recreate it (IndexReader.IsCurrent is false, need a new instance using IndexReader.Reopen). I'd like to able to use an IoC container (StructureMap) to manage the creation of the object, but I can't work out if this scenario is possible. It feels like some kind of "conditional singleton" lifecycle. Does StructureMap provide such a feature? Any alternative suggestions?

    Read the article

  • How to get total number of potential results in Lucene

    - by Slace
    I'm using lucene on a site of mine and I want to show the total result count from a query, for example: Showing results x to y of z But I can't find any method which will return me the total number of potential results. I can only seem to find methods which you have to specify the number of results you want, and since I only want 10 per page it seems logical to pass in 10 as the number of results. Or am I doing this wrong, should I be passing in say 1000 and then just taking the 10 in the range that I require?

    Read the article

  • Lucene case sensitive & insensitive search

    - by zvikico
    I have a Lucene index which is currently case sensitive. I want to add the option of having a case insensitive search as a fall-back. This means that results that match the case will get more weight and will appear first. For example, if the number of results is limited to 10, and there are 10 matches which match my case, this is enough. If I only found 7 results, I can add 3 more results from the case-insensitive search. My case is actually more complex, since I have items with different weights. Ideally, having a match with "wrong" case will add some weight. Needless to say, I do not want duplicate results. One possible approach is to have 2 indexes. One with case and one without and search both. Naturally, there's some redundancy here, since I need to index twice. Is there a better solution? Ideas?

    Read the article

  • Can' get couchdb external http handlers to work.

    - by fuzzy lollipop
    following the instructions here http://wiki.apache.org/couchdb/ExternalProcesses this is what I get { * error: "{{badarg,[{erlang,port_command, [#Port<0.2056>, [123, [34,<<"info">>,34], 58, [123, [34,"db_name",34], 58, [34,<<"transfer_central">>,34], 44, [34,"doc_count",34], 58,"39441",44, [34,"doc_del_count",34], 58,"0",44, [34,"update_seq",34], 58,"56508",44, [34,"purge_seq",34], 58,"0",44, [34,"compact_running",34], 58,<<"false">>,44, [34,"disk_size",34], 58,"43593828",44, [34,"instance_start_time",34], 58, [34,<<"1272560477320483">>,34], 44, [34,"disk_format_version",34], 58,"5",125], 44, [34,<<"id">>,34], 58,<<"null">>,44, [34,<<"method">>,34], 58, [34,"GET",34], 44, [34,<<"path">>,34], 58, [91, [34,<<"transfer_central">>,34], 44, [34,<<"_test">>,34], 93], 44, [34,<<"query">>,34], 58,<<"{}">>,44, [34,<<"headers">>,34], 58, [123, [34,<<"Accept">>,34], 58, [34, <<"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8,application/json">>, 34], 44, [34,<<"Accept-Charset">>,34], 58, [34,<<"ISO-8859-1,utf-8;q=0.7,*;q=0.7">>,34], 44, [34,<<"Accept-Encoding">>,34], 58, [34,<<"gzip,deflate">>,34], 44, [34,<<"Accept-Language">>,34], 58, [34,<<"en-us,en;q=0.5">>,34], 44, [34,<<"Connection">>,34], 58, [34,<<"keep-alive">>,34], 44, [34,<<"Host">>,34], 58, [34,<<"127.0.0.1:5984">>,34], 44, [34,<<"Keep-Alive">>,34], 58, [34,<<"115">>,34], 44, [34,<<"User-Agent">>,34], 58, [34, <<"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3">>, 34], 125], 44, [34,<<"body">>,34], 58, [34,"undefined",34], 44, [34,<<"peer">>,34], 58, [34,<<"127.0.0.1">>,34], 44, [34,<<"form">>,34], 58,<<"{}">>,44, [34,<<"cookie">>,34], 58,<<"{}">>,44, [34,<<"userCtx">>,34], 58, [123, [34,<<"db">>,34], 58, [34,<<"transfer_central">>,34], 44, [34,<<"name">>,34], 58,<<"null">>,44, [34,<<"roles">>,34], 58,<<"[]">>,125], 125,10]]}, {couch_os_process,writeline,2}, {couch_os_process,writejson,2}, {couch_os_process,handle_call,3}, {gen_server,handle_msg,5}, {proc_lib,init_p_do_apply,3}]}, {gen_server,call, [<0.110.0>, {prompt,{[{<<"info">>, {[{db_name,<<"transfer_central">>}, {doc_count,39441}, {doc_del_count,0}, {update_seq,56508}, {purge_seq,0}, {compact_running,false}, {disk_size,43593828}, {instance_start_time,<<"1272560477320483">>}, {disk_format_version,5}]}}, {<<"id">>,null}, {<<"method">>,'GET'}, {<<"path">>,[<<"transfer_central">>,<<"_test">>]}, {<<"query">>,{[]}}, {<<"headers">>, {[{<<"Accept">>, <<"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8,application/json">>}, {<<"Accept-Charset">>, <<"ISO-8859-1,utf-8;q=0.7,*;q=0.7">>}, {<<"Accept-Encoding">>,<<"gzip,deflate">>}, {<<"Accept-Language">>,<<"en-us,en;q=0.5">>}, {<<"Connection">>,<<"keep-alive">>}, {<<"Host">>,<<"127.0.0.1:5984">>}, {<<"Keep-Alive">>,<<"115">>}, {<<"User-Agent">>, <<"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3">>}]}}, {<<"body">>,undefined}, {<<"peer">>,<<"127.0.0.1">>}, {<<"form">>,{[]}}, {<<"cookie">>,{[]}}, {<<"userCtx">>, {[{<<"db">>,<<"transfer_central">>}, {<<"name">>,null}, {<<"roles">>,[]}]}}]}}, infinity]}}" * reason: "{gen_server,call, [<0.109.0>, {execute,{[{<<"info">>, {[{db_name,<<"transfer_central">>}, {doc_count,39441}, {doc_del_count,0}, {update_seq,56508}, {purge_seq,0}, {compact_running,false}, {disk_size,43593828}, {instance_start_time,<<"1272560477320483">>}, {disk_format_version,5}]}}, {<<"id">>,null}, {<<"method">>,'GET'}, {<<"path">>,[<<"transfer_central">>,<<"_test">>]}, {<<"query">>,{[]}}, {<<"headers">>, {[{<<"Accept">>, <<"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8,application/json">>}, {<<"Accept-Charset">>, <<"ISO-8859-1,utf-8;q=0.7,*;q=0.7">>}, {<<"Accept-Encoding">>,<<"gzip,deflate">>}, {<<"Accept-Language">>,<<"en-us,en;q=0.5">>}, {<<"Connection">>,<<"keep-alive">>}, {<<"Host">>,<<"127.0.0.1:5984">>}, {<<"Keep-Alive">>,<<"115">>}, {<<"User-Agent">>, <<"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3">>}]}}, {<<"body">>,undefined}, {<<"peer">>,<<"127.0.0.1">>}, {<<"form">>,{[]}}, {<<"cookie">>,{[]}}, {<<"userCtx">>, {[{<<"db">>,<<"transfer_central">>}, {<<"name">>,null}, {<<"roles">>,[]}]}}]}}, infinity]}" }

    Read the article

  • CouchDB, HDFS, HBase or which is right for my situation?

    - by Lucas
    Hello all, This question is regarding data storage systems such as CouchDB, HDFS and HBase, specifically, which is right. I am looking at making a simple and customized Document Management System for my organization. Basically, we need the ability to store some Word Documents, PDFs and other similar files. I also want to store metadata about these files (e.g., Author, Dates, etc). Usage permissions would also be handy, but that can probably be built using meta-data. I would also need the ability to full-text index. The ability to version, while not required would be extremely useful. I would like the ability to simply add hardware to expand the resources of the system and the system must support Network Attached Storage over the CIFS or NFS protocol(s). I have read about CouchDB, HDFS and HBase. My preferred programming language is C# as all of my end-users will be running Windows machines and I will want to make both web and winforms client implementations. My question is which solution best fits my needs? Based on my research it appears that CouchDB (utilizing the CouchDB-Lounge and CouchDB-Lucene) perfectly fits my needs. However, I am worried that since I have worked with CouchDB that I might be overlooking something useful for my needs in HDFS or HBase or something similar due to a bias. Any and all opinions are welcome as I am looking for the community input as I really do not want to make the wrong choice at the start of my project. Please ask if you need more information. I thank you all for your time, input and assistance.

    Read the article

  • Is there any reason not to go directly from client-side Javascript to a database?

    - by Chris Smith
    So, let's say I'm going to build a Stack Exchange clone and I decide to use something like CouchDB as my backend store. If I use their built-in authentication and database-level authorization, is there any reason not to allow the client-side Javascript to write directly to the publicly available CouchDB server? Since this is basically a CRUD application and the business logic consists of "Only the author can edit their post" I don't see much of a need to have a layer between the client-side stuff and the database. I would simply use validation on the CouchDB side to make sure someone isn't putting in garbage data and make sure that permissions are set properly so that users can only read their own _user data. The rendering would be done client-side by something like AngularJS. In essence you could just have a CouchDB server and a bunch of "static" pages and you're good to go. You wouldn't need any kind of server-side processing, just something that could serve up the HTML pages. Opening my database up to the world seems wrong, but in this scenario I can't think of why as long as permissions are set properly. It goes against my instinct as a web developer, but I can't think of a good reason. So, why is this a bad idea? EDIT: Looks like there is a similar discussion here: Writing Web "server less" applications EDIT: Awesome discussion so far, and I appreciate everyone's feedback! I feel like I should add a few generic assumptions instead of calling out CouchDB and AngularJS specifically. So let's assume that: The database can authenticate users directly from its hidden store All database communication would happen over SSL Data validation can (but maybe shouldn't?) be handled by the database The only authorization we care about other than admin functions is someone only being allowed to edit their own post We're perfectly fine with everyone being able to read all data (EXCEPT user records which may contain password hashes) Administrative functions would be restricted by database authorization No one can add themselves to an administrator role The database is relatively easy to scale There is little to no true business logic; this is a basic CRUD app

    Read the article

  • Lucene.NET 2.9 and BitArray/DocIdSet

    - by Paul Knopf
    I found a great example on grabbing facet counts on a base query. It stores the bitarray of the base query to improve the performance each time the a facet gets counted. var genreQuery = new TermQuery(new Term("genre", genre)); var genreQueryFilter = new QueryFilter(genreQuery); BitArray genreBitArray = genreQueryFilter.Bits(searcher.GetIndexReader()); Console.WriteLine("There are " + GetCardinality(genreBitArray) + " document with the genre " + genre); // Next perform a regular search and get its BitArray result Query searchQuery = MultiFieldQueryParser.Parse(term, new[] {"title", "description"}, new[] {BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD}, new StandardAnalyzer()); var searchQueryFilter = new QueryFilter(searchQuery); BitArray searchBitArray = searchQueryFilter.Bits(searcher.GetIndexReader()); Console.WriteLine("There are " + GetCardinality(searchBitArray) + " document containing the term " + term); The only problem is that I am using a newer version of Lucene.NET (2.9) and Filter.Bits is obsolete. We are told to use DocIdSet instead (rather than BitArray). I cannot found out how to do the bitArray.And(bitArray) with a docIdSet. I looked in reflector and found OpenIdSet which has And operations. Not sure if OpenIdSet is the route to go, I'm just stating. Thanks in advance!

    Read the article

  • Lucene document Boosting

    - by athreyar
    Hello, I am having problem with lucene boosting, Iam trying to boost a particular document which matches with the (firstname)field specified I have posted the part of the codeenter code hereprivate static Document createDoc(String lucDescription,String primaryk,String specialString){ Document doc = new Document(); doc.add(new Field("lucDescription",lucDescription, Field.Store.NO, Field.Index.TOKENIZED)); doc.add(new Field("primarykey",primaryk,Field.Store.YES,Field.Index.NO)); doc.add(new Field("specialDescription",specialString, Field.Store.NO, Field.Index.UN_TOKENIZED)); doc.setBoost ((float)(0.00001)); if (specialString.equals("chris")) doc.setBoost ((float)(100000.1)); return doc; } why is this not working?enter code herepublic static String dbSearch(String searchString){ List pkList = new ArrayList(); String conCat="("; try{ String querystr = searchString; Query query = new QueryParser("lucDescription", new StandardAnalyzer()).parse(querystr); IndexSearcher searchIndex = new IndexSearcher("/home/athreya/docsIndexFile"); // Index of the User table-- /home/araghu/aditya/indexFile. Hits hits = searchIndex.search(query); System.out.println("Found " + hits.length() + " hits."); for(int iterator=0;iterator Thank you in advance Athreya

    Read the article

  • How to use multifieldquery and filters in Lucene.net

    - by Khotu Nam
    I want to perform a multi field search on a lucene.net index but filter the results based on one of the fields. Here's what I'm currently doing: To index the fields the definitions are: doc.Add(new Field("id", id.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED)); doc.Add(new Field("title", title, Field.Store.NO, Field.Index.TOKENIZED)); doc.Add(new Field("summary", summary, Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES)); doc.Add(new Field("description", description, Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES)); doc.Add(new Field("distribution", distribution, Field.Store.NO, Field.Index.UN_TOKENIZED)); When I perform the search I do the following: MultiFieldQueryParser parser = new MultiFieldQueryParser(new string[]{"title", "summary", "description"}, analyzer); parser.SetDefaultOperator(QueryParser.Operator.AND); Query query = parser.Parse(text); BooleanQuery bq = new BooleanQuery(); TermQuery tq = new TermQuery(new Term("distribution", distribution)); bq.Add(tq, BooleanClause.Occur.MUST); Filter filter = new QueryFilter(bq); Hits hits = searcher.Search(query, filter); However, the result is always 0 hits. What am I doing wrong?

    Read the article

  • How to create more complex Lucene query strings?

    - by boris callens
    This question is a spin-off from this question. My inquiry is two-fold, but because both are related I think it is a good idea to put them together. How to programmatically create queries. I know I could start creating strings and get that string parsed with the query parser. But as I gather bits and pieces of information from other resources, there is a programattical way to do this. What are the syntax rules for the Lucene queries? --EDIT-- I'll give a requirement example for a query I would like to make: Say I have 5 fields: First Name Last Name Age Address Everything All fields are optional, the last field should search over all the other fields. I go over every field and see if it's IsNullOrEmpty(). If it's not, I would like to append a part of my query so it adds the relevant search part. First name and last name should be exact matches and have more weight then the other fields. Age is a string and should exact match. Address can varry in order. Everything can also varry in order. How should I go about this?

    Read the article

  • Lucene setboost doesn't work

    - by Keven
    Hi all, OUr team just upgrade lucene from 2.3 to 3.0 and we are confused about the setboost and getboost of document. What we want is just set a boost for each document when add them into index, then when search it the documents in the response should have different order according to the boost I set. But it seems the order is not changed at all, even the boost of each document in the search response is still 1.0. Could some one give me some hit? Following is our code: String[] a = new String[] { "schindler", "spielberg", "shawshank", "solace", "sorcerer", "stone", "soap", "salesman", "save" }; List strings = Arrays.asList(a); AutoCompleteIndex index = new Index(); IndexWriter writer = new IndexWriter(index.getDirectory(), AnalyzerFactory.createAnalyzer("en_US"), true, MaxFieldLength.LIMITED); float i = 1f; for (String string : strings) { Document doc = new Document(); Field f = new Field(AutoCompleteIndexFactory.QUERYTEXTFIELD, string, Field.Store.YES, Field.Index.NOT_ANALYZED); doc.setBoost(i); doc.add(f); writer.addDocument(doc); i += 2f; } writer.close(); IndexReader reader2 = IndexReader.open(index.getDirectory()); for (int j = 0; j < reader2.maxDoc(); j++) { if (reader2.isDeleted(j)) { continue; } Document doc = reader2.document(j); Field f = doc.getField(AutoCompleteIndexFactory.QUERYTEXTFIELD); System.out.println(f.stringValue() + ":" + f.getBoost() + ", docBoost:" + doc.getBoost()); doc.setBoost(j); }

    Read the article

  • Multiple word Auttosuggest using Lucene.Net

    - by eric
    I am currently working on an search application which uses Lucene.Net to index the data from the database to Index file. I have a product catalog which has Name, short and long description, sku and other fields. The data is stored in Index using StandardAnalyzer. I am trying to add auto suggestion for a text field and using TermEnum to get all the keyword terms and its score from the Index. But the terms returned are of single term. For example, if I type for co, the suggestion returned are costume, count, collection, cowboy, combination etc. But I want the suggestion to return phrases. For exmaple, if I search for co, the suggestions should be cowboy costume, costume for adults, combination locks etc. The following is the code used to get the suggestions: public string[] GetKeywords(string strSearchExp) { IndexReader rd = IndexReader.Open(mIndexLoc); TermEnum tenum = rd.Terms(new Term("Name", strSearchExp)); string[] strResult = new string[10]; int i = 0; Dictionary<string, double> KeywordList = new Dictionary<string, double>(); do { //terms = tenum.Term(); if (tenum.Term() != null) { //strResult[i] = terms.text.ToString(); KeywordList.Add(tenum.Term().text.ToString(), tenum.DocFreq()); } } while (tenum.Next() && tenum.Term().text.StartsWith(strSearchExp) && tenum.Term().text.Length > 1); var sortedDict = (from entry in KeywordList orderby entry.Value descending select entry); foreach (KeyValuePair<string, double> data in sortedDict) { if (data.Key.Length > 1) { strResult[i] = data.Key; i++; } if (i >= 10) //Exit the for Loop if the count exceeds 10 break; } tenum.Close(); rd.Close(); return strResult; } Can anyone please give me directions to achive this? Thanks for looking into this.

    Read the article

  • java AbstractMethodError

    - by Akhil
    How to handle this error in lucene: java.lang.AbstractMethodError: org.apache.lucene.store.Directory.listAll()[Ljava/lang/String; at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:568) at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:69) at org.apache.lucene.index.IndexReader.open(IndexReader.java:316) at org.apache.lucene.index.IndexReader.open(IndexReader.java:188) I am making a lucene function call but unfortunately it itself calls an abstract method of some class, as is evident from the error above. What is the work around for this? Thanks, --Akhil

    Read the article

  • multiple key ranges as parameters to a couchdb view

    - by kolosy
    is there a way to send multiple startKey/endKey pairs to a view, akin to the keys: [] array that can be posted for keys? the underlying problem - let's say my documents have "categories" and timestamps. if i want all documents in the "foo" category that have a timestamp that's within the last two hours, it's simple: function (doc) { emit([doc.category, doc.timestamp], null); } and then query as GET server:5894/.../myview?startKey=[foo, |now - 2 hours|]&endkey=[foo, |now|] the problem comes when i want something in categories foo or bar, within the last two hours. if i didn't care about time, i could just pull directly by key through the keys collection. unfortunately, i have no such option with ranges. what i ended up doing in the meantime is rounding the timestamp to two-hour blocks, and then multiplexing the query out: POST server:5894/.../myview keys=[[foo, 0 hours], [foo, 2 hours], [bar, 0 hours], [bar, 2 hours]] it works, but will get messy if i want to go back a large amount of time (in relationship to the blocksize)

    Read the article

< Previous Page | 5 6 7 8 9 10 11 12 13 14 15 16  | Next Page >