couchdb lucene - Page 4

how to install lucene 3.0.0 in ubuntu 8.10

- by kshama

hi, I have downloaded lucene3.0.0 and when i used cmd java -jar lucene-core-3.0.0.jar in the directory where lucene is present i got this msg Failed to load Main-Class manifest attribute from lucene-core-3.0.0.jar how do i proceed? please help me. thanks in advance,

Read the article

Custom Lucene Sharding with Hibernate Search

- by Timo Westkämper

Has anyone experience with custom Lucene sharding / paritioning using Hibernate Search? The documentation of Hibernate Search says the following about Lucene Sharding : In some cases, it is necessary to split (shard) the indexing data of a given entity type into several Lucene indexes. This solution is not recommended unless there is a pressing need because by default, searches will be slower as all shards have to be opened for a single search. In other words don't do it until you have problems :) Has anyone implemented sharding in such a way for Hibernate Search that also queries can be target to one of the shards? In our case we have Lucene queries that should target only one shard per query.

Read the article

Lucene wildcard queries

- by Javi

Hello, I have this question relating to Lucene. I have a form and I get a text from it and I want to perform a full text search in several fields. Suppose I get from the input the text "textToLook". I have a Lucene Analyzer with several filters. One of them is lowerCaseFilter, so when I create the index, words will be lowercased. Imagine I want to search into two fields field1 and field2 so the lucene query would be something like this (note that 'textToLook' now is 'texttolook'): field1: texttolook* field2:texttolook* In my class I have something like this to create the query. I works when there is no wildcard. String text = "textToLook"; String[] fields = {"field1", "field2"}; //analyser is the same as the one used for indexing Analyzer analyzer = fullTextEntityManager.getSearchFactory().getAnalyzer("customAnalyzer"); MultiFieldQueryParser parser = new MultiFieldQueryParser(fields, analyzer); org.apache.lucene.search.Query queryTextoLibre = parser.parse(text); With this code the query would be: field1: texttolook field2:texttolook but If I set text to "textToLook*" I get field1: textToLook* field2:textToLook* which won't find correctly as the indexes are in lowercase. I have read in lucene website this: " Wildcard, Prefix, and Fuzzy queries are not passed through the Analyzer, which is the component that performs operations such as stemming and lowercasing" My problem cannot be solved by setting the behaviour case insensitive cause my analyzer has other fields which for examples remove some suffixes of words. I think I can solve the problem by getting how the text would be after going through the filters of my analyzer, then I could add the "*" and then I could build the Query with MultiFieldQueryParser. So in this example I woud get "textToLower" and after being passed to to these filters I could get "texttolower". After this I could make "textotolower*". But, is there any way to get the value of my text variable after going through all my analyzer's filters? How can I get all the filters of my analyzer? Is this possible? Thanks

Read the article

Lucene Query Syntax

- by Don

Hi, I'm trying to use Lucene to query a domain that has the following structure Student 1-------* Attendance *---------1 Course The data in the domain is summarised below Course.name Attendance.mandatory Student.name ------------------------------------------------- cooking N Bob art Y Bob If I execute the query "courseName:cooking AND mandatory:Y" it returns Bob, because Bob is attending the cooking course, and Bob is also attending a mandatory course. However, what I really want to query for is "students attending a mandatory cooking course", which in this case would return nobody. Is it possible to formulate this as a Lucene query? I'm actually using Compass, rather than Lucene directly, so I can use either CompassQueryBuilder or Lucene's query language. For the sake of completeness, the domain classes themselves are shown below. These classes are Grails domain classes, but I'm using the standard Compass annotations and Lucene query syntax. @Searchable class Student { @SearchableProperty(accessor = 'property') String name static hasMany = [attendances: Attendance] @SearchableId(accessor = 'property') Long id @SearchableComponent Set<Attendance> getAttendances() { return attendances } } @Searchable(root = false) class Attendance { static belongsTo = [student: Student, course: Course] @SearchableProperty(accessor = 'property') String mandatory = "Y" @SearchableId(accessor = 'property') Long id @SearchableComponent Course getCourse() { return course } } @Searchable(root = false) class Course { @SearchableProperty(accessor = 'property', name = "courseName") String name @SearchableId(accessor = 'property') Long id }

Read the article

How does Lucene work

- by Midhat

I am trying to find out how lucene search works so fast. Cant find any useful docs on the web. If you have anything (short of lucene source code) to read, let me know. A text search query using mysql5 text search with index takes about 18 minutes in my case. A lucene search for the same query takes less than a second

Read the article

Integrate Lucene or any other search product with SQL server 2005

- by HBACHARYA

Hi, I need to use full text search with SQL server 2005 and I have explored its inbuilt search approach (SQL server full text indexing) but it seems less powerful. I have also looked features of Lucene. Now my questions: Is is possible to integrate lucene and SQL server in anyway? 1. Can my T-Sql queries use Lucene index for returning results? (May be uses CLR based function internally) 2. How to update Lucene index while data in the tables are getting updated 3. What can be overall architecutre? 4. Are there any commercial products avaliable which provides this kind of support? Thanks, HB

Read the article

Minimal deployment of couchdb on windows

- by MartinStettner

Hi, I'd like to use couchdb for a client-only application on Windows (the document-oriented structure and the synchronization features would be perfect for me). There is a Windows installer package here, but the installer itself has about 45 MB, when installed it takes more than 100 MB on my HD. This is far to much for my (relatively small) application. I noticed that there are a lot of "src" directories in the couchdb/lib subdirs. I've been experimenting with removing some of them and it didn't seem to break the system. Now I'm wondering what would be the "minimal" set of files (preferably binary-only) that would be needed in order to run a local couchdb server. Are there already any efforts to create such a deployment-friendly installer? Or could anyone give some (even very general) hints how to create it? How much disk space would be minimally required for such an installation? Needless to say that I'm not at all familiar with neither the couchdb internals nor the Erlang system :). But perhaps I could figure out if I got some direction (or I could stop trying if someone told me that this would be impossible or didn't make sense at all ...) Thanks anyway!

Read the article

How to index a string like "aaa.bbb.ddd-fff" in Lucene?

- by user46703

Hi, I have to index a lot documents that contain reference numbers like "aaa.bbb.ddd-fff". The structure can change but it's always some arbitrary numbers or characters combined with "/","-","_" or some other delimiter. The users want to be able to search for any of the substrings like "aaa" or "ddd" and also for combinations like "aaa.bbb" or "ddd-fff". The best I have been able to come up with is to create my own token filter modeled after the synonym filter in "Lucene in action" which spits out multiple terms for each input. In my case I return "aaa.bbb", "bbb.ddd","bbb.ddd-fff" and all other combinations of the substrings. This works pretty well but when I index large documents (100MB) that contain lots of such strings I tend to get out of memory exceptions because my filter returns multiple terms for each input string. Is there a better way to index these strings?

Read the article

Lucene Analyzer to Use With Special Characters and Punctuation?

- by Brandon

I have a Lucene index that has several documents in it. Each document has multiple fields such as: Id Project Name Description The Id field will be a unique identifier such as a GUID, Project is a user's ProjectID and a user can only view documents for their project, and Name and Description contain text that can have special characters. When a user performs a search on the Name field, I want to be able to attempt to match the best I can such as: First Will return both: First.Last and First.Middle.Last Name can also be something like: Test (NameTest) Where, if a user types in 'Test', 'Name', or '(NameTest)', then they can find the result. However, if I say that Project is 'ProjectA' then that needs to be an exact match (case insensitive search). The same goes with the Id field. Which fields should I set up as Tokenized and which as Untokenized? Also, is there a good Analyzer I should consider to make this happen? I am stuck trying to decide the best route to implement the desired searching.

Read the article

Lucene.Net - How to treat a space-seperated phrase as a single token?

- by Gareth D

I've implemented a search facility using Lucene.Net. The index includes UK academic qualifications, including "A Level". I'd like the users to be able to search using the phrase "A Level", but using the Standrad Analyser the "A" is stripped out as a stop-word and therefore only "Level" is indexed/searched. What's my best option to work around this? I'm guessing I need to somehow tokenise "A Level" to "A-Level" or similar by creating a custom analyser. Is this the best approach? Note that I want don't want the whole search to be a phrase query. i.e. in my search box I want the user to be able to enter <"A Level" AND English Maths Physics and this would return any with "A Level" and either of English MAths or Physics. Question updated to reflect this.

Read the article

When to use CouchDB vs RDBMS

- by Andrew Whitehouse

I am looking at CouchDB, which has a number of appealing features over relational databases including: intuitive REST/HTTP interface easy replication data stored as documents, rather than normalised tables I appreciate that this is not a mature product so should be adopted with caution, but am wondering whether it is actually a viable replacement for an RDBMS (in spite of the intro page saying otherwise - http://couchdb.apache.org/docs/intro.html). Under what circumstances would CouchDB be a better choice of database than an RDBMS (e.g. MySQL), e.g. in terms of scalability, design + development time, reliability and maintenance. Are there still cases where an RDBMS is still clearly the right choice? Is this an either-or choice, or is a hybrid solution more likely to emerge as best practice?

Read the article

MongoDB or CouchDB - fit for production?

- by Alan

I was wondering if anyone can tell me if MongoDB or CouchDB are ready for a production environment. I'm now looking at these storage solutions (I'm favouring MongoDB at the moment), however these projects are quite young and so I foresee that I'm going to have to work quite hard to convince my manager that we should adopt this new technology. What I'd like to know is: 1) Who is using MongoDB or CouchDB today in a production environment? 2) How are you using MongoDB/CouchDB? 3) What problems (if any) did you come across when you adopted this new storage mechanism (and how did you overcome them)? 4) How did you deal with any migration issues that you had to deal with? 5) Do you have any good/bad experiences with either of these solutions that you'd like to share? Thanks.

Read the article

CouchDB read authorization

- by mdikici

In couchdb website - technical overview - security and validation - http://couchdb.apache.org/docs/overview.html - it writes that (on reader access part) "To protect document contents, CouchDB documents can have a reader list. This is an optional list of reader-names allowed to read the document. When a reader list is used, protected documents are only viewable by listed users." I searched about how to use it but i found nothing. So is it actually used and if it is how? Thanks. -- Mustafa

Read the article

I want absolute atomicity on a single couchdb instance (insert, fail if already existing)

- by MatternPatching

I've come to really love the couchdb style of organizing and updating data, but there are a few situations where I really need to be able to create an entry and determine if an equivalent entry is already in existence before returning to the user. The only situation that this is absolutely necessary for my application is user registration. I'm fine with having all user registration writes go to a particular, designated couchdb instance known as the "registration-instance". I want to hash the user_id into some _id to use. Then execute a put with this _id, but fail if the _id is already inserted. I need to return to the user that the user name is already reserved, and I cannot detect the conflict later and resolve it at that point, because the user would be under the impression that they had reserved the user name. I don't see why couchdb couldn't provide some way to do this, under the assumption that you designate that inserts for a particular "type" of document always are routed to a particular instance.

Read the article

UUIDs in CouchDB

- by PartlyCloudy

I am wondering about the format UUIDs are by default represented in CouchDB. While the RFC 4122 describes UUIDs like 550e8400-e29b-11d4-a716-446655440000, CouchDB uses continuously chars like 3069197232055d39bc5bc39348a36417. I've searched some time in both their wiki and their documentation what this actually is, however without any result. Do you know whether this is either a non RFC-conform format omitting all - or is this a completely different representation of the 128 bits. The background is that I'm using Java UUIDs which are formatted as noted in the RFC. I see the advantage that the CouchDB-style is probably more handy for building internal trees, but I want to be sure to use a consistent implementation.

Read the article

CouchDB: How to change view function via javascript?

- by osti

Hello Guys, I am playing around with CouchDB to test if it is "possible" [1] to store scientific data (simulated and experimental raw data + metadata). A big pro is the schema-less approach of CouchDB: we have to be very flexible with the metadata, as the set of parameters changes very often. Up to now I have some code to feed raw data, plots (both as attachments), and hierarchical metadata (as JSON) into CouchDB documents, and have written some prototype Javascript for filtering and showing. But the filtering is done on the client side (a.k.a. browser): The map function simply returns everything. How could I change the (or push a second) map function of a specific _design-document with simple browser-JS? I do not think that a temporary view would yield any performance gain... Thanks for your time and answers. [1]: of course it is possible, but is it also useful? feasible? reasonable?

Read the article

How to handle very frequent updates to a Lucene index

- by fsm

I am trying to prototype an indexing/search application which uses very volatile indexing data sources (forums, social networks etc), here are some of the performance requirements, Very fast turn-around time (by this I mean that any new data (such as a new message on a forum) should be available in the search results very soon (less than a minute)) I need to discard old documents on a fairly regular basis to ensure that the search results are not dated. Last but not least, the search application needs to be responsive. (latency on the order of 100 milliseconds, and should support at least 10 qps) All of the requirements I have currently can be met w/o using Lucene (and that would let me satisfy all 1,2 and 3), but I am anticipating other requirements in the future (like search relevance etc) which Lucene makes easier to implement. However, since Lucene is designed for use cases far more complex than the one I'm currently working on, I'm having a hard time satisfying my performance requirements. Here are some questions, a. I read that the optimize() method in the IndexWriter class is expensive, and should not be used by applications that do frequent updates, what are the alternatives? b. In order to do incremental updates, I need to keep committing new data, and also keep refreshing the index reader to make sure it has the new data available. These are going to affect 1 and 3 above. Should I try duplicate indices? What are some common approaches to solving this problem? c. I know that Lucene provides a delete method, which lets you delete all documents that match a certain query, in my case, I need to delete all documents which are older than a certain age, now one option is to add a date field to every document and use that to delete documents later. Is it possible to do range queries on document ids (I can create my own id field since I think that the one created by lucene keeps changing) to delete documents? Is it any faster than comparing dates represented as strings? I know these are very open questions, so I am not looking for a detailed answer, I will try to treat all of your answers as suggestions and use them to inform my design. Thanks! Please let me know if you need any other information.

Read the article

Adding functions to JavaScript View Engine in CouchDB 0.11

- by fuzzy lollipop

I want to format dates in a specific format but I can't figure out how to add my date function to the JavasScript view engine in CouchDB. Preferably I would like to add my formatDate() function to the prototype Date object so it is available everywhere Does anyone know how to add this so that the CouchDB JavaScript View engine will see it?

Read the article

Creating views with PHP for couchDB

- by Industrial

Hi! I have started to try out noSQL databases now and are currently testing out couchDB. Seems like a good solution, but I really get some headache when I follow available examples on how to create views (queries) to select documents from a database and sort them. Everything I can find is regarding Javascript and it would be great to take part of some examples for PHP since that is the language we will use. So, how do I create views using PHP for couchDB?

Read the article

CouchDB on Windows?

- by epitka

I started exploring CouchDB and I am interested in following: Is there or will there be a Windows install? If there is, is there a shared hosting provider that offers CouchDB? Not knowing much about it, can it be somehow embedded in my application or bin deployed (don't laugh).

Read the article

Deleting document attachments in CouchDb

- by henrik_lundgren

In CouchDb's documentation, the described method of deleting document attachments is to send a DELETE call to the attachment's url. However, I have noticed that if you edit the document and remove the attachment stub from the _attachment field, it will not be accessible anymore. If i remove foo.txt from the document below and save to CouchDb it will be gone the next time I access the document: { "_id":"attachment_doc", "_rev":1589456116, "_attachments": { "foo.txt": { "stub":true, "content_type":"text/plain", "length":29 } } } Is the attachment actually deleted on disk or is just the reference to it deleted?

Read the article

CouchDB in-place updates

- by Jason

Hi http://wiki.apache.org/couchdb/Document_Update_Handlers CouchDB ( 0.10 and above ) supports in-place updates now. I'm having trouble understanding how it works. I tried to use the example provided but I couldn't get it to work. Can someone provide some examples and uris used to access the in-place updates. Thanks

Read the article

Where does lucene .net cache the search results?

- by Lanceomagnifico

Hi, I'm trying to figure out where Lucene stores the cached query results, and how it's configured to do so - and how long it caches for. This is for an ASP.NET 3.5 solution. I'm getting this problem: If I run a search and sort the result by a particular product field, it seems to work the very first time each search and sort combination is used. If I then go in and change some product attributes, reindex and run the same search and sort, I get the products returned in the same order as the very first result. example Product A is named: foo Product B is named: bar For the first search, sort by name desc. This results in: Product A Product B Now mix up the data a bit: Change names to: Product A named: bar Product B named: foo reindex verify that the index contains the changes for these two products. search Result: Product A Product B Since I changed the alphabetical order of the names, I expected: Product B Product A So I think that Lucene is caching the search results. (Which, btw, is a very good thing.) I just need to know where/how to clear these results. I've tried deleting the index files and doing an IISreset to clear the memory, but it seems to have no effect. So I'm thinking there is another set of Lucene files outside of the indexes that Lucene uses for caching. EDIT I just found out that you must create the index for field you wish to sort on as un-tokenized. I had the field as tokenized, so sorting didn't work.

Read the article

Lucene.net create+lock errors in ASP.NET

- by acidzombie24

I have an issue with Lucene.net. It throws a lock exception. After poking around i notice these things. My code below works in an app bit when calling in Application_Start i get a NoSuchDirectoryException. Not closing the writer (as my code doesnt do below) i WILL get a LockObtainFailedException with the message Lock obtain timed out: SimpleFSLock@<FULL_PATH> from either app or asp.net These thread hinted when spawning threads they get less permissions then i do (but! my main thread has problems as well...) and one solution is to impersonate IIS. I am using visual studios 2010. I am not sure how full blown it is but my attempt to impersonate it failed. So my question is how do i have lucene create the directory and not throw an exception if dont close the writer for some reason (such as power going out)? http://stackoverflow.com/questions/2341163/why-is-my-lucene-index-getting-locked/2499285#2499285 http://stackoverflow.com/questions/1123517/lucene-net-and-i-o-threading-issue/1123981#1123981 static IndexWriter writer = null; static void lucene_init() { bool create = false; string dirname = "LuceneIndex_z"; if (System.IO.Directory.Exists(dirname) == false) create = true; var directory = FSDirectory.GetDirectory(dirname); var analyzer = new StandardAnalyzer(); writer = new IndexWriter(directory, analyzer, create); }

Read the article

Lucene.net create+lock errors in ASP.NET

- by acidzombie24

I have an issue with Lucene.net. It throws a lock exception. After poking around i notice these things. My code below works in an app bit when calling in Application_Start i get a NoSuchDirectoryException. Not closing the writer (as my code doesnt do below) i WILL get a LockObtainFailedException with the message Lock obtain timed out: SimpleFSLock@<FULL_PATH> from either app or asp.net These thread hinted when spawning threads they get less permissions then i do (but! my main thread has problems as well...) and one solution is to impersonate IIS. I am using visual studios 2010. I am not sure how full blown it is but my attempt to impersonate it failed. So my question is how do i have lucene create the directory and not throw an exception if dont close the writer for some reason (such as power going out)? http://stackoverflow.com/questions/2341163/why-is-my-lucene-index-getting-locked/2499285#2499285 http://stackoverflow.com/questions/1123517/lucene-net-and-i-o-threading-issue/1123981#1123981 static IndexWriter writer = null; static void lucene_init() { bool create = false; //I now use a full path. I still get NoSuchDirectoryException //string dirname = "LuceneIndex_z"; if (System.IO.Directory.Exists(dirname) == false) create = true; var directory = FSDirectory.GetDirectory(dirname); var analyzer = new StandardAnalyzer(); writer = new IndexWriter(directory, analyzer, create); }

Search Results

Search found 631 results on 26 pages for 'couchdb lucene'.

Page 4/26 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by kshama

- by Timo Westkämper

- by Javi

- by Don

- by Midhat

- by HBACHARYA

- by MartinStettner

- by user46703

- by Brandon

- by Gareth D

- by Andrew Whitehouse

- by Alan

- by mdikici

- by MatternPatching

- by PartlyCloudy

- by osti

- by fsm

- by fuzzy lollipop

- by Industrial

- by epitka

- by henrik_lundgren

- by Jason

- by Lanceomagnifico

- by acidzombie24

- by acidzombie24

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >