lucene - Page 4 - Developer IT

Couple o' quick questions on Apache Lucene

- by Doug

-- I don't want to start any religious wars, but a quick google search indicates that Apache Lucene is the preferred open source tool for indexing and searching. Are there others? -- What file format does Lucene use to store its index file(s)? Thank is advance. Doug

Read the article

lucene index missing files

- by Akhil

I have _0.cfs file of a lucene index directory but segments.gen and segments_2 are missing. Can I generate the segments.gen and segments_2 files without having to regenerate the _0.cfs file. Does these "segments" files contain any index specific data, which will thus force me to regnerate the entire index again. Or can I just generate the two "segments" file by copying these from another lucen index directory gnerated with the same lucene version.

Read the article

Lucene.NET performance

- by Paul Knopf

I have a website that runs of a third party search provider that is expensive. I am going to roll my own. Is Lucene.NET capable of ~25,000 products (or documents), each with maybe ten attributes used for filtering? I am looking to do a "narrow/drill down" or "faceted search". Does that sound like to much to ask from Lucene.NET?

Read the article

NHibernate.Search with Lucene.NET without using DB?

- by Samnang

Could I use NHibernate.Search only with lucene’s index without database? Because I would like to store all data only in my lucene’s index, but I really like features in NHibernate.Search.

Read the article

Where can I find open source applications that use lucene.net

- by Luke101

I am looking for any open source application that uses lucene.net. I am working on a complicated web application and would like to see how others have implemented lucene.net.

Read the article

Resources for getting started with Lucene.Net?

- by Matt Dotson

I'm building a simple site that allows users to post text content and I want to add it to a search index as it gets posted, so my site search is up to date. From what I can tell Lucene.NET is a good full text search framework. I've found very few examples of how to use it though. Can anyone post some good references for learning about Lucene?

Read the article

Apache Lucene or another Search in iPhone app

- by lostInTransit

Hi I would like to implement a search functionality within my iPhone app which can search for terms within all the documents in the application. I believe I cannot use Apache Lucene directly since it is in Java. Can I use Lucy which is a C port of Lucene (not sure if Perl and Ruby would work on it)? Or is there any other open-source search engine which I can use in my iPhone app for search within the app? Thanks

Read the article

C# Lucene get all the index

- by ngc224

Hello, I am working on a windows application using Lucene. I want to get all the indexed keywords and use them as a source for a auto-suggest on search field. How can I receive all the indexed keywords in Lucene? I am fairly new in C#. Code itself is appreciated. Thanks.

Read the article

What is the VInt in Lucene ?

- by Mehdi Amrollahi

I want to know what is the VInt in Lucene ? I read this article , but i don't understand what is it and where does Lucene use it ? Thanks .

Read the article

What is the segment in Lucene ?

- by Mehdi Amrollahi

I don't understand , what are the segments in Lucene ? What are the benefits of Lucene's segments? Thanks .

Read the article

Lucene: How to have more than 100 results?

- by zshis

Hi all, my question is simple but i cant fin de answer. Is there a way to set in Lucene to retrieve an amount of results higher than 100 in a query? Im using lucene 2.4.0 now. Thanks all.

Read the article

Solr/Lucene user click based ranking

- by Danim

I am facing the problem of sort Lucene results based on user click log. I would like that more accessed results comes first. Does anyone knows how to configure or implement such property in Lucene or Solr? Thank you very much.

Read the article

Indexing different type of Entities/Objects with Solr Lucene

- by Yos

Let's say I want to index my shop using Solr Lucene. I have many types of entities : Products, Product Reviews, Articles How do I get my Lucene to index those types, but each type with different Schema ?

Read the article

How get the offset of term in Lucene ?

- by Mehdi Amrollahi

I want to get the offset of one term in the Lucene . How can i get it ? I vectored my content as Field.TermVector.WITH_POSITIONS_OFFSETS Is there any method in Lucene that give me offset of the term in one Document ?

Read the article

Read huge free text docs in one file for lucene indexing

- by Jun

I have heaps of free text news docs in one big file. The structure of each news doc is like: (Header line) Category, Doc1, Date (day, month, year) (body text) ... ... ... (Header line) Category, Doc2, Date (day, month, year) (body text) ... ... ... If I extract each doc from the big file, it costs too much time and not efficient. Therefore, I decide to read the file line by line and feed information to lucene the same time. I write c# code to index each doc to lucene like: Streamreader sr = new Streamreader(file); string line = ""; while((line = sr.ReadLine()) != null) { How can I tell this line is a doc header line from text line and get the metadata and all the text lines of a doc for lucene to index. Also, the text is read by OCR which can not give correct line-separating. Captions are mixed with content text iterate the process till the end of the file } with thanks

Read the article

How do i implement tag searching with lucene?

- by acidzombie24

I havent used lucene. Last time i ask (many months ago, maybe a year) people suggested lucene. As am example say there are 3 items tag like this apples carrots apples carrots apple banana if a user search apples i dont care if there is any preference from 1,2 and 4. However i seen many forums do this which i hated is when a user search apple carrots 2 and 3 are get high results while 1 is hard to find even though it matches my search more closely. I HATED this in forums. Also i would like the ability to do search carrots -apples which will only get me 3. I am not sure what should happen if i search carrots banana but anyways as long as more 2 and 3 results are lower priority then 1 when i search apples carrots i'll be happy. Can lucene do this? and where do i start? i see a lot of classes and many of them talk about docs. What should i use for tagging?

Read the article

Luke like tool in C#

- by Kapil

Is there are Luke like tool for viewing lucene indexes in C# using the Lucene.NET api?

Read the article

Building a case for solr

- by Midhat

Our product consists of multiple applications, All using Lucene. 2 of the applications I am involved with have Lucene indexes of about 3 GB and 12GB. Another team is building an application, for which they estimate the LUCENE INDEX size to be close to 1 Terabyte. New documents are added to the indexes every 15 days approx. We do not have any apparent performance issues with the current applications. So my question is SHould we be using Solr now? When should one stop using Lucene and graduate to Solr? Any disadvantages/problems for using Solr? The client applications are made in ASP.Net, but I assume they will be able to use a solr server using solrnet

Read the article

Tokenizing Twitter Posts in Lucene

- by Amaç Herdagdelen

Hello, My question in a nutshell: Does anyone know of a TwitterAnalyzer or TwitterTokenizer for Lucene? More detailed version: I want to index a number of tweets in Lucene and keep the terms like @user or #hashtag intact. StandardTokenizer does not work because it discards the punctuation (but it does other useful stuff like keeping domain names, email addresses or recognizing acronyms). How can I have an analyzer which does everything StandardTokenizer does but does not touch terms like @user and #hashtag? My current solution is to preprocess the tweet text before feeding it into the analyzer and replace the characters by other alphanumeric strings. For example, String newText = newText.replaceAll("#", "hashtag"); newText = newText.replaceAll("@", "addresstag"); Unfortunately this method breaks legitimate email addresses but I can live with that. Does that approach make sense? Thanks in advance! Amaç

Read the article

Lucene Error While Reading binary block : java.io.EOFException

- by tushar Khairnar

Hi, I am getting java.io.EOFException while reading a binary block from lucene index. I am storing java object as byte-array in lucene index field and reading it when hit occurs. Here is stack trace : Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2281) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2750) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780) at java.io.ObjectInputStream.(ObjectInputStream.java:280) at org.terracotta.modules.searchable.util.SerializationUtil$OIS.(SerializationUtil.java:20) I have some background threads which write into index. But i buffer them and then write them at once like 1000. Occasionally I also issue optimize() on index. When I write, I am re-opening IndexReader. Does this is happening because of IndexReader re-opening call? Thanks. Regards Tushar

Read the article

Lucene numDocs and doqFreq on custom similarity class

- by David A

Hi All, im doing an aplication with Lucene (im a noob with it) and im facing some problems. My aplication uses the Lucene 2.4.0 library with a custom similaraty implementation (the jar is imported) In my app im calculating doqFreq and numDocs manually (im adding the values of all indexes and then i calculate a global value in order to use it on every query) and i want to use that values on a custom similarity implementation in order to calculate a new IDF. The problem is that I dont know how to use (or send) the new doqFreq and numDocs values from my app on that new similarty implementation as I dont want to change lucene´s code apart from this extra class. Any suggestions or examples? I read the docs but i dont now how to aproach this :s Thanks

Read the article

Lucene.NET search index approach

- by Tim Peel

Hi, I am trying to put together a test case for using Lucene.NET on one of our websites. I'd like to do the following: Index in a single unique id. Index across a comma delimitered string of terms or tags. For example. Item 1: Id = 1 Tags = Something,Separated-Term I will then be structuring the search so I can look for documents against tag i.e. tags:something OR tags:separate-term I need to maintain the exact term value in order to search against it. I have something running, and the search query is being parsed as expected, but I am not seeing any results. Here's some code. My parser (_luceneAnalyzer is passed into my indexing service): var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_CURRENT, "Tags", _luceneAnalyzer); parser.SetDefaultOperator(QueryParser.Operator.AND); return parser; My Lucene.NET document creation: var doc = new Document(); var id = new Field( "Id", NumericUtils.IntToPrefixCoded(indexObject.id), Field.Store.YES, Field.Index.NOT_ANALYZED, Field.TermVector.NO); var tags = new Field( "Tags", string.Join(",", indexObject.Tags.ToArray()), Field.Store.NO, Field.Index.ANALYZED, Field.TermVector.YES); doc.Add(id); doc.Add(tags); return doc; My search: var parser = BuildQueryParser(); var query = parser.Parse(searchQuery); var searcher = Searcher; TopDocs hits = searcher.Search(query, null, max); IList<SearchResult> result = new List<SearchResult>(); float scoreNorm = 1.0f / hits.GetMaxScore(); for (int i = 0; i < hits.scoreDocs.Length; i++) { float score = hits.scoreDocs[i].score * scoreNorm; result.Add(CreateSearchResult(searcher.Doc(hits.scoreDocs[i].doc), score)); } return result; I have two documents in my index, one with the tag "Something" and one with the tags "Something" and "Separated-Term". It's important for the - to remain in the terms as I want an exact match on the full value. When I search with "tags:Something" I do not get any results. Question What Analyzer should I be using to achieve the search index I am after? Are there any pointers for putting together a search such as this? Why is my current search not returning any results? Many thanks

Read the article

Lucene 2.2 arabic analyzer

- by Mustafa Zidan

Is it possible to modify Lucene 2.2 to add Arabic analyzer and if anyone have done this already where can I get source/jar

Read the article

Lucene real-time indexing?

- by Kip

What is the best way to achieve Lucene real-time indexing?

Read the article

Get highest frequency terms from Lucene index

- by Julia

Hello! i need to extract terms with highest frequencies from several lucene indexes, to use them for some semantic analysis. So, I want to get maybe top 30 most occuring terms(still did not decide on threshold, i will analyze results) and their per-index counts. I am aware that I might lose some precision because of potentionally dropped duplicates, but for now, lets say i am ok with that. So for the proposed solutions, (needless to say maybe) speed is not important, since I would do static analysis, I would put accent on simplicity of implementation because im not so skilled with Lucene (not the programming guru too :/ ) and cant wrap my mind around many concepts of it.. I can not find any code samples from something similar, so all concrete advices (code, pseudocode, links to code samples...) I will apretiate very much!!! Thank you!

Search Results

Search found 393 results on 16 pages for 'lucene'.

Page 4/16 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Doug

- by Akhil

- by Paul Knopf

- by Samnang

- by Luke101

- by Matt Dotson

- by lostInTransit

- by ngc224

- by Mehdi Amrollahi

- by Mehdi Amrollahi

- by zshis

- by Danim

- by Yos

- by Mehdi Amrollahi

- by Jun

- by acidzombie24

- by Kapil

- by Midhat

- by Amaç Herdagdelen

- by tushar Khairnar

- by David A

- by Tim Peel

- by Mustafa Zidan

- by Kip

- by Julia

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >