Search Results

Search found 12803 results on 513 pages for 'lucene index'.

Page 3/513 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Using Lucene to Query File properties in Windows

    - by sneha
    Hi All, I am planning to use Apache lucense in one of my projects, I want to index files based on the file properties (I won’t be indexing the data) and I want lucense to query the index so that I can quickly find list of files to based on the properties . E.g: give me all the files with access time greater than 10/10/2005 and access time less than 10/04/2010 and file created by james. Can i use Lucene for these kind of projects ? or i better of using windows search (the foor print is very heavy almost 5 MB :( ) and i have to bundling this as part of my application is seems to tough. Can you please suggest is there any better alternatives here?

    Read the article

  • My Lucene queries only ever find one hit

    - by Bob
    I'm getting started with Lucene.Net (stuck on version 2.3.1). I add sample documents with this: Dim indexWriter = New IndexWriter(indexDir, New Standard.StandardAnalyzer(), True) Dim doc = Document() doc.Add(New Field("Title", "foo", Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.NO)) doc.Add(New Field("Date", DateTime.UtcNow.ToString, Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.NO)) indexWriter.AddDocument(doc) indexWriter.Close() I search for documents matching "foo" with this: Dim searcher = New IndexSearcher(indexDir) Dim parser = New QueryParser("Title", New StandardAnalyzer()) Dim Query = parser.Parse("foo") Dim hits = searcher.Search(Query) Console.WriteLine("Number of hits = " + hits.Length.ToString) No matter how many times I run this, I only ever get one result. Any ideas?

    Read the article

  • Lucene.Net support phrases?: What is best approach to tokenize comma-delimited data (atomically) in

    - by Pete Alvin
    I have a database with a column I wish to index that has comma-delimited names, e.g., User.FullNameList = "Helen Ready, Phil Collins, Brad Paisley" I prefer to tokenize each name atomically (name as a whole searchable entity). What is the best approach for this? Did I miss a simple option to set the tokenize delimiter? Do I have to subclass or write my own class that to roll my own tokenizer? Something else? ;) Or does Lucene.net not support phrases? Or is it smart enough to handle this use case automatically? I'm sure I'm not the first person to have to do this. Googling produced no noticeable solutions.

    Read the article

  • Hyphens in Lucene

    - by user72185
    Hi, I'm playing around with Lucene and noticed that the use of a hyphen (e.g. "semi-final") will result in two words ("semi" and "final" in the index. How is this supposed to match if the users searches for "semifinal", in one word? Edit: I'm just playing around with the StandardTokenizer class actually, maybe that is why? Am I missing a filter? Thanks! (Edit) My code looks like this: StandardAnalyzer sa = new StandardAnalyzer(); TokenStream ts = sa.TokenStream("field", new StringReader("semi-final")); while (ts.IncrementToken()) { string t = ts.ToString(); Console.WriteLine("Token: " + t); }

    Read the article

  • Disabling scoring in Lucene(.NET)

    - by user72185
    Hi, When searching, is there a way to disable scoring for any query? The scenario is that the user refines his query by trying different combinations of words, phrases etc., and needs realtime (well, reasonably fast at least) responses on the number of hits. Search time slows down a lot when there are millions of hits due to scoring, but the user really doesn't care about all these documents. As soon as he sees there are 1M+ hits he will start adding additional words to the query. A "Sort by relevance" option would allow him to do this quickly, while turning scoring back on when the number of hits is reasonable. Is this possible? I'm using Lucene.NET 2.9.2 but AFAIK it is identical to the Java version.

    Read the article

  • 301 redirect from "/index.html" to root if index.html not exist

    - by Andrij Muzychka
    Can I create 301 redirect from "index.html" to root directory if file "index.html" not exist? For example: link "http://example.com/index.html" show "404 Error" page. I need 301 redirect to root directory: "http://example.com/" in .htaccess I add rule: Options +FollowSymLinks RewriteCond %{THE_REQUEST} ^.*/index.html RewriteRule ^(.*)index.html$ http://example.com/$1 [R=301,L] but it doesn't work. Can you help me solve this problem?

    Read the article

  • SQL SERVER Size of Index Table for Each Index Solution 2

    Earlier I had ran puzzle where I asked question regarding size of index table for each index in database over here SQL SERVER Size of Index Table A Puzzle to Find Index Size for Each Index on Table. I had received good amount answers and I had blogged about that here SQL SERVER [...]...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Lucene .Net Searching with TermVector

    - by Ashish
    in Lucene.Net,i am creating the document for searching a word and want to display before 10 words and after 10 words.i have used TermVector. Lucene.Net.Documents.Field fldContent = new Lucene.Net.Documents.Field("content", content, Lucene.Net.Documents.Field.Store.YES, Lucene.Net.Documents.Field.Index.TOKENIZED, Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS); Can anyone help me how to find out the keyword position and extract nearest 15 words. please send some code. Thanks Ashish

    Read the article

  • Get highest frequency terms from Lucene index

    - by Julia
    Hello! i need to extract terms with highest frequencies from several lucene indexes, to use them for some semantic analysis. So, I want to get maybe top 30 most occuring terms(still did not decide on threshold, i will analyze results) and their per-index counts. I am aware that I might lose some precision because of potentionally dropped duplicates, but for now, lets say i am ok with that. So for the proposed solutions, (needless to say maybe) speed is not important, since I would do static analysis, I would put accent on simplicity of implementation because im not so skilled with Lucene (not the programming guru too :/ ) and cant wrap my mind around many concepts of it.. I can not find any code samples from something similar, so all concrete advices (code, pseudocode, links to code samples...) I will apretiate very much!!! Thank you!

    Read the article

  • how do i filter my lucene search results?

    - by Andrew Bullock
    Say my requirement is "search for all users by name, who are over 18" If i were using SQL, i might write something like: Select * from [Users] Where ([firstname] like '%' + @searchTerm + '%' OR [lastname] like '%' + @searchTerm + '%') AND [age] >= 18 However, im having difficulty translating this into lucene.net. This is what i have so far: var parser = new MultiFieldQueryParser({ "firstname", "lastname"}, new StandardAnalyser()); var luceneQuery = parser.Parse(searchterm) var query = FullTextSession.CreateFullTextQuery(luceneQuery, typeof(User)); var results = query.List<User>(); How do i add in the "where age = 18" bit? I've heard about .SetFilter(), but this only accepts LuceneQueries, and not IQueries. If SetFilter is the right thing to use, how do I make the appropriate filter? If not, what do I use and how do i do it? Thanks! P.S. This is a vastly simplified version of what I'm trying to do for clarity, my WHERE clause is actually a lot more complicated than shown here. In reality i need to check if ids exist in subqueries and check a number of unindexed properties. Any solutions given need to support this. Thanks

    Read the article

  • "no inclosing instance error " while getting top term frequencies for document from Lucene index

    - by Julia
    Hello ! I am trying to get the most occurring term frequencies for every particular document in Lucene index. I am trying to set the treshold of top occuring terms that I care about, maybe 20 However, I am getting the "no inclosing instance of type DisplayTermVectors is accessible" when calling Comparator... So to this function I pass vector of every document and max top terms i would like to know protected static Collection getTopTerms(TermFreqVector tfv, int maxTerms){ String[] terms = tfv.getTerms(); int[] tFreqs = tfv.getTermFrequencies(); List result = new ArrayList(terms.length); for (int i = 0; i < tFreqs.length; i++) { TermFrq tf = new TermFrq(terms[i], tFreqs[i]); result.add(tf); } Collections.sort(result, new FreqComparator()); if(maxTerms < result.size()){ result = result.subList(0, maxTerms); } return result; } /Class for objects to hold the term/freq pairs/ static class TermFrq{ private String term; private int freq; public TermFrq(String term,int freq){ this.term = term; this.freq = freq; } public String getTerm(){ return this.term; } public int getFreq(){ return this.freq; } } /*Comparator to compare the objects by the frequency*/ class FreqComparator implements Comparator{ public int compare(Object pair1, Object pair2){ int f1 = ((TermFrq)pair1).getFreq(); int f2 = ((TermFrq)pair2).getFreq(); if(f1 > f2) return 1; else if(f1 < f2) return -1; else return 0; } } Explanations and corrections i will very much appreciate, and also if someone else had experience with term frequency extraction and did it better way, I am opened to all suggestions! Please help!!!! Thanx!

    Read the article

  • Webserver directory index: index.xml?

    - by Marius
    Hello there, I am making my first RSS-Feed, and I want to host it like this: www.example.com/rss/ I tried to name the xml-file "index.xml" and place it inside the directory, however, when I type http://www.example.com/rss/ i arrive at "Index of /rss" where the file is listed as being part of the directory, but it is not loaded automatically. What can be done about this? Thank you for your time. Kind regards, Marius

    Read the article

  • How can I index HTML documents?

    - by Swami
    I am using Lucene .NEt to do full-text searching. Till now I have been indexing PDF docs, but now I have a few webpages that I need to index. What's the best/easiest way to index HTML documents to add to my Lucene index? I am using .NET/C#

    Read the article

  • How do I get Lucene (.NET) to highlight correctly with wildcards?

    - by Scott Stafford
    I am using the Lucene.NET API directly in my ASP.NET/C# web application. When I search using a wildcard, like "fuc*", the highlighter doesn't highlight anything, but when I search for the whole word, like "fuchsia", it highlights fine. Does Lucene have the ability to highlight using the same logic it used to match with? Various maybe-relevant code-snippets below: var formatter = new Lucene.Net.Highlight.SimpleHTMLFormatter( "<span class='srhilite'>", "</span>"); var fragmenter = new Lucene.Net.Highlight.SimpleFragmenter(100); var scorer = new Lucene.Net.Highlight.QueryScorer(query); var highlighter = new Lucene.Net.Highlight.Highlighter(formatter, scorer); highlighter.SetTextFragmenter(fragmenter); and then on each hit... string description = Server.HtmlEncode(doc.Get("Description")); var stream = analyzer.TokenStream("Description", new System.IO.StringReader(description)); string highlighted_text = highlighter.GetBestFragments( stream, description, 1, "..."); And I'm using the QueryParser and the StandardAnalyzer.

    Read the article

  • How do I detect if there is already a similar document stored in Lucene index.

    - by Jenea
    Hi. I need to exclude duplicates in my database. The problem is that duplicates are not considered exact match but rather similar documents. For this purpose I decided to use FuzzyQuery like follows: var fuzzyQuery = new global::Lucene.Net.Search.FuzzyQuery( new Term("text", queryText), 0.8f, 0); hits = _searcher.Search(query); The idea was to set the minimal similarity to 0.8 (that I think is high enough) so only similar documents will be found excluding those that are not sufficiently similar. To test this code I decided to see if it finds already existing document. To the variable queryText was assigned a value that is stored in the index. The code from above found nothing, in other words it doesn't detect even exact match. Index was build by this code: doc.Add(new global::Lucene.Net.Documents.Field( "text", text, global::Lucene.Net.Documents.Field.Store.YES, global::Lucene.Net.Documents.Field.Index.TOKENIZED, global::Lucene.Net.Documents.Field.TermVector.WITH_POSITIONS_OFFSETS)); I followed recomendations from bellow and the results are: TermQuery doesn't return any result. Query contructed with var _analyzer = new RussianAnalyzer(); var parser = new global::Lucene.Net.QueryParsers .QueryParser("text", _analyzer); var query = parser.Parse(queryText); var _searcher = new IndexSearcher (Settings.General.Default.LuceneIndexDirectoryPath); var hits = _searcher.Search(query); Returns several results with the maximum score the document that has exact match and other several documents that have similar content.

    Read the article

  • Search for index.php and index.html and replace string

    - by Jonas
    Hello. I recently had some sort of Malware on my computer that added to all index.php and index.html ON THE WEBSERVER! the following string(s): echo "<iframe src=\"http://fabujob.com/?click=AD4A4\" width=1 height=1 style=\"visibility:hidden;position:absolute\"></iframe>"; echo "<iframe src=\"http://fabujob.com/?click=AC785\" width=1 height=1 style=\"visibility:hidden;position:absolute\"></iframe>"; So the parameter after "click=" always changes. These two were only examples. Is there a way to do that quick and fast? . . EDIT: It is on my webserver, so no use of find...

    Read the article

  • SQL SERVER Understanding ALTER INDEX ALL REBUILD with Disabled Clustered Index

    This blog is in response to the ongoing communication with the reader who had earlier asked the question of SQL SERVER Disable Clustered Index and Data Insert. The same reader has asked me the difference between ALTER INDEX ALL REBUILD and ALTER INDEX REBUILD along with disabled clustered index. Instead of writing a big [...]...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • SQL SERVER – Video – Performance Improvement in Columnstore Index

    - by pinaldave
    I earlier wrote an article about SQL SERVER – Fundamentals of Columnstore Index and it got very well accepted in community. However, one of the suggestion I keep on receiving for that article is that many of the reader wanted to see columnstore index in the action but they were not able to do that. Some of the readers did not install SQL Server 2012 or some did not have good machine to recreate the big table involved in the demo. For the same reason, I have created small video for that. I have written two more article on columstore index. Please read them as followup to the video: SQL SERVER – How to Ignore Columnstore Index Usage in Query SQL SERVER – Updating Data in A Columnstore Index Reference: Pinal Dave (http://blog.SQLAuthority.com) Filed under: Pinal Dave, PostADay, SQL, SQL Authority, SQL Index, SQL Performance, SQL Query, SQL Scripts, SQL Server, SQL Tips and Tricks, T SQL, Technology, Video

    Read the article

  • How important is index size when searching?

    - by Michael K
    My company has recently began using Apache Solr to search its data. As we learn how to use it we have gone down the path of indexing multiple fields to get the results we need. Most of these are either N-Grammed or Edge-N-Grammed. Gramming by nature takes up a lot of space, which takes more time to search. Space is cheap, but time is less so. Index time is not too important, since a delta-import (only get the changes since last index) is extremely quick and you only pay a penalty on the first import. What we've not been able to determine is what effect the index size has on query times. Obviously a larger index takes longer to search, but the time added by n-gramming a field is difficult to predict. How do you determine whether a field is worth gramming? Can you predict how much longer a query will take when you gram a field?

    Read the article

  • How do i keep my DB and lucene in sync?

    - by acidzombie24
    So i can have a transaction in sql. But i am sure its not a good idea to wait in the middle of a transaction for lucene to finish also i am unsure if lucene is permanently saved in the DB until i do something there. Whats the best way to keep my DB and lucene in sync? I am thinking of adding a lucene_queue in my sql db and everytime i make a change i add it into the queue (removing older queue if any) and delete it once it is done. Is this the best way? Also i am unsure how to make lucene permanently keep the changes i made and how frequent i can/should do it.

    Read the article

  • Should I specify both INDEX and UNIQUE INDEX?

    - by Matt Huggins
    On one of my PostgreSQL tables, I have a set of two fields that will be defined as being unique in the table, but will also both be used together when selecting data. Given this, do I only need to define a UNIQUE INDEX, or should I specify an INDEX in addition to the UNIQUE INDEX? This? CREATE UNIQUE INDEX mytable_col1_col2_idx ON mytable (col1, col2); Or this? CREATE UNIQUE INDEX mytable_col1_col2_uidx ON mytable (col1, col2); CREATE INDEX mytable_col1_col2_idx ON mytable (col1, col2);

    Read the article

  • My website google index suddenly increase and also suddenly reduced

    - by Jeg Bagus
    Yesterday before i sleep, i check my site index. i get about 50 index on google. today morning when i wake up, i get 250 index on google. and my page ranking better on several keyword. than i add 1 page and 2 canonical link, add 404 page header, and resubmit sitemap. and after 2 hour, its going down to 50 index again. and my page ranking just rolled back to previous day. what is actually happen? is it because i resubmit sitemap? until now, google still crawl my website. do they try to refresh the index?

    Read the article

  • pylucene: install error

    - by Pradeep
    I am trying to install Pylucene (pylucene-3.3-3-src.tar.gz) on my ubuntu linux 11.10. I have python 2.7.2. I was able to compile JCC (I think) because I didnt see any error when I installed it. When I tried to install Pylucene I get the following error. Can someone help? Thanks. ICU not installed /usr/bin/python -m jcc --shared --jar lucene-java-3.3/lucene/build/lucene-core-3.3.jar --jar lucene-java-3.3/lucene/build/contrib/analyzers/common/lucene-analyzers-3.3.jar --jar lucene-java-3.3/lucene/build/contrib/memory/lucene-memory-3.3.jar --jar lucene-java-3.3/lucene/build/contrib/highlighter/lucene-highlighter-3.3.jar --jar build/jar/extensions.jar --jar lucene-java-3.3/lucene/build/contrib/queries/lucene-queries-3.3.jar --jar lucene-java-3.3/lucene/build/contrib/grouping/lucene-grouping-3.3.jar --package java.lang java.lang.System java.lang.Runtime --package java.util java.util.Arrays java.util.HashMap java.util.HashSet java.text.SimpleDateFormat java.text.DecimalFormat java.text.Collator --package java.util.regex --package java.io java.io.StringReader java.io.InputStreamReader java.io.FileInputStream --exclude org.apache.lucene.queryParser.Token --exclude org.apache.lucene.queryParser.TokenMgrError --exclude org.apache.lucene.queryParser.QueryParserTokenManager --exclude org.apache.lucene.queryParser.ParseException --exclude org.apache.lucene.search.regex.JakartaRegexpCapabilities --exclude org.apache.regexp.RegexpTunnel --exclude org.apache.lucene.analysis.cn.smart.AnalyzerProfile --python lucene --mapping org.apache.lucene.document.Document 'get:(Ljava/lang/String;)Ljava/lang/String;' --mapping java.util.Properties 'getProperty:(Ljava/lang/String;)Ljava/lang/String;' --sequence java.util.AbstractList 'size:()I' 'get:(I)Ljava/lang/Object;' --rename org.apache.lucene.search.highlight.SpanScorer=HighlighterSpanScorer --version 3.3 --module python/collections.py --module python/ICUNormalizer2Filter.py --module python/ICUFoldingFilter.py --module python/ICUTransformFilter.py --files 3 --build /usr/bin/python: No module named jcc make: *** [compile] Error 1 Here is my Makefile configuration which I uncommented PREFIX_PYTHON=/usr ANT=ant PYTHON=$(PREFIX_PYTHON)/bin/python JCC=$(PYTHON) -m jcc --shared NUM_FILES=3

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >