Search Results

Search found 12803 results on 513 pages for 'lucene index'.

Page 6/513 | < Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >

Lucene.net create+lock errors in ASP.NET

- by acidzombie24

I have an issue with Lucene.net. It throws a lock exception. After poking around i notice these things. My code below works in an app bit when calling in Application_Start i get a NoSuchDirectoryException. Not closing the writer (as my code doesnt do below) i WILL get a LockObtainFailedException with the message Lock obtain timed out: SimpleFSLock@<FULL_PATH> from either app or asp.net These thread hinted when spawning threads they get less permissions then i do (but! my main thread has problems as well...) and one solution is to impersonate IIS. I am using visual studios 2010. I am not sure how full blown it is but my attempt to impersonate it failed. So my question is how do i have lucene create the directory and not throw an exception if dont close the writer for some reason (such as power going out)? http://stackoverflow.com/questions/2341163/why-is-my-lucene-index-getting-locked/2499285#2499285 http://stackoverflow.com/questions/1123517/lucene-net-and-i-o-threading-issue/1123981#1123981 static IndexWriter writer = null; static void lucene_init() { bool create = false; //I now use a full path. I still get NoSuchDirectoryException //string dirname = "LuceneIndex_z"; if (System.IO.Directory.Exists(dirname) == false) create = true; var directory = FSDirectory.GetDirectory(dirname); var analyzer = new StandardAnalyzer(); writer = new IndexWriter(directory, analyzer, create); }

Read the article
Lucene stop words not removed during searching need a substitute for AnalyzingQueryParser

- by iamrohitbanga

I have created a Lucene index with the following analyzer. public class DocSpecAnalyzer extends Analyzer { private static CharArraySet stopSet;// = new HashSet<String>(Arrays.asList());//STOP_WORDS_SET; static { stopSet = new CharArraySet(FDConstants.stopwords, true); // uncommenting this displays all the stop words // for (String s: FDConstants.stopwords) { // System.out.println(s); // } } /** * Specifies whether deprecated acronyms should be replaced with HOST type. * See {@linkplain https://issues.apache.org/jira/browse/LUCENE-1068} */ private final boolean enableStopPositionIncrements; private final Version matchVersion; public DocSpecAnalyzer(Version matchVersion) { this.matchVersion = matchVersion; enableStopPositionIncrements = StopFilter.getEnablePositionIncrementsVersionDefault(matchVersion); } public TokenStream tokenStream(String fieldName, Reader reader) { StandardTokenizer tokenStream = new StandardTokenizer(matchVersion, reader); tokenStream.setMaxTokenLength(DEFAULT_MAX_TOKEN_LENGTH); TokenStream result = new StandardFilter(tokenStream); result = new LowerCaseFilter(result); result = new StopFilter(enableStopPositionIncrements, result, stopSet); result = new PorterStemFilter(result); return result; } /** Default maximum allowed token length */ public static final int DEFAULT_MAX_TOKEN_LENGTH = 255; } Now when I search for documents for a query containing stop words, i get hits for stop words also. It is because of http://lucene.apache.org/java/2_9_2/api/contrib-misc/org/apache/lucene/queryParser/analyzing/AnalyzingQueryParser.html not handling stop words. Is there a substitute? Update: forgot to mention that I need to do a fuzzy search. that is why i am using an AnalyzingQueryParser. Update portion of code that invokes AnalyzingQueryParser AnalyzingQueryParser parser = new AnalyzingQueryParser(Version.LUCENE_CURRENT,"description", analyzer); // fuzzy matching preparation String fuzzyStr = TextQuery.prepareFuzzy(tq.text, fuzzyDist); Query query = parser.parse(fuzzyStr); TopScoreDocCollector collector = TopScoreDocCollector.create(numHits, true); searcher.search(query, collector);

Read the article
Lucene stop words not removed during searching

- by iamrohitbanga

I have created a Lucene index with the following analyzer. public class DocSpecAnalyzer extends Analyzer { private static CharArraySet stopSet;// = new HashSet<String>(Arrays.asList());//STOP_WORDS_SET; static { stopSet = new CharArraySet(FDConstants.stopwords, true); // uncommenting this displays all the stop words // for (String s: FDConstants.stopwords) { // System.out.println(s); // } } /** * Specifies whether deprecated acronyms should be replaced with HOST type. * See {@linkplain https://issues.apache.org/jira/browse/LUCENE-1068} */ private final boolean enableStopPositionIncrements; private final Version matchVersion; public DocSpecAnalyzer(Version matchVersion) { this.matchVersion = matchVersion; enableStopPositionIncrements = StopFilter.getEnablePositionIncrementsVersionDefault(matchVersion); } public TokenStream tokenStream(String fieldName, Reader reader) { StandardTokenizer tokenStream = new StandardTokenizer(matchVersion, reader); tokenStream.setMaxTokenLength(DEFAULT_MAX_TOKEN_LENGTH); TokenStream result = new StandardFilter(tokenStream); result = new LowerCaseFilter(result); result = new StopFilter(enableStopPositionIncrements, result, stopSet); result = new PorterStemFilter(result); return result; } /** Default maximum allowed token length */ public static final int DEFAULT_MAX_TOKEN_LENGTH = 255; } Now when I search for documents for a query containing stop words, i get hits for stop words also. As I post this problem, I found the bug. It is because of http://lucene.apache.org/java/2_9_2/api/contrib-misc/org/apache/lucene/queryParser/analyzing/AnalyzingQueryParser.html not handling stop words. Is there a substitute? Update: forgot to mention that I need to do a fuzzy search. that is why i am using an AnalyzingQueryParser.

Read the article
SQL Server – Learning SQL Server Performance: Indexing Basics – Video

- by pinaldave

Today I remember one of my older cartoon years ago created for Indexing and Performance. Every single time when Performance is discussed, Indexes are mentioned along with it. In recent times, data and application complexity is continuously growing. The demand for faster query response, performance, and scalability by organizations is increasing and developers and DBAs need to now write efficient code to achieve this. DBA and Developers A DBA’s role is critical, because a production environment has to run 24×7, hence maintenance, trouble shooting, and quick resolutions are the need of the hour. The first baby step into any performance tuning exercise in SQL Server involves creating, analysing, and maintaining indexes. Though we have learnt indexing concepts from our college days, indexing implementation inside SQL Server can vary. Understanding this behaviour and designing our applications appropriately will make sure the application is performed to its highest potential. Video Learning Vinod Kumar and myself we often thought about this and realized that practical understanding of the indexes is very important. One can not master every single aspects of the index. However there are some minimum expertise one should gain if performance is one of the concern. We decided to build a course which just addresses the practical aspects of the performance. In this course, we explored some of these indexing fundamentals and we elaborated on how SQL Server goes about using indexes. At the end of this course of you will know the basic structure of indexes, practical insights into implementation, and maintenance tips and tricks revolving around indexes. Finally, we will introduce SQL Server 2012 column store indexes. We have refrained from discussing internal storage structure of the indexes but have taken a more practical, demo-oriented approach to explain these core concepts. Course Outline Here are salient topics of the course. We have explained every single concept along with a practical demonstration. Additionally shared our personal scripts along with the same. Introduction Fundamentals of Indexing Index Fundamentals Index Fundamentals – Visual Representation Practical Indexing Implementation Techniques Primary Key Over Indexing Duplicate Index Clustered Index Unique Index Included Columns Filtered Index Disabled Index Index Maintenance and Defragmentation Introduction to Columnstore Index Indexing Practical Performance Tips and Tricks Index and Page Types Index and Non Deterministic Columns Index and SET Values Importance of Clustered Index Effect of Compression and Fillfactor Index and Functions Dynamic Management Views (DMV) – Fillfactor Table Scan, Index Scan and Index Seek Index and Order of Columns Final Checklist: Index and Performance Well, we believe we have done our part, now waiting for your comments and feedback. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: PostADay, SQL, SQL Authority, SQL Index, SQL Performance, SQL Query, SQL Server, SQL Tips and Tricks, SQLServer, T SQL, Technology, Video

Read the article
Capturing index operations using a DDL trigger

- by AaronBertrand

Today on twitter the following question came up on the #sqlhelp hash tag, from DaveH0ward : Is there a DMV that can tell me the last time an index was rebuilt? SQL 2008 My initial response: I don't believe so, you'd have to be monitoring for that ... perhaps a DDL trigger capturing ALTER_INDEX? Then I remembered that the default trace in SQL Server ( as long as it is enabled ) will capture these events. My follow-up response: You can get it from the default trace, blog post forthcoming So here is...(read more)

Read the article
Google webmaster Index Status. Total Indexed=0

- by hammad

I previously changed my domain from www.visualstudiolearn.blogspot.com to www.visualstudiolearn.com... i had around 300 posts with the previous domain name and most of them where showing up on Google. Now that i have changed my domain name the index status shows total indexed as 0 and when i go to the advanced tab it says 304(not selected) and 217 blocked my robots. Im really depressed because of this situation. could you please help out???

Read the article
Capturing index operations using a DDL trigger

- by AaronBertrand

Today on twitter the following question came up on the #sqlhelp hash tag, from DaveH0ward : Is there a DMV that can tell me the last time an index was rebuilt? SQL 2008 My initial response: I don't believe so, you'd have to be monitoring for that ... perhaps a DDL trigger capturing ALTER_INDEX? Then I remembered that the default trace in SQL Server ( as long as it is enabled ) will capture these events. My follow-up response: You can get it from the default trace, blog post forthcoming So here is...(read more)

Read the article
Rewrite URL to index.php but avoid index.php in the URL

- by Tom

I'm trying to internally redirect all requests to index.php and externally redirect all requests that contain index.php using a .htaccess file. So URLs like http://host/test should be processed by index.php and URLs like http://host/index.php/test should be redirected to http://host/test and then processed by index.php (without redirecting the browser to index.php) I tried the following but always get a message "Too many redirects...": RewriteRule ^index\.php/?(.*)$ /$1 [R,L] RewriteRule .* index.php/$0 [L]

Read the article
Undefined index: email in C:\wamp\www\emailvalidate.php on line 4

- by klari

I am new to Ajax. In my Ajax I get the following error message : Notice: Undefined index: address in C:\wamp\www\test\sample.php on line 11 I googled but I didn't get a solution for my specified issue. Here is what I did. HTML Form with Ajax (test1.php) <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Untitled Document</title> <script> function loadXMLDoc() { var xmlhttp; if (window.XMLHttpRequest) {// code for IE7+, Firefox, Chrome, Opera, Safari xmlhttp=new XMLHttpRequest(); } else {// code for IE6, IE5 xmlhttp=new ActiveXObject("Microsoft.XMLHTTP"); } xmlhttp.onreadystatechange=function() { if (xmlhttp.readyState==4 && xmlhttp.status==200) { document.getElementById("mydiv").innerHTML=xmlhttp.responseText; } } xmlhttp.open("POST","test2.php",true); xmlhttp.send(); } </script> </head> <body> <form id="form1" name="form1" method="post" action="sample.php"> <p> <label for="mail"></label> <input type="text" name="mail" id="mail" onblur="loadXMLDoc()" /> <input type="submit" name="submit" id="submit" value="Submit" /> </p> <p><div id = 'mydiv'></div></p> </form> </body> </html> test2.php <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Untitled Document</title> </head> <body> <?php echo "Your Address is ".$_POST['address']; ?> </body> </html> I am sure it is very simple issue but I don't know how to solve it. Any help ?

Read the article
How to use multifieldquery and filters in Lucene.net

- by Khotu Nam

I want to perform a multi field search on a lucene.net index but filter the results based on one of the fields. Here's what I'm currently doing: To index the fields the definitions are: doc.Add(new Field("id", id.ToString(), Field.Store.YES, Field.Index.UN_TOKENIZED)); doc.Add(new Field("title", title, Field.Store.NO, Field.Index.TOKENIZED)); doc.Add(new Field("summary", summary, Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES)); doc.Add(new Field("description", description, Field.Store.NO, Field.Index.TOKENIZED, Field.TermVector.YES)); doc.Add(new Field("distribution", distribution, Field.Store.NO, Field.Index.UN_TOKENIZED)); When I perform the search I do the following: MultiFieldQueryParser parser = new MultiFieldQueryParser(new string[]{"title", "summary", "description"}, analyzer); parser.SetDefaultOperator(QueryParser.Operator.AND); Query query = parser.Parse(text); BooleanQuery bq = new BooleanQuery(); TermQuery tq = new TermQuery(new Term("distribution", distribution)); bq.Add(tq, BooleanClause.Occur.MUST); Filter filter = new QueryFilter(bq); Hits hits = searcher.Search(query, filter); However, the result is always 0 hits. What am I doing wrong?

Read the article
string matching algorithms used by lucene

- by iamrohitbanga

i want to know the string matching algorithms used by Apache Lucene. i have been going through the index file format used by lucene given here. it seems that lucene stores all words occurring in the text as is with their frequency of occurrence in each document. but as far as i know that for efficient string matching it would need to preprocess the words occurring in the Documents. example: search for "iamrohitbanga is a user of stackoverflow" (use fuzzy matching) in some documents. it is possible that there is a document containing the string "rohit banga" to find that the substrings rohit and banga are present in the search string, it would use some efficient substring matching. i want to know which algorithm it is. also if it does some preprocessing which function call in the java api triggers it.

Read the article
Intersecting boundaries with lucene

- by Silvio Donnini

I'm using Lucene, and I'm trying to find a way to index and retrieve documents that have a ranged property. For example I have: Document 1: Price:[30 TO 50] Document 2: Price:[45 TO 60] Document 3: Price:[60 TO 70] And I would like to search for all the documents whose ranges intersect a specific interval, in the above example, if I search for Price in [55 TO 65] I should get Document 2 and Document 3 as results. I don't think NumericRangeQueries alone would do the trick, I need to work on the index with something similar to R-trees, but are they implemented in Lucene? Also, I suppose that what I need should be a subclass of MultiTermQuery, because the query Price in [55 TO 65] has two boundaries, but I don't see anything suitable among MultiTermQuery's subclasses. Any help is appreciated, thanks, Silvio P.S. I'm using Lucene 2.9.0, but I can update to the latest release if needed.

Read the article
Rollback in lucene

- by Petrick Lim

Is there a rollback in lucene? I'm saving & updating database repository & lucene repository simultaneously so that the lucene index & database are in sync.. ex. CustomerRepository.add(customer); SupplierRepository.add(supplier); CustomerLuceneRepository.add(customer); SupplierLuceneRepository.add(supplier); // If this here fails i cannot rollback the customer above DataContext.SubmitChanges();

Read the article
Why are my Lucene Document results empty?

- by vegashacker

I'm running a simple test--trying to index something and then search for it. I index a simple document, but then when a search for a string in it, I get back what looks to be an empty document (it has no fields). Lucene seems to be doing something, because if I search for a word that's not in the document, it returns 0 results. Any reason why Lucene would reliably return a document when it finds one that matches the given query, and yet that document has nothing in it? Thanks! PS: I'm actually running Lucandra (Lucene + Cassandra). That certainly may be a relevant detail, but not sure.

Read the article
lucene index missing files

- by Akhil

I have _0.cfs file of a lucene index directory but segments.gen and segments_2 are missing. Can I generate the segments.gen and segments_2 files without having to regenerate the _0.cfs file. Does these "segments" files contain any index specific data, which will thus force me to regnerate the entire index again. Or can I just generate the two "segments" file by copying these from another lucen index directory gnerated with the same lucene version.

Read the article
SQL SERVER Size of Index Table for Each Index Solution 3 Powershell

Laerte Junior If you are a Powershell user, the name of the Laerte Junior is not a new name. He is the one man with exceptional knowledge of Powershell. He is not only very knowledgeable, but also very kind and eager to those in need. I have been attempting to setup Powershell for many days, [...]...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article
SQL SERVER Size of Index Table for Each Index Solution 3 Powershell

Laerte Junior If you are a Powershell user, the name of the Laerte Junior is not a new name. He is the one man with exceptional knowledge of Powershell. He is not only very knowledgeable, but also very kind and eager to those in need. I have been attempting to setup Powershell for many days, [...]...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

Read the article
How to properly remove URL's from Google's index?

- by ElHaix

On some of our sites, we now have several thousand pages that dilute our website's keyword density. The website is an MVC site with SEO routing. If I submit a new sitemap with say only the 2000 or so pages that we want indexed, even though navigating to the diluting pages still works, will Google re-index the site with only those 2000 pages, dropping the superfluous ones? For example, I want to keep roughly 2000 of the following: www.mysite.com/some-search-term-1/some-good-keywords www.mysite.com/some-search-term-2/some-more-good-keywords And remove several thousand of the following that have already been indexed. www.mysite.com/some-search-term-xx/some-poor-keywords www.mysite.com/some-search-term-xx/some-poor-more-keywords These pages are not actually "removed" as navigating to these URL's still renders a page. Even though there are potentially hundreds of thousands of pages, I only want say 2000 to be re-indexed and retained. The others removed (without having to do these manually). Thanks.

Read the article
NLP - Queries using semantic wildcards in full text searching, maybe with Lucene?

- by Zsolt

Let's say I have a big corpus (for example in english or an arbitrary language), and I want to perform some semantic search on it. For example I have the query: "Be careful: [art] armada of [sg] is coming to [do sg]!" And the corpus contains the following sentence: "Be careful: an armada of alien ships is coming to destroy our planet!" It can be seen that my query string could contain "semantic placeholders", such as: [art] - some placeholder for articles (for example a / an in English) [sg], [do sg] - some placeholders for NPs and VPs (subjects and predicates) I would like to develop a library which would be capable to handle these queries efficiently. I suspect that some kind of POS-tagging would be necessary for parsing the text, but because I don't want to fully reimplement an already existing full-text search engine to make it work, I'm considering that how could I integrate this behaviour into a search engine like Lucene? I know there are SpanQueries which could behave similarly in some cases, but as I can see, Lucene doesn't do any semantic stuff with stored texts. It is possible to implement a behavior like this? Or do I have to write an own search engine?

Read the article
How to make sure Solr/Lucene won't die with java.lang.OutOfMemoryError?

- by taw

I'm really puzzled why it keeps dying with java.lang.OutOfMemoryError during indexing even though it has a few GBs of memory. Is there a fundamental reason why it needs manual tweaking of config files / jvm parameters instead of it just figuring out how much memory is available and limiting itself to that? No other programs except Solr ever have this kind of problem. Yes, I can keep tweaking JVM heap size every time such crashes happen, but this is all so backwards. Here's stack trace of the latest such crash in case it is relevant: SEVERE: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOfRange(Arrays.java:3209) at java.lang.String.<init>(String.java:216) at org.apache.lucene.index.TermBuffer.toTerm(TermBuffer.java:122) at org.apache.lucene.index.SegmentTermEnum.term(SegmentTermEnum.java:169) at org.apache.lucene.search.FieldCacheImpl$StringIndexCache.createValue(FieldCacheImpl.java:701) at org.apache.lucene.search.FieldCacheImpl$Cache.get(FieldCacheImpl.java:208) at org.apache.lucene.search.FieldCacheImpl.getStringIndex(FieldCacheImpl.java:676) at org.apache.lucene.search.FieldComparator$StringOrdValComparator.setNextReader(FieldComparator.java:667) at org.apache.lucene.search.TopFieldCollector$OneComparatorNonScoringCollector.setNextReader(TopFieldCollector.java:94) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:245) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:988) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:182) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619)

Read the article
Lucene Fuzzy Match on Phrase instead of Single Word

- by Koobz

I'm trying to do a fuzzy match on the Phrase "Grand Prarie" (deliberately misspelled) using Apache Lucene. Part of my issue is that the ~ operator only does fuzzy matches on single word terms and behaves as a proximity match for phrases. Is there a way to do a fuzzy match on a phrase with lucene?

Read the article
How-to index arrays (tags) in CouchDB using couchdb-lucene

- by Lucas

The setup: I have a project that is using CouchDB. The documents will have a field called "tags". This "tags" field is an array of strings (e.g., "tags":["tag1","tag2","etc"]). I am using couchdb-lucene as my search provider. The question: What function can be used to get couchdb-lucene to index the elements of "tags"? If you have an idea but no test environment, type it out, I'll try it and give the result here.

Read the article
Why Lucene merge indexes ?

- by Mehdi Amrollahi

I want to know why Lucene merge indexes ? It's better to say , why does not Lucene merge all indexes to one index ? What is the advantage of this merging method ?

Read the article
Couple o' quick questions on Apache Lucene

- by Doug

-- I don't want to start any religious wars, but a quick google search indicates that Apache Lucene is the preferred open source tool for indexing and searching. Are there others? -- What file format does Lucene use to store its index file(s)? Thank is advance. Doug

Read the article
Lucene.NET performance

- by Paul Knopf

I have a website that runs of a third party search provider that is expensive. I am going to roll my own. Is Lucene.NET capable of ~25,000 products (or documents), each with maybe ten attributes used for filtering? I am looking to do a "narrow/drill down" or "faceted search". Does that sound like to much to ask from Lucene.NET?

Read the article

< Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >