Search Results

Search found 17712 results on 709 pages for 'zend search lucene'.

Page 19/709 | < Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >

  • How do you boost term relevance in Sql Server Full Text Search like you can in Lucene?

    - by Snives
    I'm doing a typical full text search using containstable using 'ISABOUT(term1,term2,term3)' and although it supports term weighting that's not what I need. I need the ability to boost the relevancy of terms contained in certain portions of text. For example, it is customary for metatags or page title to be weighted differently than body text when searching web pages. Although I'm not dealing with web pages I do seek the same functionality. In Lucene it's called Document Field Level Boosting. How would one natively do this in Sql Server Full Text Search?

    Read the article

  • Lucene: Fastest way to return the document occurance of a phrase?

    - by dont say the kid's name
    Hi Guys, I am trying to use Lucene (actually PyLucene!) to find out how many documents contain my exact phrase. My code currently looks like this... but it runs rather slow. Does anyone know a faster way to return document counts? phraseList = ["some phrase 1", "some phrase 2"] #etc, a list of phrases... countsearcher = IndexSearcher(SimpleFSDirectory(File(STORE_DIR)), True) analyzer = StandardAnalyzer(Version.LUCENE_CURRENT) for phrase in phraseList: query = QueryParser(Version.LUCENE_CURRENT, "contents", analyzer).parse("\"" + phrase + "\"") scoreDocs = countsearcher.search(query, 200).scoreDocs print "count is: " + str(len(scoreDocs))

    Read the article

  • How to secure an Internet-facing Elastic Search implementation in a shared hosting environment?

    - by casperOne
    (Originally asked on StackOverflow, and recommended that I move it here) I've been going over the documentation for Elastic Search and I'm a big fan and I'd like to use it to handle the search for my ASP.NET MVC app. That introduces a few interesting twists, however. If the ASP.NET MVC application was on a dedicated machine, it would be simple to spool up an instance of Elastic Search and use the TCP Transport to connect locally. However, I'm not on a dedicated machine for the ASP.NET MVC application, nor does it look like I'll move to one anytime soon. That leaves hosting Elastic Search on another machine (in the *NIX world) and I would probably go with shared hosting there. One of the biggest things lacking from Elastic Search, however, is the fact that it doesn't support HTTPS and basic authentication out of the box. If it did, then this question wouldn't exist; I'd simply host it somewhere and make sure to have an incredibly secure password and HTTPS enabled (possibly with a self-signed certificate). But that's not the case. That given, what is a good way to expose Elastic Search over the Internet in a secure way? Note, I'm looking for something that hopefully, will not require writing code to provide shims for the methods that I want (in other words, writing forwarders).

    Read the article

  • Zend_Search_Lucene vs SOLR

    - by spacemonkey
    Hi, I have recenlty stumbled into Zend Lucene port of Lucene project. I have a little bit experience with SOLR so I would like to know what is the difference between two of them especially from performance and installation side. As much as I know SOLR requires Tomcat serverlet running in web hosting in order to work, what about Zend Lucene library? I am also a bit confused what means "being implemented on the top of Lucene"?

    Read the article

  • Tokenizing Twitter Posts in Lucene

    - by Amaç Herdagdelen
    Hello, My question in a nutshell: Does anyone know of a TwitterAnalyzer or TwitterTokenizer for Lucene? More detailed version: I want to index a number of tweets in Lucene and keep the terms like @user or #hashtag intact. StandardTokenizer does not work because it discards the punctuation (but it does other useful stuff like keeping domain names, email addresses or recognizing acronyms). How can I have an analyzer which does everything StandardTokenizer does but does not touch terms like @user and #hashtag? My current solution is to preprocess the tweet text before feeding it into the analyzer and replace the characters by other alphanumeric strings. For example, String newText = newText.replaceAll("#", "hashtag"); newText = newText.replaceAll("@", "addresstag"); Unfortunately this method breaks legitimate email addresses but I can live with that. Does that approach make sense? Thanks in advance! Amaç

    Read the article

  • Lucene Error While Reading binary block : java.io.EOFException

    - by tushar Khairnar
    Hi, I am getting java.io.EOFException while reading a binary block from lucene index. I am storing java object as byte-array in lucene index field and reading it when hit occurs. Here is stack trace : Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2281) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2750) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780) at java.io.ObjectInputStream.(ObjectInputStream.java:280) at org.terracotta.modules.searchable.util.SerializationUtil$OIS.(SerializationUtil.java:20) I have some background threads which write into index. But i buffer them and then write them at once like 1000. Occasionally I also issue optimize() on index. When I write, I am re-opening IndexReader. Does this is happening because of IndexReader re-opening call? Thanks. Regards Tushar

    Read the article

  • Modify PHP Search Script to Handle Multiple Entries For a Single Input

    - by Thomas
    I need to modify a php search script so that it can handle multiple entries for a single field. The search engine is designed for a real estate website. The current search form allows users to search for houses by selecting a single neighborhood from a dropdown menu. Instead of a dropdown menu, I would like to use a list of checkboxes so that the the user can search for houses in multiple neighborhoods at one time. I have converted all of the dropdown menu items into checkboxes on the HTML side but the PHP script only searches for houses in the last checkbox selected. For example, if I selected: 'Dallas' 'Boston' 'New York' the search engine will only search for houses in New York. Im new to PHP, so I am a little at a loss as to how to modify this script to handle the behavior I have described: <?php require_once(dirname(__FILE__).'/extra_search_fields.php'); //Add Widget for configurable search. add_action('plugins_loaded',array('DB_CustomSearch_Widget','init')); class DB_CustomSearch_Widget extends DB_Search_Widget { function DB_CustomSearch_Widget($params=array()){ DB_CustomSearch_Widget::__construct($params); } function __construct($params=array()){ $this->loadTranslations(); parent::__construct(__('Custom Fields ','wp-custom-fields-search'),$params); add_action('admin_print_scripts', array(&$this,'print_admin_scripts'), 90); add_action('admin_menu', array(&$this,'plugin_menu'), 90); add_filter('the_content', array(&$this,'process_tag'),9); add_shortcode( 'wp-custom-fields-search', array(&$this,'process_shortcode') ); wp_enqueue_script('jquery'); if(version_compare("2.7",$GLOBALS['wp_version'])>0) wp_enqueue_script('dimensions'); } function init(){ global $CustomSearchFieldStatic; $CustomSearchFieldStatic['Object'] = new DB_CustomSearch_Widget(); $CustomSearchFieldStatic['Object']->ensureUpToDate(); } function currentVersion(){ return "0.3.16"; } function ensureUpToDate(){ $version = $this->getConfig('version'); $latest = $this->currentVersion(); if($version<$latest) $this->upgrade($version,$latest); } function upgrade($current,$target){ $options = $this->getConfig(); if(version_compare($current,"0.3")<0){ $config = $this->getDefaultConfig(); $config['name'] = __('Default Preset','wp-custom-fields-search'); $options['preset-default'] = $config; } $options['version']=$target; update_option($this->id,$options); } function getInputs($params = false,$visitedPresets=array()){ if(is_array($params)){ $id = $params['widget_id']; } else { $id = $params; } if($visitedPresets[$id]) return array(); $visitedPresets[$id]=true; global $CustomSearchFieldStatic; if(!$CustomSearchFieldStatic['Inputs'][$id]){ $config = $this->getConfig($id); $inputs = array(); if($config['preset']) $inputs = $this->getInputs($config['preset'],$visitedPresets); $nonFields = $this->getNonInputFields(); if($config) foreach($config as $k=>$v){ if(in_array($k,$nonFields)) continue; if(!(class_exists($v['input']) && class_exists($v['comparison']) && class_exists($v['joiner']))) { continue; } $inputs[] = new CustomSearchField($v); } foreach($inputs as $k=>$v){ $inputs[$k]->setIndex($k); } $CustomSearchFieldStatic['Inputs'][$id]=$inputs; } return $CustomSearchFieldStatic['Inputs'][$id]; } function getTitle($params){ $config = $this->getConfig($params['widget_id']); return $config['name']; } function form_processPost($post,$old){ unset($post['###TEMPLATE_ID###']); if(!$post) $post=array('exists'=>1); return $post; } function getDefaultConfig(){ return array('name'=>'Site Search', 1=>array( 'label'=>__('Key Words','wp-custom-fields-search'), 'input'=>'TextField', 'comparison'=>'WordsLikeComparison', 'joiner'=>'PostDataJoiner', 'name'=>'all' ), 2=>array( 'label'=>__('Category','wp-custom-fields-search'), 'input'=>'DropDownField', 'comparison'=>'EqualComparison', 'joiner'=>'CategoryJoiner' ), ); } function form_outputForm($values,$pref){ $defaults=$this->getDefaultConfig(); $prefId = preg_replace('/^.*\[([^]]*)\]$/','\\1',$pref); $this->form_existsInput($pref); $rand = rand(); ?> <div id='config-template-<?php echo $prefId?>' style='display: none;'> <?php $templateDefaults = $defaults[1]; $templateDefaults['label'] = 'Field ###TEMPLATE_ID###'; echo $this->singleFieldHTML($pref,'###TEMPLATE_ID###',$templateDefaults); ?> </div> <?php foreach($this->getClasses('input') as $class=>$desc) { if(class_exists($class)) $form = new $class(); else $form = false; if(compat_method_exists($form,'getConfigForm')){ if($form = $form->getConfigForm($pref.'[###TEMPLATE_ID###]',array('name'=>'###TEMPLATE_NAME###'))){ ?> <div id='config-input-templates-<?php echo $class?>-<?php echo $prefId?>' style='display: none;'> <?php echo $form?> </div> <?php } } } ?> <div id='config-form-<?php echo $prefId?>'> <?php if(!$values) $values = $defaults; $maxId=0; $presets = $this->getPresets(); array_unshift($presets,__('NONE','wp-custom-fields-search')); ?> <div class='searchform-name-wrapper'><label for='<?php echo $prefId?>[name]'><?php echo __('Search Title','wp-custom-fields-search')?></label><input type='text' class='form-title-input' id='<?php echo $prefId?>[name]' name='<?php echo $pref?>[name]' value='<?php echo $values['name']?>'/></div> <div class='searchform-preset-wrapper'><label for='<?php echo $prefId?>[preset]'><?php echo __('Use Preset','wp-custom-fields-search')?></label> <?php $dd = new AdminDropDown($pref."[preset]",$values['preset'],$presets); echo $dd->getInput()."</div>"; $nonFields = $this->getNonInputFields(); foreach($values as $id => $val){ $maxId = max($id,$maxId); if(in_array($id,$nonFields)) continue; echo "<div id='config-form-$prefId-$id'>".$this->singleFieldHTML($pref,$id,$val)."</div>"; } ?> </div> <br/><a href='#' onClick="return CustomSearch.get('<?php echo $prefId?>').add();"><?php echo __('Add Field','wp-custom-fields-search')?></a> <script type='text/javascript'> CustomSearch.create('<?php echo $prefId?>','<?php echo $maxId?>'); <?php foreach($this->getClasses('joiner') as $joinerClass=>$desc){ if(compat_method_exists($joinerClass,'getSuggestedFields')){ $options = eval("return $joinerClass::getSuggestedFields();"); $str = ''; foreach($options as $i=>$v){ $k=$i; if(is_numeric($k)) $k=$v; $options[$i] = json_encode(array('id'=>$k,'name'=>$v)); } $str = '['.join(',',$options).']'; echo "CustomSearch.setOptionsFor('$joinerClass',".$str.");\n"; }elseif(eval("return $joinerClass::needsField();")){ echo "CustomSearch.setOptionsFor('$joinerClass',[]);\n"; } } ?> </script> <?php } function getNonInputFields(){ return array('exists','name','preset','version'); } function singleFieldHTML($pref,$id,$values){ $prefId = preg_replace('/^.*\[([^]]*)\]$/','\\1',$pref); $pref = $pref."[$id]"; $htmlId = $pref."[exists]"; $output = "<input type='hidden' name='$htmlId' value='1'/>"; $titles="<th>".__('Label','wp-custom-fields-search')."</th>"; $inputs="<td><input type='text' name='$pref"."[label]' value='$values[label]' class='form-field-title'/></td><td><a href='#' onClick='return CustomSearch.get(\"$prefId\").toggleOptions(\"$id\");'>".__('Show/Hide Config','wp-custom-fields-search')."</a></td>"; $output.="<table class='form-field-table'><tr>$titles</tr><tr>$inputs</tr></table>"; $output.="<div id='form-field-advancedoptions-$prefId-$id' style='display: none'>"; $inputs='';$titles=''; $titles="<th>".__('Data Field','wp-custom-fields-search')."</th>"; $inputs="<td><div id='form-field-dbname-$prefId-$id' class='form-field-title-div'><input type='text' name='$pref"."[name]' value='$values[name]' class='form-field-title'/></div></td>"; $count=1; foreach(array('joiner'=>__('Data Type','wp-custom-fields-search'),'comparison'=>__('Compare','wp-custom-fields-search'),'input'=>__('Widget','wp-custom-fields-search')) as $k=>$v){ $dd = new AdminDropDown($pref."[$k]",$values[$k],$this->getClasses($k),array('onChange'=>'CustomSearch.get("'.$prefId.'").updateOptions("'.$id.'","'.$k.'")','css_class'=>"wpcfs-$k")); $titles="<th>".$v."</th>".$titles; $inputs="<td>".$dd->getInput()."</td>".$inputs; if(++$count==2){ $output.="<table class='form-field-table form-class-$k'><tr>$titles</tr><tr>$inputs</tr></table>"; $count=0; $inputs = $titles=''; } } if($titles){ $output.="<table class='form-field-table'><tr>$titles</tr><tr>$inputs</tr></table>"; $inputs = $titles=''; } $titles.="<th>".__('Numeric','wp-custom-fields-search')."</th><th>".__('Widget Config','wp-custom-fields-search')."</th>"; $inputs.="<td><input type='checkbox' ".($values['numeric']?"checked='true'":"")." name='$pref"."[numeric]'/></td>"; if(class_exists($widgetClass = $values['input'])){ $widget = new $widgetClass(); if(compat_method_exists($widget,'getConfigForm')) $widgetConfig=$widget->getConfigForm($pref,$values); } $inputs.="<td><div id='$this->id"."-$prefId"."-$id"."-widget-config'>$widgetConfig</div></td>"; $output.="<table class='form-field-table'><tr>$titles</tr><tr>$inputs</tr></table>"; $output.="</div>"; $output.="<a href='#' onClick=\"return CustomSearch.get('$prefId').remove('$id');\">Remove Field</a>"; return "<div class='field-wrapper'>$output</div>"; } function getRootURL(){ return WP_CONTENT_URL .'/plugins/' . dirname(plugin_basename(__FILE__) ) . '/'; } function print_admin_scripts($params){ $jsRoot = $this->getRootURL().'js'; $cssRoot = $this->getRootURL().'css'; $scripts = array('Class.js','CustomSearch.js','flexbox/jquery.flexbox.js'); foreach($scripts as $file){ echo "<script src='$jsRoot/$file' ></script>"; } echo "<link rel='stylesheet' href='$cssRoot/admin.css' >"; echo "<link rel='stylesheet' href='$jsRoot/flexbox/jquery.flexbox.css' >"; } function getJoiners(){ return $this->getClasses('joiner'); } function getComparisons(){ return $this->getClasses('comparison'); } function getInputTypes(){ return $this->getClasses('input'); } function getClasses($type){ global $CustomSearchFieldStatic; if(!$CustomSearchFieldStatic['Types']){ $CustomSearchFieldStatic['Types'] = array( "joiner"=>array( "PostDataJoiner" =>__( "Post Field",'wp-custom-fields-search'), "CustomFieldJoiner" =>__( "Custom Field",'wp-custom-fields-search'), "CategoryJoiner" =>__( "Category",'wp-custom-fields-search'), "TagJoiner" =>__( "Tag",'wp-custom-fields-search'), "PostTypeJoiner" =>__( "Post Type",'wp-custom-fields-search'), ), "input"=>array( "TextField" =>__( "Text Input",'wp-custom-fields-search'), "DropDownField" =>__( "Drop Down",'wp-custom-fields-search'), "RadioButtonField" =>__( "Radio Button",'wp-custom-fields-search'), "HiddenField" =>__( "Hidden Constant",'wp-custom-fields-search'), ), "comparison"=>array( "EqualComparison" =>__( "Equals",'wp-custom-fields-search'), "LikeComparison" =>__( "Phrase In",'wp-custom-fields-search'), "WordsLikeComparison" =>__( "Words In",'wp-custom-fields-search'), "LessThanComparison" =>__( "Less Than",'wp-custom-fields-search'), "MoreThanComparison" =>__( "More Than",'wp-custom-fields-search'), "AtMostComparison" =>__( "At Most",'wp-custom-fields-search'), "AtLeastComparison" =>__( "At Least",'wp-custom-fields-search'), "RangeComparison" =>__( "Range",'wp-custom-fields-search'), //TODO: Make this work... // "NotEqualComparison" =>__( "Not Equal To",'wp-custom-fields-search'), ) ); $CustomSearchFieldStatic['Types'] = apply_filters('custom_search_get_classes',$CustomSearchFieldStatic['Types']); } return $CustomSearchFieldStatic['Types'][$type]; } function plugin_menu(){ add_options_page('Form Presets','WP Custom Fields Search',8,__FILE__,array(&$this,'presets_form')); } function getPresets(){ $presets = array(); foreach(array_keys($config = $this->getConfig()) as $key){ if(strpos($key,'preset-')===0) { $presets[$key] = $key; if($name = $config[$key]['name']) $presets[$key]=$name; } } return $presets; } function presets_form(){ $presets=$this->getPresets(); if(!$preset = $_REQUEST['selected-preset']){ $preset = 'preset-default'; } if(!$presets[$preset]){ $defaults = $this->getDefaultConfig(); $options = $this->getConfig(); $options[$preset] = $defaults; if($n = $_POST[$this->id][$preset]['name']) $options[$preset]['name'] = $n; elseif($preset=='preset-default') $options[$preset]['name'] = 'Default'; else{ list($junk,$id) = explode("-",$preset); $options[$preset]['name'] = 'New Preset '.$id; } update_option($this->id,$options); $presets[$preset] = $options[$preset]['name']; } if($_POST['delete']){ check_admin_referer($this->id.'-editpreset-'.$preset); $options = $this->getConfig(); unset($options[$preset]); unset($presets[$preset]); update_option($this->id,$options); list($preset,$name) = each($presets); } $index = 1; while($presets["preset-$index"]) $index++; $presets["preset-$index"] = __('New Preset','wp-custom-fields-search'); $linkBase = $_SERVER['REQUEST_URI']; $linkBase = preg_replace("/&?selected-preset=[^&]*(&|$)/",'',$linkBase); foreach($presets as $key=>$name){ $config = $this->getConfig($key); if($config && $config['name']) $name=$config['name']; if(($n = $_POST[$this->id][$key]['name'])&&(!$_POST['delete'])) $name = $n; $presets[$key]=$name; } $plugin=&$this; ob_start(); wp_nonce_field($this->id.'-editpreset-'.$preset); $hidden = ob_get_contents(); $hidden.="<input type='hidden' name='selected-preset' value='$preset'>"; $shouldSave = $_POST['selected-preset'] && !$_POST['delete'] && check_admin_referer($this->id.'-editpreset-'.$preset); ob_end_clean(); include(dirname(__FILE__).'/templates/options.php'); } function process_tag($content){ $regex = '/\[\s*wp-custom-fields-search\s+(?:([^\]=]+(?:\s+.*)?))?\]/'; return preg_replace_callback($regex, array(&$this, 'generate_from_tag'), $content); } function process_shortcode($atts,$content){ return $this->generate_from_tag(array("",$atts['preset'])); } function generate_from_tag($reMatches){ global $CustomSearchFieldStatic; ob_start(); $preset=$reMatches[1]; if(!$preset) $preset = 'default'; wp_custom_fields_search($preset); $form = ob_get_contents(); ob_end_clean(); return $form; } } global $CustomSearchFieldStatic; $CustomSearchFieldStatic['Inputs'] = array(); $CustomSearchFieldStatic['Types'] = array(); class AdminDropDown extends DropDownField { function AdminDropDown($name,$value,$options,$params=array()){ AdminDropDown::__construct($name,$value,$options,$params); } function __construct($name,$value,$options,$params=array()){ $params['options'] = $options; $params['id'] = $params['name']; parent::__construct($params); $this->name = $name; $this->value = $value; } function getHTMLName(){ return $this->name; } function getValue(){ return $this->value; } function getInput(){ return parent::getInput($this->name,null); } } if (!function_exists('json_encode')) { function json_encode($a=false) { if (is_null($a)) return 'null'; if ($a === false) return 'false'; if ($a === true) return 'true'; if (is_scalar($a)) { if (is_float($a)) { // Always use "." for floats. return floatval(str_replace(",", ".", strval($a))); } if (is_string($a)) { static $jsonReplaces = array(array("\\", "/", "\n", "\t", "\r", "\b", "\f", '"'), array('\\\\', '\\/', '\\n', '\\t', '\\r', '\\b', '\\f', '\"')); return '"' . str_replace($jsonReplaces[0], $jsonReplaces[1], $a) . '"'; } else return $a; } $isList = true; for ($i = 0, reset($a); $i < count($a); $i++, next($a)) { if (key($a) !== $i) { $isList = false; break; } } $result = array(); if ($isList) { foreach ($a as $v) $result[] = json_encode($v); return '[' . join(',', $result) . ']'; } else { foreach ($a as $k => $v) $result[] = json_encode($k).':'.json_encode($v); return '{' . join(',', $result) . '}'; } } } function wp_custom_fields_search($presetName='default'){ global $CustomSearchFieldStatic; if(strpos($presetName,'preset-')!==0) $presetName="preset-$presetName"; $CustomSearchFieldStatic['Object']->renderWidget(array('widget_id'=>$presetName,'noTitle'=>true),array('number'=>$presetName)); } function compat_method_exists($class,$method){ return method_exists($class,$method) || in_array(strtolower($method),get_class_methods($class)); }

    Read the article

  • Lucene numDocs and doqFreq on custom similarity class

    - by David A
    Hi All, im doing an aplication with Lucene (im a noob with it) and im facing some problems. My aplication uses the Lucene 2.4.0 library with a custom similaraty implementation (the jar is imported) In my app im calculating doqFreq and numDocs manually (im adding the values of all indexes and then i calculate a global value in order to use it on every query) and i want to use that values on a custom similarity implementation in order to calculate a new IDF. The problem is that I dont know how to use (or send) the new doqFreq and numDocs values from my app on that new similarty implementation as I dont want to change lucene´s code apart from this extra class. Any suggestions or examples? I read the docs but i dont now how to aproach this :s Thanks

    Read the article

  • Lucene Search for japanese characters

    - by Pranali Desai
    Hi All, I have implemented lucene for my application and it works very well unless you have introduced something like japanese characters. The problem is that if I have japanese string ?????????????? and I search with ? that is the first character than it works well whereas if I use more than one japanese character(????)in search token search fails and there is no document found. Are japanese characters supported in lucene? what are the settings to be done to get it working?

    Read the article

  • Exception when indexing text documents with Lucene, using SnowballAnalyzer for cleaning up

    - by Julia
    Hello!!! I am indexing the documents with Lucene and am trying to apply the SnowballAnalyzer for punctuation and stopword removal from text .. I keep getting the following error :( IllegalAccessError: tried to access method org.apache.lucene.analysis.Tokenizer.(Ljava/io/Reader;)V from class org.apache.lucene.analysis.snowball.SnowballAnalyzer Here is the code, I would very much appreciate help!!!! I am new with this.. public class Indexer { private Indexer(){}; private String[] stopWords = {....}; private String indexName; private IndexWriter iWriter; private static String FILES_TO_INDEX = "/Users/ssi/forindexing"; public static void main(String[] args) throws Exception { Indexer m = new Indexer(); m.index("./newindex"); } public void index(String indexName) throws Exception { this.indexName = indexName; final File docDir = new File(FILES_TO_INDEX); if(!docDir.exists() || !docDir.canRead()){ System.err.println("Something wrong... " + docDir.getPath()); System.exit(1); } Date start = new Date(); PerFieldAnalyzerWrapper analyzers = new PerFieldAnalyzerWrapper(new SimpleAnalyzer()); analyzers.addAnalyzer("text", new SnowballAnalyzer("English", stopWords)); Directory directory = FSDirectory.open(new File(this.indexName)); IndexWriter.MaxFieldLength maxLength = IndexWriter.MaxFieldLength.UNLIMITED; iWriter = new IndexWriter(directory, analyzers, true, maxLength); System.out.println("Indexing to dir..........." + indexName); if(docDir.isDirectory()){ File[] files = docDir.listFiles(); if(files != null){ for (int i = 0; i < files.length; i++) { try { indexDocument(files[i]); }catch (FileNotFoundException fnfe){ fnfe.printStackTrace(); } } } } System.out.println("Optimizing...... "); iWriter.optimize(); iWriter.close(); Date end = new Date(); System.out.println("Time to index was" + (end.getTime()-start.getTime()) + "miliseconds"); } private void indexDocument(File someDoc) throws IOException { Document doc = new Document(); Field name = new Field("name", someDoc.getName(), Field.Store.YES, Field.Index.ANALYZED); Field text = new Field("text", new FileReader(someDoc), Field.TermVector.WITH_POSITIONS_OFFSETS); doc.add(name); doc.add(text); iWriter.addDocument(doc); } }

    Read the article

  • Indexing community images for google image search

    - by Vittorio Vittori
    Hi, I'm trying to understand how can I do to let my site be reachable from google image search spiders. I like how last.fm solution, and I thought to use a technique like his staff do to let google find artists images on their pages. When I'm looking for an artist and I search it on google image search, as often as not I find an image from last.fm artists page, I make an example: If I search the band Pure Reason Revolution It brings me here, the artist's image page http://www.last.fm/music/Pure+Reason+Revolution/+images/4284073 Now if I take a look to the image file, i can see it's named: http://userserve-ak.last.fm/serve/500/4284073/Pure+Reason+Revolution+4.jpg so if I try to undertand how the service works I can try to say: http://userserve-ak.last.fm/serve/ the server who serve the images 500/ the selected size for the image 4284073/ the image id for database Pure+Reason+Revolution+4.jpg the image name I thought it's difficult to think the real filename for the image is Pure+Reason+Revolution+4.jpg for image overwrite problems when an user upload it, in fact if I digit: http://userserve-ak.last.fm/serve/500/4284073.jpg I probably find the real image location and filename With this tecnique the image is highly reachable from search engines and easily archived. My question is, does exist some guide or tutorial to approach on this kind of tecniques, or something similar?

    Read the article

  • lucene get matched terms in query

    - by iamrohitbanga
    what is the best way to find out which terms in a query matched against a given document returned as a hit in lucene? I have tried a weird method involving hit highlighting package in lucene contrib and also a method that searches for every word in the query against the top most document ("docId: xy AND description: each_word_in_query"). Do not get satisfactory results? hit highlighting does not report some of the words that matched for a document other than the first one. i am not sure if the second approach is the best alternative.

    Read the article

  • Prepare community images for google image search indexing

    - by Vittorio Vittori
    Hi, I'm trying to understand how can I do to let my site be reachable from google image search spiders. I like how last.fm solution, and I thought to use a technique like his staff do to let google find artists images on their pages. When I'm looking for an artist and I search it on google image search, as often as not I find an image from last.fm artists page, I make an example: If I search the band Pure Reason Revolution It brings me here, the artist's image page http://www.last.fm/music/Pure+Reason+Revolution/+images/4284073 Now if I take a look to the image file, i can see it's named: http://userserve-ak.last.fm/serve/500/4284073/Pure+Reason+Revolution+4.jpg so if I try to undertand how the service works I can try to say: http://userserve-ak.last.fm/serve/ the server who serve the images 500/ the selected size for the image 4284073/ the image id for database Pure+Reason+Revolution+4.jpg the image name I thought it's difficult to think the real filename for the image is Pure+Reason+Revolution+4.jpg for image overwrite problems when an user upload it, in fact if I digit: http://userserve-ak.last.fm/serve/500/4284073.jpg I probably find the real image location and filename With this tecnique the image is highly reachable from search engines and easily archived. My question is, does exist some guide or tutorial to approach on this kind of tecniques, or something similar?

    Read the article

  • Search a string to find which records in table are inside said string

    - by Improfane
    Hello, Say I have a string. Then I have a number of unique tokens or keywords, potentially a large number in a database. I want to search and find out which of these database strings are inside the string I provide (and get the IDs of them). Is there a way of using a query to search the provided string or must it be taken to application space? Am I right in thinking that this is not a 'full text search'? Would the best method be to insert it into the database to make it a full text search?

    Read the article

  • Prepare your site images for google image search indexing

    - by Vittorio Vittori
    Hi, I'm trying to understand how can I do to let my site be reachable from google image search spiders. I like how last.fm solution, and I thought to use a technique like his staff do to let google find artists images on their pages. When I'm looking for an artist and I search it on google image search, as often as not I find an image from last.fm artists page, I make an example: If I search the band Pure Reason Revolution It brings me here, the artist's image page http://www.last.fm/music/Pure+Reason+Revolution/+images/4284073 Now if I take a look to the image file, i can see it's named: http://userserve-ak.last.fm/serve/500/4284073/Pure+Reason+Revolution+4.jpg so if I try to undertand how the service works I can try to say: http://userserve-ak.last.fm/serve/ the server who serve the images 500/ the selected size for the image 4284073/ the image id for database Pure+Reason+Revolution+4.jpg the image name I thought it's difficult to think the real filename for the image is Pure+Reason+Revolution+4.jpg for image overwrite problems when an user upload it, in fact if I digit: http://userserve-ak.last.fm/serve/500/4284073.jpg I probably find the real image location and filename With this tecnique the image is highly reachable from search engines and easily archived. My question is, does exist some guide or tutorial to approach on this kind of tecniques, or something similar?

    Read the article

  • PHP 'smart' search engine to search Mysql tables advice

    - by Anonymous12345
    I am creating a search engine for my php based website. I need to search a mysql table. Thing is, the search engine must be pretty 'smart', so that users can easily find their items (it's a classifieds website). I have currently set up a FULLTEXT search with this piece of code: MATCH (headline) AGAINST ($querystring) But this isn't enough... For instance, lets say the field headline contains something like Bmw 330ci. If I search for 330, I wont get any results. The ending ('ci') is just one of many endings in car models which must be taken into account when searching the table. Or what if the headline field is bmw330? Also no results, because it only matches full words. Or also, what if the headline is bmw 330, and I search for bmw 520, still with FULLTEXT I will get the bmw 330 as a result, even though I searched for bmw 520... Not good! How should I solve this problem?

    Read the article

  • Can I store and join based on external attributes in Lucene/Solr

    - by Kibbee
    Is there a way to store information about documents that are stored in Lucene such that I don't have to update the entire document to update certain attributes about the documents? For instance, let's say I had a bunch of documents, and that I wanted to update a permissions list of who was allowed to see the documents on a daily, or more frequent, basis. Would it be possible to update all the permissions each day, without updating all the documents. I could do it by keeping a exactly which permissions were added and removed, but I would rather just be able to take the end list of permissions, and use that, rather than have to keep track of all the permission changes and post those entire documents to Lucene.

    Read the article

  • Using FieldSelector when searching with Lucene

    - by Christian
    I'm searching articles in PubMed via Lucene. Each of the 20,000,000 articles has an abstract with ~250 words and an ID. At the moment I store my searches, with each take multiple seconds, in a TopDocs object. Searchs can find thousands of articles. I'm just interested in the ID of the article. Does Lucene load the abstracts internally into the TopDocs? If so can I prevent that behavior through FieldSelectors or do FieldSelectors only work with IndexReader and don't work with IndexSearcher?

    Read the article

  • Lucene boost: I need to make it work better

    - by zvikico
    I'm using Lucene to index components with names and types. Some components are more important, thus, get a bigger boost. However, I cannot get my boost to work properly. I sill get some components appear later (get worse score), even though they have a higher boost. Note that the indexing is done on one field only and I've set the boost to that field alone. I'm using Lucene in Java. I don't think it has anything to do with the field length. I've seen components with the same name (but different type) get the wrong score.

    Read the article

  • URL blocked in robots.txt but still showing up on Google search [closed]

    - by Ahmad Alfy
    Possible Duplicate: Why do Google search results include pages disallowed in robots.txt? In my robots.txt I am disallowing a lot of URLs. Google webmaster tools says there're +750 URL blocked. The problem is the URLs are still showing on Google search. For example I have the following rule: Disallow: /entity/child-health/ But when I search some-keyword + child health the following URL shows up : http://www.sitename.com/entity/child-health/ Am I doing anything wrong? Is is possible for a URL to be blocked using robots.txt and still show up on search results?

    Read the article

< Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >