Search Results

Search found 631 results on 26 pages for 'couchdb lucene'.

Page 18/26 | < Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25  | Next Page >

  • acts_as_solr returns all rows in the database when using the model as search query

    - by chris Chan
    In our application we're using acts_as_solr for search. Everything seems to be running smoothly except for the fact that using the model name as the search query returns every single row in the table. For example, let's say we have a users table. We specify acts_as_solr in our model to search the fields first name, last name and handle acts_as_solr :fields = [:handle, :lname, :fname]. When you use "user" as the search term it returns every single user in the system, or every row in the database as a result. Has anyone else run into this?

    Read the article

  • Custom Solr sorting

    - by Tom
    Hello everyone, I've been asked to do an evaluation of Solr as an alternative for a commercial search engine. The application now has a very particular way of sorting results using something called "buckets". I'll try to explain with a bit of details: In the interface they have 2 fields: "what" and "where". Both fields are actually sets of fields (what = category, name, contact info... and where= country, state, region, city...) so the copyfield feature of Solr immediately comes to mind. Now based on the field generated the actual match the result should end up in a specific bucket. In particular the first bucket contains all the result documents that have an exact match on the category field, in the second bucket all exact matches on name, the third partial matches on category, the fourth partial matches on name, the fifth matches on contact info etc... Then within each of those first tier buckets all results are placed in second tier buckets depending on what location was matched: city, then region, then province and so on. To even complicate things more there is also a third tier bucket where results are placed according to the value of a ranking field: all documents with the value 1 in the ranking field go in bucket 1 and so on. And finally results should be randomized in the third tier bucket... On top of this they obviously want support for facets and paging. My apologies for the long mail but I would greatly appreciate feedback and/or suggestions. I'm aware that this that this is a very particular problem but everything that points me in the right direction is helpful. Cheers, Tom

    Read the article

  • Get starting and end index of a highlighted fragment in a searched field

    - by Umer
    "My search returns a highlighted fragment from a field. I want to know that in that field of particular searched document, where does that fragment starts and ends ?" for instance. consider i am searching "highlighted fragment" in above lines (consider the above para as single document). I am setting my fragmenter as : SimpleFragmenter fragmenter = new SimpleFragmenter(30); now the output of GetBestFragment is somewhat like : "returns a highlighted fragment from" Is it possible to get the starting and ending index of this fragment in the text above (say starting is 10 and ending is 45)

    Read the article

  • How well does Solr scale over large number of facet values?

    - by Continuation
    I'm using Solr and I want to facet over a field "group". Since "group" is created by users, potentially there can be a huge number of values for "group". Would Solr be able to handle a use case like this? Or is Solr not really appropriate for facet fields with a large number of values? I understand that I can set facet.limit to restrict the number of values returned for a facet field. Would this help in my case? Say there are 100,000 matching values for "group" in a search, if I set facet.limit to 50. would that speed up the query, or would the query still be slow because Solr still needs to process and sort through all the facet values and return the top 50 ones? Any tips on how to tune Solr for large number of facet values? Thanks.

    Read the article

  • Multiple Refinement/Clusters in a single search

    - by Brain Teasers
    Suppose I have options of searching are 1.occupation 2.education 3.religion 4.caste 5.country and many more. when i perform this search, i can get easily calculate refinements. problem is i need to calculate refinements in a way that for individual refinement is calculated in a manner that all search parameter is considered expect the calculated refinement check on the left side ... even i have choose the profession area, still search refinements of this is coming.... same functionality if of all refinements.Please help.I use sphinx for searching seen this type of refinements in http://ww2.shaadi.com/search

    Read the article

  • SOLR and Natural Language Parsing - Can I use it?

    - by andy
    hey guys, my requirements are pretty similar to this: Requirements http://stackoverflow.com/questions/90580/word-frequency-algorithm-for-natural-language-processing Using Solr While the answer for that question is excellent, I was wondering if I could make use of all the time I spent getting to know SOLR for my NLP. I thought of SOLR because: It's got a bunch of tokenizers and performs a lot of NLP. It's pretty use to use out of the box. It's restful distributed app, so it's easy to hook up I've spent some time with it, so using could save me time. Can I use Solr? Although the above reasons are good, I don't know SOLR THAT well, so I need to know if it would be appropriate for my requirements. Ideal Usage Ideally, I'd like to configure SOLR, and then be able to send SOLR some text, and retrieve the indexed tonkenized content. Context So you guys know, I'm working on a small component of a bigger recommendation engine.

    Read the article

  • apache solr : sum of data resulted from group by

    - by Terance Dias
    Hi, We have a requirement where we need to group our records by a particular field and take the sum of a corresponding numeric field e.x. select userid, sum(click_count) from user_action group by userid; We are trying to do this using apache solr and found that there were 2 ways of doing this: Using the field collapsing feature (http://blog.jteam.nl/2009/10/20/result-grouping-field-collapsing-with-solr/) but found 2 problems with this: 1.1. This is not part of release and is available as patch so we are not sure if we can use this in production. 1.2. We do not get the sum back but individual counts and we need to sum it at the client side. Using the Stats Component along with faceted search (http://wiki.apache.org/solr/StatsComponent). This meets our requirement but it is not fast enough for very large data sets. I just wanted to know if anybody knows of any other way to achieve this. Appreciate any help. Thanks, Terance.

    Read the article

  • determine which value produced a hit in SOLR multivalued field type

    - by harschware
    If I have a multiValued field type of text, and I put values [cat,dog,green,blue] in it. Is there a way to tell when I execute a query against that field for dog, that it was in the 1st element position for that multiValued field? Assumption: client does not have any pre-knowledge of what the field type of the field being queried is. (i.e. Solr must provide the answer and the client can't post process the return doc to figure it out because it would not know how SOLR matched the query to the result). Disclosure: I posted to solr-user list and am getting no traction so I post here now.

    Read the article

  • Two Applications using the same index file with Hibernate Search

    - by Dominik Obermaier
    Hi, I want to know if it is possible to use the same index file for an entity in two applications. Let me be more specific: We have an online Application with a frondend for the users and an application for the backend tasks (= administrator interface). Both are running on the same JBOSS AS. Both Applications are using the same database, so they are using the same entities. Of course the package names are not the same in both applications for the entities. So this is our usecase: A user should be able to search via the frondend. The user is only allowed to see results which are tagged with "visible". This tagging happens in our admin interface, so the index for the frontend should be updated every time an entity is tagged as "visible" in the backend. Of course both applications do have the same index root folder. In my index folder there are 2 index files: de.x.x.admin.model.Product de.x.x.frondend.model.Product How to "merge" this via Hibernate Search Configuration? I just did not get it via the documentation... Thanks for any help!

    Read the article

  • Solr return whether member is in multivalued field

    - by ??iu
    Is there any way to return in the fields list whether a value exists as one of the values of a multivalued field? E.g., if your schema is <schema> ... <field name="user_name" type="text" indexed="true" stored="true" required="true" /> <field name="follower" type="integer" indexed="true" stored="true" multiValued="true" /> ... </schema> A sample document might look like: <doc> <field name="user_name">tester blah</field> <field name="follower">1</field> <field name="follower">62</field> <field name="follower">63</field> <field name="follower">64</field> </doc> I would like to be able to query for, say, "tester" and follower:62 and have it match "tester blah" and have some indication of whether 62 is a follower or not in the results.

    Read the article

  • Obtain all keys of a Neo4j index

    - by MattiSG
    I have a Neo4j database whose content is generated dynamically from a big dataset. All “entry points” nodes are indexed on a named index (IndexManager.forNodes(…)). I can therefore look up a particular “entry point” node. However, I would now like to enumerate all those specific nodes, but I can't know on which key they were indexed. Is there any way to enumerate all keys of a Neo4j Index? If not, what would be the best way to store those keys, a data type that is eminently non-graph-oriented? UPDATE (thanks for asking details :) ): the list would be more than 2 million entries. The main use case would be to never update it after an initialization step, but other use cases might need it, so it has to be somewhat scalable. Also, I would really prefer avoiding killing my current resilience abilities, so storing all keys at once, as opposed to adding them incrementally, would be a last-resort solution.

    Read the article

  • Can't search in a certain field using solR

    - by intrance
    Hi, I'm setting up an environment using Nutch 1.0 + solR 1.4. In Nutch I configured the subcollection plugin which seems to work nicely. If I search as normal adding fl=* I can see the subcollection field is filled as intented. (something like <str name="subcollection">mysite.com</str>). My problem is, I would like to be able to search only in one or more given subcollections, but whenever my searchquery is something like q=subcollection:mysite.com it won't work. I've also tried to add a fl=* or searched in mysite* instead but I never get any results. Obviously solR "knows" the subcollection field, as it doesn't result with an error but simply whith an empty result. I'd be glad for any help

    Read the article

  • Loading index in MemoryIndex instance

    - by Javi
    Hello, Is there any way to load an existing index into an instance of MemoryIndex?. I have an application which uses Hibernate Search so I can use index() in FullTextEntityManager instance to index an object. I'd like to recover back the created index and insert it into a MemoryIndex instance to execute several queries over it. Is it possible? Thanks.

    Read the article

  • Optimizing Solr for Sorting

    - by devinfoley
    I'm using Solr for a realtime search index. My dataset is about 60M large documents. Instead of sorting by relevance, I need to sort by time. Currently I'm using the sort flag in the query to sort by time. This works fine for specific searches, but when searches return large numbers of results, Solr has to take all of the resulting documents and sort them by time before returning. This is slow, and there has to be a better way. What is the better way?

    Read the article

  • Loading index in MamoryIndex instance

    - by Javi
    Hello, Is there any way to load an existing index into an instance of MemoryIndex?. I have an application which uses Hibernate Search so I can use index() in FullTextEntityManager instance to index an object. I'd like to recover back the created index and insert it into a MemoryIndex instance to execute several queries over it. Is it possible? Thanks.

    Read the article

  • Can I run a site search like Lucene on a single 2 gig server that's also a web & mysql server

    - by ian.evans
    My site's pages have exceeded the limit of pages for Google Custom Search so many of the results are not found in our site search. I've been reading about Lucene, Nutch, Solr, etc and I'm wondering if I'd have the requirements for running those on a single server that also runs the site (on nginx) and our mysql server. We hae 2 gigs of RAM. I'd appreciate any suggestions for migrating to a new site search.

    Read the article

  • Fogbugz search broken

    - by apollodude217
    The Search feature on our Fogbugz server is broken. It can look up case numbers just fine, but when I search for text, I get the following: FogBugz Internal Error An internal error occurred in FogBugz. If you are unsure of how to fix this error, and you have made no changes to the source code of FogBugz, please report this error to Fog Creek Software by clicking "Submit." File: /FogBugz/list.asp Line: 1465 Error: nErr = 100 nErr = -2146232832 Unable to find the specified file: _254m.fnm at FogUtil.Search.Store.Directory.GetFile(String name) at FogUtil.Search.Store.Directory.OpenInput(String name) at Lucene.Net.Index.FieldInfos..ctor(Directory d, String name) at Lucene.Net.Index.SegmentReader.Initialize(SegmentInfo si) at Lucene.Net.Index.SegmentReader.Get(Directory dir, SegmentInfo si, SegmentInfos sis, Boolean closeDir, Boolean ownDir) at Lucene.Net.Index.SegmentReader.Get(SegmentInfo si) at Lucene.Net.Index.IndexReader.AnonymousClassWith.DoBody() at Lucene.Net.Store.Lock.With.Run() at Lucene.Net.Index.IndexReader.Open(Directory directory, Boolean closeDirectory) at Lucene.Net.Index.IndexReader.Open(Directory directory) at FogUtil.Search.Index.LoadReadonlyReader() at FogUtil.Search.Index.LoadSearcher() at FogUtil.Search.QueryGenerator.Execute(Object querytokens, Boolean fMSSQL, Boolean fMySQL, String sFilterKey, String sFilterValues) at FogUtil.Search.Index.Execute(Object query, Boolean fMSSQL, Boolean fMySQL, String sFilterKey, String sFilterValues) Browser: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB6 (.NET CLR 3.5.30729) Number: 0x800A0064 Category: Microsoft VBScript runtime Column: -1 OS Version: Microsoft Windows Server 2003 5.2.3790 Database: Microsoft SQL Server CLR Versions: 1033|v1.0.3705|v1.1.4322|v2.0.50727|v3.0 QueryString: pre=preMultiSearch&pg=pgList&pgBack=pgSearch&search=2&searchFor=asdf&sLastSearchString=&sLastSearchStringJSArgs=%27%27%2C%27%27%2C%27%27 URL: /FogBugz/default.asp Content Length: 0 Local Addr: 172.22.22.15 Remote Addr: 172.16.55.25 Time: 4/27/2010 2:45:39 PM Does anyone know what this problem is or how to fix it?

    Read the article

  • CodePlex Daily Summary for Friday, December 10, 2010

    CodePlex Daily Summary for Friday, December 10, 2010Popular ReleasesFree Silverlight & WPF Chart Control - Visifire: Visifire Silverlight, WPF Charts v3.6.5 Released: Hi, Today we are releasing final version of Visifire, v3.6.5 with the following new feature: * New property AutoFitToPlotArea has been introduced in DataSeries. AutoFitToPlotArea will bring bubbles inside the PlotArea in order to avoid clipping of bubbles in bubble chart. You can visit Visifire documentation to know more. http://www.visifire.com/visifirechartsdocumentation.php Also this release includes few bug fixes: * Chart threw exception while adding new Axis in Chart using Vi...PHPExcel: PHPExcel 1.7.5 Production: DonationsDonate via PayPal via PayPal. If you want to, we can also add your name / company on our Donation Acknowledgements page. PEAR channelWe now also have a full PEAR channel! Here's how to use it: New installation: pear channel-discover pear.pearplex.net pear install pearplex/PHPExcel Or if you've already installed PHPExcel before: pear upgrade pearplex/PHPExcel The official page can be found at http://pearplex.net. Want to contribute?Please refer the Contribute page.UserVoice Helper for WebMatrix: UserVoice Helper v0.9: This version will work with ASP.NET WebPages and ASP.NET MVC ApplicationsDNN Simple Article: DNNSimpleArticle Module V00.00.03: The initial release of the DNNSimpleArticle module (labelled V00.00.03) There are C# and VB versions of this module for this initial release. No promises that going forward there will be packages for both languages provided for future releases. This module provides the following functionality Create and display articles Display a paged list of articles Articles get created as DNN ContentItems Categorization provided through DNN Taxonomy SEO functionality for article display providi...UOB & ME: UOB_ME 2.5: latest versionCouchDB.NET: CouchDB.NET 0.1: CouchDB.NET ------- Libraries and providers to use CouchDB features from .NET This distribution includes the following projects: - MachineKeyGenerator: Command line tool to generate a machine key string for use in App.Config and Web.Config files. - CouchDB.NET: Library to facilitate the use of CouchDB features. It uses Hadi Hariri's EasyHttp library to communicate with the CouchDB server. More info at: https://github.com/hhariri/EasyHttp - CouchDb.ASP.NET: ASP.NET Membership Provider and ASP...AutoLoL: AutoLoL v1.4.3: AutoLoL now supports importing the build pages from Mobafire.com as well! Just insert the url to the build and voila. (For example: http://www.mobafire.com/league-of-legends/build/unforgivens-guide-how-to-build-a-successful-mordekaiser-24061) Stable release of AutoChat (It is still recommended to use with caution and to read the documentation) It is now possible to associate *.lolm files with AutoLoL to quickly open them The selected spells are now displayed in the masteries tab for qu...SubtitleTools: SubtitleTools 1.2: - Added auto insertion of RLE (RIGHT-TO-LEFT EMBEDDING) Unicode character for the RTL languages. - Fixed delete rows issue.PHP Manager for IIS: PHP Manager 1.1 for IIS 7: This is a final stable release of PHP Manager 1.1 for IIS 7. This is a minor incremental release that contains all the functionality available in 53121 plus additional features listed below: Improved detection logic for existing PHP installations. Now PHP Manager detects the location to php.ini file in accordance to the PHP specifications Configuring date.timezone. PHP Manager can automatically set the date.timezone directive which is required to be set starting from PHP 5.3 Ability to ...Algorithmia: Algorithmia 1.1: Algorithmia v1.1, released on December 8th, 2010.SuperSocket, an extensible socket application framework: SuperSocket 1.0 SP1: Fixed bugs: fixed a potential bug that the running state hadn't been updated after socket server stopped fixed a synchronization issue when clearing timeout session fixed a bug in ArraySegmentList fixed a bug on getting configuration valueCslaGenFork: CslaGenFork 4.0 CTP 2: The version is 4.0.1 CTP2 and was released 2010 December 7 and includes the following files: CslaGenFork 4.0.1-2010-12-07 Setup.msi Templates-2010-10-07.zip For getting started instructions, refer to How to section. Overview of the changes Since CTP1 there were 53 work items closed (28 features, 24 issues and 1 task). During this 60 days a lot of work has been done on several areas. First the stereotypes: EditableRoot is OK EditableChild is OK EditableRootCollection is OK Editable...My Web Pages Starter Kit: 1.3.1 Production Release (Security HOTFIX): Due to a critical security issue, it's strongly advised to update the My Web Pages Starter Kit to this version. Possible attackers could misuse the image upload to transmit any type of file to the website. If you already have a running version of My Web Pages Starter Kit 1.3.0, you can just replace the ftb.imagegallery.aspx file in the root directory with the one attached to this release.EnhSim: EnhSim 2.2.0 ALPHA: 2.2.0 ALPHAThis release adds in the changes for 4.03a. at level 85 To use this release, you must have the Microsoft Visual C++ 2010 Redistributable Package installed. This can be downloaded from http://www.microsoft.com/downloads/en/details.aspx?FamilyID=A7B7A05E-6DE6-4D3A-A423-37BF0912DB84 To use the GUI you must have the .NET 4.0 Framework installed. This can be downloaded from http://www.microsoft.com/downloads/en/details.aspx?FamilyID=9cfb2d51-5ff4-4491-b0e5-b386f32c0992 - Updated En...ASP.NET MVC Project Awesome (jQuery Ajax helpers): 1.4: A rich set of helpers (controls) that you can use to build highly responsive and interactive Ajax-enabled Web applications. These helpers include Autocomplete, AjaxDropdown, Lookup, Confirm Dialog, Popup Form, Popup and Pager new stuff: popup WhiteSpaceFilterAttribute tested on mozilla, safari, chrome, opera, ie 9b/8/7/6nopCommerce. ASP.NET open source shopping cart: nopCommerce 1.90: To see the full list of fixes and changes please visit the release notes page (http://www.nopCommerce.com/releasenotes.aspx).myCollections: Version 1.2: New in version 1.2: Big performance improvement. New Design (Added Outlook style View, New detail view, New Groub By...) Added Sort by Media Added Manage Movie Studio Zoom preference is now saved. Media name are now editable. Added Portuguese version You can now Hide details panel Add support for FLAC tags You can now imports books from BibTex Xml file BugFixingmytrip.mvc (CMS & e-Commerce): mytrip.mvc 1.0.49.0 beta: mytrip.mvc 1.0.49.0 beta web Web for install hosting System Requirements: NET 4.0, MSSQL 2008 or MySql (auto creation table to database) if .\SQLEXPRESS auto creation database (App_Data folder) mytrip.mvc 1.0.49.0 beta src System Requirements: Visual Studio 2010 or Web Deweloper 2010 MSSQL 2008 or MySql (auto creation table to database) if .\SQLEXPRESS auto creation database (App_Data folder) Connector/Net 6.3.4, MVC3 RC WARNING For run and debug mytrip.mvc 1.0.49.0 beta src download and ...Menu and Context Menu for Silverlight 4.0: Silverlight Menu and Context Menu v2.3 Beta: - Added keyboard navigation support with access keys - Shortcuts like Ctrl-Alt-A are now supported(where the browser permits it) - The PopupMenuSeparator is now completely based on the PopupMenuItem class - Moved item manipulation code to a partial class in PopupMenuItemsControl.cs - Moved menu management and keyboard navigation code to the new PopupMenuManager class - Simplified the layout by removing the RootGrid element(all content is now placed in OverlayCanvas and is accessed by the new ...MiniTwitter: 1.62: MiniTwitter 1.62 ???? ?? ??????????????????????????????????????? 140 ?????????????????????????? ???????????????????????????????? ?? ??????????????????????????????????New ProjectsAccountingGuid: for testing onlyChinese Nag Screen: This is a simple but effective program for learning to recognize Mandarin characters. The application sits in the system tray and displays a character random through your day. You can only get rid of it by typing in the pinyin.CouchDB.NET: .NET libraries to use CouchDB from .NET. Included are Membership and Roles provider so that you may use CouchDB as your integrated DB backend on your ASP.NET projects. Please see the readme.txt file for instructions.DataSetMapper: The idea behind DataSetMapper is to provide support for the automatic mapping of legacy DataSet based structures to proper domain objects. In essence the aim is to create the Mapping aspect of an ORM without the persistence concerns.EasyXnaAudio: EasyXnaAudio is a simple component for use in XNA Game Studio 3.1/4.0 projects that provides an easy interface to load, play, and manage songs and sounds in your game.FixMailboxSD - Exchange Mailbox Security Descriptor Canonicalizer: This is a small utility to fix mailbox security descriptors in Microsoft Exchange that have become non-canonical. It must be run on a machine with Exchange System Manager for Exchange 2003 installed, but it will work against mailboxes on 2003 or 2007 (not 2010).GearSynth Plugin: a plugin for graphsynth that makes gear trainsGroceryList: TBD with first versionIBMS Suite Build on the Associate Platform: A new way of approaching Information Systems. From the UI, users of the IS will be able to build and manipulate the IS to whatever way fits their needs. We have simplified development, removed the chasm between management and IT and give the power of simplification to the user!Ivy Nasha Framework: A PHP FrameworkjQuery helpers for ASP.NET and ASP.NET MVC: jQuery helpers makes it easier for ASP.NET developers to build jQuery scripts. It's developed in C#. JSTest.NET: JSTest.NET enabled JavaScript unit tests to be run directly in the test framework of your choice (MSTest, NUnit, xUnit, etc) and all without the need for a web browser. JSTest.NET utilizes the Windows Script Host (CScript) to run fast, fully debuggable JavaScript unit tests!Multicore Task Framework: MTF is a visual tool to simplify building robust component based .NET applications. MTF is designed to make full use of the power of multi-core processors.Nazha Script On DLR: NazhaPascalESE - a Delphi/Pascal class library for Microsoft ESENT database API: This pascal class library, primarily written for Delphi's Object Pascal, provides a lightweight and easy-to-use wrapper around the ESENT API. Perpetuum Hangar: A Character planner for the online game "Perpetuum"Projeto Exemplo: Projeto exemplo para a atividade 3 da disciplina.PSiteCode: PSiteCode Manager rScript Engine: rScript scripting engine is a managed script engine wrote in C# that supports Visual Basic and C# syntax based scripts. It provides Type's for dynamically getting and setting properties, invoking methods and run-time compilation of scripts.SharePoint 2010 User Profile WebPart: This webpart shows all user profile properties and values of the properties for a particular user profile. The results are shown in a table containing the display and technical name together with the user value.SHC: shriSHMTools: SHMTools is set of compatible software tools (mostly Matlab based) for structural health monitoring (SHM) research. This includes algorithms for system design, modeling, data acquisition, feature extraction, classification, and prognosis.SwapWin: SwapWin is a tiny and handy tool which swaps windows on different screens. Developed in C# and .NET 3.5.Teachers Diary: Teachers diary is application realizing electronic teacher's notepad with student marks. Current localization of the application is in czech language only.VkApp: Vk app for downloadingWebSpirit: A lightweighted web server implemented by C# which supports sufficient extendible feature. By zjuWPF & MEF Studio: WPF & MEF Studio

    Read the article

  • How to quickly search through a very large list of strings / records on a database

    - by Giorgio
    I have the following problem: I have a database containing more than 2 million records. Each record has a string field X and I want to display a list of records for which field X contains a certain string. Each record is about 500 bytes in size. To make it more concrete: in the GUI of my application I have a text field where I can enter a string. Above the text field I have a table displaying the (first N, e.g. 100) records that match the string in the text field. When I type or delete one character in the text field, the table content must be updated on the fly. I wonder if there is an efficient way of doing this using appropriate index structures and / or caching. As explained above, I only want to display the first N items that match the query. Therefore, for N small enough, it should not be a big issue loading the matching items from the database. Besides, caching items in main memory can make retrieval faster. I think the main problem is how to find the matching items quickly, given the pattern string. Can I rely on some DBMS facilities, or do I have to build some in-memory index myself? Any ideas? EDIT I have run a first experiment. I have split the records into different text files (at most 200 records per file) and put the files in different directories (I used the content of one data field to determine the directory tree). I end up with about 50000 files in about 40000 directories. I have then run Lucene to index the files. Searching for a string with the Lucene demo program is pretty fast. Splitting and indexing took a few minutes: this is totally acceptable for me because it is a static data set that I want to query. The next step is to integrate Lucene in the main program and use the hits returned by Lucene to load the relevant records into main memory.

    Read the article

  • What are some "mental steps" a developer must take to begin moving from SQL to NO-SQL (CouchDB, Fath

    - by Byron Sommardahl
    I have my mind firmly wrapped around relational databases and how to code efficiently against them. Most of my experience is with MySQL and SQL. I like many of the things I'm hearing about document-based databases, especially when someone in a recent podcast mentioned huge performance benefits. So, if I'm going to go down that road, what are some of the mental steps I must take to shift from SQL to NO-SQL? If it makes any difference in your answer, I'm a C# developer primarily (today, anyhow). I'm used to ORM's like EF and Linq to SQL. Before ORMs, I rolled my own objects with generics and datareaders. Maybe that matters, maybe it doesn't. Here are some more specific: How do I need to think about joins? How will I query without a SELECT statement? What happens to my existing stored objects when I add a property in my code? (feel free to add questions of your own here)

    Read the article

  • What&rsquo;s new in Subtext 2.5: full-text search, related posts and more

    In Subtext 2.5 we changed the internal search provider from the like %term% SQL based one to a more mature and powerful one powered by Lucene.net. I wrote about how Lucene.net is implemented inside Subtext, but it didnt show the benefits for the users. In this post Im explaining the visible features of the full-text search. There are 4 places where the new Lucene.net based search engine has its effect: Full-text search Related links More Results for the search OpenSearch provider...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Big Data – Operational Databases Supporting Big Data – Key-Value Pair Databases and Document Databases – Day 13 of 21

    - by Pinal Dave
    In yesterday’s blog post we learned the importance of the Relational Database and NoSQL database in the Big Data Story. In this article we will understand the role of Key-Value Pair Databases and Document Databases Supporting Big Data Story. Now we will see a few of the examples of the operational databases. Relational Databases (Yesterday’s post) NoSQL Databases (Yesterday’s post) Key-Value Pair Databases (This post) Document Databases (This post) Columnar Databases (Tomorrow’s post) Graph Databases (Tomorrow’s post) Spatial Databases (Tomorrow’s post) Key Value Pair Databases Key Value Pair Databases are also known as KVP databases. A key is a field name and attribute, an identifier. The content of that field is its value, the data that is being identified and stored. They have a very simple implementation of NoSQL database concepts. They do not have schema hence they are very flexible as well as scalable. The disadvantages of Key Value Pair (KVP) database are that they do not follow ACID (Atomicity, Consistency, Isolation, Durability) properties. Additionally, it will require data architects to plan for data placement, replication as well as high availability. In KVP databases the data is stored as strings. Here is a simple example of how Key Value Database will look like: Key Value Name Pinal Dave Color Blue Twitter @pinaldave Name Nupur Dave Movie The Hero As the number of users grow in Key Value Pair databases it starts getting difficult to manage the entire database. As there is no specific schema or rules associated with the database, there are chances that database grows exponentially as well. It is very crucial to select the right Key Value Pair Database which offers an additional set of tools to manage the data and provides finer control over various business aspects of the same. Riak Rick is one of the most popular Key Value Database. It is known for its scalability and performance in high volume and velocity database. Additionally, it implements a mechanism for collection key and values which further helps to build manageable system. We will further discuss Riak in future blog posts. Key Value Databases are a good choice for social media, communities, caching layers for connecting other databases. In simpler words, whenever we required flexibility of the data storage keeping scalability in mind – KVP databases are good options to consider. Document Database There are two different kinds of document databases. 1) Full document Content (web pages, word docs etc) and 2) Storing Document Components for storage. The second types of the document database we are talking about over here. They use Javascript Object Notation (JSON) and Binary JSON for the structure of the documents. JSON is very easy to understand language and it is very easy to write for applications. There are two major structures of JSON used for Document Database – 1) Name Value Pairs and 2) Ordered List. MongoDB and CouchDB are two of the most popular Open Source NonRelational Document Database. MongoDB MongoDB databases are called collections. Each collection is build of documents and each document is composed of fields. MongoDB collections can be indexed for optimal performance. MongoDB ecosystem is highly available, supports query services as well as MapReduce. It is often used in high volume content management system. CouchDB CouchDB databases are composed of documents which consists fields and attachments (known as description). It supports ACID properties. The main attraction points of CouchDB are that it will continue to operate even though network connectivity is sketchy. Due to this nature CouchDB prefers local data storage. Document Database is a good choice of the database when users have to generate dynamic reports from elements which are changing very frequently. A good example of document usages is in real time analytics in social networking or content management system. Tomorrow In tomorrow’s blog post we will discuss about various other Operational Databases supporting Big Data. Reference: Pinal Dave (http://blog.sqlauthority.com) Filed under: Big Data, PostADay, SQL, SQL Authority, SQL Query, SQL Server, SQL Tips and Tricks, T SQL

    Read the article

< Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25  | Next Page >