couchdb lucene - Page 23

How can I tell what files are currently open by a process (i.e. my app)?

- by chaiguy

I am using a Lucene.Net index and want to give the user an option to move the index, but am having trouble closing it down so the directory/contents can be moved (I keep getting access denied exceptions). I need to be able to have some more information so I can debug this problem, such as being able to tell what files are currently open, and as much information about each use as possible. Alternatively, is there any way to simply force close a bunch of files so they can be moved? This would make things a lot easier to solve.

Read the article

Problem indexing files in Solr on Ubuntu

- by nik

Hi, What I want to do is index some documents in Solr and know how it works. I have installed Solr and Tomcat and can see Solr Admin UI at localhost:8080/ solr/ admin/ Now, I want to add some documents to the index, may I know how to proceed further? I find very less documentation on Internet regarding this. In the tutorial http://lucene.apache.org/solr/tutorial.html#Indexing+Data they asked to run java -jar post.jar solr.xml monitor.xml command but after running that I get connection refused error as it is jetty. After installing jetty I tried telnet: I get "Connection refused" error I am not able to understand what the problem is.

Read the article

Should HTTP POST be discouraged?

- by Tomas Sedovic

Quoting from the CouchDB documentation: It is recommended that you avoid POST when possible, because proxies and other network intermediaries will occasionally resend POST requests, which can result in duplicate document creation. To my understanding, this should not be happening on the protocol level (a confused user armed with a doubleclick is a completely different story). What is the best course of action, then? Should we really try to avoid POST requests and replace them by PUT? I don't like that as they convey a different meaning. Should we anticipate this and protect the requests by unique IDs where we want to avoid accidental duplication? I don't like that either: it complicates the code and prevents situations where multiple identical posts may be desired.

Read the article

Java escape HTML - string replace slow?

- by cpf

Hi StackOverflow, I have a Java application that makes heavy use of a large file, to read, process and give through to SolrEmbeddedServer (http://lucene.apache.org/solr/). One of the functions does basic HTML escaping: private String htmlEscape(String input) { return input.replace("&", "&").replace(">", ">").replace("<", "<") .replace("'", "'").replaceAll("\"", """); } While profiling the application, the program spends roughly 58% of the time in this function, a total of 47% in replace, and 11% in replaceAll. Now, is the Java replace that slow, or am I on the right path and should I consider the program efficient enough to have its bottleneck in Java and not in my code? (Or am I replacing wrong?) Thanks in advance!

Read the article

Mysql search design

- by neil

I'm designing a mysql database, and i'd like some input on an efficient way to store blog/article data for searching. Right now, I've made a separate column that stores the content to be searched - no duplicate words, no words shorter than four letters, and no words that are too common. So, essentially, it's a list of keywords from the original article. Also searched would be a list of tags, and the title field. I'm not quite sure how mysql indexes fulltext columns, so would storing the data like that be ineffective, or redundant somehow? A lot of the articles are on the same topic, so would the score be hurt by so many of the rows having similar keywords? Also, for this project, solutions like sphinx, lucene or google custom seach can't be used -- only php & mysql. Thanks!

Read the article

LINQ Changeset multi-threading

- by Xodarap

I'm using LINQ to SQL and after I submit some changes I want to spawn a thread which looks through all the changes and updates our lucene index as necessary. My code looks vaguely like: (new Thread(() => { UpdateIndex(context.GetChangeSet()); }).Start(); Sometimes though I get an InvalidOperationException, which I think is because context.GetChangeSet() is not thread-safe, and so if the change set is modified in one thread while another thread is enumerating through it, problems arise. Is there a "thread-safe" version of GetChangeSet()? Or some way I can do ChangeSet.clone() or something?

Read the article

How to store a 250mb database in an Offline Web App

- by Couto

Ok, maybe i'm not seeing the whole picture or something, but i kinda need a brainstorm. So the purpose is to make a webapp (HTML5, CSS, Javascript) that has to search on a 250mb database without any internet connection, so.. yes the database has to be on the client side. The hard part here is, this App has to work on an iPod or iPhone without internet connection. (An initial connection to download the App is ok), LocalStorage has a 5mb limit, couchDB would be great since they have an webapp easily accessed by Javascript (privacy concerns don't matter at this point), so i'm pretty much out of ideas.... Does anyone see an alternative, or solution for the purpose?

Read the article

What you would learn. [closed]

- by NDeveloper

Hi, I have a little free time and would like to learn new development language/technology. I know it can be very subective, but please share with us what you would learn and why. I have about 4 years of .NET development experience mostly distributed applications. And a little more than 2 years of c/c++. There are a lot of options to choose like Google Go/F#/Python/Scala/Java/ASP.NET/Mobile App development like for Android, BB, iPhone.../DB (MS SQL, Oracle or even MongoDB or CouchDB)/any new concepts, etc... I would like to use the time for investment, so gained knowledge will be useful.

Read the article

NoSQL with RavenDB and ASP.NET MVC - Part 2

- by shiju

In my previous post, we have discussed on how to work with RavenDB document database in an ASP.NET MVC application. We have setup RavenDB for our ASP.NET MVC application and did basic CRUD operations against a simple domain entity. In this post, let’s discuss on domain entity with deep object graph and how to query against RavenDB documents using Indexes.Let's create two domain entities for our demo ASP.NET MVC appplication public class Category { public string Id { get; set; } [Required(ErrorMessage = "Name Required")] [StringLength(25, ErrorMessage = "Must be less than 25 characters")] public string Name { get; set;} public string Description { get; set; } public List<Expense> Expenses { get; set; } public Category() { Expenses = new List<Expense>(); } } public class Expense { public string Id { get; set; } public Category Category { get; set; } public string Transaction { get; set; } public DateTime Date { get; set; } public double Amount { get; set; } } We have two domain entities - Category and Expense. A single category contains a list of expense transactions and every expense transaction should have a Category.Let's create ASP.NET MVC view model for Expense transaction public class ExpenseViewModel { public string Id { get; set; } public string CategoryId { get; set; } [Required(ErrorMessage = "Transaction Required")] public string Transaction { get; set; } [Required(ErrorMessage = "Date Required")] public DateTime Date { get; set; } [Required(ErrorMessage = "Amount Required")] public double Amount { get; set; } public IEnumerable<SelectListItem> Category { get; set; } } Let's create a contract type for Expense Repository public interface IExpenseRepository { Expense Load(string id); IEnumerable<Expense> GetExpenseTransactions(DateTime startDate,DateTime endDate); void Save(Expense expense,string categoryId); void Delete(string id); } Let's create a concrete type for Expense Repository for handling CRUD operations. public class ExpenseRepository : IExpenseRepository { private IDocumentSession session; public ExpenseRepository() { session = MvcApplication.CurrentSession; } public Expense Load(string id) { return session.Load<Expense>(id); } public IEnumerable<Expense> GetExpenseTransactions(DateTime startDate, DateTime endDate) { //Querying using the Index name "ExpenseTransactions" //filtering with dates var expenses = session.LuceneQuery<Expense>("ExpenseTransactions") .WaitForNonStaleResults() .Where(exp => exp.Date >= startDate && exp.Date <= endDate) .ToArray(); return expenses; } public void Save(Expense expense,string categoryId) { var category = session.Load<Category>(categoryId); if (string.IsNullOrEmpty(expense.Id)) { //new expense transaction expense.Category = category; session.Store(expense); } else { //modifying an existing expense transaction var expenseToEdit = Load(expense.Id); //Copy values to expenseToEdit ModelCopier.CopyModel(expense, expenseToEdit); //set category object expenseToEdit.Category = category; } //save changes session.SaveChanges(); } public void Delete(string id) { var expense = Load(id); session.Delete<Expense>(expense); session.SaveChanges(); } } Insert/Update Expense Transaction The Save method is used for both insert a new expense record and modifying an existing expense transaction. For a new expense transaction, we store the expense object with associated category into document session object and load the existing expense object and assign values to it for editing a existing record. public void Save(Expense expense,string categoryId) { var category = session.Load<Category>(categoryId); if (string.IsNullOrEmpty(expense.Id)) { //new expense transaction expense.Category = category; session.Store(expense); } else { //modifying an existing expense transaction var expenseToEdit = Load(expense.Id); //Copy values to expenseToEdit ModelCopier.CopyModel(expense, expenseToEdit); //set category object expenseToEdit.Category = category; } //save changes session.SaveChanges(); } Querying Expense transactions public IEnumerable<Expense> GetExpenseTransactions(DateTime startDate, DateTime endDate) { //Querying using the Index name "ExpenseTransactions" //filtering with dates var expenses = session.LuceneQuery<Expense>("ExpenseTransactions") .WaitForNonStaleResults() .Where(exp => exp.Date >= startDate && exp.Date <= endDate) .ToArray(); return expenses; } The GetExpenseTransactions method returns expense transactions using a LINQ query expression with a Date comparison filter. The Lucene Query is using a index named "ExpenseTransactions" for getting the result set. In RavenDB, Indexes are LINQ queries stored in the RavenDB server and would be executed on the background and will perform query against the JSON documents. Indexes will be working with a lucene query expression or a set operation. Indexes are composed using a Map and Reduce function. Check out Ayende's blog post on Map/Reduce We can create index using RavenDB web admin tool as well as programmitically using its Client API. The below shows the screen shot of creating index using web admin tool. We can also create Indexes using Raven Cleint API as shown in the following code documentStore.DatabaseCommands.PutIndex("ExpenseTransactions", new IndexDefinition<Expense,Expense>() { Map = Expenses => from exp in Expenses select new { exp.Date } }); In the Map function, we used a Linq expression as shown in the following from exp in docs.Expensesselect new { exp.Date };We have not used a Reduce function for the above index. A Reduce function is useful while performing aggregate functions based on the results from the Map function. Indexes can be use with set operations of RavenDB.SET OperationsUnlike other document databases, RavenDB supports set based operations that lets you to perform updates, deletes and inserts to the bulk_docs endpoint of RavenDB. For doing this, you just pass a query to a Index as shown in the following commandDELETE http://localhost:8080/bulk_docs/ExpenseTransactions?query=Date:20100531The above command using the Index named "ExpenseTransactions" for querying the documents with Date filter and will delete all the documents that match the query criteria. The above command is equivalent of the following queryDELETE FROM ExpensesWHERE Date='2010-05-31' Controller & ActionsWe have created Expense Repository class for performing CRUD operations for the Expense transactions. Let's create a controller class for handling expense transactions. public class ExpenseController : Controller { private ICategoryRepository categoyRepository; private IExpenseRepository expenseRepository; public ExpenseController(ICategoryRepository categoyRepository, IExpenseRepository expenseRepository) { this.categoyRepository = categoyRepository; this.expenseRepository = expenseRepository; } //Get Expense transactions based on dates public ActionResult Index(DateTime? StartDate, DateTime? EndDate) { //If date is not passed, take current month's first and last dte DateTime dtNow; dtNow = DateTime.Today; if (!StartDate.HasValue) { StartDate = new DateTime(dtNow.Year, dtNow.Month, 1); EndDate = StartDate.Value.AddMonths(1).AddDays(-1); } //take last date of startdate's month, if endate is not passed if (StartDate.HasValue && !EndDate.HasValue) { EndDate = (new DateTime(StartDate.Value.Year, StartDate.Value.Month, 1)).AddMonths(1).AddDays(-1); } var expenses = expenseRepository.GetExpenseTransactions(StartDate.Value, EndDate.Value); if (Request.IsAjaxRequest()) { return PartialView("ExpenseList", expenses); } ViewData.Add("StartDate", StartDate.Value.ToShortDateString()); ViewData.Add("EndDate", EndDate.Value.ToShortDateString()); return View(expenses); } // GET: /Expense/Edit public ActionResult Edit(string id) { var expenseModel = new ExpenseViewModel(); var expense = expenseRepository.Load(id); ModelCopier.CopyModel(expense, expenseModel); var categories = categoyRepository.GetCategories(); expenseModel.Category = categories.ToSelectListItems(expense.Category.Id.ToString()); return View("Save", expenseModel); } // // GET: /Expense/Create public ActionResult Create() { var expenseModel = new ExpenseViewModel(); var categories = categoyRepository.GetCategories(); expenseModel.Category = categories.ToSelectListItems("-1"); expenseModel.Date = DateTime.Today; return View("Save", expenseModel); } // // POST: /Expense/Save // Insert/Update Expense Tansaction [HttpPost] public ActionResult Save(ExpenseViewModel expenseViewModel) { try { if (!ModelState.IsValid) { var categories = categoyRepository.GetCategories(); expenseViewModel.Category = categories.ToSelectListItems(expenseViewModel.CategoryId); return View("Save", expenseViewModel); } var expense=new Expense(); ModelCopier.CopyModel(expenseViewModel, expense); expenseRepository.Save(expense, expenseViewModel.CategoryId); return RedirectToAction("Index"); } catch { return View(); } } //Delete a Expense Transaction public ActionResult Delete(string id) { expenseRepository.Delete(id); return RedirectToAction("Index"); } } Download the Source - You can download the source code from http://ravenmvc.codeplex.com

Read the article

Silverlight Cream for November 08, 2011 -- #1165

- by Dave Campbell

In this Issue: Brian Noyes, Michael Crump, WindowsPhoneGeek, Erno de Weerd, Jesse Liberty, Derik Whittaker, Sumit Dutta, Asim Sajjad, Dhananjay Kumar, Kunal Chowdhury, and Beth Massi. Above the Fold: Silverlight: "Working with Prism 4 Part 1: Getting Started" Brian Noyes WP7: "Getting Started with the Coding4Fun toolkit Tile Control" WindowsPhoneGeek LightSwitch: "How to Connect to and Diagram your SQL Express Database in Visual Studio LightSwitch" Beth Massi Shoutouts: Michael Palermo's latest Desert Mountain Developers is up Michael Washington's latest Visual Studio #LightSwitch Daily is up From SilverlightCream.com: Working with Prism 4 Part 1: Getting Started Brian Noyes has a series starting at SilverlightShow about Prism 4 ... this is the first one, so a good time to jump in and pick up on an intro and basic info about Prism plus building your first Prism app. 10 Laps around Silverlight 5 (Part 5 of 10) Michael Crump has Part 5 of his 10-part Silverlight 5 investigation up at SilverlightShow talking about all the various text features added in Silverlight 5 Beta: Text Tracking and Leading, Linked and MultiColumn, OpenType, etc. Getting Started with the Coding4Fun toolkit Tile Control WindowsPhoneGeek takes on the Tile control from the Coding4Fun toolkit... as usual, great tutorial... diagrams, code, explanation Using AppHarbor, Bitbucket and Mercurial with ASP.NET and Silverlight – Part 2 CouchDB, Cloudant and Hammock Erno de Weerd has Part 2 of his trilogy and he's trying to beat David Anson for the long title record :) ... in this episode, he's adding in cloud storage to the mix in a 35-step tutorial. Background Audio Jesse Liberty's talking about background Audio... and no not the Muzak in the elevator (do they still have that?) ... he's tlking about the WP7.1 BackgroundAudioPlayer Using the ToggleSwitch in WinRT/Metro (for C#) Derik Whittaker shows off the ToggleSwitch for WinRT/Metro... not a lot to be said about it, but he says it all :) Part 19 - Windows Phone 7 - Access Phone Contacts Sumit Dutta has Part 19! of his WP7 series up... talking today about getting a phone number from the directory using the PhoneNumberChooserTask ContextMenu using MVVM Asim Sajjad shows how to make the Context Menu ViewModel friendly in this short tutorial. Code to make call in Windows Phone 7 Dhananjay Kumar's latest WP7 post is explaining how to make a call programmatically using the PhoneCallTask launcher. Silverlight Page Navigation Framework - Basic Concept Kunal Chowdhury has a 3-part tutorial series on Silverlight Navigation up. This is the first in the series, and he hits the basics... what constitutes a Page, and how to get started with the navigation framework. How to Connect to and Diagram your SQL Express Database in Visual Studio LightSwitch Beth Massi's latest LightSwitch post is on using the Data Designer to easily crete and model database tables... during development this is in SQL Express, but can be deployed to most SQL server db you like Stay in the 'Light! Twitter SilverlightNews | Twitter WynApse | WynApse.com | Tagged Posts | SilverlightCream Join me @ SilverlightCream | Phoenix Silverlight User Group Technorati Tags: Silverlight Silverlight 3 Silverlight 4 Windows Phone MIX10

Read the article

Site Search Engine for 1,000 page website

- by Ian

I manage a website with about 1,000 articles that need to be searchable by my members. The site search engines I've tried all had their own problems: Fluid Dynamics Search Engine Since it's written in perl, it was a bit hacky to integrate with my PHP-based CMS. I basically had to file_get_contents the search results page. However, FDSE had the best search results. Google CSE Ugh, the search results SUCK. It can't find documents even using unique strings. I'm so surprised that a Google search product is this bad. Nor can I get any answers on their 'help' forums, and I am a paying user. Boo, Google. Boo. Sphider Again, bad search results. Unable to locate some phrases used in link text. Better results than Google CSE though. Shame on Google that a free PHP script has better search results than their paid application. IndexTank This one looked really promising. I got all set up with their PHP API client. But it would only randomly add articles that I submitted. Out of 700+ articles I pushed to the index through their API, only 8 made it in. Unable to find any help on this subject. Update for IndexTank -- Got the above issue fixed, so this looks most promising so far. The site itself runs on php/mysql and FreeBSD, though this shouldn't matter for a web crawling indexer. I've looked at Lucene, but I don't know anything about Java or installing Java programs on my web server. I also do not have root access on my web server, if this would be required for installation. I really don't need a lot of fancy features. It just needs to be able to crawl my web site and return great (even decent!) search results. I don't need any crazy search operators. It doesn't need to index off my primary domain. It just needs to work! Thanks, Hive Mind!

Read the article

Server-infrastructure recommendations

- by Tim van Elsloo

Here's the thing: I need a cheap, fast, reliable infrastructure that can dynamically scale (like Amazon S3: cloud-storage). I'm thinking of 3 different type of 'servers'. Application-server Should be able to run CentOS (or another light Linux-distr.) Should be able to run Apache Should be able to run PHP Should be able to run GD (so it does rely on it's cpu). Should be extremely reliable and fast. Database-server Should be able to run MySQL Should be able to... well, do nothing else :P. Should be extremely reliable and fast. Storage-server Should be able to run some kind of file-transfer-deamon (like FTP, CouchDB, etc.) Should be able to do nothing else. Should be extremely reliable and fast. So technically, by transferring all static data to 2 different servers/services, the application-server can totally focus on the webpages. My questions: What services do you recommend? Which is cheaper, faster and more reliable: using my own server, or using some cloud-storage/cloud-computing-service (like Amazon S3, CloudFiles, etc.)? How can I prevent bandwidth abuse (such as dos-attacks causing the bill to be extremely high)? What's the difference between "including CDN" and "excluding CDN"? It seems the price doesn't differ at CloudFiles? Do you have to pay "including CDN" + "excluding CDN" when you decide to enable the delivery-network? Or have you only got to pay "including CDN"? Should I use my own nameserver too or can I use my domain-hoster's nameservers? What are the minimum software specifications of a nameserver. Can I write some software myself? Does anyone have a good protocol-description? I hope you can answer my questions. Answers I shouldn't write my own nameserver-software. Instead, I should use something like bind. (http://osspro.com/2010/05/04/linux-create-your-own-domain-name-server-dns/).

Read the article

Hadoop, NOSQL, and the Relational Model

- by Phil Factor

(Guest Editorial for the IT Pro/SysAdmin Newsletter)Whereas Relational Databases fit the world of commerce like a glove, it is useless to pretend that they are a perfect fit for all human endeavours. Although, with SQL Server, we’ve made great strides with indexing text, in processing spatial data and processing markup, there is still a problem in dealing efficiently with large volumes of ephemeral semi-structured data. Key-value stores such as Cassandra, Project Voldemort, and Riak are of great value for ephemeral data, and seem of equal value as a data-feed that provides aggregations to an RDBMS. However, the Document databases such as MongoDB and CouchDB are ideal for semi-structured data for which no fixed schema exists; analytics and logging are obvious examples. NoSQL products, such as MongoDB, tackle the semi-structured data problem with panache. MongoDB is designed with a simple document-oriented data model that scales horizontally across multiple servers. It doesn’t impose a schema, and relies on the application to enforce the data structure. This is another take on the old ‘EAV’ problem (where you don’t know in advance all the attributes of a particular entity) It uses a clever replica set design that allows automatic failover, and uses journaling for data durability. It allows indexing and ad-hoc querying. However, for SQL Server users, the obvious choice for handling semi-structured data is Apache Hadoop. There will soon be an ODBC Driver for Apache Hive .and an Add-in for Excel. Additionally, there are now two Hadoop-based connectors for SQL Server; the Apache Hadoop connector for SQL Server 2008 R2, and the SQL Server Parallel Data Warehouse (PDW) connector. We can connect to Hadoop process the semi-structured data and then store it in SQL Server. For one steeped in the culture of Relational SQL Databases, I might be expected to throw up my hands in the air in a gesture of contempt for a technology that was, judging by the overblown journalism on the subject, about to make my own profession as archaic as the Saggar makers bottom knocker (a potter’s assistant who helped the saggar maker to make the bottom of the saggar by placing clay in a metal hoop and bashing it). However, on the contrary, I find that I'm delighted with the advances made by the NoSQL databases in the past few years. Having the flow of ideas from the NoSQL providers will knock any trace of complacency out of the providers of Relational Databases and inspire them into back-fitting some features, such as horizontal scaling, with sharding and automatic failover into SQL-based RDBMSs. It will do the breed a power of good to benefit from all this lateral thinking.

Read the article

Desktop Provisioning for a Small Linux Software Development Team

- by deakblue

Goal: Get a small team using a standard development image rather than 4 software devs setting up their own environments. Why: it takes a day or days to install a distro, build-specific libraries, tools like editors and IDEs, mysql, couchdb, java, maven, python, android-sdk, etc. It's a giant PITA that when repeated 4 times by 4 developers (not sys admins) wastes time and generates annoying divergences that crop up later (it-builds-on-my-box syndrome). There's no sharing of productivity, settings, tricks, scripts, set-ups. Some of this is helped by segregating the build systems into headless virtualbox images. This doesn't really address tooling though or the GUI-desktop dev that needs doing. So I see three basic strategies, ghosting, virtualization, and finally creating a kind of in-house linux distro (I guess Google does something like this). The target dev environment is based on Debian OpenBox and must allow a mix of 3rd gen Core i7 notebooks 8GB-minimum to work both single and multihead. Important, the lappies are not the same, but a mix of 2012 macbooks and PCs. So: virtualization: is doing all of your work within a VM, like VirtualBox, practical on this hardware or annoying. ghosting: will laptops from different manufacturers make this impractical. DIY distro: short of scripting a bunch of package installs, I don't know if there's any "distro-maker" that could keep this from being an epic project of scripting package installs. So any advice?

Read the article

setting up tracd behind mod_proxy?

- by FilmJ

I'm having trouble setting up mod_proxy and tracd. Seems almost all the search results for this problem take me to the built-in trac documentation page that mentions it as an option. I have several VirtualServers already running on the box in question, so running tracd on port 80 or 443 is not an option, but I do want to make my trac server accessible on this machine without exposing an additional port via the firewall. Making things even more complicated is that I have multiple trac repositories being served by the same instance of tracd, and so I want to set it up so: http://trac.abc.com is proxy'd to localhost:8000/projects/abcproject, and http://trac.def.com is proxy'd to localhost:8000/projects/defproject. Currently, the setup I have below results in 100% 403 errors. The server is running as www-data and the directory where all trac files are stored is owned by www-data, AND tracd (as show below) is running as www-data, so not sure where it's getting hung up. The relevant configuration on /var/apache2/sites-enabled/trac.abc.com: ProxyPass / http://localhost:8000/abcproject ProxyPassReverse / http://localhost:8000/abcproject The relevant configuration on /var/apache2/sites-enabled/trac.def.com: ProxyPass / http://localhost:8000/defproject ProxyPassReverse / http://localhost:8000/defproject The command used to instantiate tracd: tracd -a defproject,/var/www/vhosts/trac-common/users.htdigest,DEFProject -a abcproject,/var/www/vhosts/trac-common/users.htdigest,ABCProject -p 8000 -b localhost -e /var/www/vhosts/trac-common/projects If I access the site at http://localhost:8000/ everything works fine, but if I try to access via any of the proxy'd hosts I end up with 403 at every turn. I've used mod_proxy successfully as described above for other servers, such as couchdb, so maybe this has to do with the headers sent by tracd??

Read the article

Byte array serialization in JSON.NET

- by Daniel Earwicker

Given this simple class: class HasBytes { public byte[] Bytes { get; set; } } I can round-trip it through JSON using JSON.NET such that the byte array is base-64 encoded: var bytes = new HasBytes { Bytes = new byte[] { 1, 2, 3, 4 } }; // turn it into a JSON string var json = JsonConvert.SerializeObject(bytes); // get back a new instance of HasBytes var result1 = JsonConvert.DeserializeObject<HasBytes>(json); // all is well Debug.Assert(bytes.Bytes.SequenceEqual(result1.Bytes)); But if I deserialize this-a-wise: var result2 = (HasBytes)new JsonSerializer().Deserialize( new JTokenReader( JToken.ReadFrom(new JsonTextReader( new StringReader(json)))), typeof(HasBytes)); ... it throws an exception, "Expected bytes but got string". What other options/flags/whatever would need to be added to the "complicated" version to make it properly decode the base-64 string to initialize the byte array? Obviously I'd prefer to use the simple version but I'm trying to work with a CouchDB wrapper library called Divan, which sadly uses the complicated version, with the responsibilities for tokenizing/deserializing widely separated, and I want to make the simplest possible patch to how it currently works.

Read the article

Most useful free .NET libraries?

- by Binoj Antony

I have used a lot of free .NET libraries, some from Microsoft itself! Which ones have you found the most useful? Dependency Injection/Inversion of Control Unity Framework - Microsoft StructureMap - Jeremy Miller Castle Windsor NInject Spring Framework Autofac Managed Extensibility Framework Logging Logging Application Block - Microsoft Log4Net - Apache Error Logging Modules and Handlers(ELMAH) NLog Compression SharpZipLib DotNetZip YUI Compressor (CSS and JS compression/minification) AjaxMinifier (in other downloads) (JS compression. Also includes MSBuild task) Ajax Ajax Control Toolkit - Microsoft AJAXNet Pro Data Mapper XmlDataMapper AutoMapper ORM NHibernate Castle ActiveRecord Subsonic XmlDataMapper Charting/Graphics Microsoft Chart Controls for ASP.NET 3.5 SP1 Microsoft Chart Controls for Winforms ZedGraph Charting NPlot - Charting for ASP.NET and WinForms PDF Creators/Generators PDFsharp iTextSharp Unit Testing/Mocking NUnit Rhino Mocks Moq TypeMock.Net xUnit.net mbUnit Machine.Specifications Automated Web Testing Selenium Watin URL Rewriting url rewriter UrlRewriting.Net Url Rewriter and Reverse Proxy - Managed Fusion Controls Krypton - Free winform controls Source Grid - A Grid control Devexpress - free controls Unclassified CSLA Framework - Business Objects Framework AForge.net - AI, computer vision, genetic algorithms, machine learning Enterprise Library 4.1 - Logging, Exception Management, Validation, Policy Injection File helpers library C5 Collections - Collections for .NET Quartz.NET - Enterprise Job Scheduler for .NET Platform MiscUtil - Utilities by Jon Skeet Lucene.net - Text indexing and searching Json.NET - Linq over JSON Flee - expression evaluator PostSharp - AOP IKVM - brings the extensive world of Java libraries to .NET. Title of the question taken from here. [EDIT] Please provide links to these free libraries as well. Once we have a huge list of this, it can be arranged in categories! Please do not mention .NET Applications/EXEs here.

Read the article

Write-only collections in MongoDB

- by rcoder

I'm currently using MongoDB to record application logs, and while I'm quite happy with both the performance and with being able to dump arbitrary structured data into log records, I'm troubled by the mutability of log records once stored. In a traditional database, I would structure the grants for my log tables such that the application user had INSERT and SELECT privileges, but not UPDATE or DELETE. Similarly, in CouchDB, I could write a update validator function that rejected all attempts to modify an existing document. However, I've been unable to find a way to restrict operations on a MongoDB database or collection beyond the three access levels (no access, read-only, "god mode") documented in the security topic on the MongoDB wiki. Has anyone else deployed MongoDB as a document store in a setting where immutability (or at least change tracking) for documents was a requirement? What tricks or techniques did you use to ensure that poorly-written or malicious application code could not modify or destroy existing log records? Do I need to wrap my MongoDB logging in a service layer that enforces the write-only policy, or can I use some combination of configuration, query hacking, and replication to ensure a consistent, audit-able record is maintained?

Read the article

How to add Spatial Solr to a Solrnet query

- by Flo

Hi, I am running Solr on my windows machine using jetty. I have downloaded the Spatial Solr Plugin which I finally managed to get up and running. I am also using Solrnet to query against Solr from my asp.net mvc project. Now, adding data into my index seems to work fine and the SpatialTierUpdateProcessorFactory does work as well. The problem is: How do I add the spatial query to my normal query using the Solrnet library. I have tried adding it using the "ExtraParams" parameter but that didn't work very well. Here is an example of me trying to combine the spatial query with a data range query. The date range query works fine without the spatial query attached to it: new SolrQuery("{!spatial lat=51.5224 long=-2.6257 radius=10000 unit=km calc=arc threadCount=2}") && new SolrQuery(MyCustomQuery.Query) && new SolrQuery(DateRangeQuery); which results in the following query against Solr: (({!spatial lat=51.5224 long=-2.6257 radius=100 unit=km calc=arc threadCount=2} AND *:*) AND _date:[2010-05-07T13:13:37Z TO 2011-05-07T13:13:37Z]) And the error message I get back is: The remote server returned an error: (400) Bad Request. SEVERE: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.Pars eException: Cannot parse '(({!spatial lat=51.5224 lng=-2.6257 radius=10000 unit= km calc=arc threadCount=2} AND *:*) AND _date:[2010-05-07T13:09:49Z TO 2011-05-0 7T13:09:49Z])': Encountered " <RANGEEX_GOOP> "lng=-2.6257 "" at line 1, column 2 4. Was expecting: "}" ... Now, the thing is if I use the Solr Web Admin page and execute the following query against it, everything works fine. {!spatial lat=50.8371 long=4.35536 radius=100 calc=arc unit=km threadcount=2}text:London What is the best/correct way to call the spatial function using SolrNet. Is the best way to somehow add that bit of the query manually to the query string and is so how? Any help is much appreciated!

Read the article

Newbie, deciding Python or Erlang

- by Joe

Hi Guys, I'm a Administrator (unix, Linux and some windows apps such as Exchange) by experience and have never worked on any programming language besides C# and scripting on Bash and lately on powershell. I'm starting out as a service provider and using multiple network/server monitoring tools based on open source (nagios, opennms etc) in order to monitor them. At this moment, being inspired by a design that I came up with, to do more than what is available with the open source at this time, I would like to start programming and test some of these ideas. The requirement is that a server software that captures a stream of data and store them in a database(CouchDB or MongoDB preferably) and the client side (agent installed on a server) would be sending this stream of data on a schedule of every 10 minutes or so. For these two core ideas, I have been reading about Python and Erlang besides ruby. I do plan to use either Amazon or Rackspace where the server platform would run. This gives me the scalability needed when we have more customers with many servers. For that reason alone, I thought Erlang was a better fit(I could be totally wrong, new to this game) and I understand that Erlang has limited support in some ways compared to Ruby or Python. But also I'm totally new to the programming realm of things and any advise would be appreciated grately. Jo

Read the article

Subsonic Access To App.Config Connection Strings From Referenced DLL in Powershell Script

- by J Wynia

I've got a DLL that contains Subsonic-generated and augmented code to access a data model. Actually, it is a merged DLL of that original assembly, Subsonic itself and a few other referenced DLL's into a single assembly, called "PowershellDataAccess.dll. However, it should be noted that I've also tried this referencing each assembly individually in the script as well and that doesn't work either. I am then attempting to use the objects and methods in that assembly. In this case, I'm accessing a class that uses Subsonic to load a bunch of records and creates a Lucene index from those records. The problem I'm running into is that the call into the Subsonic method to retrieve data from the database says it can't find the connection string. I'm pointing the AppDomain at the appropriate config file which does contain that connection string, by name. Here's the script. $ScriptDir = Get-Location [System.IO.Directory]::SetCurrentDirectory($ScriptDir) [Reflection.Assembly]::LoadFrom("PowershellDataAccess.dll") [System.AppDomain]::CurrentDomain.SetData("APP_CONFIG_FILE", "$ScriptDir\App.config") $indexer = New-Object LuceneIndexingEngine.LuceneIndexGenerator $indexer.GeneratePageTemplateIndex("PageTemplateIndex"); I went digging into Subsonic itself and the following line in Subsonic is what's looking for the connection string and throwing the exception: ConfigurationManager.ConnectionStrings[connectionStringName] So, out of curiosity, I created an assembly with a single class that has a single property that just runs that one line to retrieve the connection string name. I created a ps1 that called that assembly and hit that property. That prototype can find the connection string just fine. Anyone have any idea why Subsonic's portion can't seem to see the connection strings?

Read the article

Starting out NLP - Python + large data set

- by pencilNero

Hi, I've been wanting to learn python and do some NLP, so have finally gotten round to starting. Downloaded the english wikipedia mirror for a nice chunky dataset to start on, and have been playing around a bit, at this stage just getting some of it into a sqlite db (havent worked with dbs in the past unfort). But I'm guessing sqlite is not the way to go for a full blown nlp project(/experiment :) - what would be the sort of things I should look at ? HBase (.. and hadoop) seem interesting, i guess i could run then im java, prototype in python and maybe migrate the really slow bits to java... alternatively just run Mysql.. but the dataset is 12gb, i wonder if that will be a problem? Also looked at lucene, but not sure how (other than breaking the wiki articles into chunks) i'd get that to work.. What comes to mind for a really flexible NLP platform (i dont really know at this stage WHAT i want to do.. just want to learn large scale lang analysis tbh) ? Many thanks.

Read the article

simple document server built over Apache HTTP server

- by abhinav

Hi, I want to build a simple document server. The requirement for now is : provide a hierarchical directory structure for placing documents (like pdfs, doc files) that is accessible through a browser, and provide the facility to search for documents by name and then be able to download them from server. Right now, placing documents can be done manually (directly place the files into some designated directory). I can do the hierarchical structure part of the problem by adding some configs to Apache's httpd.conf file. Basically I create a root directory for documents and then give an alias to this directory in httpd.conf file. That way, I can browse the directory structure in my browser and also download files placed there. I can provide more detail on this if needed. However, it is the searching documents by name part that I am not able to get to a clear solution yet. I have a few ideas like integrating Lucene with Apache server, or maybe using CouchDb, but I am not very sure of all the details to solve this problem. Could anyone suggest some clear approach as to how to solve this part ?

Read the article

Is Berkeley DB XML a viable database backend?

- by w00t

Apparently, BDB-XML has been around since at least 2003 but I only recently stumbled upon it on Oracle's website: Berkeley DB XML. Here's the blurb: Oracle Berkeley DB XML is an open source, embeddable XML database with XQuery-based access to documents stored in containers and indexed based on their content. Oracle Berkeley DB XML is built on top of Oracle Berkeley DB and inherits its rich features and attributes. Like Oracle Berkeley DB, it runs in process with the application with no need for human administration. Oracle Berkeley DB XML adds a document parser, XML indexer and XQuery engine on top of Oracle Berkeley DB to enable the fastest, most efficient retrieval of data. To me it seems that the underlying ideas are technically sound and probably more mature than the newer document-based DBs like CouchDB or MongoDB. It has support for C, C++, Ruby and Perl, as far as I can determine. It even has HA-capabilities like automatic replication using a master/slave model with automatic election. However, I can't seem to find any projects that use it. Is there something fundamentally wrong with it? Is the license too onerous? Is it too complicated? Why is it not being used?

Read the article

How to determine the (natural) language of a document?

- by Robert Petermeier

I have a set of documents in two languages: English and German. There is no usable meta information about these documents, a program can look at the content only. Based on that, the program has to decide which of the two languages the document is written in. Is there any "standard" algorithm for this problem that can be implemented in a few hours' time? Or alternatively, a free .NET library or toolkit that can do this? I know about LingPipe, but it is Java Not free for "semi-commercial" usage This problem seems to be surprisingly hard. I checked out the Google AJAX Language API (which I found by searching this site first), but it was ridiculously bad. For six web pages in German to which I pointed it only one guess was correct. The other guesses were Swedish, English, Danish and French... A simple approach I came up with is to use a list of stop words. My app already uses such a list for German documents in order to analyze them with Lucene.Net. If my app scans the documents for occurrences of stop words from either language the one with more occurrences would win. A very naive approach, to be sure, but it might be good enough. Unfortunately I don't have the time to become an expert at natural-language processing, although it is an intriguing topic.

Search Results

Search found 631 results on 26 pages for 'couchdb lucene'.

Page 23/26 | < Previous Page | 19 20 21 22 23 24 25 26 | Next Page >

- by chaiguy

- by nik

- by Tomas Sedovic

- by cpf

- by neil

- by Xodarap

- by Couto

- by NDeveloper

- by shiju

- by Dave Campbell

- by Ian

- by Tim van Elsloo

- by Phil Factor

- by deakblue

- by FilmJ

- by Daniel Earwicker

- by Binoj Antony

- by rcoder

- by Flo

- by Joe

- by J Wynia

- by pencilNero

- by abhinav

- by w00t

- by Robert Petermeier

< Previous Page | 19 20 21 22 23 24 25 26 | Next Page >