Search Results

Search found 21427 results on 858 pages for 'enterprise search'.

Page 54/858 | < Previous Page | 50 51 52 53 54 55 56 57 58 59 60 61  | Next Page >

  • Alcatel-Lucent: Enterprise 2.0: The Top 5 Things I would Do Over

    - by Kellsey Ruppel
    Happy Monday! Does anyone else feel as if the weekend went entirely too quickly? At least for those of us in the United States, we have the 4th of July Holiday next week to look forward to This week on the blog, we are going to focus on "WebCenter by Example" and highlight best practices from customers and partners. I recently came across this article and I think this is a great example of how we can learn from one another when it comes to social collaboration adoption. Do you agree with Jem? What things or best practices have you learned in your organizations?  By Jem Janik, Enterprise community manager, Alcatel-Lucent  Not so long ago, Engage, the Alcatel-Lucent employee social network and collaboration platform, celebrated its third birthday. With more than 25,000 members actively interacting each month, Engage has been a big enough success that it’s been the subject of external articles, and often those of us who helped launch it will go out and speak about what aspects contributed to that success. Hindsight is still 20/20 and what it takes to successfully launch an enterprise 2.0 community is fairly well-known now.  Today I want to tell you what I suspect you really want to know about.  As the enterprise community manager for Engage, after three years in, what are the top 5 things I wish we (and I mostly mean me) could do over? #5 Define your analytics solution from the start There is so much to do when you launch a community and initially growing it without complete chaos is quite a task.  It doesn’t take too long to get to a point where you want to focus your continued efforts in growing company collaboration.  Do people truly talk across regional boundaries or have we shifted siloed conversations to a new platform.  Is there one organization that doesn’t interact with another? If you are lucky you’ll have someone in your community team well versed in the world of databases and SQL queries, but it takes time to figure out what backend analytics data actually means. Professional support can be expensive and it may be hard to justify later as it typically has the community manager as the only main customer.  Figure out what you think you’ll want to know and how to get it early on. The sooner the better even if it doesn’t seem that critical at the time. #4 Lobbies guide you to the right places One piece of feedback that comes up more and more as we keep growing Engage is it’s hard to find stuff, or new people are not sure where to start. Something we’re doing now is defining some general topic areas of interest to be like “lobbies” into the platform and some common hashtags to go with them. I liken this to walking into a large medical or professional building for the first time.  There are hundreds of offices, and you look to a sign in the lobby to get guided to the right place for you.  We’re building that sign for members now, but again we missed the boat as the majority of the company has had their initial Engage experience. #3 Clean up, clean up, clean up Knowledge work and folksonomies are messy! The day we opened the doors to Engage I would have said we should keep everything ever created in Engage with an argument that it was a window into our collective knowledge so nothing should go.  Well, 6000+ groups and 200,000+ pieces of content later, I’ve changed my mind.  As previously mentioned, with too much “stuff” the system can be overwhelming to new members and it makes it harder to get what you’re looking for.   Do we need that help document about a tool we no longer have? NO!  Do we need that group that had 1 document and 2 discussions in the last two years? NO! Should we only have one group about a given topic instead of 4?  YES! Last fall, Engage defined a cleanup process for groups not used for a long time.  We also formed a volunteer cleaning army who are extra eyes on the hunt for “stuff” that should be updated, merged, or deleted.  It’s better late than never, but in line with what’s becoming a theme I wish these efforts had started earlier. #2 Communications & local community management One of the most important aspects of my job is to make sure people who should be talking to each other are actually doing it.  Connecting people to the other people they should know, the groups they should join, a piece of content that shouldn’t be missed.   I have worked both inside and outside of communications teams, and they are the best informed people in your company.  They know when something big is coming, how it impacts employees, how it fits with strategy, who else knows more, etc.  Having communications professionals who are power users can help scale up community management because they are already so well connected.  They also need to have the platform skills to pay attention without suffering email overload, how to grab someone’s attention, etc.  I wish I’d had figured this out much earlier.  If I had I would have groomed more communications colleagues into advocates and power members right at the start. #1 Grooming advocates vs. natural advocates I’ve just alluded to this above already. The very best advocates are those who naturally embrace your platform and automatically start to see new ways to work within it.  Those advocates seem to come out of the woodwork naturally since some of them are early adopters.  Not surprisingly, our best advocates today are those same people who were willing to come kick the tires when the community was completely empty.  Unfortunately, we didn’t get a global spread of those natural advocates.  I did ask around when we first launched for other people who might be good candidates, but didn’t push too hard as there were so many other things to get ready.  That was a mistake.  If I could get a redo I would have formally asked for people to be assigned where there were gaps and groomed them into an advocate.  Today as we find new advocates to fill the gaps, people are hesitant as the initial set has three years of practice are ahead of the curve power members; it definitely would have been easier earlier on. As fairly early adopters to corporate scale enterprise collaboration, there hasn’t been a roadmap to follow as we’ve grown Engage, which is part of the fun! It’s clear a lot of issues are more easily tackled the earlier you identify and begin to correct them, and I’ve identified the main five I wish I could redo.  In the spirit of collaboration, I hope someone else learns from my mistakes! View the original article by Jem here. 

    Read the article

  • Faceted search with Solr on Windows

    - by Dr.NETjes
    With over 10 million hits a day, funda.nl is probably the largest ASP.NET website which uses Solr on a Windows platform. While all our data (i.e. real estate properties) is stored in SQL Server, we're using Solr 1.4.1 to return the faceted search results as fast as we can.And yes, Solr is very fast. We did do some heavy stress testing on our Solr service, which allowed us to do over 1,000 req/sec on a single 64-bits Solr instance; and that's including converting search-url's to Solr http-queries and deserializing Solr's result-XML back to .NET objects! Let me tell you about faceted search and how to integrate Solr in a .NET/Windows environment. I'll bet it's easier than you think :-) What is faceted search? Faceted search is the clustering of search results into categories, allowing users to drill into search results. By showing the number of hits for each facet category, users can easily see how many results match that category. If you're still a bit confused, this example from CNET explains it all: The SQL solution for faceted search Our ("pre-Solr") solution for faceted search was done by adding a lot of redundant columns to our SQL tables and doing a COUNT(...) for each of those columns:   So if a user was searching for real estate properties in the city 'Amsterdam', our facet-query would be something like: SELECT COUNT(hasGarden), COUNT(has2Bathrooms), COUNT(has3Bathrooms), COUNT(etc...) FROM Houses WHERE city = 'Amsterdam' While this solution worked fine for a couple of years, it wasn't very easy for developers to add new facets. And also, performing COUNT's on all matched rows only performs well if you have a limited amount of rows in a table (i.e. less than a million). Enter Solr "Solr is an open source enterprise search server based on the Lucene Java search library, with XML/HTTP and JSON APIs, hit highlighting, faceted search, caching, replication, and a web administration interface." (quoted from Wikipedia's page on Solr) Solr isn't a database, it's more like a big index. Every time you upload data to Solr, it will analyze the data and create an inverted index from it (like the index-pages of a book). This way Solr can lookup data very quickly. To explain the inner workings of Solr is beyond the scope of this post, but if you want to learn more, please visit the Solr Wiki pages. Getting faceted search results from Solr is very easy; first let me show you how to send a http-query to Solr:    http://localhost:8983/solr/select?q=city:Amsterdam This will return an XML document containing the search results (in this example only three houses in the city of Amsterdam):    <response>     <result name="response" numFound="3" start="0">         <doc>            <long name="id">3203</long>            <str name="city">Amsterdam</str>            <str name="steet">Keizersgracht</str>            <int name="numberOfBathrooms">2</int>        </doc>         <doc>             <long name="id">3205</long>             <str name="city">Amsterdam</str>             <str name="steet">Vondelstraat</str>             <int name="numberOfBathrooms">3</int>          </doc>          <doc>             <long name="id">4293</long>             <str name="city">Amsterdam</str>             <str name="steet">Wibautstraat</str>             <int name="numberOfBathrooms">2</int>          </doc>       </result>   </response> By adding a facet-querypart for the field "numberOfBathrooms", Solr will return the facets for this particular field. We will see that there's one house in Amsterdam with three bathrooms and two houses with two bathrooms.    http://localhost:8983/solr/select?q=city:Amsterdam&facet=true&facet.field=numberOfBathrooms The complete XML response from Solr now looks like:    <response>      <result name="response" numFound="3" start="0">         <doc>            <long name="id">3203</long>            <str name="city">Amsterdam</str>            <str name="steet">Keizersgracht</str>            <int name="numberOfBathrooms">2</int>         </doc>         <doc>            <long name="id">3205</long>            <str name="city">Amsterdam</str>            <str name="steet">Vondelstraat</str>            <int name="numberOfBathrooms">3</int>         </doc>         <doc>            <long name="id">4293</long>            <str name="city">Amsterdam</str>            <str name="steet">Wibautstraat</str>            <int name="numberOfBathrooms">2</int>         </doc>      </result>      <lst name="facet_fields">         <lst name="numberOfBathrooms">            <int name="2">2</int>            <int name="3">1</int>         </lst>      </lst>   </response> Trying Solr for yourself To run Solr on your local machine and experiment with it, you should read the Solr tutorial. This tutorial really only takes 1 hour, in which you install Solr, upload sample data and get some query results. And yes, it works on Windows without a problem. Note that in the Solr tutorial, you're using Jetty as a Java Servlet Container (that's why you must start it using "java -jar start.jar"). In our environment we prefer to use Apache Tomcat to host Solr, which installs like a Windows service and works more like .NET developers expect. See the SolrTomcat page.Some best practices for running Solr on Windows: Use the 64-bits version of Tomcat. In our tests, this doubled the req/sec we were able to handle!Use a .NET XmlReader to convert Solr's XML output-stream to .NET objects. Don't use XPath; it won't scale well.Use filter queries ("fq" parameter) instead of the normal "q" parameter where possible. Filter queries are cached by Solr and will speed up Solr's response time (see FilterQueryGuidance)In my next post I’ll talk about how to keep Solr's indexed data in sync with the data in your SQL tables. Timestamps / rowversions will help you out here!

    Read the article

  • Google tweets – Now search twitter archives using Google

    - by samsudeen
    Google has launched a Twitter archive service which allows you to  search tweets in real time as well as on its huge public archive (remember Twitter crossed 10 billionth tweet last month). The search results are displayed as tweets with twitter logo. To explore the twitter search go to Google.com homepage  and select   “Show options” on the search results page, then select “Updates.”.  The search is similar to the Google search with options to dig through the tweets by timeframe. You can explore results by zooming through a particular time range  or date. In addition to the time chart, it also displays the relative volume of an activity on Twitter about the topic. as you can see there is a spike about GSLV launch after 3 PM today.There is also a short cut link “Now” on the left corner which displays the latest results on the topics searched.The tweets also gets refreshed automatically.   Considering the huge volume of activity (50 million messages per day) on twitter, the archive is going to more and bigger. By providing such feature Google has once again proved it is way ahead of others in search Related Posts:None FoundJoin us on Facebook to read all our stories right inside your Facebook news feed.

    Read the article

  • Google tweets – Now search twitter archives using Google

    - by samsudeen
    Google has launched a Twitter archive service which allows you to  search tweets in real time as well as on its huge public archive (remember Twitter crossed 10 billionth tweet last month). The search results are displayed as tweets with twitter logo. To explore the twitter search go to Google.com homepage  and select   “Show options” on the search results page, then select “Updates.”.  The search is similar to the Google search with options to dig through the tweets by timeframe. You can explore results by zooming through a particular time range  or date. In addition to the time chart, it also displays the relative volume of an activity on Twitter about the topic. as you can see there is a spike about GSLV launch after 3 PM today.There is also a short cut link “Now” on the left corner which displays the latest results on the topics searched.The tweets also gets refreshed automatically.   Considering the huge volume of activity (50 million messages per day) on twitter, the archive is going to more and bigger. By providing such feature Google has once again proved it is way ahead of others in search Related Posts:None FoundJoin us on Facebook to read all our stories right inside your Facebook news feed.

    Read the article

  • Finding if a Binary Tree is a Binary Search Tree

    - by dharam
    Today I had an interview where I was asked to write a program which takes a Binary Tree and returns true if it is also a Binary Search Tree otherwise false. My Approach1: Perform an inroder traversal and store the elements in O(n) time. Now scan through the array/list of elements and check if element at ith index is greater than element at (i+1)th index. If such a condition is encountered, return false and break out of the loop. (This takes O(n) time). At the end return true. But this gentleman wanted me to provide an efficient solution. I tried but I was unsuccessfult, because to find if it is a BST I have to check each node. Moreover he was pointing me to think over recusrion. My Approach 2: A BT is a BST if for any node N N-left is < N and N-right N , and the INorder successor of left node of N is less than N and the inorder successor of right node of N is greater than N and the left and right subtrees are BSTs. But this is going to be complicated and running time doesn't seem to be good. Please help if you know any optimal solution. Thanks.

    Read the article

  • rails search nested set (categories and sub categories)

    - by bob
    Hello, I am using the http://github.com/collectiveidea/awesome_nested_set awesome nested set plugin and currently, if I choose a sub category as my category_id for an item, I can not search by its parent. Category.parent Category.Child I choose Category.child as the category that my item is in. So now my item has category_id of 4 stored in it. If I go to a page in my rails application, lets say teh Category page and I am on the Category.parent's page, I want to show products that have category_id's of all the descendants as well. So ideally i want to have a find method that can take into account the descendants. You can get the descendants of a root by calling root.descendants (a built in plugin method). How would I go about making it so I can query a find that gets the descendants of a root instead of what its doing now which is binging up nothing unless the product had a specific category_id of the Category.parent. I hope I am being clear here. I either need to figure out a way to create a find method or named_scope that can query and return an array of objects that have id's corresponding tot he descendants of a root OR if I have any other options, what are they? I thought about creating a field in my products table like parent_id which can keep track of the parent so i can then create two named scopes one finding the parent stuff and one finding the child stuff and chaining them. I know I can create a named scope for each child and chain them together for multiple children but this seems a very tedious process and also, if you add more children, you would need to specify more named scopes.

    Read the article

  • The enterprise vendor con - connecting SSD's using SATA 2 (3Gbits) thus limiting there performance

    - by tonyrogerson
    When comparing SSD against Hard drive performance it really makes me cross when folk think comparing an array of SSD running on 3GBits/sec to hard drives running on 6GBits/second is somehow valid. In a paper from DELL (http://www.dell.com/downloads/global/products/pvaul/en/PowerEdge-PowerVaultH800-CacheCade-final.pdf) on increasing database performance using the DELL PERC H800 with Solid State Drives they compare four SSD drives connected at 3Gbits/sec against ten 10Krpm drives connected at 6Gbits [Tony slaps forehead while shouting DOH!]. It is true in the case of hard drives it probably doesn’t make much difference 3Gbit or 6Gbit because SAS and SATA are both end to end protocols rather than shared bus architecture like SCSI, so the hard drive doesn’t share bandwidth and probably can’t get near the 600MiBytes/second throughput that 6Gbit gives unless you are doing contiguous reads, in my own tests on a single 15Krpm SAS disk using IOMeter (8 worker threads, queue depth of 16 with a stripe size of 64KiB, an 8KiB transfer size on a drive formatted with an allocation size of 8KiB for a 100% sequential read test) I only get 347MiBytes per second sustained throughput at an average latency of 2.87ms per IO equating to 44.5K IOps, ok, if that was 3GBits it would be less – around 280MiBytes per second, oh, but wait a minute [...fingers tap desk] You’ll struggle to find in the commodity space an SSD that doesn’t have the SATA 3 (6GBits) interface, SSD’s are fast not only low latency and high IOps but they also offer a very large sustained transfer rate, consider the OCZ Agility 3 it so happens that in my masters dissertation I did the same test but on a difference box, I got 374MiBytes per second at an average latency of 2.67ms per IO equating to 47.9K IOps – cost of an 240GB Agility 3 is £174.24 (http://www.scan.co.uk/products/240gb-ocz-agility-3-ssd-25-sata-6gb-s-sandforce-2281-read-525mb-s-write-500mb-s-85k-iops), but that same drive set in a box connected with SATA 2 (3Gbits) would only yield around 280MiBytes per second thus losing almost 100MiBytes per second throughput and a ton of IOps too. So why the hell are “enterprise” vendors still only connecting SSD’s at 3GBits? Well, my conspiracy states that they have no interest in you moving to SSD because they’ll lose so much money, the argument that they use SATA 2 doesn’t wash, SATA 3 has been out for some time now and all the commodity stuff you buy uses it now. Consider the cost, not in terms of price per GB but price per IOps, SSD absolutely thrash Hard Drives on that, it was true that the opposite was also true that Hard Drives thrashed SSD’s on price per GB, but is that true now, I’m not so sure – a 300GByte 2.5” 15Krpm SAS drive costs £329.76 ex VAT (http://www.scan.co.uk/products/300gb-seagate-st9300653ss-savvio-15k3-25-hdd-sas-6gb-s-15000rpm-64mb-cache-27ms) which equates to £1.09 per GB compared to a 480GB OCZ Agility 3 costing £422.10 ex VAT (http://www.scan.co.uk/products/480gb-ocz-agility-3-ssd-25-sata-6gb-s-sandforce-2281-read-525mb-s-write-410mb-s-30k-iops) which equates to £0.88 per GB. Ok, I compared an “enterprise” hard drive with a “commodity” SSD, ok, so things get a little more complicated here, most “enterprise” SSD’s are SLC and most commodity are MLC, SLC gives more performance and wear, I’ll talk about that another day. For now though, don’t get sucked in by vendor marketing, SATA 2 (3Gbit) just doesn’t cut it, SSD need 6Gbit to breath and even that SSD’s are pushing. Alas, SSD’s are connected using SATA so all the controllers I’ve seen thus far from HP and DELL only do SATA 2 – deliberate? Well, I’ll let you decide on that one.

    Read the article

  • Why Your ERP System Isn't Ready for the Next Evolution of the Enterprise

    - by ken.pulverman
      ERP has been the backbone of enterprise software.  The data held in your ERP system is core of most companies.  Efficiencies gained through the accounting and resource allocation through ERP software have literally saved companies trillions of dollars. Not only does everything seem to be fine with your ERP system, you haven't had to touch it in years.  Why aren't you ready for what comes next? Well judging by the growth rates in the space (Oracle posted only a 3% growth rate, while SAP showed a 12% decline) there hasn't been much modernization going on, just a little replacement activity. If you are like most companies, your ERP system is connected to a proprietary middleware solution that only effectively talks with a handful of other systems you might have acquired from the same vendor.   Connecting your legacy system through proprietary middleware is expensive and brittle and if you are like most companies, you were only willing to pay an SI so much before you said "enough."  So your ERP is working.  It's humming along.  You might not be able to get Order to Promise information when you take orders in your call center, but there are work arounds that work just fine. So what's the problem? The problem is that you built your business around your ERP core, and now there is such pressure to innovate your business processes to keep up that you need a whole new slew of modern apps and you need ERP data to be accessible from everywhere.   Every time you change a sales territory or a comp plan or change a benefits provider your ERP system, literally the economic brain of your business, needs to know what's going on.  And this giant need to access and provide information to your ERP is only growing. What makes matters even more challenging is that apps today come in every flavor under the Sun™.   SaaS, cloud, managed, hybrid, outsourced, composite....and they all have different integration protocols. The only easy way to get ahead of all this is to modernize the way you connect and run your applications.  Unlike the middleware solutions of yesteryear, modern middleware is effectively the operating system of the enterprise.  In the same way that you rely on Apple, Microsoft, and Google to find a video driver for your 23" monitor or to ensure the Word or Keynote runs, modern middleware takes care of intra-application connectivity and process execution.  It effectively allows you to take ERP out of the middle while ensuring connectivity to your vital data for anything you want to do.  The diagram below reflects that change.    In this model, the hegemony of ERP is over.  It too has to become a stealthy modern app to help you quickly adapt to business changes while managing vital information.  And through modern middleware it will connect to everything.  So yes ERP as we've know it is dead, but long live ERP as a connected application member of the modern enterprise. I want to Thank Andrew Zoldan, Group Vice President Oracle Manufacturing Industries Business Unit for introducing me to how some of his biggest customers have benefited by modernizing their applications infrastructure and making ERP a connected application. by John Burke, Group Vice President, Applications Business Unit

    Read the article

  • Why Your ERP System Isn't Ready for the Next Evolution of the Enterprise

    - by [email protected]
    By ken.pulverman on March 24, 2010 8:51 AM ERP has been the backbone of enterprise software. The data held in your ERP system is core of most companies. Efficiencies gained through the accounting and resource allocation through ERP software have literally saved companies trillions of dollars. Not only does everything seem to be fine with your ERP system, you haven't had to touch it in years. Why aren't you ready for what comes next? Well judging by the growth rates in the space (Oracle posted only a 3% growth rate, while SAP showed a 12% decline) there hasn't been much modernization going on, just a little replacement activity. If you are like most companies, your ERP system is connected to a proprietary middleware solution that only effectively talks with a handful of other systems you might have acquired from the same vendor. Connecting your legacy system through proprietary middleware is expensive and brittle and if you are like most companies, you were only willing to pay an SI so much before you said "enough." So your ERP is working. It's humming along. You might not be able to get Order to Promise information when you take orders in your call center, but there are work arounds that work just fine. So what's the problem? The problem is that you built your business around your ERP core, and now there is such pressure to innovate your business processes to keep up that you need a whole new slew of modern apps and you need ERP data to be accessible from everywhere. Every time you change a sales territory or a comp plan or change a benefits provider your ERP system, literally the economic brain of your business, needs to know what's going on. And this giant need to access and provide information to your ERP is only growing. What makes matters even more challenging is that apps today come in every flavor under the Sun™. SaaS, cloud, managed, hybrid, outsourced, composite....and they all have different integration protocols. The only easy way to get ahead of all this is to modernize the way you connect and run your applications. Unlike the middleware solutions of yesteryear, modern middleware is effectively the operating system of the enterprise. In the same way that you rely on Apple, Microsoft, and Google to find a video driver for your 23" monitor or to ensure that Word or Keynote runs, modern middleware takes care of intra-application connectivity and process execution. It effectively allows you to take ERP out of the middle while ensuring connectivity to your vital data for anything you want to do. The diagram below reflects that change. In this model, the hegemony of ERP is over. It too has to become a stealthy modern app to help you quickly adapt to business changes while managing vital information. And through modern middleware it will connect to everything. So yes ERP as we've know it is dead, but long live ERP as a connected application member of the modern enterprise. I want to Thank Andrew Zoldan, Group Vice President Oracle Manufacturing Industries Business Unit for introducing me to how some of his biggest customers have benefited by modernizing their applications infrastructure and making ERP a connected application. by John Burke, Group Vice President, Applications Business Unit

    Read the article

  • shell scripting: search/replace & check file exist

    - by johndashen
    I have a perl script (or any executable) E which will take a file foo.xml and write a file foo.txt. I use a Beowulf cluster to run E for a large number of XML files, but I'd like to write a simple job server script in shell (bash) which doesn't overwrite existing txt files. I'm currently doing something like #!/bin/sh PATTERN="[A-Z]*0[1-2][a-j]"; # this matches foo in all cases todo=`ls *.xml | grep $PATTERN -o`; isdone=`ls *.txt | grep $PATTERN -o`; whatsleft=todo - isdone; # what's the unix magic? #tack on the .xml prefix with sed or something #and then call the job server; jobserve E "$whatsleft"; and then I don't know how to get the difference between $todo and $isdone. I'd prefer using sort/uniq to something like a for loop with grep inside, but I'm not sure how to do it (pipes? temporary files?) As a bonus question, is there a way to do lookahead search in bash grep? To clarify: so the simplest way to do what i'm asking is (in pseudocode) for i in `/bin/ls *.xml` do replace xml suffix with txt if [that file exists] add to whatsleft list end done

    Read the article

  • What is the correct way to implement a massive hierarchical, geographical search for news?

    - by Philip Brocoum
    The company I work for is in the business of sending press releases. We want to make it possible for interested parties to search for press releases based on a number of criteria, the most important being location. For example, someone might search for all news sent to New York City, Massachusetts, or ZIP code 89134, sent from a governmental institution, under the topic of "traffic". Or whatever. The problem is, we've sent, literally, hundreds of thousands of press releases. Searching is slow and complex. For example, a press release sent to Queens, NY should show up in the search I mentioned above even though it wasn't specifically sent to New York City, because Queens is a subset of New York City. We may also want to implement "and" and "or" and negation and text search to the query to create complex searches. These searches also have to be fast enough to function as dynamic RSS feeds. I really don't know anything about search theory, or how it's properly done. The way we are getting by right now is using a data mart to store the locations the releases were sent to in a single table. However, because of the subset thing mentioned above, the data mart is gigantic with millions of rows. And we haven't even implemented cities yet, and there are about 50,000 cities in the United States, which will exponentially increase the size of the data mart by so much I'm afraid it just won't work anymore. Anyway, I realize this is not a simple question and there won't be a "do this" answer. However, I'm hoping one of you can point me in the right direction where I can learn about how massive searches are done? Because I really know nothing about it. And such a search engine is turning out to be incredibly difficult to make. Thanks! I know there must be a way because if Google can search the entire internet we must be able to search our own database :-)

    Read the article

  • SQL Server Search Proper Names Full Text Index vs LIKE + SOUNDEX

    - by Matthew Talbert
    I have a database of names of people that has (currently) 35 million rows. I need to know what is the best method for quickly searching these names. The current system (not designed by me), simply has the first and last name columns indexed and uses "LIKE" queries with the additional option of using SOUNDEX (though I'm not sure this is actually used much). Performance has always been a problem with this system, and so currently the searches are limited to 200 results (which still takes too long to run). So, I have a few questions: Does full text index work well for proper names? If so, what is the best way to query proper names? (CONTAINS, FREETEXT, etc) Is there some other system (like Lucene.net) that would be better? Just for reference, I'm using Fluent NHibernate for data access, so methods that work will with that will be preferred. I'm using SQL Server 2008 currently. EDIT I want to add that I'm very interested in solutions that will deal with things like commonly misspelled names, eg 'smythe', 'smith', as well as first names, eg 'tomas', 'thomas'. Query Plan |--Parallelism(Gather Streams) |--Nested Loops(Inner Join, OUTER REFERENCES:([testdb].[dbo].[Test].[Id], [Expr1004]) OPTIMIZED WITH UNORDERED PREFETCH) |--Hash Match(Inner Join, HASH:([testdb].[dbo].[Test].[Id])=([testdb].[dbo].[Test].[Id])) | |--Bitmap(HASH:([testdb].[dbo].[Test].[Id]), DEFINE:([Bitmap1003])) | | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([testdb].[dbo].[Test].[Id])) | | |--Index Seek(OBJECT:([testdb].[dbo].[Test].[IX_Test_LastName]), SEEK:([testdb].[dbo].[Test].[LastName] >= 'WHITDþ' AND [testdb].[dbo].[Test].[LastName] < 'WHITF'), WHERE:([testdb].[dbo].[Test].[LastName] like 'WHITE%') ORDERED FORWARD) | |--Parallelism(Repartition Streams, Hash Partitioning, PARTITION COLUMNS:([testdb].[dbo].[Test].[Id])) | |--Index Seek(OBJECT:([testdb].[dbo].[Test].[IX_Test_FirstName]), SEEK:([testdb].[dbo].[Test].[FirstName] >= 'THOMARþ' AND [testdb].[dbo].[Test].[FirstName] < 'THOMAT'), WHERE:([testdb].[dbo].[Test].[FirstName] like 'THOMAS%' AND PROBE([Bitmap1003],[testdb].[dbo].[Test].[Id],N'[IN ROW]')) ORDERED FORWARD) |--Clustered Index Seek(OBJECT:([testdb].[dbo].[Test].[PK__TEST__3214EC073B95D2F1]), SEEK:([testdb].[dbo].[Test].[Id]=[testdb].[dbo].[Test].[Id]) LOOKUP ORDERED FORWARD) SQL for above: SELECT * FROM testdb.dbo.Test WHERE LastName LIKE 'WHITE%' AND FirstName LIKE 'THOMAS%' Based on advice from Mitch, I created an index like this: CREATE INDEX IX_Test_Name_DOB ON Test (LastName ASC, FirstName ASC, BirthDate ASC) INCLUDE (and here I list the other columns) My searches are now incredibly fast for my typical search (last, first, and birth date).

    Read the article

  • average case running time of linear search algorithm

    - by Brahadeesh
    Hi all. I am trying to derive the average case running time for deterministic linear search algorithm. The algorithm searches an element x in an unsorted array A in the order A[1], A[2], A[3]...A[n]. It stops when it finds the element x or proceeds until it reaches the end of the array. I searched on wikipedia and the answer given was (n+1)/(k+1) where k is the number of times x is present in the array. I approached in another way and am getting a different answer. Can anyone please give me the correct proof and also let me know whats wrong with my method? E(T)= 1*P(1) + 2*P(2) + 3*P(3) ....+ n*P(n) where P(i) is the probability that the algorithm runs for 'i' time (i.e. compares 'i' elements). P(i)= (n-i)C(k-1) * (n-k)! / n! Here, (n-i)C(k-1) is (n-i) Choose (k-1). As the algorithm has reached the ith step, the rest of k-1 x's must be in the last n-i elements. Hence (n-i)C(k-i). (n-k)! is the total number of ways of arranging the rest non x numbers, and n! is the total number of ways of arranging the n elements in the array. I am not getting (n+1)/(k+1) on simplifying.

    Read the article

  • Python SQLite FTS3 alternatives?

    - by Mike Cialowicz
    Are there any good alternatives to SQLite + FTS3 for python? I'm iterating over a series of text documents, and would like to categorize them according to some text queries. For example, I might want to know if a document mentions the words "rating" or "upgraded" within three words of "buy." The FTS3 syntax for this query is the following: (rating OR upgraded) NEAR/3 buy That's all well and good, but if I use FTS3, this operation seems rather expensive. The process goes something like this: # create an SQLite3 db in memory conn = sqlite3.connect(':memory:') c = conn.cursor() c.execute('CREATE VIRTUAL TABLE fts USING FTS3(content TEXT)') conn.commit() Then, for each document, do something like this: #insert the document text into the fts table, so I can run a query c.execute('insert into fts(content) values (?)', content) conn.commit() # execute my FTS query here, look at the results, etc # remove the document text from the fts table before working on the next document c.execute('delete from fts') conn.commit() This seems rather expensive to me. The other problem I have with SQLite FTS is that it doesn't appear to work with Python 2.5.4. The 'CREATE VIRTUAL TABLE' syntax is unrecognized. This means that I'd have to upgrade to Python 2.6, which means re-testing numerous existing scripts and programs to make sure they work under 2.6. Is there a better way? Perhaps a different library? Something faster? Thank you.

    Read the article

  • PHP: If no Results - Split the Searchrequest and Try to find Parts of the Search

    - by elmaso
    Hello, i want to split the searchrequest into parts, if there's nothing to find. example: "nelly furtado ft. jimmy jones" - no results - try to find with nelly, furtado, jimmy or jones.. i have an api url.. thats the difficult part.. i show you some of the actually snippets: $query = urlencode (strip_tags ($_GET[search])); and $found = '0'; if ($source == 'all') { if (!($res = @get_url ('http://api.example.com/?key=' . $API . '&phrase=' . $query . ' . '&sort=' . $sort))) { exit ('<error>Cannot get requested information.</error>'); ; } how can i put a else request in this snippet, like if nothing found take the first word, or the second word, is this possible? or maybe you can tell me were i can read stuff about this function? thank you!!

    Read the article

  • Exalytics and Oracle Business Intelligence Enterprise Edition (OBIEE) Partner Workshop

    - by mseika
    Workshop Description Oracle Fusion Middleware 11g is the #1 application infrastructure foundation. It enables enterprises to create and run agile and intelligent business applications and maximize IT efficiency by exploiting modern hardware and software architectures. Oracle Exalytics Business Intelligence Machine is the world’s first engineered system specifically designed to deliver high performance analysis, modeling and planning. Built using industry-standard hardware, market-leading business intelligence software and in-memory database technology, Oracle Exalytics is an optimized system that delivers unmatched speed, visualizations and scalability for Business Intelligence and Enterprise Performance Management applications. This FREE hands-on, partner workshop highlights both the hardware and software components that are engineered to work together to deliver Oracle Exalytics - an optimized version of the industry-leading Oracle TimesTen In-Memory Database with analytic extensions, a highly scalable Oracle server designed specifically for in-memory business intelligence, and Oracle’s proven Business Intelligence Foundation with enhanced visualization capabilities and performance optimizations. This workshop will provide hands-on experience with Oracle's latest engineered system. Topics covered will include TimesTen In-Memory Database and the new Summary Advisor for Exalytics, the technical details (including mobile features) of the latest release of visualization enhancements for OBI-EE, and technical updates on Essbase. After taking this course, you will be well prepared to architect, build, demo, and implement an end-to-end Exalytics solution. You will also be able to extend your current analytical and enterprise performance management application implementations with numerous Oracle technologies specifically enhanced to take advantage of the compute capacity and in-memory capabilities of Oracle Exalytics.If you are a BI or Data Warehouse Architect, developer or consultant, you don’t want to miss this 3-day workshop. Register Now! Presentations Exalytics Architectural Overview Upgrade and Lifecycle Management Times Ten for Exalytics Summary Advisor Utility Essbase and EPM System on Exalytics Dashboard and Analysis Interactions OBIEE 11.1.1.6 Features and Advanced Topics Lab OutlineThe labs showcase Oracle Exalytics core components and functionality and provide expertise of Oracle Business Intelligence 11.1.1.6 new features and updates from prior releases. The hands-on activities are based on an Oracle VirtualBox image with software and training samples pre-installed. Lab Environment Setup Creating and Working with Oracle TimesTen In-Memory Database Running Summary Advisor Utility Working with Exalytics Visualization Features – Dashboard and Analysis Interactions Audience Oracle Partners BI and EPM Application Developers and Implementers System Integrators and Solution Consultants Data Warehouse Developers Enterprise Architects Prerequisites Experience and understanding of OBIEE 11g is required Previous attendance of Oracle Business Intelligence Foundation Suite Workshop or BIEE 11gIntroduction Workshop is highly recommended Good understanding of data warehousing and data modeling for reporting and analysis purpose Strong experience with database technologies preferred Equipment RequirementsThis workshop requires attendees to provide their own laptops for this class.Attendee laptops must meet the following minimum hardware/software requirements: Hardware Minimum 8GB RAM 60 GB free space (includes staging) USB 2.0 port (at least one available) It is strongly recommended that you bring a mouse. You will be working in a development environment and using the mouse heavily. Software One of the following operating systems: 64-bit Windows host/laptop OS 64-bit host/laptop OS with a Windows VM (XP, Server, or Win 7, BIC2g, etc.) Internet Explorer 7.x/8.x or Firefox 3.5.x WINRAR or 7ziputility to unzip workshop files: Download-able from http://www.win-rar.com/download.html Download-able from http://www.7zip.com/ Oracle VirtualBox 4.0.2 or higher Downloadable from http://www.virtualbox.org/wiki/Downloads CPU virtualization mode needs to be enabled. We will provide guidance on the day of the workshop. Attendees will be given a VirtualBox image containing a pre-installed Oracle Exalytics environment. Schedule This workshop is 3 days. - Times vary by country!9:00am: Sign-in and technical setup 9:30am: Workshop starts 5:00pm: Workshop ends Oracle Exalytics and Business Intelligence (OBIEE) Workshop December 11-13, 2012: Oracle BVP, Birmingham, UK Register Here. Questions? Send email to: [email protected] Oracle Platform Technologies Enablement Services

    Read the article

  • Binary Search Tree - Postorder logic

    - by daveb
    I am looking at implementing code to work out binary search tree. Before I do this I was wanting to verify my input data in postorder and preorder. I am having trouble working out what the following numbers would be in postorder and preorder I have the following numbers 4, 3, 14 ,8 ,1, 15, 9, 5, 13, 10, 2, 7, 6, 12, 11, that I am intending to put into an empty binary tree in that order. The order I arrived at for the numbers in POSTORDER is 2, 1, 6, 3, 7, 11, 12, 10, 9, 8, 13, 15, 14, 4. Have I got this right? I was wondering if anyone here would be able to kindly verify if the postorder sequence I came up with is indeed the correct sequence for my input i.e doing left subtree, right subtree and then root. The order I got for pre order (Visit root, do left subtree, do right subtree) is 4, 3, 1, 2, 5, 6, 14 , 8, 7, 9, 10, 12, 11, 15, 13. I can't be certain I got this right. Very grateful for any verification. Many Thanks

    Read the article

  • how do i filter my lucene search results?

    - by Andrew Bullock
    Say my requirement is "search for all users by name, who are over 18" If i were using SQL, i might write something like: Select * from [Users] Where ([firstname] like '%' + @searchTerm + '%' OR [lastname] like '%' + @searchTerm + '%') AND [age] >= 18 However, im having difficulty translating this into lucene.net. This is what i have so far: var parser = new MultiFieldQueryParser({ "firstname", "lastname"}, new StandardAnalyser()); var luceneQuery = parser.Parse(searchterm) var query = FullTextSession.CreateFullTextQuery(luceneQuery, typeof(User)); var results = query.List<User>(); How do i add in the "where age = 18" bit? I've heard about .SetFilter(), but this only accepts LuceneQueries, and not IQueries. If SetFilter is the right thing to use, how do I make the appropriate filter? If not, what do I use and how do i do it? Thanks! P.S. This is a vastly simplified version of what I'm trying to do for clarity, my WHERE clause is actually a lot more complicated than shown here. In reality i need to check if ids exist in subqueries and check a number of unindexed properties. Any solutions given need to support this. Thanks

    Read the article

  • does lucene search function work in large size document?

    - by shaon-fan
    Hi,there I have a problem when do search with lucene. First, in lucene indexing function, it works well to huge size document. such as .pst file, the outlook mail storage. It can build indexing file include all the information of .pst. The only problem is to large sometimes, include very much words. So when i search using lucene, it only can process the front part of this indexing file, if one word come out the back part of the indexing file, it couldn't find this word and no hits in result. But when i separate this indexing file to several parts in stupid way when debugging, and searching every parts, it can work well. So i want to know how to separate indexing file, how much size should be the limit of searching? cheers and wait 4 reply. ++++++++++++++++++++++++++++++++++++++++++++++++++ hi,there, follow Coady siad, i set the length to max 2^31-1. But the search result still can't include what i want. simply, i convert the doc word to string array[] to analyze, one doc word has 79680 words include the space and any symbol. when i search certain word, it just return 300 count, actually it has more than 300 results. The same reason, when i search a word in back part of the doc, it also couldn't find. //////////////set the length idexwriter.SetMaxFieldLength(2147483647); ////////////////////search IndexSearcher searcher = new ndexSearcher(Program.Parameters["INDEX_LOCATION"].ToString()); Hits hits = searcher.Search(query); This is my code, as others same. I found that problem when i need to count every word hits in a doc. So i also found it couldn't search word in back part of doc. pls help me to find, is there any set searcher length somewhere? how u meet this problem.

    Read the article

  • Google Image Search Quick Fix

    - by Asian Angel
    Are you tired of unneeded webpage loading and extra link clicking just to access an image found using Google Image Search? Now you can jump directly to the image itself with the clickGOOGLEview extension for Google Chrome. The Problem When you find an image that you like using Google Image Search you always have to go through extra hassle just to get to the image itself. First you have an entire webpage loading in your browser and then you have to click through that irritating “See full size image” link. All that you need is the image, right? Problem Fixed Once you have installed the clickGOOGLEview extension you will absolutely love the result. Find an image that you like, click the link, and there is your new image without any of the hassle or extra link clicking. Big or small having direct access to the image is how it should have been from the beginning. Conclusion The clickGOOGLEview extension does one thing and does it extremely well…it gets you to those images without the extra hassle or additional link clicking. Links Download the clickGOOGLEview extension (Google Chrome Extensions) Similar Articles Productive Geek Tips Make Firefox Quick Search Use Google’s Beta Search KeysChange Internet Explorer in Windows Vista to Search Google by DefaultMake Firefox Built-In Search Box Use Google’s Experimental Search KeysQuick Tip: Show PageRank in Firefox while Google Toolbar is HiddenQuick Tip: Use Google Talk Sidebar in Firefox TouchFreeze Alternative in AutoHotkey The Icy Undertow Desktop Windows Home Server – Backup to LAN The Clear & Clean Desktop Use This Bookmarklet to Easily Get Albums Use AutoHotkey to Assign a Hotkey to a Specific Window Latest Software Reviews Tinyhacker Random Tips Revo Uninstaller Pro Registry Mechanic 9 for Windows PC Tools Internet Security Suite 2010 PCmover Professional Kill Processes Quickly with Process Assassin Need to Come Up with a Good Name? Try Wordoid StockFox puts a Lightweight Stock Ticker in your Statusbar Explore Google Public Data Visually The Ultimate Excel Cheatsheet Convert the Quick Launch Bar into a Super Application Launcher

    Read the article

  • Sorting data by relevance, from multiple tables

    - by Oden
    Hey, How is it possible to sort data from multiple tables by relevance? My table structure is following: I have 3 tables in my database, one table contains the name of solar systems, the second for e.g. of planets. There is one more table, witch is a connection between solar systems and planets. If I want to get data of a planet, witch is in the Milky Way, i post this data to the server, and it gives me a multi-dimensional array witch contains: The Milky Way, with every planet in it Every planet, witch name contains the string Milky Way (maybe thats a bat example because i don't think that theres but one planet with this name, but the main concept is on file) But, i want to set the most relevant restaurants to the top of the array. (for the relevance i would check the description of the restaurants or something like that) So, how would you do that kind of data sorting?

    Read the article

  • Hide a single content block from search engines?

    - by jonas
    A header is automatically added on top of each content URL, but its not relevant for search and messing up the all the results beeing the first line of every page (in the code its the last line but visually its the first, which google is able to notice) Solution1: You could put the header (content to exculde from google searches) in an iframe with a static url domain.com/header.html and a <meta name="robots" content="noindex" /> ? - are there takeoffs of this solution? Solution2: You could deliver it conditionally by apache mod rewrite, php or javascript -takeoff(?): google does not like it? will google ever try pages with a standard users's useragent and compare? -takeoff: The hidden content will be missing in the google cache version as well... example: add-header.php: <?php $path = $_GET['path']; echo file_get_contents($_SERVER["DOCUMENT_ROOT"].$path); ?> apache virtual host config: RewriteCond %{HTTP_USER_AGENT} !.*spider.* [NC] RewriteCond %{HTTP_USER_AGENT} !Yahoo.* [NC] RewriteCond %{HTTP_USER_AGENT} !Bing.* [NC] RewriteCond %{HTTP_USER_AGENT} !Yandex.* [NC] RewriteCond %{HTTP_USER_AGENT} !Baidu.* [NC] RewriteCond %{HTTP_USER_AGENT} !.*bot.* [NC] RewriteCond %{SCRIPT_FILENAME} \.htm$ [NC,OR] RewriteCond %{SCRIPT_FILENAME} \.html$ [NC,OR] RewriteCond %{SCRIPT_FILENAME} \.php$ [NC] RewriteRule ^(.*)$ /var/www/add-header.php?path=%1 [L]

    Read the article

  • Refining Search Results [PHP/MySQL]

    - by Dae
    I'm creating a set of search panes that allow users to tweak their results set after submitting a query. We pull commonly occurring values in certain fields from the results and display them in order of their popularity - you've all seen this sort of thing on eBay. So, if a lot of rows in our results were created in 2009, we'll be able to click "2009" and see only rows created in that year. What in your opinion is the most efficient way of applying these filters? My working solution was to discard entries from the results that didn't match the extra arguments, like: while($row = mysql_fetch_assoc($query)) { foreach($_GET as $key => $val) { if($val !== $row[$key]) { continue 2; } } // Output... } This method should hopefully only query the database once in effect, as adding filters doesn't change the query - MySQL can cache and reuse one data set. On the downside it makes pagination a bit of a headache. The obvious alternative would be to build any additional criteria into the initial query, something like: $sql = "SELECT * FROM tbl MATCH (title, description) AGAINST ('$search_term')"; foreach($_GET as $key => $var) { $sql .= " AND ".$key." = ".$var; } Are there good reasons to do this instead? Or are there better options altogether? Maybe a temporary table? Any thoughts much appreciated!

    Read the article

< Previous Page | 50 51 52 53 54 55 56 57 58 59 60 61  | Next Page >