Search Results

Search found 607 results on 25 pages for 'similarity analyzer'.

Page 2/25 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

How do you efficiently implement a document similarity search system?

- by Björn Lindqvist

How do you implement a "similar items" system for items described by a set of tags? In my database, I have three tables, Article, ArticleTag and Tag. Each Article is related to a number of Tags via a many-to-many relationship. For each Article i want to find the five most similar articles to implement a "if you like this article you will like these too" system. I am familiar with Cosine similarity and using that algorithm works very well. But it is way to slow. For each article, I need to iterate over all articles, calculate the cosine similarity for the article pair and then select the five articles with the highest similarity rating. With 200k articles and 30k tags, it takes me half a minute to calculate the similar articles for a single article. So I need another algorithm that produces roughly as good results as cosine similarity but that can be run in realtime and which does not require me to iterate over the whole document corpus each time. Maybe someone can suggest an off-the-shelf solution for this? Most of the search engines I looked at does not enable document similarity searching.

Read the article
Algorithm to find a measurement of similarity between lists.

- by Cubed

Given that I have two lists that each contain a separate subset of a common superset, is there an algorithm to give me a similarity measurement? Example: A = { John, Mary, Kate, Peter } and B = { Peter, James, Mary, Kate } How similar are these two lists? Note that I do not know all elements of the common superset. Update: I was unclear and I have probably used the word 'set' in a sloppy fashion. My apologies. Clarification: Order is of importance. If identical elements occupy the same position in the list, we have the highest similarity for that element. The similarity decreased the farther apart the identical elements are. The similarity is even lower if the element only exists in one of the lists. I could even add the extra dimension that lower indices are of greater value, so a a[1] == b[1] is worth more than a[9] == b[9], but that is mainly cause I am curious.

Read the article
Process Manufacturing (OPM) Actual Costing Analyzer Diagnostic Script

- by ChristineS-Oracle

The OPM Actual Costing Analyzer is a script which you can use proactively at any time to review Setups and pieces of data which are known to affect either the performance or the accuracy of either the OPM Actual Cost process, or Lot Costing.Each topic reviewed by this report has been specifically selected because it points to the solution used to resolve at least two Service Requests during a recent 3-month period. You can download this script from Doc ID 1629384.1, OPM Actual Costing Analyzer Diagnostic Script.

Read the article
AT&T Application Resource Analyzer in NetBeans IDE

- by Geertjan

Here at Øredev in Malmö I met Doug Sillars who does developer outreach for the AT&T Application Resource Optimizer. In this YouTube clip you see Doug explaining how it works and what it can do for optimizing performance of mobile applications. There's a free and open source Android app on GitHub that you can install on Android to collect data and then there's a Java Swing application for analyzing the results. And here's what that application looks like as a plugin in NetBeans IDE, click to enlarge the image, which shows the Android sources of the Data Collector, as well as the Data Analyzer ready to be used to collect data: Since the ARO Data Analyzer is written in Java and has JPanels defining its UI layer, integrating the user interface wasn't hard. Now working on the Actions, so there'll be a new ARO menu with start/stop data collecting menu items, etc, reusing as much of the original code as possible. That part is actually already working. I started up an Android emulator, then started the data collection process from the IDE. Now need to include the Actions for importing the data into the analyzer, together with a few other related features. A pretty cool feature in ARO is video capture, so that a movie can be made by ARO of all the steps taken on the device during the collection process, which will also be nice to have integrated into the NetBeans plugin. Ultimately, this will be handy for anyone creating Android applications in NetBeans IDE since they'll be able to use AT&T's ARO tool for optimizing the performance of the applications they're developing. It will also be useful for those using the built-in Cordova tools in NetBeans IDE to create iOS applications because ARO is also applicable to analyzing iOS application performance.

Read the article
Real-time spectrum analyzer with API

- by bobobobo

I'm looking for a C or C++ API that will give me real-time spectrum analysis of a waveform on Windows. I'm not entirely sure how large a sample window it should need to determine frequency content, but the smaller the better. For example, if it can work with a 0.5 second long sample and determine frequency content to the Hz, that would be wicked-awesome.

Read the article
Weblog analyzer most useful features

- by phq

There are already a lot of questions asking which analyzer is the best. I try here to invert the question. Instead of asking which analyzer has the best features I'm looking for what are the best features. More interesting is to separate what an analyzer can do from what is useful spending time doing. What are the most useful features I should look for in a web server log analyzer? How are they useful, what problems can they solve?

Read the article
Proper snowball analyzer configuration when using Grails Searchable Plugin

- by Wirsbro

To improve stemming we want to switch from the default analyzer to snowball, however, having a lot of difficulty with the proper settings and would appreciate any help. In Environment: - Sun's Java 1.6.16 - Grails 1.2.2 - Searchable Plug-In 0.5.5 Config.groovy: Have tried both settings: compassSettings = ['compass.engine.analyzer.stemmed.type': 'snowball', 'compass.engine.analyzer.stemmed.name': 'English'] compassSettings = ['compass.engine.analyzer.snowball.type': 'snowball', 'compass.engine.analyzer.snowball.name': 'English', 'compass.engine.analyzer.search.type': 'snowball', 'compass.engine.analyzer.search.name': 'English'] Search.groovy - The Invocation: def searchResult = searchableService.search(params.q, withHighlighter: { highlighter, index, sr if (!sr.highlights) { sr.highlights = [] } try { sr.highlights[index] = highlighter.fragments("content")[0..2].join(" ") } catch (IndexOutOfBoundsException ex) { sr.highlights[index] = highlighter.fragment("content") } }) def suggestion = searchableService.suggestQuery(params.q) if (suggestion != params.q) { searchResult.suggestedQuery = suggestion }

Read the article
Document Similarity: Comparing two documents efficiently

- by seanieb

I have a loop that calculates the similarity between two documents. It collects all the tokens in a document and their scores, and places them in dictionary. It then compares the dictionaries This is what I have so far, it works, but is super slow: # Doc A cursor1.execute("SELECT token, tfidf_norm FROM index WHERE doc_id = %s", (docid[i][0])) doca = cursor1.fetchall() #convert tuple to a dictionary doca_dic = dict((row[0], row[1]) for row in doca) #Doc B cursor2.execute("SELECT token, tfidf_norm FROM index WHERE doc_id = %s", (docid[j][0])) docb = cursor2.fetchall() #convert tuple to a dictionary docb_dic = dict((row[0], row[1]) for row in docb) # loop through each token in doca and see if one matches in docb for x in doca_dic: if docb_dic.has_key(x): #calculate the similarity by summing the products of the tf-idf_norm similarity += doca_dic[x] * docb_dic[x] print "similarity" print similarity I'm pretty new to Python, hence this mess. I need to speed it up, any help would be appreciated. Thanks.

Read the article
Detecting similar words among n text documents

- by javanes

Hi; I have n documents and want to find common words that are included in these documents. For example I want to say (n-3) documents include the word "web". Certainly I can do this by basic data structures but there maybe efficient algorithm or a way to handle same words with different suffix. Is there any algorithm for such purposes? I am unfamiliar with datamining world. In general manner is there a term used for efforts of finding similarities between different documents? If there is then I will make my research easily. Thanks.

Read the article
What is the best Apache logs Analyzer?

- by Evgeny

What real-time log analyzer can you suggest for Apache access and error logs? There is a list of web analytics software on wikipedia, but it would be great to hear opinions from people with experience without having to try all of them. Please don't suggest Google Analytics or any other hosted/javascript analytics suites, already using them, GA is not real-time and it is missing some data that the logs show. For example 404 errors, script errors, the full query-string of the referral, IP addresses, visitor path through the website, etc ...

Read the article
Apache log analyzer which manages time spent to serve the request

- by antispam

I need to monitor performance in my web server (there's an application server in the back) and create reports for senior management. I've enabled %T/%D in my Apache logs and I would like to know if there's an Apache log analyzer or some other tool which parses these values and manages them showing charts or reports. I am looking mostly for an integrated solution and not in the line of awk+gnuplot scripts.

Read the article
Windows Server 2008 R2: Introducing the Best Practices Analyzer for AD Domain Services

Windows Server 2008 R2 includes a new Best Practices Analyzer for Active Directory Domain Services, which facilitates the implementation of best practices in your Active Directory Domain Services environment.

Read the article
Generic log analyzer that produces reports

- by Eugene

About 600 customers use our application. We have very detailed logs for everything that happens in the application, from changes in the data model, memory and CPU/GPU usage to clicks on the UI elements. We want to be able to parse the logs coming from these customers and analyze them to understand how users use our application and what happens internally in the application. Is there a log analyzer that can produce such reports automatically?

Read the article
Working with the SharePoint Rules Analyzer in SharePoint Foundation 2010

In this article, we will take a look at the features of the SharePoint Health Analyzer in SharePoint Foundation 2010 and some details on extending it by creating custom Health Rules.

Read the article
New SSIS tool on Codeplex – SSIS Log Analyzer

I stumbled across a new SSIS tool on Codeplex today, the SSIS Log Analyzer which was only released a few days ago. Whilst it is a beta release and currently only supports 2005 (2008 is promised) it looks quite interesting. It seems to be a fancy log viewer, but with some clever features and a nice looking front-end. I’ve only read the documentation so far, but it has graphs and a debug view that shows your package with the colour animations similar to when debugging in BIDS, and everyone knows, the way the pretty colours and numbers change is the best bit! I’ll quote some of the features for you here and then let you make your own mind up, is it useful in the real world? Option to analyze the logs manually by applying row and column filters over the log data or by using queries to specify more complex criterions. Automated Performance Analysis which provides a quick graphical look on which tasks spent most time during package execution. Rerun (debug) the entire sequence of events which happened during package execution showing the flow of control in graphical form, changes in runtime values for each task like execution duration etc. Support for Auto Analyzers to automatically find out issues and provide suggestions for problems which can be figured out with the help of SSIS logs and/or package. Option to analyze just log file or log and package together. Provides a lightweight environment to have a quick look at the package. Opening it in BIDS takes some time as being an authoring environment it does all sorts of validations resulting in some delay. See http://ssisloganalyzer.codeplex.com/ for more details.

Read the article
New SSIS tool on Codeplex – SSIS Log Analyzer

I stumbled across a new SSIS tool on Codeplex today, the SSIS Log Analyzer which was only released a few days ago. Whilst it is a beta release and currently only supports 2005 (2008 is promised) it looks quite interesting. It seems to be a fancy log viewer, but with some clever features and a nice looking front-end. I’ve only read the documentation so far, but it has graphs and a debug view that shows your package with the colour animations similar to when debugging in BIDS, and everyone knows, the way the pretty colours and numbers change is the best bit! I’ll quote some of the features for you here and then let you make your own mind up, is it useful in the real world? Option to analyze the logs manually by applying row and column filters over the log data or by using queries to specify more complex criterions. Automated Performance Analysis which provides a quick graphical look on which tasks spent most time during package execution. Rerun (debug) the entire sequence of events which happened during package execution showing the flow of control in graphical form, changes in runtime values for each task like execution duration etc. Support for Auto Analyzers to automatically find out issues and provide suggestions for problems which can be figured out with the help of SSIS logs and/or package. Option to analyze just log file or log and package together. Provides a lightweight environment to have a quick look at the package. Opening it in BIDS takes some time as being an authoring environment it does all sorts of validations resulting in some delay. See http://ssisloganalyzer.codeplex.com/ for more details.

Read the article
cosine similarity problem

- by jaskirat

hi.... i have calculated the tf-idf values of terms of document 1 and document 2..now i dont know how to use these tf-idf values...basically i want to find similarity between two documents(in my case are webpages)..can any body tell how to implement cosine similarity, jaccard coefficient to find similarity...c# code would be appreciated..pls help...thanks

Read the article
Seo - page similarity percentage

- by user1360479

Using Similar Page Checker (http://www.webconfs.com/similar-page-checker.php) you can check if a website is similar to other one. Is there any rule of thumb how high percentage is accepted by Google. Meaning when Google consider that page is similar than other one and will not index it. I' having two pages within same domain where "how to order" -information is similar and that's why percentage is about 70. Thx

Read the article
SQL Server 2008 Best Practices Analyzer - keep an eye for the release

- by ssqa.net

What practice do you classify as a best practice? The answer is its not a rocket science, you don't need any specific formula to satisfy the need! Ok what if a tool can follow those common best practices & perform ... read from here ....(read more)

Read the article
Real-time Steganography Analyzer Upgraded

StegAlyzerRTS is capable of operating on networks with throughput of 100Mbps Real-time computing - Realtime - Operating Systems - Business - Site Management

Read the article
SQL Performance Analyzer

Any activity that may impact a statement's execution plan is a candidate for using SPA to investigate the possible consequences - both good and bad. Steve Callan discusses the workflow and provides a working example.

Read the article
Use different Solr Similarity algo for every search

- by snickernet

Hi Guys, Is possible in Solr 1.4 to specify which similarity class to use for every search within a single index? Let's say, I got 2 type of search (keyword and brand). For keyword search, I want to use the DefaultSimilarity class. But, for brand search, I want to use my CustomSimilarity class. I've been modifying the schema.xml to specify a single similarity class to use. But, I came to this requirement that I have to use 2 different similarity classes. I'll be glad to here your thoughts on this. Thanks in advance.

Read the article
Running Best Practice Analyzer on Windows 2012 yields error "Result file has not yet been generated"

- by mhildreth

Whenever I run the Best Practice Analyzer on a Windows 2012 server with IIS installed, I receive the error: "There has been a Best Practice Analyzer error for Model Id 'Microsoft/Windows/WebServer'. The Result file has not yet been generated. Please perform the scan first and try again." I'm doing this from the "Local Server" section of the Server Manager. I'm logged in as with a domain credential that has administrative rights on the server. I don't know how to generate the result file or where it would be located. I have 4 servers, all with IIS and this is happening on all of them. The servers are practically brand new so there isn't anything really exceptional about their setup. Any suggestions on how to generate the result file? Thanks in advance.

Read the article
Apache forensic log analyzer under Windows

- by Michael

Is there any freeware apache forensic log analyzer GUI under windows? Talking about ForensicLog output. Thanks

Read the article
Is there some algorithm to compare the DOM similarity of different pages ?

- by user198729

Has anyone some experience about this?

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >