text mining - Page 3 - Developer IT

Data Mining project ideas?

- by Andriyev

Hi I am looking for project ideas in the field of data mining. I expect to complete it in a quarter and intend to use C++, Linux as the environment. The course I'm taking aims to build the basics of data mining and covers topics like Classification, Regression-Modeling, Clustering and Association learning. Please point me to some good ideas which I can chew on. cheers

Read the article

Great data mining quotes

- by Andrei Savu

I'm searching some data mining related quotes. Can you tell me some of the quotes you like? On the internet I have only found this site: http://www.quotesea.com/quotes/with/data%20mining Thanks.

Read the article

Text Expansion Awareness for UX Designers: Points to Consider

- by ultan o'broin

Awareness of translated text expansion dynamics is important for enterprise applications UX designers (I am assuming all source text for translation is in English, though apps development can takes place in other natural languages too). This consideration goes beyond the standard 'character multiplication' rule and must take into account the avoidance of other layout tricks that a designer might be tempted to try. Follow these guidelines. For general text expansion, remember the simple rule that the shorter the word is in the English, the longer it will need to be in English. See the examples provided by Richard Ishida of the W3C and you'll get the idea. So, forget the 30 percent or one inch minimum expansion rule of the old Forms days. Unfortunately remembering convoluted text expansion rules, based as a percentage of the US English character count can be tough going. Try these: Up to 10 characters: 100 to 200% 11 to 20 characters: 80 to 100% 21 to 30 characters: 60 to 80% 31 to 50 characters: 40 to 60% 51 to 70 characters: 31 to 40% Over 70 characters: 30% (Source: IBM) So it might be easier to remember a rule that if your English text is less than 20 characters then allow it to double in length (200 percent), and then after that assume an increase by half the length of the text (50%). (Bear in mind that ADF can apply truncation rules on some components in English too). (If your text is stored in a database, developers must make sure the table column widths can accommodate the expansion of your text when translated based on byte size for the translated character and not numbers of characters. Use Unicode. One character does not equal one byte in the multilingual enterprise apps world.) Rely on a graceful transformation of translated text. Let all pages to resize dynamically so the text wraps and flow naturally. ADF pages supports this already. Think websites. Don't hard-code alignments. Use Start and End properties on components and not Left or Right. Don't force alignments of components on the page by using texts of a certain length as spacers. Use proper label positioning and anchoring in ADF components or other technologies. Remember that an increase in text length means an increase in vertical space too when pages are resized. So don't hard-code vertical heights for any text areas. Don't be tempted to manually create text or printed reports this way either. They cannot be translated successfully, and are very difficult to maintain in English. Use XML, HTML, RTF and so on. Check out what Oracle BI Publisher offers. Don't force wrapping by using tricks such as /n or /t characters or HTML BR tags or forced page breaks. Once the text is translated the alignment will be destroyed. The position of the breaking character or tag would need to be moved anyway, or even removed. When creating tables, then use table components. Don't use manually created tables that reply on word length to maintain column and row alignment. For example, don't use codeblock elements in HTML; use the proper table elements instead. Once translated, the alignment of manually formatted tabular data is destroyed. Finally, if there is a space restriction, then don't use made-up acronyms, abbreviations or some form of daft text speak to save space. Besides being incomprehensible in English, they may need full translations of the shortened words, even if they can be figured out. Use approved or industry standard acronyms according to the UX style rules, not as a space-saving device. Restricted Real Estate on Mobile Devices On mobile devices real estate is limited. Using shortened text is fine once it is comprehensible. Users in the mobile space prefer brevity too, as they are on the go, performing three-minute tasks, with no time to read lengthy texts. Using fragments and lightning up on unnecessary articles and getting straight to the point with imperative forms of verbs makes sense both on real estate and user experience grounds.

Read the article

Python, web log data mining for frequent patterns

- by descent

Hello! I need to develop a tool for web log data mining. Having many sequences of urls, requested in a particular user session (retrieved from web-application logs), I need to figure out the patterns of usage and groups (clusters) of users of the website. I am new to Data Mining, and now examining Google a lot. Found some useful info, i.e. querying Frequent Pattern Mining in Web Log Data seems to point to almost exactly similar studies. So my questions are: Are there any python-based tools that do what I need or at least smth similar? Can Orange toolkit be of any help? Can reading the book Programming Collective Intelligence be of any help? What to Google for, what to read, which relatively simple algorithms to use best? I am very limited in time (to around a week), so any help would be extremely precious. What I need is to point me into the right direction and the advice of how to accomplish the task in the shortest time. Thanks in advance!

Read the article

Word 2007 - Change default text control text

- by James

When inserting a text box using the developer tab in Word 2007, by default the inserted text box says "Click here to enter text". How can I change this to something different? Thanks.

Read the article

Online network mining software

- by ron

A year ago I stumbled upon a website which provided an online application for building a network online. For example, I entered some urls and phrases, and it automatically searched them for news, inserted the connections between them, etc. I can't find it now. Do you know such software?

Read the article

Data Mining Software

- by Mark

I want to harvest some data like this http://www.newcardealers.ca/en/Dealers/List-A.aspx And insert the name, address, phone number, email, etc. into a database. Is there some software I can use that will take a webpage, let me specify some regexes or something, and then spit out all the matched data in a CSV or some format easily insertable into a DB?

Read the article

editable panel/text box on a webpage with a capability to display HTML text inside it

editable panel/text box on a webpage with a capability to display HTML text inside it

Read the article

Hidden text and links appearing just on click for SEO?

- by CamSpy

I am working on a site that has neat clean/minimalistic design/layout. Menu items are "hidden" behind an icon, to see them, users need to click on that icon to get a javascript toggled overlay with the list of menu items. Then there are blocks with photos and users need to click on a small icon/button on each of them to get a block of text shown for each of the photo. While I don't like such "design" myself, making me click lots of time just to read, I also think that for SEO purpose this model is really wrong. Is such model bad for SEO? Are there ways to keep design like this but have "safe" methods of displaying text content on click that will not hurt SEO?

Read the article

Data Mining In vb.net

- by user369161

Hi please give me a simple sample for data mining in vb.net thanks everyone.

Read the article

Java Swt Text (SWT.MULTI) append text without scroll

- by mchr

I have a Java SWT GUI with a multiline Text control. I want to append lines of text to the Text control without affecting the position of the cursor within the text box. In particular, the user should be able to scroll and select text at the top of the Text control while new text lines are appended to the bottom. Is this possible?

Read the article

Installing sublime text plugins all at once

- by James

Is there a way to install all the sublime text 2 plugins that you would like to install all at once. In Notepad++, there is a plugin manager which lets you install all the plugins you want to install by checking the box next to the plugin name & description. I was wandering if there is something like that for sublime text. For eg, I would like to install Zen Coding, JQuery Package for Sublime Text, Sublime Prefixr, JS Format, SublimeLinter and many other plugins all at once rather than typing each plugin in the Package Control and installing it one by one.

Read the article

Open and scroll through 42 GB text file in Mac OS X

- by Django Johnson

I am running Mac OS X 10.8.4 (Mountain Lion) and I am trying to open and scroll through a 42 GB .XML file. I plan on using an XML parser to parse through it and delete parts, but first I need to know how the document is structured so I can know what parts to save. How can I open this text / XML file and scroll through it so I can get a glimpse of its structure? I tried my default text-editor, text-mate, and that couldn't open it. I tried gEdit and that shows the first 10 or so lines, but then quits after trying to load the rest. I would greatly appreciate any and all suggestions!

Read the article

What is the best way to implement paginated text editing in Python?

- by W.F

I'm trying to build a formatted text editor in python. I need the editor to be paginated on edit mode. Same as in all popular word processors - when the user is editing the document what he/she sees is a representation of the actual, physical, page. I've tried looking into PySide but I can't find any ready solution to this, nor I can work out a way to do it myself. I am totally open to new technologies, so if you think Python is not the right choice here I would love to hear about new stuff (especially when I'm this new to UI coding). It only needs to be cross-platform and let me do rapid development (hence me looking for an out-of-the-box solution to this). Please suggest the best way to implement this. Please also note that I am looking for either a ready solution or an advice on how to tackle this. Thank you very much !

Read the article

Software to store frequently used text in PC

- by user15660

Hi, I am a looking for a free software that can run on the task bar (near the system time) where I can store frequently used text like my full street address, paths of specific deep folders & files in the computer etc etc. This way I can just click the icon which should popup a screen where I should be able to copy the text/string I am looking for Any ideas? thanks in advance

Read the article

Fraud Detection with the SQL Server Suite Part 1

- by Dejan Sarka

While working on different fraud detection projects, I developed my own approach to the solution for this problem. In my PASS Summit 2013 session I am introducing this approach. I also wrote a whitepaper on the same topic, which was generously reviewed by my friend Matija Lah. In order to spread this knowledge faster, I am starting a series of blog posts which will at the end make the whole whitepaper. Abstract With the massive usage of credit cards and web applications for banking and payment processing, the number of fraudulent transactions is growing rapidly and on a global scale. Several fraud detection algorithms are available within a variety of different products. In this paper, we focus on using the Microsoft SQL Server suite for this purpose. In addition, we will explain our original approach to solving the problem by introducing a continuous learning procedure. Our preferred type of service is mentoring; it allows us to perform the work and consulting together with transferring the knowledge onto the customer, thus making it possible for a customer to continue to learn independently. This paper is based on practical experience with different projects covering online banking and credit card usage. Introduction A fraud is a criminal or deceptive activity with the intention of achieving financial or some other gain. Fraud can appear in multiple business areas. You can find a detailed overview of the business domains where fraud can take place in Sahin Y., & Duman E. (2011), Detecting Credit Card Fraud by Decision Trees and Support Vector Machines, Proceedings of the International MultiConference of Engineers and Computer Scientists 2011 Vol 1. Hong Kong: IMECS. Dealing with frauds includes fraud prevention and fraud detection. Fraud prevention is a proactive mechanism, which tries to disable frauds by using previous knowledge. Fraud detection is a reactive mechanism with the goal of detecting suspicious behavior when a fraudster surpasses the fraud prevention mechanism. A fraud detection mechanism checks every transaction and assigns a weight in terms of probability between 0 and 1 that represents a score for evaluating whether a transaction is fraudulent or not. A fraud detection mechanism cannot detect frauds with a probability of 100%; therefore, manual transaction checking must also be available. With fraud detection, this manual part can focus on the most suspicious transactions. This way, an unchanged number of supervisors can detect significantly more frauds than could be achieved with traditional methods of selecting which transactions to check, for example with random sampling. There are two principal data mining techniques available both in general data mining as well as in specific fraud detection techniques: supervised or directed and unsupervised or undirected. Supervised techniques or data mining models use previous knowledge. Typically, existing transactions are marked with a flag denoting whether a particular transaction is fraudulent or not. Customers at some point in time do report frauds, and the transactional system should be capable of accepting such a flag. Supervised data mining algorithms try to explain the value of this flag by using different input variables. When the patterns and rules that lead to frauds are learned through the model training process, they can be used for prediction of the fraud flag on new incoming transactions. Unsupervised techniques analyze data without prior knowledge, without the fraud flag; they try to find transactions which do not resemble other transactions, i.e. outliers. In both cases, there should be more frauds in the data set selected for checking by using the data mining knowledge compared to selecting the data set with simpler methods; this is known as the lift of a model. Typically, we compare the lift with random sampling. The supervised methods typically give a much better lift than the unsupervised ones. However, we must use the unsupervised ones when we do not have any previous knowledge. Furthermore, unsupervised methods are useful for controlling whether the supervised models are still efficient. Accuracy of the predictions drops over time. Patterns of credit card usage, for example, change over time. In addition, fraudsters continuously learn as well. Therefore, it is important to check the efficiency of the predictive models with the undirected ones. When the difference between the lift of the supervised models and the lift of the unsupervised models drops, it is time to refine the supervised models. However, the unsupervised models can become obsolete as well. It is also important to measure the overall efficiency of both, supervised and unsupervised models, over time. We can compare the number of predicted frauds with the total number of frauds that include predicted and reported occurrences. For measuring behavior across time, specific analytical databases called data warehouses (DW) and on-line analytical processing (OLAP) systems can be employed. By controlling the supervised models with unsupervised ones and by using an OLAP system or DW reports to control both, a continuous learning infrastructure can be established. There are many difficulties in developing a fraud detection system. As has already been mentioned, fraudsters continuously learn, and the patterns change. The exchange of experiences and ideas can be very limited due to privacy concerns. In addition, both data sets and results might be censored, as the companies generally do not want to publically expose actual fraudulent behaviors. Therefore it can be quite difficult if not impossible to cross-evaluate the models using data from different companies and different business areas. This fact stresses the importance of continuous learning even more. Finally, the number of frauds in the total number of transactions is small, typically much less than 1% of transactions is fraudulent. Some predictive data mining algorithms do not give good results when the target state is represented with a very low frequency. Data preparation techniques like oversampling and undersampling can help overcome the shortcomings of many algorithms. SQL Server suite includes all of the software required to create, deploy any maintain a fraud detection infrastructure. The Database Engine is the relational database management system (RDBMS), which supports all activity needed for data preparation and for data warehouses. SQL Server Analysis Services (SSAS) supports OLAP and data mining (in version 2012, you need to install SSAS in multidimensional and data mining mode; this was the only mode in previous versions of SSAS, while SSAS 2012 also supports the tabular mode, which does not include data mining). Additional products from the suite can be useful as well. SQL Server Integration Services (SSIS) is a tool for developing extract transform–load (ETL) applications. SSIS is typically used for loading a DW, and in addition, it can use SSAS data mining models for building intelligent data flows. SQL Server Reporting Services (SSRS) is useful for presenting the results in a variety of reports. Data Quality Services (DQS) mitigate the occasional data cleansing process by maintaining a knowledge base. Master Data Services is an application that helps companies maintaining a central, authoritative source of their master data, i.e. the most important data to any organization. For an overview of the SQL Server business intelligence (BI) part of the suite that includes Database Engine, SSAS and SSRS, please refer to Veerman E., Lachev T., & Sarka D. (2009). MCTS Self-Paced Training Kit (Exam 70-448): Microsoft® SQL Server® 2008 Business Intelligence Development and Maintenance. MS Press. For an overview of the enterprise information management (EIM) part that includes SSIS, DQS and MDS, please refer to Sarka D., Lah M., & Jerkic G. (2012). Training Kit (Exam 70-463): Implementing a Data Warehouse with Microsoft® SQL Server® 2012. O'Reilly. For details about SSAS data mining, please refer to MacLennan J., Tang Z., & Crivat B. (2009). Data Mining with Microsoft SQL Server 2008. Wiley. SQL Server Data Mining Add-ins for Office, a free download for Office versions 2007, 2010 and 2013, bring the power of data mining to Excel, enabling advanced analytics in Excel. Together with PowerPivot for Excel, which is also freely downloadable and can be used in Excel 2010, is already included in Excel 2013. It brings OLAP functionalities directly into Excel, making it possible for an advanced analyst to build a complete learning infrastructure using a familiar tool. This way, many more people, including employees in subsidiaries, can contribute to the learning process by examining local transactions and quickly identifying new patterns.

Read the article

How to make a text search template?

- by Flipper

I am not really sure what to call this, but I am looking for a way to have a "template" for my code to go by when searching for text. I am working on a project where a summary for a piece of text is supplied to the user. I want to allow the user to select a piece of text on the page so that the next time they come across a similar page I can find the text. For instance, lets say somebody goes to foxnews.com and selects the article like in the image below. Then whenever they go to any other foxnews.com article I would be able to identify the text for the article and summarize it for them. But an issue I see with this is for a site like Stack Exchange where you have multiple comments to be selected (like below) which means that I would have to be able to recursively search for all separate pieces of text. Requirements Be able to keep pieces of text separate from each other. Possible Issues DIV's may not contain ids, classes, or names. A piece of text may span across multiple DIVs How to recognize where an old piece of text ends and a new begins. How to store this information for later searching?

Read the article

Data-mining related forums

- by lmsasu

Which forums you are using for data mining questions? SO is mainly intended for programming, not for DM questions.

Read the article

Data mining textbook

- by lmsasu

If you followed a DM course, which textbook was used? I know about Data Mining: Practical Machine Learning Tools and Techniques (Second Edition) and this poll. What did you effectively use?

Read the article

data mining open source software in java

- by pat

Hi i just like to know is there any open source data mining software written in java that is approximately less than 3k lines of codes? If yes, please give download link i need to do software testing thank you.

Read the article

Data mining google's web search results?

- by cheesebunz

Currently, i have a google web search. If a user searches starbucks, I would only want to retrieve the company or product information, not some other weird links like blog pages, using javascript, is it possible to do so? if yes, how am i able to do it? Kind of a newbie in the data mining part..thanks! Added my coding for download for clearer understanding : http://www.mediafire.com/?mzgo233kngm

Read the article

The Best Websites for Downloading and Playing Classic and New Text Adventure Games

- by Lori Kaufman

Before computers could handle graphical games, there were text adventure games. The games are interactive stories, so playing a text adventure game is like being part of a book in which you affect the story. Text adventure games are also referred to as “interactive fiction.” Interactive Fiction (IF) is actually a more accurate term for text adventure games, because these games can cover any topics, such as romances or comedies, not just adventures. They can also simulate real life. Even though computers can now handle intensely graphical games, playing text adventure games can still be fun. It’s like reading a good book and getting lost in the universe of the story, except you become the hero or heroine and affect the ending of the story. We’ve collected some links to websites where you can download classic and new text adventure games or play them online. There are also some free tools available for creating your own text adventure games. We even found a documentary about the evolution of computer adventure games and some articles about the art and craft of developing the original text adventure games. How To Create a Customized Windows 7 Installation Disc With Integrated Updates How to Get Pro Features in Windows Home Versions with Third Party Tools HTG Explains: Is ReadyBoost Worth Using?

Read the article

PHP text formating: Detectng several symbols in a row

- by ilnur777

I have a kind of strange thing that I really nead for my text formating. Don't ask me please why I did this strange thing! ;-) So, my PHP script replaces all line foldings "\n" with one of the speacial symbol like "|". When I insert text data to database, the PHP script replaces all line foldings with the symbol "|" and when the script reads text data from the database, it replaces all special symbols "|" with line folding "\n". I want to restrict text format in the way that it will cut line foldings if there are more than 2 line foldings used in each separating texts. Here is the example of the text I want the script to format: this is text... this is text... this is text...this is text...this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... I want to restict format like: this is text... this is text... this is text...this is text...this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... this is text... So, at the first example there is only one line folding between 2 texts and on the second example there are 3 line foldings between 2 texts. How it can be possible to replace more than 2 line foldings symbols "|" if they are detected on the text? This is a kind of example I want the script to do: $text = str_replace("|||", "||", $text); $text = str_replace("||||", "||", $text); $text = str_replace("|||||", "||", $text); $text = str_replace("||||||", "||", $text); $text = str_replace("|||||||", "||", $text); ... $text = str_replace("||||||||||", "||", $text); $text = str_replace("|", "<br>", $text);

Read the article

Data mining logs to locate a bug

- by gooli

I'm working on a data distribution application which receives data from a source and distributes that data to multiple target application. After successfully distributing several messages each second for 8 days, it missed a single message and did not deliver it properly to the clients. As I was looking at the logs I tried to find something there that was special for the time the miss happend - either in the data, its rate or some other condition but couldn't find anything. Is there any data mining technique I can use to identify how that specific event differs from other events?

Read the article

XSLT Escape Character not working

- by liveek

I am trying to use escape charaters in my text output, as i would like too surround the output in emailData tags. I am using <xsl:text><emailData></xsl:text> In the XSLT to esnure that this works however because i am using a tool called Cast Iron for some reason it is not converting the < into < and just spits out <emailData> You can see am image of it HERE that illustrates the output i am getting. My source code is this. How else could i wrap this in emailData tags? <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output method="text"/> <xsl:template match="header"> <xsl:text><emailData></xsl:text> <xsl:text>
</xsl:text> <xsl:text>From: </xsl:text> <xsl:value-of select="from/text()"/> <xsl:text>
</xsl:text> <xsl:text>To: </xsl:text> <xsl:value-of select="to/text()"/> <xsl:text>
</xsl:text> <xsl:text>Subject: </xsl:text> <xsl:value-of select="subject/text()"/> <xsl:text>
</xsl:text> <xsl:text>Content-Type: </xsl:text> <xsl:value-of select="contentType/text()"/> <xsl:text>
</xsl:text> <xsl:text> boundary="</xsl:text> <xsl:value-of select="boundary/text()"/> <xsl:text>"</xsl:text> <xsl:text>
</xsl:text> <xsl:text>MIME-Version: </xsl:text> <xsl:value-of select="mimeVersion/text()"/> </xsl:template> <xsl:template match="email"> <xsl:text>

</xsl:text> <xsl:text>--</xsl:text> <xsl:value-of select="../header/boundary/text()"/> <xsl:text>
</xsl:text> <xsl:text>Content-Type: </xsl:text> <xsl:value-of select="contentTypeBody/text()"/> <xsl:text> charset="us-ascii"</xsl:text> <xsl:text>
</xsl:text> <xsl:text>Content-Transfer-Encoding: </xsl:text> <xsl:value-of select="contentTransfer/text()"/> <xsl:text>

</xsl:text> <xsl:value-of select="body/text()"/> </xsl:template> <xsl:template match="Attachment"> <xsl:for-each select="Attachments"> <xsl:text>

</xsl:text> <xsl:value-of select="../../header/boundary/text()"/> <xsl:text>
</xsl:text> <xsl:text>Content-Type: </xsl:text> <xsl:value-of select="attachmentContentType/text()"/> <xsl:text> name="</xsl:text> <xsl:value-of select="attachmentDescription/text()"/> <xsl:text>"</xsl:text> <xsl:text>
</xsl:text> <xsl:text>Content-Description: </xsl:text> <xsl:value-of select="attachmentDescription/text()"/> <xsl:text>
</xsl:text> <xsl:text>Content-Disposition: attachment; filename="</xsl:text> <xsl:value-of select="atachementDisposition/text()"/> <xsl:text>"</xsl:text> <xsl:text>
</xsl:text> <xsl:text>Content-Transfer-Encoding: </xsl:text> <xsl:value-of select="attachmentContentTransfer/text()"/> <xsl:text>

</xsl:text> <xsl:value-of select="attachementBody/text()"/> <xsl:text>
</xsl:text> <xsl:text></emailData></xsl:text> </xsl:for-each> </xsl:template> <xsl:template match="text()"/> </xsl:stylesheet>

Search Results

Search found 35102 results on 1405 pages for 'text mining'.

Page 3/1405 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

- by Andriyev

- by Andrei Savu

- by ultan o'broin

- by descent

- by James

- by ron

- by Mark

- by CamSpy

- by user369161

- by mchr

- by James

- by Django Johnson

- by W.F

- by user15660

- by Dejan Sarka

- by Flipper

- by lmsasu

- by lmsasu

- by pat

- by cheesebunz

- by Lori Kaufman

- by ilnur777

- by gooli

- by liveek

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >