Search Results

Search found 35102 results on 1405 pages for 'text mining'.

Page 61/1405 | < Previous Page | 57 58 59 60 61 62 63 64 65 66 67 68 | Next Page >

OPTICS Clustering algorithm. How to get the best epsilon

- by Marco Galassi

I am implementing a project which needs to cluster geographical points. OPTICS algorithm seems to be a very nice solution. It needs just 2 parameters as input(MinPts and Epsilon), which are, respectively, the minimum number of points needed to consider them as a cluster, and the distance value used to compare if two points are in can be placed in same cluster. My problem is that, due to the extreme variety of the points, I can't set a fixed epsilon. Just look at the image below. The same points structure but in a different scale would result very different. Suppose to set MinPts=2 and epsilon = 1Km. On the left, the algorithm would create 2 clusters(red and blue), but on the right it would create one single cluster containing all of the points(red), but I would like to obtain 2 clusters even on the right. So my question is: is there any kind of way to calculate dynamically the epsilon value to get this result? Thank you very much and excuse my for my poor english. Marco

Read the article
Architecture for database analytics

- by David Cournapeau

Hi, We have an architecture where we provide each customer Business Intelligence-like services for their website (internet merchant). Now, I need to analyze those data internally (for algorithmic improvement, performance tracking, etc...) and those are potentially quite heavy: we have up to millions of rows / customer / day, and I may want to know how many queries we had in the last month, weekly compared, etc... that is the order of billions entries if not more. The way it is currently done is quite standard: daily scripts which scan the databases, and generate big CSV files. I don't like this solutions for several reasons: as typical with those kinds of scripts, they fall into the write-once and never-touched-again category tracking things in "real-time" is necessary (we have separate toolset to query the last few hours ATM). this is slow and non-"agile" Although I have some experience in dealing with huge datasets for scientific usage, I am a complete beginner as far as traditional RDBM go. It seems that using column-oriented database for analytics could be a solution (the analytics don't need most of the data we have in the app database), but I would like to know what other options are available for this kind of issues.

Read the article
data structure for counting frequencies in a database table-like format

- by user373312

i was wondering if there is a data structure optimized to count frequencies against data that is stored in a database table-like format. for example, the data comes in a (comma) delimited format below. col1, col2, col3 x, a, green x, b, blue ... y, c, green now i simply want to count the frequency of col1=x or col1=x and col2=green. i have been storing the data in a database table, but in my profiling and from empirical observation, database connection is the bottle-neck. i have tried using in-memory database solutions too, and that works quite well; the only problem is memory requirements and quirky init/destroy calls. also, i work mainly with java, but have experience with .net, and was wondering if there was any api to work with "tabular" data in a linq way using java. any help is appreciated.

Read the article
Analyze big human database

- by Neir0

Lets we have a big people database. Each human has a many parameters: age, weight, favorite music, favorite films, education etc. I want to know how one feature associate with other features. For example, if human has a good education what it means for musical preferences? Or how films preferences changes with age? I know about assotian rules algorithms like apriory but i donnt want just to found assotiation rules, i want to know how one specific feature affect to others. Which keywords i must to use for google?

Read the article
Apriori Algorithm- what to do with small min.support?

- by user3707650

I have a question about the table beneath my question: If i was told that the given min.support=10%, how can i know what is the support count, by which i will use during the exercise? What i know is: that you take the number of transactions (8) and multiple it by the min.support: 8*(10/100)=0.8 the problem is that i get this number: 0.8, how can i use this support count during this example?? 0.8 is a number that will make me prune all combination set that i will build... please help me!!! TID A B C D E F G 10 1 0 1 0 0 0 1 20 1 1 1 1 0 1 1 30 0 0 0 0 0 0 1 40 0 0 1 0 0 1 1 50 0 0 0 1 1 0 0 60 0 1 1 0 1 1 0 70 0 0 0 0 1 1 0 80 0 0 1 0 1 1 1

Read the article
What machine learning algorithms can be used in this scenario?

- by ExceptionHandler

My data consists of objects as follows. Obj1 - Color - shape - size - price - ranking So I want to be able to predict what combination of color/shape/size/price is a good combination to get high ranking. Or even a combination could work like for eg: in order to get good ranking, the alg predicts best performance for this color and this shape. Something like that. What are the advisable algorithms for such a prediction? Also may be if you can briefly explain how I can approach towards the model building I would really appreciate it. Say for eg: my data looks like Blue pentagon small $50.00 #5 Red Squre large $30.00 #3 So what is a useful prediction model that I should look at? What algorithm should I try to predict like say highest weightage is for price followed by color and then size. What if I wanted to predict in combinations like a Red small shape is less likely to higher rank compared to pink small shape . (In essence trying to combine more than one nominal values column to make the prediction)

Read the article
Plain Text email support: Is it still needed in 2011?

- by murdoch

For many years I have been building emails that get sent out by my webapps that are Multi-part with a text part & an email part to allow users of plain text only email clients to default to the text version. However I have recently been developing a rather complex email that doesn't translate so well to text, so in 2011 is there really any need to provide a textual alternative. How many people out there are actually still only able to see plain text emails?

Read the article
Windows Vista language text service problem

- by Azho KG

Hi, All I'm using English version of Vista and having problems with using programs that display Russian characters somewhere. For example dictionaries doesn't work for me, since they display Russian character. Also I see just "magic" characters in text editor (notepad) when open a Russian text file. I tried to change whole Vista Interface language to Russian, but it still didn't solve the problem. I CAN read any web page from browser, that's not a problem. Also adding "Russian" in "Text Services and Input Languages" doesn't solve this problem. Does anyone know how to solve this? Thanks. My System: 32-bit Windows Vista Home Premium - SP2

Read the article
Blackberry message text is all white on white cannot read - Outlook 2007

- by johnny

Hi, I have a user that has Outlook 2007. When a certain person sends her an email from their blackberry the text is all white. If you copy all the text and place it in Word and change the font color you can see the email. The recipient is the only person that has the trouble with email from the blackberry device from this certain person. Everyone else can see the bb messages fine. Any ideas on what to check? I made sure the theme was set to none and that all fonts selected were installed (changed it to arial.) All other emails sent to the recipient are fine from everyone. The user that sends from the blackberry also has a PC. When he sends emails from the machine it looks fine for the troubled recipient. It is only when sending from the bb to that certain person that we get the white on white "invisible" text. Thank you for any help.

Read the article
Excel - export sheet to fixed-width text file?

- by jkohlhepp

I know that Excel has functionality to import fixed-width text files where it presents a dialog that lets you choose where the begins and ends of fields are which it puts into columns. Does it also have functionality where, given an existing spreadsheet, you can export to a fixed-width text file? If so, how do I access this? I have tried using Save As and choosing Text File but it seems to only save as Tab-delimited which doesn't help me. This is Excel 2003 if it matters. Thanks, ~ Justin

Read the article
layout analysis of text based pdf without ocr

- by fastrack

Before recognizing a pdf, OCR software do document layout analysis to determine which parts are texts, tables or images, as shown in the picture below. ![papercrop]http://cache.gawkerassets.com/assets/images/17/2011/07/papercrop.jpg I want to use some parts of the text while leaving out the others. So having a software marking those zones comes in handy. Papercrop does a decent job, but it has a bug of now showing some of the text in the pdf file. And OCR software can also do layout analysis, marking out "zones" which I can add or delete. But you have to OCR to do that. Since my pdfs are already text based, I don't want to waste so much time OCRing. So my question is, is there any software that automatically mark out those zones and let me manually manipulate them, without having to OCR? Thanks! Waiting for your help.

Read the article
Excel 2010 filter arrow not showing text values

- by DVP

I have an odd problem on a tracker spreadsheet I use. All the columns have a filter, but when you click on the filter arrow it doesn't show you a breakdown of all the text values for that column. All it shows is the usual 'sort A to Z/Z to A', but the bottom half of the pop-up screen is blank, where normally you have a list of text values that you can further filter by putting a tick next to each. It only displays (Select All) which you can tick, but its pointless as the column has selected all text values and hasn't been further filtered, which is what I need to do.

Read the article
Gitweb showing opposite colors for added and removed text

- by Maddy

Hi, I have installed gitweb in our servers. And it started showing the branches and the commit diffs. But the syntax highlighting is opposite particularly for added and removed text. Supposed added text should be in green and removed text should be red. But I am seeing an opposite one. I can hack gitweb.css to get my job done. But felt like knowing why such issue is happening? And what might be the proper fix. (if any one knows good themes for gitweb? please mention)

Read the article
Wrapping text in an opened file in vim

- by TK

I want to soft wrap text in Vim to 90 columns per line. I want soft wrap so that it doesn't affect actual text by adding line break characters. Here's is what I tried: // Opened a file with lots of text and ran the following: set wrap set tw=90 set linebreak Running the commands doesn't change anything about the view at all. It soft wraps at the end of the window. I have used "Soft Wrap" in TextMate by Command-Option-W to get the same effect, and want to know how to get it work on Vim.

Read the article
Reverse bash console text flow

- by radman

Hi, This is a bit of a weird question and I'm not sure that there is any easy answer to it but I am very interested in finding a solution. So when I work on a linux machine via a console I find that I am constantly staring at the bottom of the screen, as once you have executed a bunch of commands text fills toward the bottom. Now I find that this is decidedly not good for my neck and it would be far better if instead of scrolling to the bottom, the text would scroll to the top instead. So does anyone out there know if there is a way to reverse the direction text appears in a console? (note that i am aware of the clear command) Example: default behaviour user@machine:~$ command 1 user@machine:~$ command 2 user@machine:~$ command 3 user@machine:~$ __active_prompt__ desired behaviour user@machine:~$ __active_prompt__ user@machine:~$ command 3 user@machine:~$ command 2 user@machine:~$ command 1 Running Kubuntu 10.04 using Konsole I realise this is an odd question, thanks for any help.

Read the article
Overwrite text in Windows Notepad

- by Mark Miller

I would like to be able to overwrite text in Windows Notepad. I am using Windows 7 Professional. Ideally I would like to be able to position the cursor next to a string of text and erase that text by pressing the spacebar until the cursor has passed over every character in the string without adding additional spaces to the document. Is that possible? I have tried pressing the 'Insert' key, but that does not help. Nor does using 'Num Lock' and pressing the 'Ins' or '0' key. Unfortunately, I have not been able to find a solution elsewhere on the internet. I do not think I am using notepad++. The application is listed as 'notepad.exe' under 'Properties'. Thank you for any suggestions.

Read the article
Text comparison utility

- by Aaron

I know this has been asked before...but I have a spin as I have been trying out varying free software offerings. I want to rid out department of DiffDoc the problem is that I am having trouble locating something that will do what we need. WinMerge has been the latest attempt... The problem is simple. One Word doc...one PDF with a portion of it containing the text to be compared against. Compare them and be done. Raw text, ignore whitespace, ignore carriage returns, etc... Just compare the text and give me the results in some sort of report. NOTE: Have tried ExamDiff, kdiff3, Tortoise, and a few others...

Read the article
Excel - Avoid cell text to be shown onto next empty cell

- by e-mre

When you have text in an Excel cell that is too long to be shown in the visible area of a single cell and the cell next to the first cell (the one on the right) is empty, Excel lets the text to be printed onto the next cell. This is what I want to change. I want to avoid this text overflow. I know I can avoid this by enabling "word wrap" and adjusting row height. But that is not what I want. I want to change the DEFAULT behavior of Excel so it shows the value of each cell only in the visible area of that cell. No overflow, no word wrap. Is this possible? (I am using Excel 2010 by the way)

Read the article
Emails intended as HTML are received as plain text

- by Jeremy

I'm regularly receiving emails from a well-known public website that read as plain text without carriage breaks or effective hyperlinks. My email client is Thunderbird. Thunderbird helpsite doesn't display an answer. And I'm reluctant to complain to the website if the problem is at my end. Message source for headers includes this: Content-Type: multipart/alternative; boundary=--boundary_9338_03b8c925-816e-4b55-95c4-b2593da7e5f6 The content in message source that follows the header is preceded by this: ----boundary_9338_03b8c925-816e-4b55-95c4-b2593da7e5f6 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: base64 The content itself in message source reads typically like this: PCFkb2N0eXBlIGh0bWwgcHVibGljICItLy9XM0MvL0RURCBIVE1MIDQuMCBUcmFuc2l0aW9u YWwvL0VOIj4NCg0KDQo8aHRtbD4NCjxoZWFkPg0KPG1ldGEgaHR0cC1lcXVpdj0iQ29udGVu, etc.,etc. And, as I've said, the message in the viewing pane is unadulterated plain text. Can you tell me - where is it all going wrong? Thanks.

Read the article
Braces (syntax) highlighting in OpenOffice Math formula text editor

- by Oleksandr Bolotov

When you use OpenOffice Math, in upper part you see formula and formula text editor in lower part. Almost like this: %sigma = 2 %mu %epsilon + %lambda Tr(%epsilon)I So my questions are: How to replace OpenOffice Math's formula text editor with own text editor? ... or how to enable braces (syntax) highlighting in embedded editor? ... are there any extensions for anything like this? I need this because sometimes it's too much braces and stuff and it's hard to distinguish which braces match each other. Please do not suggest me to use MathType Mathematica (or anything) instead of OpenOffice Math (because I'm almost happy with it:)

Read the article
split large text file without missing data record

- by Santosh

I have a 140 MB text file, which contains detail information of books in library. For each book details there is a standard format data details in text file. I need to parse it and insert the data in Database. Here, parsing text file is not an issue. I am facing problem in parsing this large file. So i decided to split the file in small file around 2 MB each file. But i can't manually split this large file in so many pieces. I got HJsplit tool, which split the file but this also doesn't helped as this split the file but 1 book details half part is in one file and rest part is in second file. so if i split this way then information will be missed. How to split the large so that i cant miss the information ? Is there any tool which help me in this condition.

Read the article
Is there an FTP client for Windows 7 with a text-based UI? (not text-prompt based, like ftp.exe)

- by Alan B

Is there such as thing as a text-mode FTP client for Windows 7 ? By 'text-mode' I mean one that runs in a CMD.EXE window as opposed to a Windows GUI application. It also needs to be something along the lines of FileZilla, i.e. menu-driven as opposed to command-entry clients like NcFTP or indeed the built-in one. edit: To avoid confusion, what I mean is an application similar to that pictured (ZTreeWin File Manager), which runs from CMD.EXE and uses text characters for its UI, within the CMD.EXE window. The built-in FTP client, and things like NcFTP offer a prompt at which you issue commands. That's not what I'm looking for.

Read the article
Vim - select text highlighted by search?

- by GorillaSandwich

In vim, I often perform searches to hop to a word or phrase instead of navigating there with h/j/k/l. Then I hit n to hop between occurrences. Say I've got this text: Time flies like an arrow; fruit flies like a banana. - Groucho Marx I type /an arrow and hit enter. That phrase is highlighted, and I jump to it. Now I want to visually select that text, maybe to change it or delete it. (Yes, I'm aware of the :s substitution command.) Since my cursor is at the letter "a" at the beginning of "an arrow," I can hit v, then press e a couple of times to highlight the entire phrase. But I have a feeling there's a shorter and more semantic way. After all, I've already specified the text I'm interested in. How might I compose a command to say "visually select the current search selection?"

Read the article
How to include a non-breaking hyphen in hyperlink text in Word 2010

- by dunxd

I want to include a URL in word document, both as text people can read, and a link they can click. The URL has a hyphen in it. I don't want the URL to get broken across lines. When I use a regular hyphen, the link works, but the text displayed gets broken. When I use a non-breaking hyphen (Ctrl+Shift+-) Word removes the hyphen from the link. When I try and manually edit the hyperlink, I can't add a non-breaking hyphen into the Text to display field using Ctrl+Shift+-. If I was writing this is HTML I could just do: <a href="http://www.my-link.com/">www.my‑link.com</a> How do I get Word to do the equivalent?

Read the article
How to auto detect text file encoding?

- by ???

There are many plain text files which were encoded in variant charsets. I want to convert them all to UTF-8, but before running iconv, I need to know its original encoding. Most browsers have an Auto Detect option in encodings, however, I can't check those text files one by one because there are too many. Only having known the original encoding, I then can convert the texts by iconv -f DETECTED_CHARSET -t utf-8. Is there any utility to detect the encoding of plain text files? It doesn't have to be a 100% perfect correct, but it should recognize most of them.

Read the article

< Previous Page | 57 58 59 60 61 62 63 64 65 66 67 68 | Next Page >