Search Results

Search found 60391 results on 2416 pages for 'data generation'.

Page 166/2416 | < Previous Page | 162 163 164 165 166 167 168 169 170 171 172 173  | Next Page >

  • How can I perform sentiment analysis on extracted text from online sources?

    - by aniket69
    I'm working on extracting the sentiment from YouTube comments, blogs, news content, Facebook wall posts, and Twitter feeds. I'm looking for an automated way to do this: the two third-party solutions I've found have been AlchemyAPI and RapidMiner. Are these the best way to approach this project, or should I be using something else? Is there a more efficient way to approach sentiment analysis? What techniques have worked for you in a project like this?

    Read the article

  • RAID setup for maximizing data retention and read speed

    - by cat pants
    My goals are simple: maximize data retention safety, and maximize read speeds. My first instinct is to do a a three drive software RAID 1. I have only used fakeraid RAID 1 in the past and it was terrible (would have led to data loss actually if it weren't for backups) Would you say software raid 1 or a cheap actual hardware raid card? OS will be linux. Could I start with a two drive raid 1 and add a third drive on the fly? Can I hot swap? Can I pull one of the drives and throw it into a new machine and be able to read all the data? I do not want a situation where I have a raid card fail and have to try and find the same chipset in order to read my data (which i am assuming can happen) Please clarify any points on which it sounds like I have no idea what I am talking about, as I am admittedly inexperienced here. (My hardest lesson was fakeraid lol) Thanks!

    Read the article

  • Should I always encapsulate an internal data structure entirely?

    - by Prog
    Please consider this class: class ClassA{ private Thing[] things; // stores data // stuff omitted public Thing[] getThings(){ return things; } } This class exposes the array it uses to store data, to any client code interested. I did this in an app I'm working on. I had a ChordProgression class that stores a sequence of Chords (and does some other things). It had a Chord[] getChords() method that returned the array of chords. When the data structure had to change (from an array to an ArrayList), all client code broke. This made me think - maybe the following approach is better: class ClassA{ private Thing[] things; // stores data // stuff omitted public Thing[] getThing(int index){ return things[index]; } public int getDataSize(){ return things.length; } public void setThing(int index, Thing thing){ things[index] = thing; } } Instead of exposing the data structure itself, all of the operations offered by the data structure are now offered directly by the class enclosing it, using public methods that delegate to the data structure. When the data structure changes, only these methods have to change - but after they do, all client code still works. Note that collections more complex than arrays might require the enclosing class to implement even more than three methods just to access the internal data structure. Is this approach common? What do you think of this? What downsides does it have other? Is it reasonable to have the enclosing class implement at least three public methods just to delegate to the inner data structure?

    Read the article

  • Summarising and Bubbling of KPI data

    - by simonsabin
    Something I’m very conscious of when delivering a  BI solution is being able to show the facts in a concise way but also not to hide whats going on. I was reminded of this when I looked at the weather today. Everywhere they are reporting weather warnings for the south east and so I though I’d check on the BBC website http://news.bbc.co.uk/weather/forecast/4281?area=AL5 Looking at that I thought we are going to miss the worst of it, just like a few weeks ago. However from previous experience...(read more)

    Read the article

  • Storing translation data as JSON column

    - by j0ntech
    We're deciding on how to store translations of some descriptions of database items. We could go the traditional way and keep a translations table (and a language table and an object_translation linking table) OR we thought it might be better to just have a Description column that contains JSON like the following: { "EN": "This is the translation in English", "EE" : "See on kirjeldus eesti keeles" } Are there any serious downsides as to why we shouldn't use this? (I haven't seen it being used anywhere else)

    Read the article

  • Unit testing a text index

    - by jplot
    Consider a text index such as a suffix tree or a suffix array supporting Count queries (number of occurrences of a pattern) and Locate queries (the positions of all the occurrences of a pattern) over a given text. How would you go about unit testing such a class ? What I have in mind is to generate a big random string then extract a random substring from this big string and compare the results of both queries with naive implementations (such as string::find). Another idea I have is to find the most frequent substring of length l appearing in the original string (using perhaps a naive method) and use these substrings for testing the index. This isn't the best way, so what would be a good design of the unit tests for a text index ? In case it matters, this is in C++ using google test.

    Read the article

  • Replicating A Volume Of Large Data via Transactional Replication

    During weekend maintenance, members of the support team executed an UPDATE statement against the database on the OLTP Server. This database was a part of Transactional Replication, and once the UPDATE statement was executed the Replication procedure came to a halt with an error message. Satnam Singh decided to work on this case and try to find an efficient solution to rebuild the procedure without significant downtime.

    Read the article

  • How to input data into user defined variables into MySql query

    - by user292791
    Simple Shell script echo "Enter 1 for month of March" echo "Enter 2 for month of April" echo "Enter 3 for month of May" read Month case "$Month" in 1) echo "enter establishment name" read a; mysql -u root -p $a < "March.sql";; 2) echo "enter establishment name" read b; mysql -u root -p $b < "April.sql";; 3) echo "enter establishment name" read c; mysql -u root -p $c < "May.sql";; esac done In this i have three other query files March.sql, April.sql, May.sql. i'm linking this in shell script . Example of .sql file: SELECT DISTINCT substr( a.case_no, 3, 2 ), b.case_type, b.type_name, a.case_no into outfile '/tmp/April.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\r\n' FROM Civil_t AS a, Case_type_t AS b, disposal_proc AS c WHERE substr( a.case_no, 3, 2 ) = b.case_type AND a.date_of_decision BETWEEN '2014-04-01' AND '2014-04-30' AND a.case_no = c.case_no AND a.court_no =1; I have to alter the .sql script every time. Is there any method to read the variables from shell script and use it in mysql. For example:- echo "enter date" read a #input date Now i have read a "date" and i want to use it in March.sql query in where clause. Is there is any method of using this variable in .sql query.

    Read the article

  • Archiving your contact form data.

    I get TONS of email from customer. Over time, this email helps me to determine what areas in our product collection are opportunities for enhancement or improvement. I store the email that comes from my blog contact form in folders and then search through them looking for trends periodically. It occurred to me that, while I need to get the emails because many of them are actionable, it would be great if I could use reporting and analysis tools against the collection. So I whipped together...Did you know that DotNetSlackers also publishes .net articles written by top known .net Authors? We already have over 80 articles in several categories including Silverlight. Take a look: here.

    Read the article

  • Most efficient Implementation a Tree in C++

    - by Topo
    I need to write a tree where each element may have any number of child elements, and because of this each branch of the tree may have any length. The tree is only going to receive elements at first and then it is going to use exclusively for iterating though it's branches in no specific order. The tree will have several million elements and must be fast but also memory efficient. My plan makes a node class to store the elements and the pointers to its children. When the tree is fully constructed, it would be transformed it to an array or something faster and if possible, loaded to the processor's cache. Construction and the search on the tree are two different problems. Can I focus on how to solve each problem on the best way individually? The construction of has to be as fast as possible but it can use memory as it pleases. Then the transformation into a format that give us speed when iterating the tree's branches. This should preferably be an array to avoid going back and forth from RAM to cache in each element of the tree. So the real question is which is the structure to implement a tree to maximize insert speed, how can I transform it to a structure that gives me the best speed and memory?

    Read the article

  • Can a table be both Fact and Dimension

    - by PatFromCanada
    Ok, I am a newbie and don't really think "dimensionally" yet, I have most of my initial schema roughed out but I keep flipping back and forth on one table. I have a Contract table and it has a quantity column (tonnes), and a net price column, which need to be summed up a bunch of different ways, and the contract has lots of foreign keys (producer, commodity, futures month etc.) and dates so it appears to be a fact table. Also the contract is never updated, if that makes a difference. However, we create cash tickets which we use to pay out part or all of the contract and they have a contract ID on them so then the contract looks like a dimension in the cash ticket's star schema. Is this a problem? Any ideas on the process to resolve this, because people don't seem to like the idea of joining two fact tables. Should I put producerId and commodityId on the cash ticket? It would seem really weird not to have a contractID on it.

    Read the article

  • Are "skip deltas" unique to svn?

    - by echinodermata
    The good folks who created the SVN version control system use a structure they refer to as "skip deltas" to store the revision history of files internally. A revision is stored as a delta against an earlier revision. However, revision N is not necessarily stored as a delta against revision N-1, like this: 0 <- 1 <- 2 <- 3 <- 4 <- 5 <- 6 <- 7 <- 8 <- 9 Instead, revision N is stored as a delta against N-f(N), where f(N) is the greatest power of two that divides N: 0 <- 1 2 <- 3 4 <- 5 6 <- 7 0 <------ 2 4 <------ 6 0 <---------------- 4 0 <------------------------------------ 8 <- 9 (Superficially it looks like a skip list but really it's not that similar - for instance, skip deltas are not interested in supporting insertion in the middle of the list.) You can read more about it here. My question is: Do other systems use skip deltas? Were skip deltas known/used/published before SVN, or did the creators of SVN invent it themselves?

    Read the article

  • My data vanished after copying to E drive

    - by pnp
    Of late I had been thinking of having a fresh install of Ubuntu. So I cut-pasted all my required files and folders in my E drive. Then I decided to not to have a fresh install and just let it be. Later, when I booted up in Windows (dual-boot with 12.04 and Windows 7), I found that the files and folders I had cut-pasted from my home account in Ubuntu are just not there. What is even more surprising is that now, when I am back on Ubuntu, those files and folders that should have been there in my E drive are also not there. Is it an Ubuntu issue or a hard drive issue?

    Read the article

  • Content light website and Google - Tell google it's a listings site (as opposed shop, reviews or restaurants)

    - by Doug Firr
    I have a listings style website. Due to the nature of this (listings) the site is content light. Each page is typically less that 50 words but there are many pages. The site in question has had a ton of media coverage and so has some great inbound links from places like Wired, Fast Company, Canada Broadcasting Corporation and many many other bloggers, media websites and recycle related niche authors (It's a recycling site). But Google really ignores it. Traffic from search is very very low - less than 5% of all traffic. I know that using markup you can tell Google whether your site is a restaurant, article, review, shop, local business and a few other categories (https://www.google.com/webmasters/markup-helper/u/0/). Is there a way to tell Google that my site is a listings site? I suspect, but do not know for sure, that part of the problem is that Google simply does not know what my site is? It's a crowdmap where people post curbalerts. The information is useful to people but it is presented in a short, concise way - a pin on a map, a picture and a short description. Adding anything further is not necessary for the site's intended purpose. 1st question - how best to tell the search engines what y site is - listings and not some spammy website? Any recommendations in improving our site's Search presence? You can take a look here if interested: http://tinyurl.com/lxg4hn7

    Read the article

  • Most efficient way to store this collection of moduli and remainders?

    - by Bryan
    I have a huge collection of different moduli and associated with each modulus a fairly large list of remainders. I want to store these values so that I can efficiently determine whether an integer is equivalent to any one of the remainders with respect to any of the moduli (it doesn't matter which, I just want a true/false return). I thought about storing these values as a linked-list of balanced binary trees, but I was wondering if there is a better way? EDIT Perhaps a little more detail would be helpful. As for the size of this structure, it will be holding about 10s of thousands of (prime-1) moduli and associated to each modulus will be a variable amount of remainders. Most moduli will only have one or two remainders associated to it, but a very rare few will have a couple hundred associated to it. This is part of a larger program which handles numbers with a couple thousand (decimal) digits. This program will benefit more from this table being as large as possible and being able to be searched quickly. Here's a small part of the dataset where the moduli are in parentheses and the remainders are comma separated: (46) k = 20 (58) k = 15, 44 (70) k = 57 (102) k = 36, 87 (106) k = 66 (156) k = 20, 59, 98, 137 (190) k = 11, 30, 68, 87, 125, 144, 182 (430) k = 234 (520) k = 152, 282 (576) k = 2, 11, 20, 29, 38, 47, 56, 65, 74, ...(add 9 each time), 569 I had said that the moduli were prime, but I was wrong they are each one below a prime.

    Read the article

  • Archiving your contact form data.

    - by Latest Microsoft Blogs
    I get TONS of email from customer. Over time, this email helps me to determine what areas in our product collection are opportunities for enhancement or improvement. I store the email that comes from my blog contact form in folders and then search through Read More......(read more)

    Read the article

< Previous Page | 162 163 164 165 166 167 168 169 170 171 172 173  | Next Page >