Search Results

Search found 89214 results on 3569 pages for 'code statistics'.

Page 75/3569 | < Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82 | Next Page >

What is a good way to sync code files between computer?

- by erotsppa

I have a need to code from two computers during the week. What is the best way to sync the two computers (mac)? I've started using source control, like SVN. It works pretty good, except sometimes I check in code that I want to sync but they don't compile and it interferes with other people on the team working on the same project. I don't want to use branch. It wouldn't make sense to branch every night when I head home from office.

Read the article
R selecting duplicate rows

- by Matt

Okay, I'm fairly new to R and I've tried to search the documentation for what I need to do but here is the problem. I have a data.frame called heeds.data in the following form (some columns omitted for simplicity) eval.num, eval.count, ... fitness, fitness.mean, green.h.0, green.v.0, offset.0, green.h.1, green.v.1,...green.h.7, green.v.7, offset.7... And I have selected a row meeting the following criteria: best.fitness <- min(heeds.data$fitness.mean[heeds.data$eval.count = 10]) best.row <- heeds.data[heeds.data$fitness.mean == best.fitness] Now, what I want are all of the other rows with that have columns green.h.0 to offset.7 (a contiguous section of columns) equal to the best.row Basically I'm looking for rows that have some of the conditions the same as the "best" row. I thought I could just do this, heeds.best <- heeds.data$fitness[ heeds.data$green.h.0 == best.row$green.h.0 & ... ] But with 24 columns it seems like a stupid method. Looking for something a bit simpler with less manual typing. Thanks!

Read the article
What to use to create bar, line and pie charts with javascript compatible with all major browsers?

- by marcgg

I used to work with flot but it doesn't support pie charts so I'm forced to change. I just saw JS Charts, but their documentation is very obscure regarding cross browser compatibility (I need it to be IE6+ compliant :). Also this will be for commercial use, so I'd rather have something that I can use free of charge jQuery Google chart looks really nice and is well integrated with rails (the framework I'm using) but I'm not sure how good it is. So what do you guys use? What would you recommend keeping in mind that: It will be for commercial use (I can deal with a license, but I'd rather avoid that) It needs to be javascript (no svg, no flash please) It needs to be compatible with IE6+, FF, Chrome, Opera and Safari It needs to be pretty ^^ If it uses jQuery it's even better

Read the article
Screening (multi)collinearity in a regression model

- by aL3xa

I hope that this one is not going to be "ask-and-answer" question... here goes: (multi)collinearity refers to extremely high correlations between predictors in the regression model. How to cure them... well, sometimes you don't need to "cure" collinearity, since it doesn't affect regression model itself, but interpretation of an effect of individual predictors. One way to spot collinearity is to put each predictor as a dependent variable, and other predictors as independent variables, determine R2, and if it's larger than .9 (or .95), we can consider predictor redundant. This is one "method"... what about other approaches? Some of them are time consuming, like excluding predictors from model and watching for b-coefficient changes - they should be noticeably different. Of course, we must always bare in mind specific context/goal of analysis... Sometimes, only remedy is to repeat a research, but right now, I'm interested in various ways of screening redundant predictors when (multi)collinearity occurs in a regression model.

Read the article
Tool used to retrieve code metrics in xUnit Test Patterns?

- by leeand00

I'm reading xUnit Test Patterns by Gerard Meszaros. On one of the pages he refers to some software metrics: While the need to wrap lines to keep them at 65 characters makes this code look even longer than it really is, it is still unnecessarily long. It contains 25 executable statements including initialized declarations, 6 lines of control statements, 4 in-line comments, and 2 lines to declare the test method—giving a total of 37 lines of unwrapped source code. Short of counting the statements to find these metrics, does anybody have any idea if he used a particular tool to calculate the metrics? (If you have any suggestions for tools that will count similar metrics, I'm looking for one that works on Java, Javascript and C++) Thanks!

Read the article
Naive Bayesian classification (spam filtering) - Doubt in one calculation? Which one is right? Plz c

- by Microkernel

Hi guys, I am implementing Naive Bayesian classifier for spam filtering. I have doubt on some calculation. Please clarify me what to do. Here is my question. In this method, you have to calculate P(S|W) - Probability that Message is spam given word W occurs in it. P(W|S) - Probability that word W occurs in a spam message. P(W|H) - Probability that word W occurs in a Ham message. So to calculate P(W|S), should I do (1) (Number of times W occuring in spam)/(total number of times W occurs in all the messages) OR (2) (Number of times word W occurs in Spam)/(Total number of words in the spam message) So, to calculate P(W|S), should I do (1) or (2)? (I thought it to be (2), but I am not sure, so plz clarify me) I am refering http://en.wikipedia.org/wiki/Bayesian_spam_filtering for the info by the way. I got to complete the implementation by this weekend :( Thanks and regards, MicroKernel :) @sth: Hmm... Shouldn't repeated occurrence of word 'W' increase a message's spam score? In the your approach it wouldn't, right?. Lets take a scenario and discuss... Lets say, we have 100 training messages, out of which 50 are spam and 50 are Ham. and say word_count of each message = 100. And lets say, in spam messages word W occurs 5 times in each message and word W occurs 1 time in Ham message. So total number of times W occuring in all the spam message = 5*50 = 250 times. And total number of times W occuring in all Ham messages = 1*50 = 50 times. Total occurance of W in all of the training messages = (250+50) = 300 times. So, in this scenario, how do u calculate P(W|S) and P(W|H) ? Naturally we should expect, P(W|S) P(W|H)??? right. Please share your thought...

Read the article
Screening (multi)collinearity in a reggresion model

- by aL3xa

I hope that this one is not going to be "ask-and-answer" question... here goes: (multi)collinearity refers to extremely high correlations between predictors in the regression model. How to cure them... well, sometimes you don't need to "cure" collinearity, since it doesn't affect regression model itself, but interpretation of an effect of individual predictors. One way to spot collinearity is to put each predictor as a dependent variable, and other predictors as independent variables, determine R2, and if it's larger than .9 (or .95), we can consider predictor redundant. This is one "method"... what about other approaches? Some of them are time consuming, like excluding predictors from model and watching for b-coefficient changes - they should be noticeably different. Of course, we must always bare in mind specific context/goal of analysis... Sometimes, only remedy is to repeat a research, but right now, I'm interested in various ways of screening redundant predictors when (multi)collinearity occurs in a regression model.

Read the article
how to develop a program to minimize errors in human transcription of hand written surveys

- by Alex. S.

I need to develop custom software to do surveys. Questions may be of multiple choice, or free text in a very few cases. I was asked to design a subsystem to check if there is any error in the manual data entry for the multiple choices part. We're trying to speed up the user data entry process and to minimize human input differences between digital forms and the original questionnaires. The surveys are filled with handwritten marks and text by human interviewers, so it's possible to find hard to read marks, or also the user could accidentally select a different value in some question, and we would like to avoid that. The software must include some automatic control to detect possible typing differences. Each answer of the multiple choice questions has the same probability of being selected. This question has two parts: The GUI. The most simple thing I have in mind is to implement the most usable design of the questions display: use of large and readable fonts and space generously the choices. Is there something else? For faster input, I would like to use drop down lists (favoring keyboard over mouse). Given the questions are grouped in sections, I would like to show the answers selected for the questions of that section, but this could slow down the process. Any other ideas? The error checking subsystem. What else can I do to minimize or to check human typos in the multiple choice questions? Is this a solvable problem? is there some statistical methodology to check values that were entered by the users are the same from the hand filled forms? For example, let's suppose the survey has 5 questions, and each has 4 options. Let's say I have n survey forms filled in paper by interviewers, and they're ready to be entered in the software, then how to minimize the accidental differences that can have the manual transcription of the n surveys, without having to double check everything in the 5 questions of the n surveys? My first suggestion is that at the end of the processing of all the hand filled forms, the software could choose some forms randomly to make a double check of the responses in a few instances, but on what criteria can I make this selection? This validation would be enough to cover everything in a significant way? The actual survey is nation level and it has 56 pages with over 200 questions in total, so it will be a lot of hand written pages by many people, and the intention is to reduce the likelihood of errors and to optimize speed in the data entry process. The surveys must filled in paper first, given the complications of taking laptops or handhelds with the interviewers.

Read the article
Any C++ library for Johansen co-integration test ?

- by Faraz

Any Ideas ? Will be highly appreciated.

Read the article
incremental way of counting quantiles for large set of data

- by Gacek

I need to count the quantiles for a large set of data. Let's assume we can get the data only through some portions (i.e. one row of a large matrix). To count the Q3 quantile one need to get all the portions of the data and store it somewhere, then sort it and count the quantile: List<double> allData = new List<double>(); foreach(var row in matrix) // this is only example. In fact the portions of data are not rows of some matrix { allData.AddRange(row); } allData.Sort(); double p = 0.75*allData.Count; int idQ3 = (int)Math.Ceiling(p) - 1; double Q3 = allData[idQ3]; Now, I would like to find a way of counting this without storing the data in some separate variable. The best solution would be to count some parameters od mid-results for first row and then adjust it step by step for next rows. Note: These datasets are really big (ca 5000 elements in each row) The Q3 can be estimated, it doesn't have to be an exact value. I call the portions of data "rows", but they can have different leghts! Usually it varies not so much (+/- few hundred samples) but it varies! This question is similar to this one: http://stackoverflow.com/questions/1058813/on-line-iterator-algorithms-for-estimating-statistical-median-mode-skewness But I need to count quantiles. ALso there are few articles in this topic, i.e.: http://web.cs.wpi.edu/~hofri/medsel.pdf http://portal.acm.org/citation.cfm?id=347195&dl But before I would try to implement these, I wanted to ask you if there are maybe any other, qucker ways of counting the 0.25/0.75 quantiles?

Read the article
Significance in R

- by Gemsie

Ok, this is quite hard to explain, but I'm at a complete loss what to do. I'm a relative newcomer to R and although I can completely admire how powerful it is, I'm not too good at actually using it.... Basically, I have some very contrived data that I need to analyse (it wasn't me who chose this, I can assure you!). I have the right and left hand lengths of lots of people, as well as some numeric data that shows their sociability. Now I would like to know if people who have significantly different lengths of hand are more or less sociable than those who have the same (leading into the research that 'symmetrical' people are more sociable and intelligent, etc. I have got as far as loading the data into R, then I have no idea where to go from there. How on Earth do I start to separate those who are close to symmetrical to those who aren't to then start to do the analysis? Ok, using Sasha's great advice, I did the cor.test and got the following: Pearson's product-moment correlation data: measurements$l.hand - measurements$r.hand and measurements$sociable t = 0.2148, df = 150, p-value = 0.8302 alternative hypothesis: true correlation is not equal to 0 95 percent confidence interval: -0.1420623 0.1762437 sample estimates: cor 0.01753501 I have never used this test before, so am unsure how to intepret it...you wouldn't think I was on my fourth Scientific degree would you?! :(

Read the article
How does one suppress a 404 status code in a page?

- by songdogtech

I've got a WordPress site that includes pages pulled from a different database. The problem is that these other pages return a 404 status code. (The WordPress posts/pages are fine.) The 404'ed pages display fine, and I removed the "Page not Found" text from the title tag in WordPress. But Googlebot and W3C see the 404 header. So: wow does one tell Apache to suppress a 404 status? And will Apache override WordPress's 404 header? Does that make sense? What other info and things should I be looking at? Can I suppress the status code in .htaccess so I don't change WP core files?

Read the article
Discrete problem of probability theory [closed]

- by calejero

A jury consists of 12 persons each of which has, before the trial started, a probability of 0.4 to vote in favor of the defendant's innocence. During the trial, the lawyer has a probability of 0.6 to change the mind of each juror who was biased against the accused. How likely is the defendant to be acquitted if he needs 10 votes in favor?

Read the article
Summarising grouped records in a dataframe in R

- by monch1962

Hello all, I have a data frame in R that looks like this: > TimeOffset, Source, Length > 0 1 1500 > 0.1 1 1000 > 0.2 1 50 > 0.4 2 25 > 0.6 2 3 > 1.1 1 1500 > 1.4 1 18 > 1.6 2 2500 > 1.9 2 18 > 2.1 1 37 > ... and I want to convert it to > TimeOffset, Source, Length > 0.2 1 2550 > 0.6 2 28 > 1.4 1 1518 > 1.9 2 2518 > ... Trying to put this into English, I want to group consecutive records with the same 'Source' together, then printing out a single record per group showing the highest time offset in that group, the source, and the sum of the lengths in that group. The TimeOffset values will always increase. I suspect this is possible in R, but I really don't know where to start. In a pinch I could export the data frame out and do it in e.g. Python, but I'd prefer to stay within R if possible. Thanks in advance for any assistance you can provide

Read the article
Is there a good R API for accessing Google Docs?

- by James Thompson

I'm using R for data analysis, and I'm sharing some data with collaborators via Google docs. Is there a simple interface that I can use to access a R data.frame object to and from a Google Docs spreadsheet? If not, is there a similar API in other languages?

Read the article
Summarising grouped records in a dataframe in R (...again)

- by monch1962

Hello all, (I tried to ask this question earlier today, but later realised I over-simplified the question; the answers I received were correct, but I couldn't use them because of my over-simplification of the problem in the original question. Here's my 2nd attempt...) I have a data frame in R that looks like: "Timestamp", "Source", "Target", "Length", "Content" 0.1 , P1 , P2 , 5 , "ABCDE" 0.2 , P1 , P2 , 3 , "HIJ" 0.4 , P1 , P2 , 4 , "PQRS" 0.5 , P2 , P1 , 2 , "ZY" 0.9 , P2 , P1 , 4 , "SRQP" 1.1 , P1 , P2 , 1 , "B" 1.6 , P1 , P2 , 3 , "DEF" 2.0 , P2 , P1 , 3 , "IJK" ... and I want to convert this to: "StartTime", "EndTime", "Duration", "Source", "Target", "Length", "Content" 0.1 , 0.4 , 0.3 , P1 , P2 , 12 , "ABCDEHIJPQRS" 0.5 , 0.9 , 0.4 , P2 , P1 , 6 , "ZYSRQP" 1.1 , 1.6 , 0.5 , P1 , P2 , 4 , "BDEF" ... Trying to put this into English, I want to group consecutive records with the same 'Source' and 'Target' together, then print out a single record per group showing the StartTime, EndTime & Duration (=EndTime-StartTime) for that group, along with the sum of the Lengths for that group, and a concatenation of the Content (which will all be strings) in that group. The TimeOffset values will always increase throughout the data frame. I had a look at melt/recast and have a feeling that it could be used to solve the problem, but couldn't get my head around the documentation. I suspect it's possible to do this within R, but I really don't know where to start. In a pinch I could export the data frame out and do it in e.g. Python, but I'd prefer to stay within R if possible. Thanks in advance for any assistance you can provide

Read the article
How popular is Groovy/Grails in the corporate world?

- by Patrick O Connell

Are there any figures for its adoption in corporate environments? Does anyone know of large corporations that have adopted it for projects?

Read the article
Determining the popularity of a video with ratings and views

- by user295825

I am about to embark on a new project - a video website. Users will be able to register, and vote on videos by clicking "like" or "dislike", or something to that effect. In any event, it will be a 2-option voting system, not a 5-star system. Every X number of days, I will be generating a "chart" of the most popular videos. So my question is: how should I determine the popularity of a given video? If I went the route of tallying up the videos with the most views, this could have the effect of exceptionally bad videos making it to the of the charts (just because they're so bad). If I go the route of a scoring system based on the amount of "like" and "dislike" votes (eg. 100 like votes, and 50 dislike votes equals a score of 2), videos with few views could appear on the top of the charts. So, what I need to do is a combination of the two. Barring, of course, spammy views and votes. What's your guys' thoughts on the subject?

Read the article
Cumulative Normal Distribution function in objective C

- by William Jockusch

Anyone know of a good implementation of this whose license is compatible with non-free iPhone apps? As suggested in this question, Boost looks absolutely wonderful. But as best I can tell, it is only available in C++. http://stackoverflow.com/questions/2328258/cumulative-normal-distribution-function-in-c

Read the article
Generate random number from an arbitrary weighted list

- by Fernando

Here's what I need to do, I'll be doing this both in PHP and JavaScript. I have a list of numbers that will range from 1 to 300-500 (I haven't set the limit yet). I will be running a drawing were 10 numbers will be picked at random from the given range. Here's the tricky part: I want some numbers to be less likely to be drawn up. A small set of those 300-500 will be flagged as "lucky numbers". For example, out of 100 drawings, most numbers have equal chances of being drawn, except for a few, that will only be picked once every 30-50 drawings. Basically I need to artificially set the probability of certain numbers to be picked while maintaining an even distribution with the rest of the numbers. The only similar thing I've found so far is this question: Generate A Weighted Random Number, the problem being that my spec has considerably more numbers (up to 500) so the weights would get very small and supposedly this could be a problem with that solution (Rejection Sampling). I'm still trying it, though, but I wonder if there other solutions. Math is not my thing so I appreciate any input. Thanks.

Read the article
please help me i want live chat free source code in cakephp or javascript where i able to get this

- by mohan

please help me i want live chat free source code in cakephp or javascript where i able to get this

Read the article
Need a R package for piecewise linear regression ?

- by ablimit

Does anybody aware of a package for "piecewise linear regression" ?

Read the article
What percent of web sites use JavaScript?

- by Claudiu

I'm wondering just how pervasive JavaScript is. This article states that 73% of websites they tested rely on JavaScript for important functionality, but it seems to me that the number must be larger. Have any surveys been done on this topic? Maybe a better way to phrase this question is - are there any sites that don't use JavaScript? EDIT: By 'use', I don't necessarily mean "rely on for important functionality" - that was just the statistic that one article gave.

Read the article
How to generate correlated binary variables

- by jonalm

Dear All I need to generate a series of N random binary variables with a given correlation function. Let x = {x_i} be a series of binary variables (taking the value 0 or 1, i running form 1 to N). The marginal probability is given Pr(x_i = 1) = p, and the values should be correlated in the following way E[ x_i x_j ] = const * |i-j|^-alfa where alfa is a positive number. Is it possible to generate a series like this? preferably in python.

Read the article
Suggest a good book for Quantitative Methods & R Programming

- by Rahul

Hi folks, Please suggest a good book for beginner in Quantitative Methods/Techniques. Adding to this, a good book for beginners in R programming language, used in Quantitative Methods. And I've a few questions about this: ? Should I have to learn the other subjects like Probability, Statics, etc. before learning Quantitative Methods ? Is there any relation between Quantitative Methods & Data Mining

Read the article

< Previous Page | 71 72 73 74 75 76 77 78 79 80 81 82 | Next Page >