rstats - Developer IT

Connect R to Quickbooks

- by Btibert3

Has anyone connected the R package to QuickBooks? I know there is an ODBC driver than can be bought. Just wondering if anyone has already gone down this road. Any insight will be much appreciated! ~ Brock

Read the article

Adding multiple vectors in R

- by Elais

I have a problem where I have to add thirty-three integer vectors of equal length from a dataset in R. I know the simple solution would be Vector1 + Vector2 + Vector3 +VectorN But I am sure there is a way to code this. Also some vectors have NA in place of integers so I need a way to skip those. I know this may be very basic but I am new to this.

Read the article

Useful keyboard shortcuts and tips for ESS/R

- by ggg

I would like to ask regular ESS/R users what key bindings do they use frequently and tips on using ESS/R.

Read the article

What is the optimal way to run a set of regressions in R.

- by stevejb

Assume that I have sources of data X and Y that are indexable, say matrices. And I want to run a set of independent regressions and store the result. My initial approach would be results = matrix(nrow=nrow(X), ncol=(2)) for(i in 1:ncol(X)) { matrix[i,] = coefficients(lm(Y[i,] ~ X[i,]) } But, loops are bad, so I could do it with lapply as out <- lapply(1:nrow(X), function(i) { coefficients(lm(Y[i,] ~ X[i,])) } ) Is there a better way to do this?

Read the article

Smooth Error in qplot from ggplot2

- by Jared

I have some data that I am trying to plot faceted by its Type with a smooth (Loess, LM, whatever) superimposed. Generation code is below: testFrame <- data.frame(Time=sample(20:60,50,replace=T),Dollars=round(runif(50,0,6)),Type=sample(c("First","Second","Third","Fourth"),50,replace=T,prob=c(.33,.01,.33,.33))) I have no problem either making a faceted plot, or plotting the smooth, but I cannnot do both. The first three lines of code below work fine. The fourth line is where I have trouble: qplot(Time,Dollars,data=testFrame,colour=Type) qplot(Time,Dollars,data=testFrame,colour=Type) + geom_smooth() qplot(Time,Dollars,data=testFrame) + facet_wrap(~Type) qplot(Time,Dollars,data=testFrame) + facet_wrap(~Type) + geom_smooth() It gives the following error: Error in [<-.data.frame(*tmp*, var, value = list(NA = NULL)) : missing values are not allowed in subscripted assignments of data frames What am I missing to overlay a smooth in a faceted plot? I could have sworn I had done this before, possibly even with the same data.

Read the article

how to do introspection in R (stat package)

- by Lebron James

Hi all, I am somewhat new to R, and i have this piece of code which generates a variable that i don't know the type for. Are there any introspection facility in R which will tell me which type this variable belongs to? The following illustrates the property of this variable: I am working on linear model selection, and the resource I have is lm result from another model. Now I want to retrieve the lm call by the command summary(model)$call so that I don't need to hardcode the model structure. However, since I have to change the dataset, I need to do a bit of modification on the "string", but apparently it is not a simple string. I wonder if there is any command similar to string.replace so that I can manipulate this variable from the variable $call. Thanks > str<-summary(rdnM)$call > str lm(formula = y ~ x1, data = rdndat) > str[1] lm() > str[2] y ~ x1() > str[3] rdndat() > str[3] <- data Warning message: In str[3] <- data : number of items to replace is not a multiple of replacement length > str lm(formula = y ~ x1, data = c(10, 20, 30, 40)) > str<-summary(rdnM)$call > str lm(formula = y ~ x1, data = rdndat) > str[3] <- 'data' > str lm(formula = y ~ x1, data = "data") > str<-summary(rdnM)$call > type str Error: unexpected symbol in "type str" >

Read the article

Parallel processing in R 2.11 Windows 64-bit using SNOW not quite working

- by Abhijit

I'm running R 2.11 64-bit on a WinXP64 machine with 8 processors. With R 2.10.1 the following code spawned 6 R processes for parallel processing: require(foreach) require(doSNOW) cl = makeCluster(6, type='SOCK') registerDoSNOW(cl) bl2 = foreach(i=icount(length(unqmrno))) %dopar% { (Some code here) } stopCluster(cl) When I run the same code in R 2.11 Win64, the 6 R processes are not spawning, and the code hangs. I'm wondering if this is a problem with the port of SNOW to 2.11-64bit, or if any additional code is required on my part. Thanks

Read the article

Plotting 3-tuple data points in a surface / contour plot using matplotlib

- by morpheous

I have some surface data that is generated by an external program as XYZ values. I want to create the following graphs, using matplotlib: Surface plot Contour plot Contour plot overlayed with a surface plot I have looked at several examples for plotting surfaces and contours in matplotlib - however, the Z values seems to be a function of X and Y i.e. Y ~ f(X,Y). I assume that I will somehow need to transform my Y variables, but I have not seen any example yet, that shows how to do this. So, my question is this: given a set of (X,Y,Z) points, how may I generate Surface and contour plots from that data? BTW, just to clarify, I do NOT want to create scatter plots. Also although I mentioned matplotlib in the title, I am not averse to using rpy(2), if that will allow me to create these charts.

Read the article

What is an efficient method for partitioning and aggregating intervals from timestamped rows in a da

- by mattrepl

From a data frame with timestamped rows (strptime results), what is the best method for aggregating statistics for intervals? Intervals could be an hour, a day, etc. I've found the aggregate function, but that doesn't help with assigning each row to an interval. I'm planning on adding a column to the data frame that denotes interval and using that with aggregate, but if there's a better solution it'd be great to hear it. Thanks for any pointers!

Read the article

What is the best way to run a loop of regressions in R?

- by stevejb

Assume that I have sources of data X and Y that are indexable, say matrices. And I want to run a set of independent regressions and store the result. My initial approach would be results = matrix(nrow=nrow(X), ncol=(2)) for(i in 1:ncol(X)) { matrix[i,] = coefficients(lm(Y[i,] ~ X[i,]) } But, loops are bad, so I could do it with lapply as out <- lapply(1:nrow(X), function(i) { coefficients(lm(Y[i,] ~ X[i,])) } ) Is there a better way to do this?

Read the article

can lapply not modify variables in a higher scope

- by stevejb

I often want to do essentially the following: mat <- matrix(0,nrow=10,ncol=1) lapply(1:10, function(i) { mat[i,] <- rnorm(1,mean=i)}) But, I would expect that mat would have 10 random numbers in it, but rather it has 0. (I am not worried about the rnorm part. Clearly there is a right way to do that. I am worry about affecting mat from within an anonymous function of lapply) Can I not affect matrix mat from inside lapply? Why not? Is there a scoping rule of R that is blocking this?

Read the article

R library for discrete Markov chain simulation

- by stevejb

Hello, I am looking for something like the 'msm' package, but for discrete Markov chains. For example, if I had a transition matrix defined as such Pi <- matrix(c(1/3,1/3,1/3, 0,2/3,1/6, 2/3,0,1/2)) for states A,B,C. How can I simulate a Markov chain according to that transition matrix? Thanks,

Read the article

editing Rnw in Emacs, gets confused if in math mode or not

- by stevejb

When editing .Rnw files with emacs, sometimes it gets confused as to if I am in math mode or not. Then, the syntax highlighting gets messed up, and C-f-i inserts \textit{} and \mathit{} opposite to how it normally should. Is seems like there is some bool storing the state of math mode or not, and it gets inadvertently flipped. Is there a way I can manually flip it back?

Read the article

Creating a Large Matrix in ff

- by Ryan Rosario

I am trying to create a huge matrix in ff, and I know that ff is good for this sort of thing. But, there is a major problem. The dimensions of the matrix exceed .Machine$max_integer! I am running on a 64 bit machine, using 64bit R and 64bit ff. Is there any way to get around this problem? It's been suggested that R is using the MAXINT value from stdint.h. Is there any way to fix this without changing that file and possibly breaking build? > ffMatrix <- ff(vmode="boolean", dim=c(1e10,1e10)) Error in if (length < 0 || length > .Machine$integer.max) stop("length must be between 1 and .Machine$integer.max") : missing value where TRUE/FALSE needed In addition: Warning message: In ff(vmode = "boolean", dim = c(1e+10, 1e+10)) : NAs introduced by coercion > 1e+10 > .Machine$integer.max [1] TRUE

Read the article

Using quantmod from Python - is this possible

- by morpheous

I have just come across quantmod, and I would like to use it from Python. However I am not sure how to use quantmod from a Python script. Has anyone done this before - any ideas or suggestions on how to get started?

Read the article

Asking ESS and R users for suggestions for elisp codes in .emacs file

- by ggg

I believe that not all R users know elisp. It would be nice if ESS users could share their code in their .emacs file here. Well commented code would be particularly useful.

Read the article

how to wrap a function that only takes individual elements to make it take a list

- by stevejb

Hello, Say I have a function handed to me that I cannot change and must use as is. This function takes several objects in the form of oldFunction( object1, object2, object3, ...) where ... are other arguments. I want to write a wrapper to take a list of objects. My idea was this. sjb.ListWrapper <- function(myList,...) { lLen <- length(myList) myStr <- "" for( i in 1:lLen) { myStr <- paste(myStr, "myList[[", i , "]],",sep="") } myCode <- paste("oldFunction(", myStr, "...)") eval({myCode}) } However, the issue is that I want to use this from Sweave and I need the output of oldFunction to be printed. What is the right way to do this? Thanks.

Read the article

What is the simplest method to fill the area under a geom_freqpoly line?

- by mattrepl

The x-axis is time broken up into time intervals. There is an interval column in the data frame that specifies the time for each row. The column is a factor, where each interval is a different factor level. Plotting a histogram or line using geom_histogram and geom_freqpoly works great, but I'd like to have a line, like that provided by geom_freqpoly, with the area filled. Currently I'm using geom_freqpoly like this: ggplot(quake.data, aes(interval, fill=tweet.type)) + geom_freqpoly(aes(group = tweet.type, colour = tweet.type)) + opts(axis.text.x=theme_text(angle=-60, hjust=0, size = 6)) I would prefer to have a filled area, such as provided by geom_density, but without smoothing the line: UPDATE: The geom_area has been suggested, is there any way to use a ggplot2-generated statistic, such as ..count.., for the geom_area's y-values? Or, does the count aggregation need to occur prior to using ggplot2?

Read the article

Is it possible to plot a single density over a discrete variable?

- by mattrepl

The x-axis is time broken up into time intervals. There is an interval column in the data frame that specifies the time for each row. Plotting a histogram or line using geom_histogram and geom_freqpoly works great, but I'd like to use geom_density to get a filled area. Perhaps there is a better way to achieve this. Right now, if I use geom_density, curves are created for each discrete factor level instead of smoothing over all of them.

Read the article

How to indent a buffer in ESS?

- by ggg

ESS allows us to indent a line and an expression. Is there a key binding for indenting a buffer? If not, can we create it?

Read the article

decode tinyurl in R to get full url path?

- by Grey Peak

Is there a way to decode tinyURL links in R so that I can see which web pages they actually refer to?

Read the article

Using R to download zipped data file, extract, and import data

- by Jeromy Anglim

@EZGraphs on Twitter writes: "Lots of online csvs are zipped. Is there a way to download, unzip the archive, and load the data to a data.frame using R? #Rstats" I was also trying to do this today, but ended up just downloading the zip file manually. I tried something like: fileName <- "http://www.newcl.org/data/zipfiles/a1.zip" con1 <- unz(fileName, filename="a1.dat", open = "r") but I feel as if I'm a long way off. Any thoughts?

Read the article

Programming R/Sweave for proper \Sexpr output

- by deoksu

Hi I'm having a bit of a problem programming R for Sweave, and the #rstats twitter group often points here, so I thought I'd put this question to the SO crowd. I'm an analyst- not a programmer- so go easy on me my first post. Here's the problem: I am drafting a survey report in Sweave with R and would like to report the marginal returns in line using \Sexpr{}. For example, rather than saying: Only 14% of respondents said 'X'. I want to write the report like this: Only \Sexpr{p.mean(variable)}$\%$ of respondents said 'X'. The problem is that Sweave() converts the results of the expression in \Sexpr{} to a character string, which means that the output from expression in R and the output that appears in my document are different. For example, above I use the function 'p.mean': p.mean<- function (x) {options(digits=1) mmm<-weighted.mean(x, weight=weight, na.rm=T) print(100*mmm) } In R, the output looks like this: p.mean(variable) >14 but when I use \Sexpr{p.mean(variable)}, I get an unrounded character string (in this case: 13.5857142857143) in my document. I have tried to limit the output of my function to 'digits=1' in the global environment, in the function itself, and and in various commands. It only seems to contain what R prints, not the character transformation that is the result of the expression and which eventually prints in the LaTeX file. as.character(p.mean(variable)) >[1] 14 >[1] "13.5857142857143" Does anyone know what I can do to limit the digits printed in the LaTeX file, either by reprogramming the R function or with a setting in Sweave or \Sexpr{}? I'd greatly appreciate any help you can give. Thanks, David

Read the article

R: manipulating data.frames containing strings and booleans.

- by Mike Dewar

Hello. I have a data.frame in R; it's called p. Each element in the data.frame is either True or False. My variable p has, say, m rows and n columns. For every row there is strictly only one TRUE element. It also has column names, which are strings. What I would like to do is the following: For every row in p I see a TRUE I would like to replace with the name of the corresponding column I would then like to collapse the data.frame, which now contains FALSEs and column names, to a single vector, which will have m elements. I would like to do this in an R-thonic manner, so as to continue my enlightenment in R and contribute to a world without for-loops. I can do step 1 using the following for loop: for (i in seq(length(colnames(p)))) { p[p[,i]==TRUE,i]=colnames(p)[i] } but theres's no beauty here and I have totally subscribed to this for-loops-in-R-are-probably-wrong mentality. Maybe wrong is too strong but they're certainly not great. I don't really know how to do step 2. I kind of hoped that the sum of a string and FALSE would return the string but it doesn't. I kind of hoped I could use an OR operator of some kind but can't quite figure that out (Python responds to False or 'bob' with 'bob'). Hence, yet again, I appeal to you beautiful Rstats people for help!

Search Results

Search found 24 results on 1 pages for 'rstats'.

Page 1/1 | 1

- by Btibert3

- by Elais

- by ggg

- by stevejb

- by Jared

- by Lebron James

- by Abhijit

- by morpheous

- by mattrepl

- by stevejb

- by stevejb

- by stevejb

- by stevejb

- by Ryan Rosario

- by morpheous

- by ggg

- by stevejb

- by mattrepl

- by mattrepl

- by ggg

- by Grey Peak

- by Jeromy Anglim

- by deoksu

- by Mike Dewar