In R, how do you get the best fitting equation to a set of data?
- by Matherion
I'm not sure wether R can do this (I assume it can, but maybe that's just because I tend to assume that R can do anything :-)). What I need is to find the best fitting equation to describe a dataset.
For example, if you have these points:
df = data.frame(x = c(1, 5, 10, 25, 50, 100), y = c(100, 75, 50, 40, 30, 25))
How do you get the best fitting equation? I know that you can get the best fitting curve with:
plot(loess(df$y ~ df$x))
But as I understood you can't extract the equation, see Loess Fit and Resulting Equation.
When I try to build it myself (note, I'm not a mathematician, so this is probably not the ideal approach :-)), I end up with smth like:
y.predicted = 12.71 + ( 95 / (( (1 + df$x) ^ .5 ) / 1.3))
Which kind of seems to approximate it - but I can't help to think that smth more elegant probably exists :-)
I have the feeling that fitting a linear or polynomial model also wouldn't work, because the formula seems different from what those models generally use (i.e. this one seems to need divisions, powers, etc). For example, the approach in Fitting polynomial model to data in R gives pretty bad approximations.
I remember from a long time ago that there exist languages (Matlab may be one of them?) that do this kind of stuff. Can R do this as well, or am I just at the wrong place?
(Background info: basically, what we need to do is find an equation for determining numbers in the second column based on the numbers in the first column; but we decide the numbers ourselves. We have an idea of how we want the curve to look like, but we can adjust these numbers to an equation if we get a better fit. It's about the pricing for a product (a cheaper alternative to current expensive software for qualitative data analysis); the more 'project credits' you buy, the cheaper it should become. Rather than forcing people to buy a given number (i.e. 5 or 10 or 25), it would be nicer to have a formula so people can buy exactly what they need - but of course this requires a formula. We have an idea for some prices we think are ok, but now we need to translate this into an equation.