Search Results

Search found 86 results on 4 pages for 'binomial coefficients'.

Page 3/4 | < Previous Page | 1 2 3 4 | Next Page >

calculate AUC (GAM) in R [migrated]

- by ahmad

I used the following script to calculate AUC in R: library(mgcv) library(ROCR) library(AUC) data1=read.table("d:\\2005.txt", header=T) GAM<-gam(tuna ~ s(chla)+s(sst)+s(ssha),family=binomial, data=data1) gampred<- predict(GAM, type="response") rp <- prediction(gampred, data1$tuna) auc <- performance( rp, "auc")@y.values[[1]] auc roc <- performance( rp, "tpr", "fpr") plot( roc ) But when I was running the script, the result is: **rp <- prediction(gampred, data1$tuna) Error in prediction(gampred, data1$tuna) : Format of predictions is invalid. > > auc <- performance( rp, "auc")@y.values[[1]] Error in performance(rp, "auc") : object 'rp' not found > auc function (x, min = 0, max = 1) { if (any(class(x) == "roc")) { if (min != 0 || max != 1) { x$fpr <- x$fpr[x$cutoffs >= min & x$cutoffs <= max] x$tpr <- x$tpr[x$cutoffs >= min & x$cutoffs <= max] } ans <- 0 for (i in 2:length(x$fpr)) { ans <- ans + 0.5 * abs(x$fpr[i] - x$fpr[i - 1]) * (x$tpr[i] + x$tpr[i - 1]) } } else if (any(class(x) %in% c("accuracy", "sensitivity", "specificity"))) { if (min != 0 || max != 1) { x$cutoffs <- x$cutoffs[x$cutoffs >= min & x$cutoffs <= max] x$measure <- x$measure[x$cutoffs >= min & x$cutoffs <= max] } ans <- 0 for (i in 2:(length(x$cutoffs))) { ans <- ans + 0.5 * abs(x$cutoffs[i - 1] - x$cutoffs[i]) * (x$measure[i] + x$measure[i - 1]) } } return(as.numeric(ans)) } <bytecode: 0x03012f10> <environment: namespace:AUC> > > roc <- performance( rp, "tpr", "fpr") Error in performance(rp, "tpr", "fpr") : object 'rp' not found > plot( roc ) Error in levels(labels) : argument "labels" is missing, with no default** Can anybody help me to solve this problem? Thank you in advance.

Read the article
Error when trying to create a faceted plot in ggplot2

- by John Horton

I am trying to make a faceted plot in ggplot2 of the coefficients on the regressors from two linear models with the same predictors. The data frame I constructed is this: r.together> reg coef se y 1 (Intercept) 5.068608671 0.6990873 Labels 2 goodTRUE 0.310575129 0.5228815 Labels 3 indiaTRUE -1.196868662 0.5192330 Labels 4 moneyTRUE -0.586451273 0.6011257 Labels 5 maleTRUE -0.157618168 0.5332040 Labels 6 (Intercept) 4.225580743 0.6010509 Bonus 7 goodTRUE 1.272760149 0.4524954 Bonus 8 indiaTRUE -0.829588862 0.4492838 Bonus 9 moneyTRUE -0.003571476 0.5175601 Bonus 10 maleTRUE 0.977011737 0.4602726 Bonus The "y" column is a label for the model, reg are the regressors and coef and se are what you would think. I want to plot: g <- qplot(reg, coef, facets=.~y, data = r.together) + coord_flip() But when I try to display the plot, I get: > print(g) Error in names(df) <- output : 'names' attribute [2] must be the same length as the vector [1] What's strange is that qplot(reg, coef, colour=y, data = r.together) + coord_flip() plots as you would expect.

Read the article
Daubechies-4 Transform in MATLAB

- by Myx

Hello: I have a 4x4 matrix which I wish to decompose into 4 frequency bands (LL, HL, LH, HH where L=low, H=high) by using a one-level Daubechies-4 wavelet transform. As a result of the transform, each band should contain 2x2 coefficients. How can I do this in MATLAB? I know that MATLAB has dbaux and dbwavf functions. However, I'm not sure how to use them and I also don't have the wavelet toolbox. Any help is greatly appreciated. Thanks.

Read the article
Strange behavior of I() in left-/right-hand side of formula

- by adibender

set.seed(98234) y <- rnorm(100) x <- rnorm(100) lm0 <- lm(y ~ x) lm1 <- lm(I(y) ~ I(x)) all work perfectly fine and I guess we can agree that ´lm0´ is what one would expect to happen. lm1 is equal to lm0 (judging by coefficients). So are set.seed(98234) lm3 <- lm(I(rnorm(100)) ~ rnorm(100)) set.seed(98234) lm4 <- lm(rnorm(100) ~ I(rnorm(100))) But when I() is on neither or both sides of the formula I don't get the results from above: set.seed(98234) lm2 <- lm(I(rnorm(100)) ~ I(rnorm(100))) set.seed(98234) lm5 <- lm(rnorm(100) ~ rnorm(100)) Any ideas why?

Read the article
What is the difference between Multiple R-squared and Adjusted R-squared in a single-variate least s

- by fmark

Could someone explain to the statistically naive what the difference between Multiple R-squared and Adjusted R-squared is? I am doing a single-variate regression analysis as follows: v.lm <- lm(epm ~ n_days, data=v) print(summary(v.lm)) Results: Call: lm(formula = epm ~ n_days, data = v) Residuals: Min 1Q Median 3Q Max -693.59 -325.79 53.34 302.46 964.95 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 2550.39 92.15 27.677 <2e-16 *** n_days -13.12 5.39 -2.433 0.0216 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 410.1 on 28 degrees of freedom Multiple R-squared: 0.1746, Adjusted R-squared: 0.1451 F-statistic: 5.921 on 1 and 28 DF, p-value: 0.0216 Apologies for the newbiness of this question.

Read the article
Testing complex entities

- by Carlos

I've got a C# form, with various controls on it. The form controls an ongoing process, and there are many, many aspects that need to be right for the program to run correctly. Each part can be unit tested (for instance, loading some coefficients, drawing some diagnostics) but I often run into problems that are best described with an example: "If I click here, then here, then change this, then re-open the form, then click here, it crashes or produces an error" I've tried my best to use common code organisational ideas (inheritance, DRY, separation of concerns) but there never seems to be a way to test every single path, and inevitably, a form with several controls will have a huge number of ways to execute. What can I read (preferably online) that addresses this kind of issue, and is there a (non-generic) term for it. This isn't a specific problem I'm having, but one that creeps up on me, especially with WinForms.

Read the article
how to apply Discrete wavelet transform on image

- by abuasis

I am implementing an android application that will verify signature images , decided to go with the Discrete wavelet transform method (symmlet-8) the method requires to apply the discrete wavelet transform and separate the image using low-pass and high-pass filter and retrieve the wavelet transform coefficients. the equations show notations that I cant understand thus can't do the math easily , also didn't know how to apply low-pass and high-pass filters to my x and y points. is there any tutorial that shows you how to apply the discrete wavelet transform to my image easily that breaks it out in numbers? thanks alot in advance.

Read the article
Spatial domain to frequency domain

- by John Elway

I know about Fourier Transforms, but I don't know how to apply it here, and I think that is over the top. I gave my ideas of the responses, but I really don't know what I'm looking for... Supposed that you form a low-pass spatial filter h(x,y) that averages all the eight immediate neighbors of a pixel (x,y) but excludes itself. a. Find the equivalent frequency domain filter H(u,v): My answer is to (a): 1/8*H(u-1, v-1) + 1/8*H(u-1, v) + 1/8*H(u-1, v+1) + 1/8*H(u, v-1) + 0 + 1/8*H(u, v+1) + 1/8*H(u+1, v-1) + 1/8*H(u+1, v) + 1/8*H(u-1, v-1) is this the frequency domain? b. Show that your result is again a low-pass filter. does this have to do with the coefficients being positive?

Read the article
matlab fit exp2

- by HelloWorld

I'm unsuccessfully looking for documentation of fit function using exp2 (sum of 2 exponents). How to operate the function is clear: [curve, gof] = fit(x, y,'exp2'); But since there are multiple ways to fit a sum of exponents I'm trying to find out what algorithm is used. Particularly what happens when I'm fitting one exponent (the raw data) with a bit of noise, how the exponents are spread. I've simulated several cases, and it seems that it "drops" all the weight on the second set of coefficients, but row data analysis often shows different behavior. Does anyone have suggestions of documentation?

Read the article
Is there a standard practice for storing default application data?

- by Rox Wen

Our application includes a default set of data. The default data includes coefficients and other factors that are unlikely to ever change but still need to be update-able by the user. Currently, the original default data is stored as a populated class within the application. Data updates are stored to an external XML file. This design allows us to include a "reset" feature to restore the original default data. Our rationale for not storing defaults externally [e.g. XML file] was to minimize the risk of being altered. The overall volume of data doesn't warrant a database. Is there a standard practice for storing "default" application data?

Read the article
A Taxonomy of Numerical Methods v1

- by JoshReuben

Numerical Analysis – When, What, (but not how) Once you understand the Math & know C++, Numerical Methods are basically blocks of iterative & conditional math code. I found the real trick was seeing the forest for the trees – knowing which method to use for which situation. Its pretty easy to get lost in the details – so I’ve tried to organize these methods in a way that I can quickly look this up. I’ve included links to detailed explanations and to C++ code examples. I’ve tried to classify Numerical methods in the following broad categories: Solving Systems of Linear Equations Solving Non-Linear Equations Iteratively Interpolation Curve Fitting Optimization Numerical Differentiation & Integration Solving ODEs Boundary Problems Solving EigenValue problems Enjoy – I did ! Solving Systems of Linear Equations Overview Solve sets of algebraic equations with x unknowns The set is commonly in matrix form Gauss-Jordan Elimination http://en.wikipedia.org/wiki/Gauss%E2%80%93Jordan_elimination C++: http://www.codekeep.net/snippets/623f1923-e03c-4636-8c92-c9dc7aa0d3c0.aspx Produces solution of the equations & the coefficient matrix Efficient, stable 2 steps: · Forward Elimination – matrix decomposition: reduce set to triangular form (0s below the diagonal) or row echelon form. If degenerate, then there is no solution · Backward Elimination –write the original matrix as the product of ints inverse matrix & its reduced row-echelon matrix à reduce set to row canonical form & use back-substitution to find the solution to the set Elementary ops for matrix decomposition: · Row multiplication · Row switching · Add multiples of rows to other rows Use pivoting to ensure rows are ordered for achieving triangular form LU Decomposition http://en.wikipedia.org/wiki/LU_decomposition C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-lu-decomposition-for-solving.html Represent the matrix as a product of lower & upper triangular matrices A modified version of GJ Elimination Advantage – can easily apply forward & backward elimination to solve triangular matrices Techniques: · Doolittle Method – sets the L matrix diagonal to unity · Crout Method - sets the U matrix diagonal to unity Note: both the L & U matrices share the same unity diagonal & can be stored compactly in the same matrix Gauss-Seidel Iteration http://en.wikipedia.org/wiki/Gauss%E2%80%93Seidel_method C++: http://www.nr.com/forum/showthread.php?t=722 Transform the linear set of equations into a single equation & then use numerical integration (as integration formulas have Sums, it is implemented iteratively). an optimization of Gauss-Jacobi: 1.5 times faster, requires 0.25 iterations to achieve the same tolerance Solving Non-Linear Equations Iteratively find roots of polynomials – there may be 0, 1 or n solutions for an n order polynomial use iterative techniques Iterative methods · used when there are no known analytical techniques · Requires set functions to be continuous & differentiable · Requires an initial seed value – choice is critical to convergence à conduct multiple runs with different starting points & then select best result · Systematic - iterate until diminishing returns, tolerance or max iteration conditions are met · bracketing techniques will always yield convergent solutions, non-bracketing methods may fail to converge Incremental method if a nonlinear function has opposite signs at 2 ends of a small interval x1 & x2, then there is likely to be a solution in their interval – solutions are detected by evaluating a function over interval steps, for a change in sign, adjusting the step size dynamically. Limitations – can miss closely spaced solutions in large intervals, cannot detect degenerate (coinciding) solutions, limited to functions that cross the x-axis, gives false positives for singularities Fixed point method http://en.wikipedia.org/wiki/Fixed-point_iteration C++: http://books.google.co.il/books?id=weYj75E_t6MC&pg=PA79&lpg=PA79&dq=fixed+point+method++c%2B%2B&source=bl&ots=LQ-5P_taoC&sig=lENUUIYBK53tZtTwNfHLy5PEWDk&hl=en&sa=X&ei=wezDUPW1J5DptQaMsIHQCw&redir_esc=y#v=onepage&q=fixed%20point%20method%20%20c%2B%2B&f=false Algebraically rearrange a solution to isolate a variable then apply incremental method Bisection method http://en.wikipedia.org/wiki/Bisection_method C++: http://numericalcomputing.wordpress.com/category/algorithms/ Bracketed - Select an initial interval, keep bisecting it ad midpoint into sub-intervals and then apply incremental method on smaller & smaller intervals – zoom in Adv: unaffected by function gradient à reliable Disadv: slow convergence False Position Method http://en.wikipedia.org/wiki/False_position_method C++: http://www.dreamincode.net/forums/topic/126100-bisection-and-false-position-methods/ Bracketed - Select an initial interval , & use the relative value of function at interval end points to select next sub-intervals (estimate how far between the end points the solution might be & subdivide based on this) Newton-Raphson method http://en.wikipedia.org/wiki/Newton's_method C++: http://www-users.cselabs.umn.edu/classes/Summer-2012/csci1113/index.php?page=./newt3 Also known as Newton's method Convenient, efficient Not bracketed – only a single initial guess is required to start iteration – requires an analytical expression for the first derivative of the function as input. Evaluates the function & its derivative at each step. Can be extended to the Newton MutiRoot method for solving multiple roots Can be easily applied to an of n-coupled set of non-linear equations – conduct a Taylor Series expansion of a function, dropping terms of order n, rewrite as a Jacobian matrix of PDs & convert to simultaneous linear equations !!! Secant Method http://en.wikipedia.org/wiki/Secant_method C++: http://forum.vcoderz.com/showthread.php?p=205230 Unlike N-R, can estimate first derivative from an initial interval (does not require root to be bracketed) instead of inputting it Since derivative is approximated, may converge slower. Is fast in practice as it does not have to evaluate the derivative at each step. Similar implementation to False Positive method Birge-Vieta Method http://mat.iitm.ac.in/home/sryedida/public_html/caimna/transcendental/polynomial%20methods/bv%20method.html C++: http://books.google.co.il/books?id=cL1boM2uyQwC&pg=SA3-PA51&lpg=SA3-PA51&dq=Birge-Vieta+Method+c%2B%2B&source=bl&ots=QZmnDTK3rC&sig=BPNcHHbpR_DKVoZXrLi4nVXD-gg&hl=en&sa=X&ei=R-_DUK2iNIjzsgbE5ID4Dg&redir_esc=y#v=onepage&q=Birge-Vieta%20Method%20c%2B%2B&f=false combines Horner's method of polynomial evaluation (transforming into lesser degree polynomials that are more computationally efficient to process) with Newton-Raphson to provide a computational speed-up Interpolation Overview Construct new data points for as close as possible fit within range of a discrete set of known points (that were obtained via sampling, experimentation) Use Taylor Series Expansion of a function f(x) around a specific value for x Linear Interpolation http://en.wikipedia.org/wiki/Linear_interpolation C++: http://www.hamaluik.com/?p=289 Straight line between 2 points à concatenate interpolants between each pair of data points Bilinear Interpolation http://en.wikipedia.org/wiki/Bilinear_interpolation C++: http://supercomputingblog.com/graphics/coding-bilinear-interpolation/2/ Extension of the linear function for interpolating functions of 2 variables – perform linear interpolation first in 1 direction, then in another. Used in image processing – e.g. texture mapping filter. Uses 4 vertices to interpolate a value within a unit cell. Lagrange Interpolation http://en.wikipedia.org/wiki/Lagrange_polynomial C++: http://www.codecogs.com/code/maths/approximation/interpolation/lagrange.php For polynomials Requires recomputation for all terms for each distinct x value – can only be applied for small number of nodes Numerically unstable Barycentric Interpolation http://epubs.siam.org/doi/pdf/10.1137/S0036144502417715 C++: http://www.gamedev.net/topic/621445-barycentric-coordinates-c-code-check/ Rearrange the terms in the equation of the Legrange interpolation by defining weight functions that are independent of the interpolated value of x Newton Divided Difference Interpolation http://en.wikipedia.org/wiki/Newton_polynomial C++: http://jee-appy.blogspot.co.il/2011/12/newton-divided-difference-interpolation.html Hermite Divided Differences: Interpolation polynomial approximation for a given set of data points in the NR form - divided differences are used to approximately calculate the various differences. For a given set of 3 data points , fit a quadratic interpolant through the data Bracketed functions allow Newton divided differences to be calculated recursively Difference table Cubic Spline Interpolation http://en.wikipedia.org/wiki/Spline_interpolation C++: https://www.marcusbannerman.co.uk/index.php/home/latestarticles/42-articles/96-cubic-spline-class.html Spline is a piecewise polynomial Provides smoothness – for interpolations with significantly varying data Use weighted coefficients to bend the function to be smooth & its 1st & 2nd derivatives are continuous through the edge points in the interval Curve Fitting A generalization of interpolating whereby given data points may contain noise à the curve does not necessarily pass through all the points Least Squares Fit http://en.wikipedia.org/wiki/Least_squares C++: http://www.ccas.ru/mmes/educat/lab04k/02/least-squares.c Residual – difference between observed value & expected value Model function is often chosen as a linear combination of the specified functions Determines: A) The model instance in which the sum of squared residuals has the least value B) param values for which model best fits data Straight Line Fit Linear correlation between independent variable and dependent variable Linear Regression http://en.wikipedia.org/wiki/Linear_regression C++: http://www.oocities.org/david_swaim/cpp/linregc.htm Special case of statistically exact extrapolation Leverage least squares Given a basis function, the sum of the residuals is determined and the corresponding gradient equation is expressed as a set of normal linear equations in matrix form that can be solved (e.g. using LU Decomposition) Can be weighted - Drop the assumption that all errors have the same significance –-> confidence of accuracy is different for each data point. Fit the function closer to points with higher weights Polynomial Fit - use a polynomial basis function Moving Average http://en.wikipedia.org/wiki/Moving_average C++: http://www.codeproject.com/Articles/17860/A-Simple-Moving-Average-Algorithm Used for smoothing (cancel fluctuations to highlight longer-term trends & cycles), time series data analysis, signal processing filters Replace each data point with average of neighbors. Can be simple (SMA), weighted (WMA), exponential (EMA). Lags behind latest data points – extra weight can be given to more recent data points. Weights can decrease arithmetically or exponentially according to distance from point. Parameters: smoothing factor, period, weight basis Optimization Overview Given function with multiple variables, find Min (or max by minimizing –f(x)) Iterative approach Efficient, but not necessarily reliable Conditions: noisy data, constraints, non-linear models Detection via sign of first derivative - Derivative of saddle points will be 0 Local minima Bisection method Similar method for finding a root for a non-linear equation Start with an interval that contains a minimum Golden Search method http://en.wikipedia.org/wiki/Golden_section_search C++: http://www.codecogs.com/code/maths/optimization/golden.php Bisect intervals according to golden ratio 0.618.. Achieves reduction by evaluating a single function instead of 2 Newton-Raphson Method Brent method http://en.wikipedia.org/wiki/Brent's_method C++: http://people.sc.fsu.edu/~jburkardt/cpp_src/brent/brent.cpp Based on quadratic or parabolic interpolation – if the function is smooth & parabolic near to the minimum, then a parabola fitted through any 3 points should approximate the minima – fails when the 3 points are collinear , in which case the denominator is 0 Simplex Method http://en.wikipedia.org/wiki/Simplex_algorithm C++: http://www.codeguru.com/cpp/article.php/c17505/Simplex-Optimization-Algorithm-and-Implemetation-in-C-Programming.htm Find the global minima of any multi-variable function Direct search – no derivatives required At each step it maintains a non-degenerative simplex – a convex hull of n+1 vertices. Obtains the minimum for a function with n variables by evaluating the function at n-1 points, iteratively replacing the point of worst result with the point of best result, shrinking the multidimensional simplex around the best point. Point replacement involves expanding & contracting the simplex near the worst value point to determine a better replacement point Oscillation can be avoided by choosing the 2nd worst result Restart if it gets stuck Parameters: contraction & expansion factors Simulated Annealing http://en.wikipedia.org/wiki/Simulated_annealing C++: http://code.google.com/p/cppsimulatedannealing/ Analogy to heating & cooling metal to strengthen its structure Stochastic method – apply random permutation search for global minima - Avoid entrapment in local minima via hill climbing Heating schedule - Annealing schedule params: temperature, iterations at each temp, temperature delta Cooling schedule – can be linear, step-wise or exponential Differential Evolution http://en.wikipedia.org/wiki/Differential_evolution C++: http://www.amichel.com/de/doc/html/ More advanced stochastic methods analogous to biological processes: Genetic algorithms, evolution strategies Parallel direct search method against multiple discrete or continuous variables Initial population of variable vectors chosen randomly – if weighted difference vector of 2 vectors yields a lower objective function value then it replaces the comparison vector Many params: #parents, #variables, step size, crossover constant etc Convergence is slow – many more function evaluations than simulated annealing Numerical Differentiation Overview 2 approaches to finite difference methods: · A) approximate function via polynomial interpolation then differentiate · B) Taylor series approximation – additionally provides error estimate Finite Difference methods http://en.wikipedia.org/wiki/Finite_difference_method C++: http://www.wpi.edu/Pubs/ETD/Available/etd-051807-164436/unrestricted/EAMPADU.pdf Find differences between high order derivative values - Approximate differential equations by finite differences at evenly spaced data points Based on forward & backward Taylor series expansion of f(x) about x plus or minus multiples of delta h. Forward / backward difference - the sums of the series contains even derivatives and the difference of the series contains odd derivatives – coupled equations that can be solved. Provide an approximation of the derivative within a O(h^2) accuracy There is also central difference & extended central difference which has a O(h^4) accuracy Richardson Extrapolation http://en.wikipedia.org/wiki/Richardson_extrapolation C++: http://mathscoding.blogspot.co.il/2012/02/introduction-richardson-extrapolation.html A sequence acceleration method applied to finite differences Fast convergence, high accuracy O(h^4) Derivatives via Interpolation Cannot apply Finite Difference method to discrete data points at uneven intervals – so need to approximate the derivative of f(x) using the derivative of the interpolant via 3 point Lagrange Interpolation Note: the higher the order of the derivative, the lower the approximation precision Numerical Integration Estimate finite & infinite integrals of functions More accurate procedure than numerical differentiation Use when it is not possible to obtain an integral of a function analytically or when the function is not given, only the data points are Newton Cotes Methods http://en.wikipedia.org/wiki/Newton%E2%80%93Cotes_formulas C++: http://www.siafoo.net/snippet/324 For equally spaced data points Computationally easy – based on local interpolation of n rectangular strip areas that is piecewise fitted to a polynomial to get the sum total area Evaluate the integrand at n+1 evenly spaced points – approximate definite integral by Sum Weights are derived from Lagrange Basis polynomials Leverage Trapezoidal Rule for default 2nd formulas, Simpson 1/3 Rule for substituting 3 point formulas, Simpson 3/8 Rule for 4 point formulas. For 4 point formulas use Bodes Rule. Higher orders obtain more accurate results Trapezoidal Rule uses simple area, Simpsons Rule replaces the integrand f(x) with a quadratic polynomial p(x) that uses the same values as f(x) for its end points, but adds a midpoint Romberg Integration http://en.wikipedia.org/wiki/Romberg's_method C++: http://code.google.com/p/romberg-integration/downloads/detail?name=romberg.cpp&can=2&q= Combines trapezoidal rule with Richardson Extrapolation Evaluates the integrand at equally spaced points The integrand must have continuous derivatives Each R(n,m) extrapolation uses a higher order integrand polynomial replacement rule (zeroth starts with trapezoidal) à a lower triangular matrix set of equation coefficients where the bottom right term has the most accurate approximation. The process continues until the difference between 2 successive diagonal terms becomes sufficiently small. Gaussian Quadrature http://en.wikipedia.org/wiki/Gaussian_quadrature C++: http://www.alglib.net/integration/gaussianquadratures.php Data points are chosen to yield best possible accuracy – requires fewer evaluations Ability to handle singularities, functions that are difficult to evaluate The integrand can include a weighting function determined by a set of orthogonal polynomials. Points & weights are selected so that the integrand yields the exact integral if f(x) is a polynomial of degree <= 2n+1 Techniques (basically different weighting functions): · Gauss-Legendre Integration w(x)=1 · Gauss-Laguerre Integration w(x)=e^-x · Gauss-Hermite Integration w(x)=e^-x^2 · Gauss-Chebyshev Integration w(x)= 1 / Sqrt(1-x^2) Solving ODEs Use when high order differential equations cannot be solved analytically Evaluated under boundary conditions RK for systems – a high order differential equation can always be transformed into a coupled first order system of equations Euler method http://en.wikipedia.org/wiki/Euler_method C++: http://rosettacode.org/wiki/Euler_method First order Runge–Kutta method. Simple recursive method – given an initial value, calculate derivative deltas. Unstable & not very accurate (O(h) error) – not used in practice A first-order method - the local error (truncation error per step) is proportional to the square of the step size, and the global error (error at a given time) is proportional to the step size In evolving solution between data points xn & xn+1, only evaluates derivatives at beginning of interval xn à asymmetric at boundaries Higher order Runge Kutta http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods C++: http://www.dreamincode.net/code/snippet1441.htm 2nd & 4th order RK - Introduces parameterized midpoints for more symmetric solutions à accuracy at higher computational cost Adaptive RK – RK-Fehlberg – estimate the truncation at each integration step & automatically adjust the step size to keep error within prescribed limits. At each step 2 approximations are compared – if in disagreement to a specific accuracy, the step size is reduced Boundary Value Problems Where solution of differential equations are located at 2 different values of the independent variable x à more difficult, because cannot just start at point of initial value – there may not be enough starting conditions available at the end points to produce a unique solution An n-order equation will require n boundary conditions – need to determine the missing n-1 conditions which cause the given conditions at the other boundary to be satisfied Shooting Method http://en.wikipedia.org/wiki/Shooting_method C++: http://ganeshtiwaridotcomdotnp.blogspot.co.il/2009/12/c-c-code-shooting-method-for-solving.html Iteratively guess the missing values for one end & integrate, then inspect the discrepancy with the boundary values of the other end to adjust the estimate Given the starting boundary values u1 & u2 which contain the root u, solve u given the false position method (solving the differential equation as an initial value problem via 4th order RK), then use u to solve the differential equations. Finite Difference Method For linear & non-linear systems Higher order derivatives require more computational steps – some combinations for boundary conditions may not work though Improve the accuracy by increasing the number of mesh points Solving EigenValue Problems An eigenvalue can substitute a matrix when doing matrix multiplication à convert matrix multiplication into a polynomial EigenValue For a given set of equations in matrix form, determine what are the solution eigenvalue & eigenvectors Similar Matrices - have same eigenvalues. Use orthogonal similarity transforms to reduce a matrix to diagonal form from which eigenvalue(s) & eigenvectors can be computed iteratively Jacobi method http://en.wikipedia.org/wiki/Jacobi_method C++: http://people.sc.fsu.edu/~jburkardt/classes/acs2_2008/openmp/jacobi/jacobi.html Robust but Computationally intense – use for small matrices < 10x10 Power Iteration http://en.wikipedia.org/wiki/Power_iteration For any given real symmetric matrix, generate the largest single eigenvalue & its eigenvectors Simplest method – does not compute matrix decomposition à suitable for large, sparse matrices Inverse Iteration Variation of power iteration method – generates the smallest eigenvalue from the inverse matrix Rayleigh Method http://en.wikipedia.org/wiki/Rayleigh's_method_of_dimensional_analysis Variation of power iteration method Rayleigh Quotient Method Variation of inverse iteration method Matrix Tri-diagonalization Method Use householder algorithm to reduce an NxN symmetric matrix to a tridiagonal real symmetric matrix vua N-2 orthogonal transforms Whats Next Outside of Numerical Methods there are lots of different types of algorithms that I’ve learned over the decades: Data Mining – (I covered this briefly in a previous post: http://geekswithblogs.net/JoshReuben/archive/2007/12/31/ssas-dm-algorithms.aspx ) Search & Sort Routing Problem Solving Logical Theorem Proving Planning Probabilistic Reasoning Machine Learning Solvers (eg MIP) Bioinformatics (Sequence Alignment, Protein Folding) Quant Finance (I read Wilmott’s books – interesting) Sooner or later, I’ll cover the above topics as well.

Read the article
Foreign-key-like merge in R

- by skyl

I'm merging a bunch of csv with 1 row per id/pk/seqn. > full = merge(demo, lab13am, by="seqn", all=TRUE) > full = merge(full, cdq, by="seqn", all=TRUE) > full = merge(full, mcq, by="seqn", all=TRUE) > full = merge(full, cfq, by="seqn", all=TRUE) > full = merge(full, diq, by="seqn", all=TRUE) > print(length(full$ridageyr)) [1] 9965 > print(summary(full$ridageyr)) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.00 11.00 19.00 29.73 48.00 85.00 Everything is great. But, I have another file which has multiple rows per id like: "seqn","rxd030","rxd240b","nhcode","rxq250" 56,2,"","",NA,NA,"" 57,1,"ACETAMINOPHEN","01200",2 57,1,"BUDESONIDE","08800",1 58,1,"99999","",NA 57 has two rows. So, if I naively try to merge this file, I have a ton more rows and my data gets all skewed up. > full = merge(full, rxq, by="seqn", all=TRUE) > print(length(full$ridageyr)) [1] 15643 > print(summary(full$ridageyr)) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.00 14.00 41.00 40.28 66.00 85.00 Is there a normal idiomatic way to deal with data like this? Suppose I want a way to make a simple model like MYSPECIAL_FACTOR <- somehow() glm(MYSPECIAL_FACTOR ~ full$ridageyr, family=binomial) where MYSPECIAL_FACTOR is, say, whether or not rxd240b == "ACETAMINOPHEN" for the observations which are unique by seqn. You can reproduce by running the first bit of this.

Read the article
optimization math computation (multiplication and summing)

- by wiso

Suppose you want to compute the sum of the square of the differences of items: $\sum_{i=1}^{N-1} (x_i - x_{i+1})^2$, the simplest code (the input is std::vector<double> xs, the ouput sum2) is: double sum2 = 0.; double prev = xs[0]; for (vector::const_iterator i = xs.begin() + 1; i != xs.end(); ++i) { sum2 += (prev - (*i)) * (prev - (*i)); // only 1 - with compiler optimization prev = (*i); } I hope that the compiler do the optimization in the comment above. If N is the length of xs you have N-1 multiplications and 2N-3 sums (sums means + or -). Now suppose you know this variable: sum = $x_1^2 + x_N^2 + 2 sum_{i=2}^{N-1} x_i^2$ Expanding the binomial square: $sum_i^{N-1} (x_i-x_{i+1})^2 = sum - 2\sum_{i=1}^{N-1} x_i x_{i+1}$ so the code becomes: double sum2 = 0.; double prev = xs[0]; for (vector::const_iterator i = xs.begin() + 1; i != xs.end(); ++i) { sum2 += (*i) * prev; prev = (*i); } sum2 = -sum2 * 2. + sum; Here I have N multiplications and N-1 additions. In my case N is about 100. Well, compiling with g++ -O2 I got no speed up (I try calling the inlined function 2M times), why?

Read the article
Accurate least-squares fit algorithm needed

- by ggkmath

I've experimented with the two ways of implementing a least-squares fit (LSF) algorithm shown here. The first code is simply the textbook approach, as described by Wolfram's page on LSF. The second code re-arranges the equation to minimize machine errors. Both codes produce similar results for my data. I compared these results with Matlab's p=polyfit(x,y,1) function, using correlation coefficients to measure the "goodness" of fit and compare each of the 3 routines. I observed that while all 3 methods produced good results, at least for my data, Matlab's routine had the best fit (the other 2 routines had similar results to each other). Matlab's p=polyfit(x,y,1) function uses a Vandermonde matrix, V (n x 2 matrix) and QR factorization to solve the least-squares problem. In Matlab code, it looks like: V = [x1,1; x2,1; x3,1; ... xn,1] % this line is pseudo-code [Q,R] = qr(V,0); p = R\(Q'*y); % performs same as p = V\y I'm not a mathematician, so I don't understand why it would be more accurate. Although the difference is slight, in my case I need to obtain the slope from the LSF and multiply it by a large number, so any improvement in accuracy shows up in my results. For reasons I can't get into, I cannot use Matlab's routine in my work. So, I'm wondering if anyone has a more accurate equation-based approach recommendation I could use that is an improvement over the above two approaches, in terms of rounding errors/machine accuracy/etc. Any comments appreciated! thanks in advance.

Read the article
simple Stata program

- by Cyrus S

I am trying to write a simple program to combine coefficient and standard error estimates from a set of regression fits. I run, say, 5 regressions, and store the coefficient(s) and standard error(s) of interest into vectors (Stata matrix objects, actually). Then, I need to do the following: Find the mean value of the coefficient estimates. Combine the standard error estimates according to the formula suggested for combining results from "multiple imputation". The formula is the square root of the formula for "T" on page 6 of the following document: http://bit.ly/b05WX3 I have written Stata code that does this once, but I want to write this as a function (or "program", in Stata speak) that takes as arguments the vector (or matrix, if possible, to combine multiple estimates at once) of regression coefficient estimates and the vector (or matrix) of corresponding standard error estimates, and then generates 1 and 2 above. Here is the code that I wrote: (breg is a 1x5 vector of the regression coefficient estimates, and sereg is a 1x5 vector of the associated standard error estimates) mat ones = (1,1,1,1,1) mat bregmean = (1/5)*(ones*breg’) scalar bregmean_s = bregmean[1,1] mat seregmean = (1/5)*(ones*sereg’) mat seregbtv = (1/4)*(breg - bregmean#ones)* (breg - bregmean#ones)’ mat varregmi = (1/5)*(sereg*sereg’) + (1+(1/5))* seregbtv scalar varregmi_s = varregmi[1,1] scalar seregmi = sqrt(varregmi_s) disp bregmean_s disp seregmi This gives the right answer for a single instance. Any pointers would be great! UPDATE: I completed the code for combining estimates in a kXm matrix of coefficients/parameters (k is the number of parameters, m the number of imputations). Code can be found here: http://bit.ly/cXJRw1 Thanks to Tristan and Gabi for the pointers.

Read the article
Why isn't my operator overloading working properly?

- by Mithrax

I have the following Polynomial class I'm working on: #include <iostream> using namespace std; class Polynomial { //define private member functions private: int coef[100]; // array of coefficients // coef[0] would hold all coefficients of x^0 // coef[1] would hold all x^1 // coef[n] = x^n ... int deg; // degree of polynomial (0 for the zero polynomial) //define public member functions public: Polynomial::Polynomial() //default constructor { for ( int i = 0; i < 100; i++ ) { coef[i] = 0; } } void set ( int a , int b ) //setter function { //coef = new Polynomial[b+1]; coef[b] = a; deg = degree(); } int degree() { int d = 0; for ( int i = 0; i < 100; i++ ) if ( coef[i] != 0 ) d = i; return d; } void print() { for ( int i = 99; i >= 0; i-- ) { if ( coef[i] != 0 ) { cout << coef[i] << "x^" << i << " "; } } } // use Horner's method to compute and return the polynomial evaluated at x int evaluate ( int x ) { int p = 0; for ( int i = deg; i >= 0; i-- ) p = coef[i] + ( x * p ); return p; } // differentiate this polynomial and return it Polynomial differentiate() { if ( deg == 0 ) { Polynomial t; t.set ( 0, 0 ); return t; } Polynomial deriv;// = new Polynomial ( 0, deg - 1 ); deriv.deg = deg - 1; for ( int i = 0; i < deg; i++ ) deriv.coef[i] = ( i + 1 ) * coef[i + 1]; return deriv; } Polynomial Polynomial::operator + ( Polynomial b ) { Polynomial a = *this; //a is the poly on the L.H.S Polynomial c; for ( int i = 0; i <= a.deg; i++ ) c.coef[i] += a.coef[i]; for ( int i = 0; i <= b.deg; i++ ) c.coef[i] += b.coef[i]; c.deg = c.degree(); return c; } Polynomial Polynomial::operator += ( Polynomial b ) { Polynomial a = *this; //a is the poly on the L.H.S Polynomial c; for ( int i = 0; i <= a.deg; i++ ) c.coef[i] += a.coef[i]; for ( int i = 0; i <= b.deg; i++ ) c.coef[i] += b.coef[i]; c.deg = c.degree(); for ( int i = 0; i < 100; i++) a.coef[i] = c.coef[i]; a.deg = a.degree(); return a; } Polynomial Polynomial::operator -= ( Polynomial b ) { Polynomial a = *this; //a is the poly on the L.H.S Polynomial c; for ( int i = 0; i <= a.deg; i++ ) c.coef[i] += a.coef[i]; for ( int i = 0; i <= b.deg; i++ ) c.coef[i] -= b.coef[i]; c.deg = c.degree(); for ( int i = 0; i < 100; i++) a.coef[i] = c.coef[i]; a.deg = a.degree(); return a; } Polynomial Polynomial::operator *= ( Polynomial b ) { Polynomial a = *this; //a is the poly on the L.H.S Polynomial c; for ( int i = 0; i <= a.deg; i++ ) for ( int j = 0; j <= b.deg; j++ ) c.coef[i+j] += ( a.coef[i] * b.coef[j] ); c.deg = c.degree(); for ( int i = 0; i < 100; i++) a.coef[i] = c.coef[i]; a.deg = a.degree(); return a; } Polynomial Polynomial::operator - ( Polynomial b ) { Polynomial a = *this; //a is the poly on the L.H.S Polynomial c; for ( int i = 0; i <= a.deg; i++ ) c.coef[i] += a.coef[i]; for ( int i = 0; i <= b.deg; i++ ) c.coef[i] -= b.coef[i]; c.deg = c.degree(); return c; } Polynomial Polynomial::operator * ( Polynomial b ) { Polynomial a = *this; //a is the poly on the L.H.S Polynomial c; for ( int i = 0; i <= a.deg; i++ ) for ( int j = 0; j <= b.deg; j++ ) c.coef[i+j] += ( a.coef[i] * b.coef[j] ); c.deg = c.degree(); return c; } }; int main() { Polynomial a, b, c, d; a.set ( 7, 4 ); //7x^4 a.set ( 1, 2 ); //x^2 b.set ( 6, 3 ); //6x^3 b.set ( -3, 2 ); //-3x^2 c = a - b; // (7x^4 + x^2) - (6x^3 - 3x^2) a -= b; c.print(); cout << "\n"; a.print(); cout << "\n"; c = a * b; // (7x^4 + x^2) * (6x^3 - 3x^2) c.print(); cout << "\n"; d = c.differentiate().differentiate(); d.print(); cout << "\n"; cout << c.evaluate ( 2 ); //substitue x with 2 cin.get(); } Now, I have the "-" operator overloaded and it works fine: Polynomial Polynomial::operator - ( Polynomial b ) { Polynomial a = *this; //a is the poly on the L.H.S Polynomial c; for ( int i = 0; i <= a.deg; i++ ) c.coef[i] += a.coef[i]; for ( int i = 0; i <= b.deg; i++ ) c.coef[i] -= b.coef[i]; c.deg = c.degree(); return c; } However, I'm having difficulty with my "-=" operator: Polynomial Polynomial::operator -= ( Polynomial b ) { Polynomial a = *this; //a is the poly on the L.H.S Polynomial c; for ( int i = 0; i <= a.deg; i++ ) c.coef[i] += a.coef[i]; for ( int i = 0; i <= b.deg; i++ ) c.coef[i] -= b.coef[i]; c.deg = c.degree(); // overwrite value of 'a' with the newly computed 'c' before returning 'a' for ( int i = 0; i < 100; i++) a.coef[i] = c.coef[i]; a.deg = a.degree(); return a; } I just slightly modified my "-" operator method to overwrite the value in 'a' and return 'a', and just use the 'c' polynomial as a temp. I've put in some debug print statement and I confirm that at the time of computation, both: c = a - b; and a -= b; are computed to the same value. However, when I go to print them, their results are different: Polynomial a, b; a.set ( 7, 4 ); //7x^4 a.set ( 1, 2 ); //x^2 b.set ( 6, 3 ); //6x^3 b.set ( -3, 2 ); //-3x^2 c = a - b; // (7x^4 + x^2) - (6x^3 - 3x^2) a -= b; c.print(); cout << "\n"; a.print(); cout << "\n"; Result: 7x^4 -6x^3 4x^2 7x^4 1x^2 Why is my c = a - b and a -= b giving me different results when I go to print them?

Read the article
Exponential regression : p-value and F significance

- by Saravanan K

I am new to statistics. I have a set of independent data and dependent data (X,Y), where I would like to do an exponential regression to obtain its p-value and significant F (already obtained R2 and also the coefficients through mathematical calculation). What is the natural evolution from the (X,Y) data to mathematically calculate those variables. Spent a week on the internet to study this but unable to find the right answer. Often an exponential data, y=be^(mx) will be converted first to a linear data, ln y = mx + ln b . Then a linear regression will done on the converted data, obtaining its p-value etc. Assume we use a statistical tool such as Excel's Analysis ToolPak: Data Analysis : Regression, it will produce a result such as below, I believe the p-value and Significant F value is representing the converted linear data and not the original exponential data. Questions: What is the approach/steps used by Excel to get the p-value and Significant F value for the converted linear data as shown in the statistic output in the image above? It is not clear in their help page or website. Can the p-value and Significant F could be mathematically calculated for exponential regression without using a statistical tool? Can you assist to point me to the right link if this has been answered before.

Read the article
Small-o(n^2) implementation of Polynomial Multiplication

- by AlanTuring

I'm having a little trouble with this problem that is listed at the back of my book, i'm currently in the middle of test prep but i can't seem to locate anything regarding this in the book. Anyone got an idea? A real polynomial of degree n is a function of the form f(x)=a(n)x^n+?+a1x+a0, where an,…,a1,a0 are real numbers. In computational situations, such a polynomial is represented by a sequence of its coefficients (a0,a1,…,an). Assuming that any two real numbers can be added/multiplied in O(1) time, design an o(n^2)-time algorithm to compute, given two real polynomials f(x) and g(x) both of degree n, the product h(x)=f(x)g(x). Your algorithm should **not** be based on the Fast Fourier Transform (FFT) technique. Please note it needs to be small-o(n^2), which means it complexity must be sub-quadratic. The obvious solution that i have been finding is indeed the FFT, but of course i can't use that. There is another method that i have found called convolution, where if you take polynomial A to be a signal and polynomial B to be a filter. A passed through B yields a shifted signal that has been "smoothed" by A and the resultant is A*B. This is supposed to work in O(n log n) time. Of course i am completely unsure of implementation. If anyone has any ideas of how to achieve a small-o(n^2) implementation please do share, thanks.

Read the article
Generating lognormally distributed random number from mean, coeff of variation

- by Richie Cotton

Most functions for generating lognormally distributed random numbers take the mean and standard deviation of the associated normal distribution as parameters. My problem is that I only know the mean and the coefficient of variation of the lognormal distribution. It is reasonably straight forward to derive the parameters I need for the standard functions from what I have: If mu and sigma are the mean and standard deviation of the associated normal distribution, we know that coeffOfVar^2 = variance / mean^2 = (exp(sigma^2) - 1) * exp(2*mu + sigma^2) / exp(mu + sigma^2/2)^2 = exp(sigma^2) - 1 We can rearrange this to sigma = sqrt(log(coeffOfVar^2 + 1)) We also know that mean = exp(mu + sigma^2/2) This rearranges to mu = log(mean) - sigma^2/2 Here's my R implementation rlnorm0 <- function(mean, coeffOfVar, n = 1e6) { sigma <- sqrt(log(coeffOfVar^2 + 1)) mu <- log(mean) - sigma^2 / 2 rlnorm(n, mu, sigma) } It works okay for small coefficients of variation r1 <- rlnorm0(2, 0.5) mean(r1) # 2.000095 sd(r1) / mean(r1) # 0.4998437 But not for larger values r2 <- rlnorm0(2, 50) mean(r2) # 2.048509 sd(r2) / mean(r2) # 68.55871 To check that it wasn't an R-specific issue, I reimplemented it in MATLAB. (Uses stats toolbox.) function y = lognrnd0(mean, coeffOfVar, sizeOut) if nargin < 3 || isempty(sizeOut) sizeOut = [1e6 1]; end sigma = sqrt(log(coeffOfVar.^2 + 1)); mu = log(mean) - sigma.^2 ./ 2; y = lognrnd(mu, sigma, sizeOut); end r1 = lognrnd0(2, 0.5); mean(r1) % 2.0013 std(r1) ./ mean(r1) % 0.5008 r2 = lognrnd0(2, 50); mean(r2) % 1.9611 std(r2) ./ mean(r2) % 22.61 Same problem. The question is, why is this happening? Is it just that the standard deviation is not robust when the variation is that wide? Or have a screwed up somewhere?

Read the article
Linear regression confidence intervals in SQL

- by Matt Howells

I'm using some fairly straight-forward SQL code to calculate the coefficients of regression (intercept and slope) of some (x,y) data points, using least-squares. This gives me a nice best-fit line through the data. However we would like to be able to see the 95% and 5% confidence intervals for the line of best-fit (the curves below). What these mean is that the true line has 95% probability of being below the upper curve and 95% probability of being above the lower curve. How can I calculate these curves? I have already read wikipedia etc. and done some googling but I haven't found understandable mathematical equations to be able to calculate this. Edit: here is the essence of what I have right now. --sample data create table #lr (x real not null, y real not null) insert into #lr values (0,1) insert into #lr values (4,9) insert into #lr values (2,5) insert into #lr values (3,7) declare @slope real declare @intercept real --calculate slope and intercept select @slope = ((count(*) * sum(x*y)) - (sum(x)*sum(y)))/ ((count(*) * sum(Power(x,2)))-Power(Sum(x),2)), @intercept = avg(y) - ((count(*) * sum(x*y)) - (sum(x)*sum(y)))/ ((count(*) * sum(Power(x,2)))-Power(Sum(x),2)) * avg(x) from #lr Thank you in advance.

Read the article
using load instead of other I/O command

- by Amadou

How can I modify this program using load-ascii command to read (x,y)? n=0; sum_x = 0; sum_y = 0; sum_x2 = 0; sum_xy = 0; disp('This program performs a least-squares fit of an'); disp('input data set to a straight line. Enter the name'); disp('of the file containing the input (x,y) pairs: '); filename = input(' ','s'); [fid,msg] = fopen(filename,'rt'); if fid<0 disp(msg); else [in,count]=fscanf(fid, '%g %g',2); while ~feof(fid) x=in(1); y=in(2); n=n+1; sum_x=sum_x+x; sum_y=sum_y+y; sum_x2=sum_x2+x.^2; sum_xy=sum_xy+x*y; [in,count] = fscanf(fid, '%f',[1 2]); end fclose(fid); x_bar = sum_x / n; y_bar = sum_y / n; slope = (sum_xy - sum_x*y_bar) / (sum_x2 - sum_x*x_bar); y_int = y_bar - slope * x_bar; fprintf('Regression coefficients for the least-squares line:\n'); fprintf('Slope (m) =%12.3f\n',slope); fprintf('Intercept (b) =%12.3f\n',y_int); fprintf('No of points =%12d\n',n); end

Read the article
Simple encryption - Sum of Hashes in C

- by Dogbert

I am attempting to demonstrate a simple proof of concept with respect to a vulnerability in a piece of code in a game written in C. Let's say that we want to validate a character login. The login is handled by the user choosing n items, (let's just assume n=5 for now) from a graphical menu. The items are all medieval themed: eg: _______________________________ | | | | | Bow | Sword | Staff | |-----------|-----------|-------| | Shield | Potion | Gold | |___________|___________|_______| The user must click on each item, then choose a number for each item. The validation algorithm then does the following: Determines which items were selected Drops each string to lowercase (ie: Bow becomes bow, etc) Calculates a simple string hash for each string (ie: `bow = b=2, o=15, w=23, sum = (2+15+23=40) Multiplies the hash by the value the user selected for the corresponding item; This new value is called the key Sums together the keys for each of the selected items; this is the final validation hash IMPORTANT: The validator will accept this hash, along with non-zero multiples of it (ie: if the final hash equals 1111, then 2222, 3333, 8888, etc are also valid). So, for example, let's say I select: Bow (1) Sword (2) Staff (10) Shield (1) Potion (6) The algorithm drops each of these strings to lowercase, calculates their string hashes, multiplies that hash by the number selected for each string, then sums these keys together. eg: Final_Validation_Hash = 1*HASH(Bow) + 2*HASH(Sword) + 10*HASH(Staff) + 1*HASH(Shield) + 6*HASH(Potion) By application of Euler's Method, I plan to demonstrate that these hashes are not unique, and want to devise a simple application to prove it. in my case, for 5 items, I would essentially be trying to calculate: (B)(y) = (A_1)(x_1) + (A_2)(x_2) + (A_3)(x_3) + (A_4)(x_4) + (A_5)(x_5) Where: B is arbitrary A_j are the selected coefficients/values for each string/category x_j are the hash values for each string/category y is the final validation hash (eg: 1111 above) B,y,A_j,x_j are all discrete-valued, positive, and non-zero (ie: natural numbers) Can someone either assist me in solving this problem or point me to a similar example (ie: code, worked out equations, etc)? I just need to solve the final step (ie: (B)(Y) = ...). Thank you all in advance.

Read the article
Explaining the forecasts from an ARIMA model

- by Samik R.

I am trying to explain to myself the forecasting result from applying an ARIMA model to a time-series dataset. The data is from the M1-Competition, the series is MNB65. For quick reference, I have a google doc spreadsheet with the data. I am trying to fit the data to an ARIMA(1,0,0) model and get the forecasts. I am using R. Here are some output snippets: > arima(x, order = c(1,0,0)) Series: x ARIMA(1,0,0) with non-zero mean Call: arima(x = x, order = c(1, 0, 0)) Coefficients: ar1 intercept 0.9421 12260.298 s.e. 0.0474 202.717 > predict(arima(x, order = c(1,0,0)), n.ahead=12) $pred Time Series: Start = 53 End = 64 Frequency = 1 [1] 11757.39 11786.50 11813.92 11839.75 11864.09 11887.02 11908.62 11928.97 11948.15 11966.21 11983.23 11999.27 I have a few questions: (1) How do I explain that although the dataset shows a clear downward trend, the forecast from this model trends upward. This also happens for ARIMA(2,0,0), which is the best ARIMA fit for the data using auto.arima (forecast package) and for an ARIMA(1,0,1) model. (2) The intercept value for the ARIMA(1,0,0) model is 12260.298. Shouldn't the intercept satisfy the equation: C = mean * (1 - sum(AR coeffs)), in which case, the value should be 715.52. I must be missing something basic here. (3) This is clearly a series with non-stationary mean. Why is an AR(2) model still selected as the best model by auto.arima? Could there be an intuitive explanation? Thanks.

Read the article
Calculating moving average/stdev in SAS?

- by John

Hye guys, I included a screenshot to help clarify my problem: http://i40.tinypic.com/mcrnmv.jpg. I'm trying to calculate some kind of moving average and moving standard deviation. The thing is I want to calculate the coefficients of variation (stdev/avg) for the actual value. Normally this is done by calculating the stdev and avg for the past 5 years. However sometimes there will be observations in my database for which I do not have the information of the past 5 years (maybe only 3, 2 etc). That's why i want a code that will calculate the avg and stdev even if there is no information for the whole 5 years. Also, as you see in the observations, sometimes I have information over more than 5 years, when this is the case I need some kind of moving average which allows me to calculate the avg and stdev for the past 5 year. So if a company has information for 7 years I need some kind of code that will calculate the avg and stdev for, lets say, 1997 (by 1991-1996), 1998 (by 1992-1997) and 1999 (1993-1998). As i'm not very familiar with sas commands it should look (very very roughly) like: set var if year = i then stdev=stdev(year(i-6) untill year(i-1)) and average = avg(year(i-6) untill year(i-1)) Or something like this, I really have no clue, I'm gonna try and figure it out but it's worth posting it if I won't find it myself. Thanks!

Read the article
C++, name collision across different namespace

- by aaa

hello. I am baffled by the following name collision: namespace mp2 { boost::numeric::ublas::matrix_range<M> slice(M& m, const R1& r1, const R2& r2) { namespace ublas = boost::numeric::ublas; ublas::range r1_(r1.begin(), r1.end()), r2_(r2.begin(), r2.end()); return ublas::matrix_range<M>(m, r1_, r2_); } double energy(const Wavefunction &wf) { const Wavefunction::matrix& C = wf.coefficients(); int No = wf.occupied().size(); foreach (const Basis::MappedShell& P, basis.shells()) { slice(C, range(No), range(P)); the error from g++4.4 is 7 In file included from mp2.cpp:1: 8 /usr/include/boost/numeric/ublas/fwd.hpp: In function âdouble mp2::energy(const Wavefunction&)â: 9 /usr/include/boost/numeric/ublas/fwd.hpp:32: error: âboost::numeric::ublas::sliceâ is not a function, 10 ../../src/mp2/energy.hpp:98: error: conflict with âtemplate<class M, class R1, class R2> boost::numeric::ublas::matrix_range<M> mp2::slice(M&, const R1&, const R2&)â 11 ../../src/mp2/energy.hpp:123: error: in call to âsliceâ 12 /usr/include/boost/numeric/ublas/fwd.hpp:32: error: âboost::numeric::ublas::sliceâ is not a function, 13 ../../src/mp2/energy.hpp:98: error: conflict with âtemplate<class M, class R1, class R2> boost::numeric::ublas::matrix_range<M> mp2::slice(M&, const R1&, const R2&)â 14 ../../src/mp2/energy.hpp:129: error: in call to âsliceâ 15 make: *** [mp2.lo] Error 1 ublas segment is namespace boost { namespace numeric { namespace ublas { typedef basic_slice<> slice; why is slice in ublas collides with slice in mp2? I and fairly certain there is no using namespace ublas in the code and in includes. thank you

Read the article

Search Results

Search found 86 results on 4 pages for 'binomial coefficients'.

Page 3/4 | < Previous Page | 1 2 3 4 | Next Page >

- by ahmad

- by John Horton

- by Myx

- by adibender

- by fmark

- by Carlos

- by abuasis

- by John Elway

- by HelloWorld

- by Rox Wen

- by JoshReuben

- by skyl

- by wiso

- by ggkmath

- by Cyrus S

- by Mithrax

- by Saravanan K

- by AlanTuring

- by Richie Cotton

- by Matt Howells

- by Amadou

- by Dogbert

- by Samik R.

- by John

- by aaa

< Previous Page | 1 2 3 4 | Next Page >