I'm trying to do some parameter estimation and want to choose parameter estimates that minimize the square error in a predicted equation over about 30 variables. If the equation were linear, I would just compute the 30 partial derivatives, set them all to zero, and use a linear-equation solver. But unfortunately the equation is nonlinear and so are its derivatives.
If the equation were over a single variable, I would just use Newton's method (also known as Newton-Raphson). The Web is rich in examples and code to implement Newton's method for functions of a single variable.
Given that I have about 30 variables, how can I program a numeric solution to this problem using Newton's method? I have the equation in closed form and can compute the first and second derivatives, but I don't know quite how to proceed from there. I have found a large number of treatments on the web, but they quickly get into heavy matrix notation. I've found something moderately helpful on Wikipedia, but I'm having trouble translating it into code.
Where I'm worried about breaking down is in the matrix algebra and matrix inversions. I can invert a matrix with a linear-equation solver but I'm worried about getting the right rows and columns, avoiding transposition errors, and so on.
To be quite concrete:
I want to work with tables mapping variables to their values. I can write a function of such a table that returns the square error given such a table as argument. I can also create functions that return a partial derivative with respect to any given variable.
I have a reasonable starting estimate for the values in the table, so I'm not worried about convergence.
I'm not sure how to write the loop that uses an estimate (table of value for each variable), the function, and a table of partial-derivative functions to produce a new estimate.
That last is what I'd like help with. Any direct help or pointers to good sources will be warmly appreciated.
Edit: Since I have the first and second derivatives in closed form, I would like to take advantage of them and avoid more slowly converging methods like simplex searches.