寻找非线性函数的局部最小值

谢益辉 2006-04-06

这件事情起因源于美国的Nancy同学,涉及到一个求函数最小值的问题,然后就牵涉到S-Plus中nlmin函数,现将S-Plus中关于nlmin函数的reference粘贴如下:

**nlmin** S-PLUS Language Reference


**Find Local Minimum of a Nonlinear Function**


**DESCRIPTION:**


Finds a local minimum of a nonlinear function using a **general quasi-Newton optimizer**.


**USAGE:**<PRE> nlmin(f, x, d=rep(1,length(x)), print.level=0, max.fcal=30,  
      max.iter=15, init.step=1, rfc.tol=<<see below>>, ckfc=0.1,  
      xc.tol=<<see below>>, xf.tol=<<see below>>) </PRE>



**REQUIRED ARGUMENTS:** 


f 
an S-PLUS function which takes as its only argument a real vector of length p. The function must return a value of storage mode "double". 
x 
a vector of length p used as the starting point for the optimizer. 


**OPTIONAL ARGUMENTS: **


d 
a vector of length p used as a scaling vector for x. The elements of the vector d* x should be in comparable units. Normally d is only specified if the elements of x have differing orders of magnitude. 
print.level= 
if print.level=0, then all output from the optimizer is suppressed. If print.level=1, then a short summary of each iteration is printed. If print.level=2, then a long summary of each iteration is printed. 
max.fcal= 
the maximum number of function evaluations permitted. 
max.iter= 
the maximum number of iterations permitted. 
init.step= 
a bound on the L2-norm of the initial scaled step. This can have a dramatic effect on the performance of the optimizer. 
rfc.tol= 
the relative function convergence tolerance. If the predicted function reduction is no more than rfc.tol times |f|, where |f| is the function value, then the optimizer has converged. The default is max(10^-10, e^(2/3)), where e is machine epsilon. 
ckfc= 
if the predicted function decrease is smaller than ckfc times the actual function decrease, then a check is performed for false convergence. 
xc.tol= 
the relative x-convergence tolerance, where x is the argument of the function being minimized. If a Newton step has a scaled step length smaller than xc.tol, then the optimizer converges. The default is sqrt(e) where e is machine epsilon. 
xf.tol= 
the false convergence tolerance. If relative function convergence and relative x-convergence were not achieved, and a Newton step has scaled step length smaller than xf.tol, then the optimizer appears to be stopped at a non-critical solution. The default is 100 times the machine epsilon. 


**VALUE: **


a list with the following components: 
x 
the value at which the optimizer has converged 
converged 
if the optimizer has apparently successfully converged to a minimum, then converged is TRUE; otherwise converged is FALSE. 
conv.type 
a character string describing the type of convergence. 
DETAILS:
The optimizer is based on a quasi-Newton method using the double dogleg step with the BFGS secant update to the Hessian. See Dennis, Gay, and Welsch (1981) and Dennis and Mei (1979) for details. 


**BUGS:**


If the storage mode of the return value of f is "single" or "integer", nlmin will not work properly (but converged will be FALSE). 


**REFERENCES: **


Dennis, J. E., Gay, D. M., and Welsch, R. E. (1981). An adaptive nonlinear least-squares algorithm. ACM Transactions on Mathematical Software 7, 348-383. 


Dennis, J. E. and Mei, H. H. W. (1979). Two new unconstrained optimization algorithms which use function and gradient values. Journal of Optimization Theory and Applications 28, 453-483. 


**SEE ALSO: **


nlminb , arima.mle . 


**EXAMPLES: **


# minimize a simple function 
func <- function(x) {x^2-2*x+4} 
min.func <- nlmin(func,0) 

# One way to pass parameters to the function is: 
function() 
{ 
        co <- c(1, 2) 
        assign("co", co, frame = 1) 
        f1 <- function(x) co[1] + co[2] * x^2 
        nlmin(f1, 10) 
} 
# Another way to pass parameters is by setting the default
# value for an argument; e.g. g$y0 in next example:# nlmin can be used to solve a system of nonlinear equations:
solveNonlinear <- function(f, y0, x, ...){
  # solve f(x) = y0
  # x is vector of initial guesses, same length as y0
  # ... are additional arguments to nlmin (not to f)
  g <- function(x, y0, f) sum((f(x) - y0)^2)
  g$y0 <- y0   # set g's default value for y0
  g$f <- f     # set g's default value for f
  nlmin(g, x, ...)
}
f <- function(x) c(exp(x[1]), sin(x[2]))
solveNonlinear(f, c( 3, .5), c(.2,.2))  # solve exp(x1)=3, sin(x2)=.5
# result$x is c(log(3), asin(.5)) 
g <- function(x) exp(x[1]) * c(1, sin(x[2]))
solveNonlinear(g, c( 3, .5), c(.2,.2))  # solve exp(x1)=3, exp(x1)sin(x2)=.5


# Note that multiple variables (x1 and x2) can be thought of as components
# of a single multivariate variable (x[1] and x[2]), and
# that multiple functions (exp(x1) and exp(x1)*sin(x2))
# correspond to a vector-valued function c(exp(x1), exp(x1)*sin(x2))


# Warning -- solveNonlinear can be fooled; it minimizes a sum of
# squares, and may stop if it believes the SS is not decreasing,
# even if not the SS is zero.

感慨一句,统计计算(数值计算 / 数值分析 / Computational Statistics / Numerical Analysis)真是一门恐怖的学问。