source/maxlik/src_FPy/sumsld.f

   1       subroutine sumsl(n, d, x, calcf, calcg, iv, liv, lv, v,
   2      1                  uiparm, urparm, ufparm)
   3 c
   4 c  ***  minimize general unconstrained objective function using   ***
   5 c  ***  analytic gradient and hessian approx. from secant update  ***
   6 c
   7       integer n, liv, lv
   8       integer iv(liv), uiparm(1)
   9       double precision d(n), x(n), v(lv), urparm(1)
  10 c     dimension v(71 + n*(n+15)/2), uiparm(*), urparm(*)
  11       external calcf, calcg, ufparm
  12 c
  13 c  ***  purpose  ***
  14 c
  15 c        this routine interacts with subroutine  sumit  in an attempt
  16 c     to find an n-vector  x*  that minimizes the (unconstrained)
  17 c     objective function computed by  calcf.  (often the  x*  found is
  18 c     a local minimizer rather than a global one.)
  19 c
  20 c--------------------------  parameter usage  --------------------------
  21 c
  22 c n........ (input) the number of variables on which  f  depends, i.e.,
  23 c                  the number of components in  x.
  24 c d........ (input/output) a scale vector such that  d(i)*x(i),
  25 c                  i = 1,2,...,n,  are all in comparable units.
  26 c                  d can strongly affect the behavior of sumsl.
  27 c                  finding the best choice of d is generally a trial-
  28 c                  and-error process.  choosing d so that d(i)*x(i)
  29 c                  has about the same value for all i often works well.
  30 c                  the defaults provided by subroutine deflt (see i
  31 c                  below) require the caller to supply d.
  32 c x........ (input/output) before (initially) calling sumsl, the call-
  33 c                  er should set  x  to an initial guess at  x*.  when
  34 c                  sumsl returns,  x  contains the best point so far
  35 c                  found, i.e., the one that gives the least value so
  36 c                  far seen for  f(x).
  37 c calcf.... (input) a subroutine that, given x, computes f(x).  calcf
  38 c                  must be declared external in the calling program.
  39 c                  it is invoked by
  40 c                       call calcf(n, x, nf, f, uiparm, urparm, ufparm)
  41 c                  when calcf is called, nf is the invocation
  42 c                  count for calcf.  nf is included for possible use
  43 c                  with calcg.  if x is out of bounds (e.g., if it
  44 c                  would cause overflow in computing f(x)), then calcf
  45 c                  should set nf to 0.  this will cause a shorter step
  46 c                  to be attempted.  (if x is in bounds, then calcf
  47 c                  should not change nf.)  the other parameters are as
  48 c                  described above and below.  calcf should not change
  49 c                  n, p, or x.
  50 c calcg.... (input) a subroutine that, given x, computes g(x), the gra-
  51 c                  dient of f at x.  calcg must be declared external in
  52 c                  the calling program.  it is invoked by
  53 c                       call calcg(n, x, nf, g, uiparm, urparm, ufaprm)
  54 c                  when calcg is called, nf is the invocation
  55 c                  count for calcf at the time f(x) was evaluated.  the
  56 c                  x passed to calcg is usually the one passed to calcf
  57 c                  on either its most recent invocation or the one
  58 c                  prior to it.  if calcf saves intermediate results
  59 c                  for use by calcg, then it is possible to tell from
  60 c                  nf whether they are valid for the current x (or
  61 c                  which copy is valid if two copies are kept).  if g
  62 c                  cannot be computed at x, then calcg should set nf to
  63 c                  0.  in this case, sumsl will return with iv(1) = 65.
  64 c                  (if g can be computed at x, then calcg should not
  65 c                  changed nf.)  the other parameters to calcg are as
  66 c                  described above and below.  calcg should not change
  67 c                  n or x.
  68 c iv....... (input/output) an integer value array of length liv (see
  69 c                  below) that helps control the sumsl algorithm and
  70 c                  that is used to store various intermediate quanti-
  71 c                  ties.  of particular interest are the initialization/
  72 c                  return code iv(1) and the entries in iv that control
  73 c                  printing and limit the number of iterations and func-
  74 c                  tion evaluations.  see the section on iv input
  75 c                  values below.
  76 c liv...... (input) length of iv array.  must be at least 60.  if li
  77 c                  is too small, then sumsl returns with iv(1) = 15.
  78 c                  when sumsl returns, the smallest allowed value of
  79 c                  liv is stored in iv(lastiv) -- see the section on
  80 c                  iv output values below.  (this is intended for use
  81 c                  with extensions of sumsl that handle constraints.)
  82 c lv....... (input) length of v array.  must be at least 71+n*(n+15)/2.
  83 c                  (at least 77+n*(n+17)/2 for smsno, at least
  84 c                  78+n*(n+12) for humsl).  if lv is too small, then
  85 c                  sumsl returns with iv(1) = 16.  when sumsl returns,
  86 c                  the smallest allowed value of lv is stored in
  87 c                  iv(lastv) -- see the section on iv output values
  88 c                  below.
  89 c v........ (input/output) a floating-point value array of length l
  90 c                  (see below) that helps control the sumsl algorithm
  91 c                  and that is used to store various intermediate
  92 c                  quantities.  of particular interest are the entries
  93 c                  in v that limit the length of the first step
  94 c                  attempted (lmax0) and specify convergence tolerances
  95 c                  (afctol, lmaxs, rfctol, sctol, xctol, xftol).
  96 c uiparm... (input) user integer parameter array passed without change
  97 c                  to calcf and calcg.
  98 c urparm... (input) user floating-point parameter array passed without
  99 c                  change to calcf and calcg.
 100 c ufparm... (input) user external subroutine or function passed without
 101 c                  change to calcf and calcg.
 102 c
 103 c  ***  iv input values (from subroutine deflt)  ***
 104 c
 105 c iv(1)...  on input, iv(1) should have a value between 0 and 14......
 106 c             0 and 12 mean this is a fresh start.  0 means that
 107 c                  deflt(2, iv, liv, lv, v)
 108 c             is to be called to provide all default values to iv and
 109 c             v.  12 (the value that deflt assigns to iv(1)) means the
 110 c             caller has already called deflt and has possibly changed
 111 c             some iv and/or v entries to non-default values.
 112 c             13 means deflt has been called and that sumsl (and
 113 c             sumit) should only do their storage allocation.  that is,
 114 c             they should set the output components of iv that tell
 115 c             where various subarrays arrays of v begin, such as iv(g)
 116 c             (and, for humsl and humit only, iv(dtol)), and return.
 117 c             14 means that a storage has been allocated (by a call
 118 c             with iv(1) = 13) and that the algorithm should be
 119 c             started.  when called with iv(1) = 13, sumsl returns
 120 c             iv(1) = 14 unless liv or lv is too small (or n is not
 121 c             positive).  default = 12.
 122 c iv(inith).... iv(25) tells whether the hessian approximation h should
 123 c             be initialized.  1 (the default) means sumit should
 124 c             initialize h to the diagonal matrix whose i-th diagonal
 125 c             element is d(i)**2.  0 means the caller has supplied a
 126 c             cholesky factor  l  of the initial hessian approximation
 127 c             h = l*(l**t)  in v, starting at v(iv(lmat)) = v(iv(42))
 128 c             (and stored compactly by rows).  note that iv(lmat) may
 129 c             be initialized by calling sumsl with iv(1) = 13 (see
 130 c             the iv(1) discussion above).  default = 1.
 131 c iv(mxfcal)... iv(17) gives the maximum number of function evaluations
 132 c             (calls on calcf) allowed.  if this number does not suf-
 133 c             fice, then sumsl returns with iv(1) = 9.  default = 200.
 134 c iv(mxiter)... iv(18) gives the maximum number of iterations allowed.
 135 c             it also indirectly limits the number of gradient evalua-
 136 c             tions (calls on calcg) to iv(mxiter) + 1.  if iv(mxiter)
 137 c             iterations do not suffice, then sumsl returns with
 138 c             iv(1) = 10.  default = 150.
 139 c iv(outlev)... iv(19) controls the number and length of iteration sum-
 140 c             mary lines printed (by itsum).  iv(outlev) = 0 means do
 141 c             not print any summary lines.  otherwise, print a summary
 142 c             line after each abs(iv(outlev)) iterations.  if iv(outlev)
 143 c             is positive, then summary lines of length 78 (plus carri-
 144 c             age control) are printed, including the following...  the
 145 c             iteration and function evaluation counts, f = the current
 146 c             function value, relative difference in function values
 147 c             achieved by the latest step (i.e., reldf = (f0-v(f))/f01,
 148 c             where f01 is the maximum of abs(v(f)) and abs(v(f0)) and
 149 c             v(f0) is the function value from the previous itera-
 150 c             tion), the relative function reduction predicted for the
 151 c             step just taken (i.e., preldf = v(preduc) / f01, where
 152 c             v(preduc) is described below), the scaled relative change
 153 c             in x (see v(reldx) below), the step parameter for the
 154 c             step just taken (stppar = 0 means a full newton step,
 155 c             between 0 and 1 means a relaxed newton step, between 1
 156 c             and 2 means a double dogleg step, greater than 2 means
 157 c             a scaled down cauchy step -- see subroutine dbldog), the
 158 c             2-norm of the scale vector d times the step just taken
 159 c             (see v(dstnrm) below), and npreldf, i.e.,
 160 c             v(nreduc)/f01, where v(nreduc) is described below -- if
 161 c             npreldf is positive, then it is the relative function
 162 c             reduction predicted for a newton step (one with
 163 c             stppar = 0).  if npreldf is negative, then it is the
 164 c             negative of the relative function reduction predicted
 165 c             for a step computed with step bound v(lmaxs) for use in
 166 c             testing for singular convergence.
 167 c                  if iv(outlev) is negative, then lines of length 50
 168 c             are printed, including only the first 6 items listed
 169 c             above (through reldx).
 170 c             default = 1.
 171 c iv(parprt)... iv(20) = 1 means print any nondefault v values on a
 172 c             fresh start or any changed v values on a restart.
 173 c             iv(parprt) = 0 means skip this printing.  default = 1.
 174 c iv(prunit)... iv(21) is the output unit number on which all printing
 175 c             is done.  iv(prunit) = 0 means suppress all printing.
 176 c             default = standard output unit (unit 6 on most systems).
 177 c iv(solprt)... iv(22) = 1 means print out the value of x returned (as
 178 c             well as the gradient and the scale vector d).
 179 c             iv(solprt) = 0 means skip this printing.  default = 1.
 180 c iv(statpr)... iv(23) = 1 means print summary statistics upon return-
 181 c             ing.  these consist of the function value, the scaled
 182 c             relative change in x caused by the most recent step (see
 183 c             v(reldx) below), the number of function and gradient
 184 c             evaluations (calls on calcf and calcg), and the relative
 185 c             function reductions predicted for the last step taken and
 186 c             for a newton step (or perhaps a step bounded by v(lmaxs)
 187 c             -- see the descriptions of preldf and npreldf under
 188 c             iv(outlev) above).
 189 c             iv(statpr) = 0 means skip this printing.
 190 c             iv(statpr) = -1 means skip this printing as well as that
 191 c             of the one-line termination reason message.  default = 1.
 192 c iv(x0prt).... iv(24) = 1 means print the initial x and scale vector d
 193 c             (on a fresh start only).  iv(x0prt) = 0 means skip this
 194 c             printing.  default = 1.
 195 c
 196 c  ***  (selected) iv output values  ***
 197 c
 198 c iv(1)........ on output, iv(1) is a return code....
 199 c             3 = x-convergence.  the scaled relative difference (see
 200 c                  v(reldx)) between the current parameter vector x and
 201 c                  a locally optimal parameter vector is very likely at
 202 c                  most v(xctol).
 203 c             4 = relative function convergence.  the relative differ-
 204 c                  ence between the current function value and its lo-
 205 c                  cally optimal value is very likely at most v(rfctol).
 206 c             5 = both x- and relative function convergence (i.e., the
 207 c                  conditions for iv(1) = 3 and iv(1) = 4 both hold).
 208 c             6 = absolute function convergence.  the current function
 209 c                  value is at most v(afctol) in absolute value.
 210 c             7 = singular convergence.  the hessian near the current
 211 c                  iterate appears to be singular or nearly so, and a
 212 c                  step of length at most v(lmaxs) is unlikely to yield
 213 c                  a relative function decrease of more than v(sctol).
 214 c             8 = false convergence.  the iterates appear to be converg-
 215 c                  ing to a noncritical point.  this may mean that the
 216 c                  convergence tolerances (v(afctol), v(rfctol),
 217 c                  v(xctol)) are too small for the accuracy to which
 218 c                  the function and gradient are being computed, that
 219 c                  there is an error in computing the gradient, or that
 220 c                  the function or gradient is discontinuous near x.
 221 c             9 = function evaluation limit reached without other con-
 222 c                  vergence (see iv(mxfcal)).
 223 c            10 = iteration limit reached without other convergence
 224 c                  (see iv(mxiter)).
 225 c            11 = stopx returned .true. (external interrupt).  see the
 226 c                  usage notes below.
 227 c            14 = storage has been allocated (after a call with
 228 c                  iv(1) = 13).
 229 c            17 = restart attempted with n changed.
 230 c            18 = d has a negative component and iv(dtype) .le. 0.
 231 c            19...43 = v(iv(1)) is out of range.
 232 c            63 = f(x) cannot be computed at the initial x.
 233 c            64 = bad parameters passed to assess (which should not
 234 c                  occur).
 235 c            65 = the gradient could not be computed at x (see calcg
 236 c                  above).
 237 c            67 = bad first parameter to deflt.
 238 c            80 = iv(1) was out of range.
 239 c            81 = n is not positive.
 240 c iv(g)........ iv(28) is the starting subscript in v of the current
 241 c             gradient vector (the one corresponding to x).
 242 c iv(lastiv)... iv(44) is the least acceptable value of liv.  (it is
 243 c             only set if liv is at least 44.)
 244 c iv(lastv).... iv(45) is the least acceptable value of lv.  (it is
 245 c             only set if liv is large enough, at least iv(lastiv).)
 246 c iv(nfcall)... iv(6) is the number of calls so far made on calcf (i.e.,
 247 c             function evaluations).
 248 c iv(ngcall)... iv(30) is the number of gradient evaluations (calls on
 249 c             calcg).
 250 c iv(niter).... iv(31) is the number of iterations performed.
 251 c
 252 c  ***  (selected) v input values (from subroutine deflt)  ***
 253 c
 254 c v(bias)..... v(43) is the bias parameter used in subroutine dbldog --
 255 c             see that subroutine for details.  default = 0.8.
 256 c v(afctol)... v(31) is the absolute function convergence tolerance.
 257 c             if sumsl finds a point where the function value is less
 258 c             than v(afctol) in absolute value, and if sumsl does not
 259 c             return with iv(1) = 3, 4, or 5, then it returns with
 260 c             iv(1) = 6.  this test can be turned off by setting
 261 c             v(afctol) to zero.  default = max(10**-20, machep**2),
 262 c             where machep is the unit roundoff.
 263 c v(dinit).... v(38), if nonnegative, is the value to which the scale
 264 c             vector d is initialized.  default = -1.
 265 c v(lmax0).... v(35) gives the maximum 2-norm allowed for d times the
 266 c             very first step that sumsl attempts.  this parameter can
 267 c             markedly affect the performance of sumsl.
 268 c v(lmaxs).... v(36) is used in testing for singular convergence -- if
 269 c             the function reduction predicted for a step of length
 270 c             bounded by v(lmaxs) is at most v(sctol) * abs(f0), where
 271 c             f0  is the function value at the start of the current
 272 c             iteration, and if sumsl does not return with iv(1) = 3,
 273 c             4, 5, or 6, then it returns with iv(1) = 7.  default = 1.
 274 c v(rfctol)... v(32) is the relative function convergence tolerance.
 275 c             if the current model predicts a maximum possible function
 276 c             reduction (see v(nreduc)) of at most v(rfctol)*abs(f0)
 277 c             at the start of the current iteration, where  f0  is the
 278 c             then current function value, and if the last step attempt-
 279 c             ed achieved no more than twice the predicted function
 280 c             decrease, then sumsl returns with iv(1) = 4 (or 5).
 281 c             default = max(10**-10, machep**(2/3)), where machep is
 282 c             the unit roundoff.
 283 c v(sctol).... v(37) is the singular convergence tolerance -- see the
 284 c             description of v(lmaxs) above.
 285 c v(tuner1)... v(26) helps decide when to check for false convergence.
 286 c             this is done if the actual function decrease from the
 287 c             current step is no more than v(tuner1) times its predict-
 288 c             ed value.  default = 0.1.
 289 c v(xctol).... v(33) is the x-convergence tolerance.  if a newton step
 290 c             (see v(nreduc)) is tried that has v(reldx) .le. v(xctol)
 291 c             and if this step yields at most twice the predicted func-
 292 c             tion decrease, then sumsl returns with iv(1) = 3 (or 5).
 293 c             (see the description of v(reldx) below.)
 294 c             default = machep**0.5, where machep is the unit roundoff.
 295 c v(xftol).... v(34) is the false convergence tolerance.  if a step is
 296 c             tried that gives no more than v(tuner1) times the predict-
 297 c             ed function decrease and that has v(reldx) .le. v(xftol),
 298 c             and if sumsl does not return with iv(1) = 3, 4, 5, 6, or
 299 c             7, then it returns with iv(1) = 8.  (see the description
 300 c             of v(reldx) below.)  default = 100*machep, where
 301 c             machep is the unit roundoff.
 302 c v(*)........ deflt supplies to v a number of tuning constants, with
 303 c             which it should ordinarily be unnecessary to tinker.  see
 304 c             section 17 of version 2.2 of the nl2sol usage summary
 305 c             (i.e., the appendix to ref. 1) for details on v(i),
 306 c             i = decfac, incfac, phmnfc, phmxfc, rdfcmn, rdfcmx,
 307 c             tuner2, tuner3, tuner4, tuner5.
 308 c
 309 c  ***  (selected) v output values  ***
 310 c
 311 c v(dgnorm)... v(1) is the 2-norm of (diag(d)**-1)*g, where g is the
 312 c             most recently computed gradient.
 313 c v(dstnrm)... v(2) is the 2-norm of diag(d)*step, where step is the
 314 c             current step.
 315 c v(f)........ v(10) is the current function value.
 316 c v(f0)....... v(13) is the function value at the start of the current
 317 c             iteration.
 318 c v(nreduc)... v(6), if positive, is the maximum function reduction
 319 c             possible according to the current model, i.e., the func-
 320 c             tion reduction predicted for a newton step (i.e.,
 321 c             step = -h**-1 * g,  where  g  is the current gradient and
 322 c             h is the current hessian approximation).
 323 c                  if v(nreduc) is negative, then it is the negative of
 324 c             the function reduction predicted for a step computed with
 325 c             a step bound of v(lmaxs) for use in testing for singular
 326 c             convergence.
 327 c v(preduc)... v(7) is the function reduction predicted (by the current
 328 c             quadratic model) for the current step.  this (divided by
 329 c             v(f0)) is used in testing for relative function
 330 c             convergence.
 331 c v(reldx).... v(17) is the scaled relative change in x caused by the
 332 c             current step, computed as
 333 c                  max(abs(d(i)*(x(i)-x0(i)), 1 .le. i .le. p) /
 334 c                     max(d(i)*(abs(x(i))+abs(x0(i))), 1 .le. i .le. p),
 335 c             where x = x0 + step.
 336 c
 337 c-------------------------------  notes  -------------------------------
 338 c
 339 c  ***  algorithm notes  ***
 340 c
 341 c        this routine uses a hessian approximation computed from the
 342 c     bfgs update (see ref 3).  only a cholesky factor of the hessian
 343 c     approximation is stored, and this is updated using ideas from
 344 c     ref. 4.  steps are computed by the double dogleg scheme described
 345 c     in ref. 2.  the steps are assessed as in ref. 1.
 346 c
 347 c  ***  usage notes  ***
 348 c
 349 c        after a return with iv(1) .le. 11, it is possible to restart,
 350 c     i.e., to change some of the iv and v input values described above
 351 c     and continue the algorithm from the point where it was interrupt-
 352 c     ed.  iv(1) should not be changed, nor should any entries of i
 353 c     and v other than the input values (those supplied by deflt).
 354 c        those who do not wish to write a calcg which computes the
 355 c     gradient analytically should call smsno rather than sumsl.
 356 c     smsno uses finite differences to compute an approximate gradient.
 357 c        those who would prefer to provide f and g (the function and
 358 c     gradient) by reverse communication rather than by writing subrou-
 359 c     tines calcf and calcg may call on sumit directly.  see the com-
 360 c     ments at the beginning of sumit.
 361 c        those who use sumsl interactively may wish to supply their
 362 c     own stopx function, which should return .true. if the break key
 363 c     has been pressed since stopx was last invoked.  this makes it
 364 c     possible to externally interrupt sumsl (which will return with
 365 c     iv(1) = 11 if stopx returns .true.).
 366 c        storage for g is allocated at the end of v.  thus the caller
 367 c     may make v longer than specified above and may allow calcg to use
 368 c     elements of g beyond the first n as scratch storage.
 369 c
 370 c  ***  portability notes  ***
 371 c
 372 c        the sumsl distribution tape contains both single- and double-
 373 c     precision versions of the sumsl source code, so it should be un-
 374 c     necessary to change precisions.
 375 c        only the functions imdcon and rmdcon contain machine-dependent
 376 c     constants.  to change from one machine to another, it should
 377 c     suffice to change the (few) relevant lines in these functions.
 378 c        intrinsic functions are explicitly declared.  on certain com-
 379 c     puters (e.g. univac), it may be necessary to comment out these
 380 c     declarations.  so that this may be done automatically by a simple
 381 c     program, such declarations are preceded by a comment having c/+
 382 c     in columns 1-3 and blanks in columns 4-72 and are followed by
 383 c     a comment having c/ in columns 1 and 2 and blanks in columns 3-72.
 384 c        the sumsl source code is expressed in 1966 ansi standard
 385 c     fortran.  it may be converted to fortran 77 by commenting out all
 386 c     lines that fall between a line having c/6 in columns 1-3 and a
 387 c     line having c/7 in columns 1-3 and by removing (i.e., replacing
 388 c     by a blank) the c in column 1 of the lines that follow the c/7
 389 c     line and precede a line having c/ in columns 1-2 and blanks in
 390 c     columns 3-72.  these changes convert some data statements into
 391 c     parameter statements, convert some variables from real to
 392 c     character*4, and make the data statements that initialize these
 393 c     variables use character strings delimited by primes instead
 394 c     of hollerith constants.  (such variables and data statements
 395 c     appear only in modules itsum and parck.  parameter statements
 396 c     appear nearly everywhere.)  these changes also add save state-
 397 c     ments for variables given machine-dependent constants by rmdcon.
 398 c
 399 c  ***  references  ***
 400 c
 401 c 1.  dennis, j.e., gay, d.m., and welsch, r.e. (1981), algorithm 573 --
 402 c             an adaptive nonlinear least-squares algorithm, acm trans.
 403 c             math. software 7, pp. 369-383.
 404 c
 405 c 2.  dennis, j.e., and mei, h.h.w. (1979), two new unconstrained opti-
 406 c             mization algorithms which use function and gradient
 407 c             values, j. optim. theory applic. 28, pp. 453-482.
 408 c
 409 c 3.  dennis, j.e., and more, j.j. (1977), quasi-newton methods, motiva-
 410 c             tion and theory, siam rev. 19, pp. 46-89.
 411 c
 412 c 4.  goldfarb, d. (1976), factorized variable metric methods for uncon-
 413 c             strained optimization, math. comput. 30, pp. 796-811.
 414 c
 415 c  ***  general  ***
 416 c
 417 c     coded by david m. gay (winter 1980).  revised summer 1982.
 418 c     this subroutine was written in connection with research
 419 c     supported in part by the national science foundation under
 420 c     grants mcs-7600324, dcr75-10143, 76-14311dss, mcs76-11989,
 421 c     and mcs-7906671.
 422 c.
 423 c
 424 c----------------------------  declarations  ---------------------------
 425 c
 426       external deflt, sumit
 427 c
 428 c deflt... supplies default iv and v input components.
 429 c sumit... reverse-communication routine that carries out sumsl algo-
 430 c             rithm.
 431 c
 432       integer g1, iv1, nf
 433       double precision f
 434 c
 435 c  ***  subscripts for iv   ***
 436 c
 437       integer nextv, nfcall, nfgcal, g, toobig, vneed
 438 c
 439 c/6
 440 c     data nextv/47/, nfcall/6/, nfgcal/7/, g/28/, toobig/2/, vneed/4/
 441 c/7
 442       parameter (nextv=47, nfcall=6, nfgcal=7, g=28, toobig=2, vneed=4)
 443 c/
 444 c
 445 c+++++++++++++++++++++++++++++++  body  ++++++++++++++++++++++++++++++++
 446 c
 447       if (iv(1) .eq. 0) call deflt(2, iv, liv, lv, v)
 448       iv1 = iv(1)
 449       if (iv1 .eq. 12 .or. iv1 .eq. 13) iv(vneed) = iv(vneed) + n
 450       if (iv1 .eq. 14) go to 10
 451       if (iv1 .gt. 2 .and. iv1 .lt. 12) go to 10
 452       g1 = 1
 453       if (iv1 .eq. 12) iv(1) = 13
 454       go to 20
 455 c
 456  10   g1 = iv(g)
 457 c
 458  20   call sumit(d, f, v(g1), iv, liv, lv, n, v, x)
 459 c      write (2,*) "after sumit",iv
 460 c      call flush(2)
 461       if (iv(1) - 2) 30, 40, 50
 462 c
 463  30   nf = iv(nfcall)
 464       call calcf(n, x, nf, f, uiparm, urparm, ufparm)
 465 c      write (2,*) "after calcf nf",nf
 466 c      call flush(2)
 467       if (nf .le. 0) iv(toobig) = 1
 468       go to 20
 469 c
 470  40   call calcg(n, x, iv(nfgcal), v(g1), uiparm, urparm, ufparm)
 471       go to 20
 472 c
 473  50   if (iv(1) .ne. 14) go to 999
 474 c
 475 c  ***  storage allocation
 476 c
 477       iv(g) = iv(nextv)
 478       iv(nextv) = iv(g) + n
 479       if (iv1 .ne. 13) go to 10
 480 c
 481  999  return
 482 c  ***  last card of sumsl follows  ***
 483       end
 484       subroutine sumit(d, fx, g, iv, liv, lv, n, v, x)
 485 c
 486 c  ***  carry out sumsl (unconstrained minimization) iterations, using
 487 c  ***  double-dogleg/bfgs steps.
 488 c
 489 c  ***  parameter declarations  ***
 490 c
 491       integer liv, lv, n
 492       integer iv(liv)
 493       double precision d(n), fx, g(n), v(lv), x(n)
 494 c
 495 c--------------------------  parameter usage  --------------------------
 496 c
 497 c d.... scale vector.
 498 c fx... function value.
 499 c g.... gradient vector.
 500 c iv... integer value array.
 501 c liv.. length of iv (at least 60).
 502 c lv... length of v (at least 71 + n*(n+13)/2).
 503 c n.... number of variables (components in x and g).
 504 c v.... floating-point value array.
 505 c x.... vector of parameters to be optimized.
 506 c
 507 c  ***  discussion  ***
 508 c
 509 c        parameters iv, n, v, and x are the same as the corresponding
 510 c     ones to sumsl (which see), except that v can be shorter (since
 511 c     the part of v that sumsl uses for storing g is not needed).
 512 c     moreover, compared with sumsl, iv(1) may have the two additional
 513 c     output values 1 and 2, which are explained below, as is the use
 514 c     of iv(toobig) and iv(nfgcal).  the value iv(g), which is an
 515 c     output value from sumsl (and smsno), is not referenced by
 516 c     sumit or the subroutines it calls.
 517 c        fx and g need not have been initialized when sumit is called
 518 c     with iv(1) = 12, 13, or 14.
 519 c
 520 c iv(1) = 1 means the caller should set fx to f(x), the function value
 521 c             at x, and call sumit again, having changed none of the
 522 c             other parameters.  an exception occurs if f(x) cannot be
 523 c             (e.g. if overflow would occur), which may happen because
 524 c             of an oversized step.  in this case the caller should set
 525 c             iv(toobig) = iv(2) to 1, which will cause sumit to ig-
 526 c             nore fx and try a smaller step.  the parameter nf that
 527 c             sumsl passes to calcf (for possible use by calcg) is a
 528 c             copy of iv(nfcall) = iv(6).
 529 c iv(1) = 2 means the caller should set g to g(x), the gradient vector
 530 c             of f at x, and call sumit again, having changed none of
 531 c             the other parameters except possibly the scale vector d
 532 c             when iv(dtype) = 0.  the parameter nf that sumsl passes
 533 c             to calcg is iv(nfgcal) = iv(7).  if g(x) cannot be
 534 c             evaluated, then the caller may set iv(nfgcal) to 0, in
 535 c             which case sumit will return with iv(1) = 65.
 536 c.
 537 c  ***  general  ***
 538 c
 539 c     coded by david m. gay (december 1979).  revised sept. 1982.
 540 c     this subroutine was written in connection with research supported
 541 c     in part by the national science foundation under grants
 542 c     mcs-7600324 and mcs-7906671.
 543 c
 544 c        (see sumsl for references.)
 545 c
 546 c+++++++++++++++++++++++++++  declarations  ++++++++++++++++++++++++++++
 547 c
 548 c  ***  local variables  ***
 549 c
 550       integer dg1, dummy, g01, i, k, l, lstgst, nwtst1, step1,
 551      1        temp1, w, x01, z
 552       double precision t
 553 c
 554 c     ***  constants  ***
 555 c
 556       double precision half, negone, one, onep2, zero
 557 c
 558 c  ***  no intrinsic functions  ***
 559 c
 560 c  ***  external functions and subroutines  ***
 561 c
 562       external assst, dbdog, deflt, dotprd, itsum, litvmu, livmul,
 563      1         ltvmul, lupdat, lvmul, parck, reldst, stopx, vaxpy,
 564      2         vcopy, vscopy, vvmulp, v2norm, wzbfgs
 565       logical stopx
 566       double precision dotprd, reldst, v2norm
 567 c
 568 c assst.... assesses candidate step.
 569 c dbdog.... computes double-dogleg (candidate) step.
 570 c deflt.... supplies default iv and v input components.
 571 c dotprd... returns inner product of two vectors.
 572 c itsum.... prints iteration summary and info on initial and final x.
 573 c litvmu... multiplies inverse transpose of lower triangle times vector.
 574 c livmul... multiplies inverse of lower triangle times vector.
 575 c ltvmul... multiplies transpose of lower triangle times vector.
 576 c lupdt.... updates cholesky factor of hessian approximation.
 577 c lvmul.... multiplies lower triangle times vector.
 578 c parck.... checks validity of input iv and v values.
 579 c reldst... computes v(reldx) = relative step size.
 580 c stopx.... returns .true. if the break key has been pressed.
 581 c vaxpy.... computes scalar times one vector plus another.
 582 c vcopy.... copies one vector to another.
 583 c vscopy... sets all elements of a vector to a scalar.
 584 c vvmulp... multiplies vector by vector raised to power (componentwise).
 585 c v2norm... returns the 2-norm of a vector.
 586 c wzbfgs... computes w and z for lupdat corresponding to bfgs update.
 587 c
 588 c  ***  subscripts for iv and v  ***
 589 c
 590       integer afctol
 591       integer cnvcod, dg, dgnorm, dinit, dstnrm, dst0, f, f0, fdif,
 592      1        gthg, gtstep, g0, incfac, inith, irc, kagqt, lmat, lmax0,
 593      2        lmaxs, mode, model, mxfcal, mxiter, nextv, nfcall, nfgcal,
 594      3        ngcall, niter, nreduc, nwtstp, preduc, radfac, radinc,
 595      4        radius, rad0, reldx, restor, step, stglim, stlstg, toobig,
 596      5        tuner4, tuner5, vneed, xirc, x0
 597 c
 598 c  ***  iv subscript values  ***
 599 c
 600 c/6
 601 c     data cnvcod/55/, dg/37/, g0/48/, inith/25/, irc/29/, kagqt/33/,
 602 c    1     mode/35/, model/5/, mxfcal/17/, mxiter/18/, nfcall/6/,
 603 c    2     nfgcal/7/, ngcall/30/, niter/31/, nwtstp/34/, radinc/8/,
 604 c    3     restor/9/, step/40/, stglim/11/, stlstg/41/, toobig/2/,
 605 c    4     vneed/4/, xirc/13/, x0/43/
 606 c/7
 607       parameter (cnvcod=55, dg=37, g0=48, inith=25, irc=29, kagqt=33,
 608      1           mode=35, model=5, mxfcal=17, mxiter=18, nfcall=6,
 609      2           nfgcal=7, ngcall=30, niter=31, nwtstp=34, radinc=8,
 610      3           restor=9, step=40, stglim=11, stlstg=41, toobig=2,
 611      4           vneed=4, xirc=13, x0=43)
 612 c/
 613 c
 614 c  ***  v subscript values  ***
 615 c
 616 c/6
 617 c     data afctol/31/
 618 c     data dgnorm/1/, dinit/38/, dstnrm/2/, dst0/3/, f/10/, f0/13/,
 619 c    1     fdif/11/, gthg/44/, gtstep/4/, incfac/23/, lmat/42/,
 620 c    2     lmax0/35/, lmaxs/36/, nextv/47/, nreduc/6/, preduc/7/,
 621 c    3     radfac/16/, radius/8/, rad0/9/, reldx/17/, tuner4/29/,
 622 c    4     tuner5/30/
 623 c/7
 624       parameter (afctol=31)
 625       parameter (dgnorm=1, dinit=38, dstnrm=2, dst0=3, f=10, f0=13,
 626      1           fdif=11, gthg=44, gtstep=4, incfac=23, lmat=42,
 627      2           lmax0=35, lmaxs=36, nextv=47, nreduc=6, preduc=7,
 628      3           radfac=16, radius=8, rad0=9, reldx=17, tuner4=29,
 629      4           tuner5=30)
 630 c/
 631 c
 632 c/6
 633 c     data half/0.5d+0/, negone/-1.d+0/, one/1.d+0/, onep2/1.2d+0/,
 634 c    1     zero/0.d+0/
 635 c/7
 636       parameter (half=0.5d+0, negone=-1.d+0, one=1.d+0, onep2=1.2d+0,
 637      1           zero=0.d+0)
 638 c/
 639 c
 640 c+++++++++++++++++++++++++++++++  body  ++++++++++++++++++++++++++++++++
 641 c
 642 C Following SAVE statement inserted.
 643       save l
 644 c      write (2,*) "calling sumit"
 645 c      call flush(2)
 646       i = iv(1)
 647       if (i .eq. 1) go to 50
 648       if (i .eq. 2) go to 60
 649 c
 650 c  ***  check validity of iv and v input values  ***
 651 c
 652       if (iv(1) .eq. 0) call deflt(2, iv, liv, lv, v)
 653       if (iv(1) .eq. 12 .or. iv(1) .eq. 13)
 654      1     iv(vneed) = iv(vneed) + n*(n+13)/2
 655       call parck(2, d, iv, liv, lv, n, v)
 656       i = iv(1) - 2
 657       if (i .gt. 12) go to 999
 658       go to (180, 180, 180, 180, 180, 180, 120, 90, 120, 10, 10, 20), i
 659 c
 660 c  ***  storage allocation  ***
 661 c
 662 10    l = iv(lmat)
 663       iv(x0) = l + n*(n+1)/2
 664       iv(step) = iv(x0) + n
 665       iv(stlstg) = iv(step) + n
 666       iv(g0) = iv(stlstg) + n
 667       iv(nwtstp) = iv(g0) + n
 668       iv(dg) = iv(nwtstp) + n
 669       iv(nextv) = iv(dg) + n
 670       if (iv(1) .ne. 13) go to 20
 671          iv(1) = 14
 672          go to 999
 673 c
 674 c  ***  initialization  ***
 675 c
 676  20   iv(niter) = 0
 677       iv(nfcall) = 1
 678       iv(ngcall) = 1
 679       iv(nfgcal) = 1
 680       iv(mode) = -1
 681       iv(model) = 1
 682       iv(stglim) = 1
 683       iv(toobig) = 0
 684       iv(cnvcod) = 0
 685       iv(radinc) = 0
 686       v(rad0) = zero
 687       if (v(dinit) .ge. zero) call vscopy(n, d, v(dinit))
 688       if (iv(inith) .ne. 1) go to 40
 689 c
 690 c     ***  set the initial hessian approximation to diag(d)**-2  ***
 691 c
 692          l = iv(lmat)
 693          call vscopy(n*(n+1)/2, v(l), zero)
 694          k = l - 1
 695          do 30 i = 1, n
 696               k = k + i
 697               t = d(i)
 698               if (t .le. zero) t = one
 699               v(k) = t
 700  30           continue
 701 c
 702 c  ***  compute initial function value  ***
 703 c
 704  40   iv(1) = 1
 705       go to 999
 706 c
 707  50   v(f) = fx
 708       if (iv(mode) .ge. 0) go to 180
 709       iv(1) = 2
 710       if (iv(toobig) .eq. 0) go to 999
 711          iv(1) = 63
 712          go to 300
 713 c
 714 c  ***  make sure gradient could be computed  ***
 715 c
 716  60   if (iv(nfgcal) .ne. 0) go to 70
 717          iv(1) = 65
 718          go to 300
 719 c
 720  70   dg1 = iv(dg)
 721       call vvmulp(n, v(dg1), g, d, -1)
 722       v(dgnorm) = v2norm(n, v(dg1))
 723 c
 724 c  ***  test norm of gradient  ***
 725 c
 726       if (v(dgnorm) .gt. v(afctol)) go to 75
 727       iv(irc) = 10
 728       iv(cnvcod) = iv(irc) - 4
 729 c
 730  75   if (iv(cnvcod) .ne. 0) go to 290
 731       if (iv(mode) .eq. 0) go to 250
 732 c
 733 c  ***  allow first step to have scaled 2-norm at most v(lmax0)  ***
 734 c
 735       v(radius) = v(lmax0)
 736 c
 737       iv(mode) = 0
 738 c
 739 c
 740 c-----------------------------  main loop  -----------------------------
 741 c
 742 c
 743 c  ***  print iteration summary, check iteration limit  ***
 744 c
 745  80   call itsum(d, g, iv, liv, lv, n, v, x)
 746  90   k = iv(niter)
 747       if (k .lt. iv(mxiter)) go to 100
 748          iv(1) = 10
 749          go to 300
 750 c
 751 c  ***  update radius  ***
 752 c
 753  100  iv(niter) = k + 1
 754       if(k.gt.0)v(radius) = v(radfac) * v(dstnrm)
 755 c
 756 c  ***  initialize for start of next iteration  ***
 757 c
 758       g01 = iv(g0)
 759       x01 = iv(x0)
 760       v(f0) = v(f)
 761       iv(irc) = 4
 762       iv(kagqt) = -1
 763 c
 764 c     ***  copy x to x0, g to g0  ***
 765 c
 766       call vcopy(n, v(x01), x)
 767       call vcopy(n, v(g01), g)
 768 c
 769 c  ***  check stopx and function evaluation limit  ***
 770 c
 771 C AL 4/30/95
 772 c      dummy=iv(niter)
 773 c      write (2,*) "dummy",dummy
 774  110  if (.not. stopx(iv(niter))) go to 130
 775          iv(1) = 11
 776          go to 140
 777 c
 778 c     ***  come here when restarting after func. eval. limit or stopx.
 779 c
 780  120  if (v(f) .ge. v(f0)) go to 130
 781          v(radfac) = one
 782          k = iv(niter)
 783          go to 100
 784 c
 785  130  if (iv(nfcall) .lt. iv(mxfcal)) go to 150
 786          iv(1) = 9
 787  140     if (v(f) .ge. v(f0)) go to 300
 788 c
 789 c        ***  in case of stopx or function evaluation limit with
 790 c        ***  improved v(f), evaluate the gradient at x.
 791 c
 792               iv(cnvcod) = iv(1)
 793               go to 240
 794 c
 795 c. . . . . . . . . . . . .  compute candidate step  . . . . . . . . . .
 796 c
 797  150  step1 = iv(step)
 798       dg1 = iv(dg)
 799       nwtst1 = iv(nwtstp)
 800       if (iv(kagqt) .ge. 0) go to 160
 801          l = iv(lmat)
 802          call livmul(n, v(nwtst1), v(l), g)
 803          v(nreduc) = half * dotprd(n, v(nwtst1), v(nwtst1))
 804          call litvmu(n, v(nwtst1), v(l), v(nwtst1))
 805          call vvmulp(n, v(step1), v(nwtst1), d, 1)
 806          v(dst0) = v2norm(n, v(step1))
 807          call vvmulp(n, v(dg1), v(dg1), d, -1)
 808          call ltvmul(n, v(step1), v(l), v(dg1))
 809          v(gthg) = v2norm(n, v(step1))
 810          iv(kagqt) = 0
 811  160  call dbdog(v(dg1), lv, n, v(nwtst1), v(step1), v)
 812       if (iv(irc) .eq. 6) go to 180
 813 c
 814 c  ***  check whether evaluating f(x0 + step) looks worthwhile  ***
 815 c
 816       if (v(dstnrm) .le. zero) go to 180
 817       if (iv(irc) .ne. 5) go to 170
 818       if (v(radfac) .le. one) go to 170
 819       if (v(preduc) .le. onep2 * v(fdif)) go to 180
 820 c
 821 c  ***  compute f(x0 + step)  ***
 822 c
 823  170  x01 = iv(x0)
 824       step1 = iv(step)
 825       call vaxpy(n, x, one, v(step1), v(x01))
 826       iv(nfcall) = iv(nfcall) + 1
 827       iv(1) = 1
 828       iv(toobig) = 0
 829       go to 999
 830 c
 831 c. . . . . . . . . . . . .  assess candidate step  . . . . . . . . . . .
 832 c
 833  180  x01 = iv(x0)
 834       v(reldx) = reldst(n, d, x, v(x01))
 835       call assst(iv, liv, lv, v)
 836       step1 = iv(step)
 837       lstgst = iv(stlstg)
 838       if (iv(restor) .eq. 1) call vcopy(n, x, v(x01))
 839       if (iv(restor) .eq. 2) call vcopy(n, v(lstgst), v(step1))
 840       if (iv(restor) .ne. 3) go to 190
 841          call vcopy(n, v(step1), v(lstgst))
 842          call vaxpy(n, x, one, v(step1), v(x01))
 843          v(reldx) = reldst(n, d, x, v(x01))
 844 c
 845  190  k = iv(irc)
 846       go to (200,230,230,230,200,210,220,220,220,220,220,220,280,250), k
 847 c
 848 c     ***  recompute step with changed radius  ***
 849 c
 850  200     v(radius) = v(radfac) * v(dstnrm)
 851          go to 110
 852 c
 853 c  ***  compute step of length v(lmaxs) for singular convergence test.
 854 c
 855  210  v(radius) = v(lmaxs)
 856       go to 150
 857 c
 858 c  ***  convergence or false convergence  ***
 859 c
 860  220  iv(cnvcod) = k - 4
 861       if (v(f) .ge. v(f0)) go to 290
 862          if (iv(xirc) .eq. 14) go to 290
 863               iv(xirc) = 14
 864 c
 865 c. . . . . . . . . . . .  process acceptable step  . . . . . . . . . . .
 866 c
 867  230  if (iv(irc) .ne. 3) go to 240
 868          step1 = iv(step)
 869          temp1 = iv(stlstg)
 870 c
 871 c     ***  set  temp1 = hessian * step  for use in gradient tests  ***
 872 c
 873          l = iv(lmat)
 874          call ltvmul(n, v(temp1), v(l), v(step1))
 875          call lvmul(n, v(temp1), v(l), v(temp1))
 876 c
 877 c  ***  compute gradient  ***
 878 c
 879  240  iv(ngcall) = iv(ngcall) + 1
 880       iv(1) = 2
 881       go to 999
 882 c
 883 c  ***  initializations -- g0 = g - g0, etc.  ***
 884 c
 885  250  g01 = iv(g0)
 886       call vaxpy(n, v(g01), negone, v(g01), g)
 887       step1 = iv(step)
 888       temp1 = iv(stlstg)
 889       if (iv(irc) .ne. 3) go to 270
 890 c
 891 c  ***  set v(radfac) by gradient tests  ***
 892 c
 893 c     ***  set  temp1 = diag(d)**-1 * (hessian*step + (g(x0)-g(x)))  ***
 894 c
 895          call vaxpy(n, v(temp1), negone, v(g01), v(temp1))
 896          call vvmulp(n, v(temp1), v(temp1), d, -1)
 897 c
 898 c        ***  do gradient tests  ***
 899 c
 900          if (v2norm(n, v(temp1)) .le. v(dgnorm) * v(tuner4))
 901      1                  go to 260
 902               if (dotprd(n, g, v(step1))
 903      1                  .ge. v(gtstep) * v(tuner5))  go to 270
 904  260               v(radfac) = v(incfac)
 905 c
 906 c  ***  update h, loop  ***
 907 c
 908  270  w = iv(nwtstp)
 909       z = iv(x0)
 910       l = iv(lmat)
 911       call wzbfgs(v(l), n, v(step1), v(w), v(g01), v(z))
 912 c
 913 c     ** use the n-vectors starting at v(step1) and v(g01) for scratch..
 914       call lupdat(v(temp1), v(step1), v(l), v(g01), v(l), n, v(w), v(z))
 915       iv(1) = 2
 916       go to 80
 917 c
 918 c. . . . . . . . . . . . . .  misc. details  . . . . . . . . . . . . . .
 919 c
 920 c  ***  bad parameters to assess  ***
 921 c
 922  280  iv(1) = 64
 923       go to 300
 924 c
 925 c  ***  print summary of final iteration and other requested items  ***
 926 c
 927  290  iv(1) = iv(cnvcod)
 928       iv(cnvcod) = 0
 929  300  call itsum(d, g, iv, liv, lv, n, v, x)
 930 c
 931  999  return
 932 c
 933 c  ***  last line of sumit follows  ***
 934       end
 935       subroutine dbdog(dig, lv, n, nwtstp, step, v)
 936 c
 937 c  ***  compute double dogleg step  ***
 938 c
 939 c  ***  parameter declarations  ***
 940 c
 941       integer lv, n
 942       double precision dig(n), nwtstp(n), step(n), v(lv)
 943 c
 944 c  ***  purpose  ***
 945 c
 946 c        this subroutine computes a candidate step (for use in an uncon-
 947 c     strained minimization code) by the double dogleg algorithm of
 948 c     dennis and mei (ref. 1), which is a variation on powell*s dogleg
 949 c     scheme (ref. 2, p. 95).
 950 c
 951 c--------------------------  parameter usage  --------------------------
 952 c
 953 c    dig (input) diag(d)**-2 * g -- see algorithm notes.
 954 c      g (input) the current gradient vector.
 955 c     lv (input) length of v.
 956 c      n (input) number of components in  dig, g, nwtstp,  and  step.
 957 c nwtstp (input) negative newton step -- see algorithm notes.
 958 c   step (output) the computed step.
 959 c      v (i/o) values array, the following components of which are
 960 c             used here...
 961 c v(bias)   (input) bias for relaxed newton step, which is v(bias) of
 962 c             the way from the full newton to the fully relaxed newton
 963 c             step.  recommended value = 0.8 .
 964 c v(dgnorm) (input) 2-norm of diag(d)**-1 * g -- see algorithm notes.
 965 c v(dstnrm) (output) 2-norm of diag(d) * step, which is v(radius)
 966 c             unless v(stppar) = 0 -- see algorithm notes.
 967 c v(dst0) (input) 2-norm of diag(d) * nwtstp -- see algorithm notes.
 968 c v(grdfac) (output) the coefficient of  dig  in the step returned --
 969 c             step(i) = v(grdfac)*dig(i) + v(nwtfac)*nwtstp(i).
 970 c v(gthg)   (input) square-root of (dig**t) * (hessian) * dig -- see
 971 c             algorithm notes.
 972 c v(gtstep) (output) inner product between g and step.
 973 c v(nreduc) (output) function reduction predicted for the full newton
 974 c             step.
 975 c v(nwtfac) (output) the coefficient of  nwtstp  in the step returned --
 976 c             see v(grdfac) above.
 977 c v(preduc) (output) function reduction predicted for the step returned.
 978 c v(radius) (input) the trust region radius.  d times the step returned
 979 c             has 2-norm v(radius) unless v(stppar) = 0.
 980 c v(stppar) (output) code telling how step was computed... 0 means a
 981 c             full newton step.  between 0 and 1 means v(stppar) of the
 982 c             way from the newton to the relaxed newton step.  between
 983 c             1 and 2 means a true double dogleg step, v(stppar) - 1 of
 984 c             the way from the relaxed newton to the cauchy step.
 985 c             greater than 2 means 1 / (v(stppar) - 1) times the cauchy
 986 c             step.
 987 c
 988 c-------------------------------  notes  -------------------------------
 989 c
 990 c  ***  algorithm notes  ***
 991 c
 992 c        let  g  and  h  be the current gradient and hessian approxima-
 993 c     tion respectively and let d be the current scale vector.  this
 994 c     routine assumes dig = diag(d)**-2 * g  and  nwtstp = h**-1 * g.
 995 c     the step computed is the same one would get by replacing g and h
 996 c     by  diag(d)**-1 * g  and  diag(d)**-1 * h * diag(d)**-1,
 997 c     computing step, and translating step back to the original
 998 c     variables, i.e., premultiplying it by diag(d)**-1.
 999 c
1000 c  ***  references  ***
1001 c
1002 c 1.  dennis, j.e., and mei, h.h.w. (1979), two new unconstrained opti-
1003 c             mization algorithms which use function and gradient
1004 c             values, j. optim. theory applic. 28, pp. 453-482.
1005 c 2. powell, m.j.d. (1970), a hybrid method for non-linear equations,
1006 c             in numerical methods for non-linear equations, edited by
1007 c             p. rabinowitz, gordon and breach, london.
1008 c
1009 c  ***  general  ***
1010 c
1011 c     coded by david m. gay.
1012 c     this subroutine was written in connection with research supported
1013 c     by the national science foundation under grants mcs-7600324 and
1014 c     mcs-7906671.
1015 c
1016 c------------------------  external quantities  ------------------------
1017 c
1018 c  ***  functions and subroutines called  ***
1019 c
1020       external dotprd, v2norm
1021       double precision dotprd, v2norm
1022 c
1023 c dotprd... returns inner product of two vectors.
1024 c v2norm... returns 2-norm of a vector.
1025 c
1026 c  ***  intrinsic functions  ***
1027 c/+
1028       double precision dsqrt
1029 c/
1030 c--------------------------  local variables  --------------------------
1031 c
1032       integer i
1033       double precision cfact, cnorm, ctrnwt, ghinvg, femnsq, gnorm,
1034      1                 nwtnrm, relax, rlambd, t, t1, t2
1035       double precision half, one, two, zero
1036 c
1037 c  ***  v subscripts  ***
1038 c
1039       integer bias, dgnorm, dstnrm, dst0, grdfac, gthg, gtstep,
1040      1        nreduc, nwtfac, preduc, radius, stppar
1041 c
1042 c  ***  data initializations  ***
1043 c
1044 c/6
1045 c     data half/0.5d+0/, one/1.d+0/, two/2.d+0/, zero/0.d+0/
1046 c/7
1047       parameter (half=0.5d+0, one=1.d+0, two=2.d+0, zero=0.d+0)
1048 c/
1049 c
1050 c/6
1051 c     data bias/43/, dgnorm/1/, dstnrm/2/, dst0/3/, grdfac/45/,
1052 c    1     gthg/44/, gtstep/4/, nreduc/6/, nwtfac/46/, preduc/7/,
1053 c    2     radius/8/, stppar/5/
1054 c/7
1055       parameter (bias=43, dgnorm=1, dstnrm=2, dst0=3, grdfac=45,
1056      1           gthg=44, gtstep=4, nreduc=6, nwtfac=46, preduc=7,
1057      2           radius=8, stppar=5)
1058 c/
1059 c
1060 c+++++++++++++++++++++++++++++++  body  ++++++++++++++++++++++++++++++++
1061 c
1062       nwtnrm = v(dst0)
1063       rlambd = one
1064       if (nwtnrm .gt. zero) rlambd = v(radius) / nwtnrm
1065       gnorm = v(dgnorm)
1066       ghinvg = two * v(nreduc)
1067       v(grdfac) = zero
1068       v(nwtfac) = zero
1069       if (rlambd .lt. one) go to 30
1070 c
1071 c        ***  the newton step is inside the trust region  ***
1072 c
1073          v(stppar) = zero
1074          v(dstnrm) = nwtnrm
1075          v(gtstep) = -ghinvg
1076          v(preduc) = v(nreduc)
1077          v(nwtfac) = -one
1078          do 20 i = 1, n
1079  20           step(i) = -nwtstp(i)
1080          go to 999
1081 c
1082  30   v(dstnrm) = v(radius)
1083       cfact = (gnorm / v(gthg))**2
1084 c     ***  cauchy step = -cfact * g.
1085       cnorm = gnorm * cfact
1086       relax = one - v(bias) * (one - gnorm*cnorm/ghinvg)
1087       if (rlambd .lt. relax) go to 50
1088 c
1089 c        ***  step is between relaxed newton and full newton steps  ***
1090 c
1091          v(stppar)  =  one  -  (rlambd - relax) / (one - relax)
1092          t = -rlambd
1093          v(gtstep) = t * ghinvg
1094          v(preduc) = rlambd * (one - half*rlambd) * ghinvg
1095          v(nwtfac) = t
1096          do 40 i = 1, n
1097  40           step(i) = t * nwtstp(i)
1098          go to 999
1099 c
1100  50   if (cnorm .lt. v(radius)) go to 70
1101 c
1102 c        ***  the cauchy step lies outside the trust region --
1103 c        ***  step = scaled cauchy step  ***
1104 c
1105          t = -v(radius) / gnorm
1106          v(grdfac) = t
1107          v(stppar) = one  +  cnorm / v(radius)
1108          v(gtstep) = -v(radius) * gnorm
1109       v(preduc) = v(radius)*(gnorm - half*v(radius)*(v(gthg)/gnorm)**2)
1110          do 60 i = 1, n
1111  60           step(i) = t * dig(i)
1112          go to 999
1113 c
1114 c     ***  compute dogleg step between cauchy and relaxed newton  ***
1115 c     ***  femur = relaxed newton step minus cauchy step  ***
1116 c
1117  70   ctrnwt = cfact * relax * ghinvg / gnorm
1118 c     *** ctrnwt = inner prod. of cauchy and relaxed newton steps,
1119 c     *** scaled by gnorm**-1.
1120       t1 = ctrnwt - gnorm*cfact**2
1121 c     ***  t1 = inner prod. of femur and cauchy step, scaled by
1122 c     ***  gnorm**-1.
1123       t2 = v(radius)*(v(radius)/gnorm) - gnorm*cfact**2
1124       t = relax * nwtnrm
1125       femnsq = (t/gnorm)*t - ctrnwt - t1
1126 c     ***  femnsq = square of 2-norm of femur, scaled by gnorm**-1.
1127       t = t2 / (t1 + dsqrt(t1**2 + femnsq*t2))
1128 c     ***  dogleg step  =  cauchy step  +  t * femur.
1129       t1 = (t - one) * cfact
1130       v(grdfac) = t1
1131       t2 = -t * relax
1132       v(nwtfac) = t2
1133       v(stppar) = two - t
1134       v(gtstep) = t1*gnorm**2 + t2*ghinvg
1135       v(preduc) = -t1*gnorm * ((t2 + one)*gnorm)
1136      1                 - t2 * (one + half*t2)*ghinvg
1137      2                  - half * (v(gthg)*t1)**2
1138       do 80 i = 1, n
1139  80      step(i) = t1*dig(i) + t2*nwtstp(i)
1140 c
1141  999  return
1142 c  ***  last line of dbdog follows  ***
1143       end
1144       subroutine ltvmul(n, x, l, y)
1145 c
1146 c  ***  compute  x = (l**t)*y, where  l  is an  n x n  lower
1147 c  ***  triangular matrix stored compactly by rows.  x and y may
1148 c  ***  occupy the same storage.  ***
1149 c
1150       integer n
1151 cal   double precision x(n), l(1), y(n)
1152       double precision x(n), l(n*(n+1)/2), y(n)
1153 c     dimension l(n*(n+1)/2)
1154       integer i, ij, i0, j
1155       double precision yi, zero
1156 c/6
1157 c     data zero/0.d+0/
1158 c/7
1159       parameter (zero=0.d+0)
1160 c/
1161 c
1162       i0 = 0
1163       do 20 i = 1, n
1164          yi = y(i)
1165          x(i) = zero
1166          do 10 j = 1, i
1167               ij = i0 + j
1168               x(j) = x(j) + yi*l(ij)
1169  10           continue
1170          i0 = i0 + i
1171  20      continue
1172  999  return
1173 c  ***  last card of ltvmul follows  ***
1174       end
1175       subroutine lupdat(beta, gamma, l, lambda, lplus, n, w, z)
1176 c
1177 c  ***  compute lplus = secant update of l  ***
1178 c
1179 c  ***  parameter declarations  ***
1180 c
1181       integer n
1182 cal   double precision beta(n), gamma(n), l(1), lambda(n), lplus(1),
1183       double precision beta(n), gamma(n), l(n*(n+1)/2), lambda(n),
1184      1   lplus(n*(n+1)/2),w(n), z(n)
1185 c     dimension l(n*(n+1)/2), lplus(n*(n+1)/2)
1186 c
1187 c--------------------------  parameter usage  --------------------------
1188 c
1189 c   beta = scratch vector.
1190 c  gamma = scratch vector.
1191 c      l (input) lower triangular matrix, stored rowwise.
1192 c lambda = scratch vector.
1193 c  lplus (output) lower triangular matrix, stored rowwise, which may
1194 c             occupy the same storage as  l.
1195 c      n (input) length of vector parameters and order of matrices.
1196 c      w (input, destroyed on output) right singular vector of rank 1
1197 c             correction to  l.
1198 c      z (input, destroyed on output) left singular vector of rank 1
1199 c             correction to  l.
1200 c
1201 c-------------------------------  notes  -------------------------------
1202 c
1203 c  ***  application and usage restrictions  ***
1204 c
1205 c        this routine updates the cholesky factor  l  of a symmetric
1206 c     positive definite matrix to which a secant update is being
1207 c     applied -- it computes a cholesky factor  lplus  of
1208 c     l * (i + z*w**t) * (i + w*z**t) * l**t.  it is assumed that  w
1209 c     and  z  have been chosen so that the updated matrix is strictly
1210 c     positive definite.
1211 c
1212 c  ***  algorithm notes  ***
1213 c
1214 c        this code uses recurrence 3 of ref. 1 (with d(j) = 1 for all j)
1215 c     to compute  lplus  of the form  l * (i + z*w**t) * q,  where  q
1216 c     is an orthogonal matrix that makes the result lower triangular.
1217 c        lplus may have some negative diagonal elements.
1218 c
1219 c  ***  references  ***
1220 c
1221 c 1.  goldfarb, d. (1976), factorized variable metric methods for uncon-
1222 c             strained optimization, math. comput. 30, pp. 796-811.
1223 c
1224 c  ***  general  ***
1225 c
1226 c     coded by david m. gay (fall 1979).
1227 c     this subroutine was written in connection with research supported
1228 c     by the national science foundation under grants mcs-7600324 and
1229 c     mcs-7906671.
1230 c
1231 c------------------------  external quantities  ------------------------
1232 c
1233 c  ***  intrinsic functions  ***
1234 c/+
1235       double precision dsqrt
1236 c/
1237 c--------------------------  local variables  --------------------------
1238 c
1239       integer i, ij, j, jj, jp1, k, nm1, np1
1240       double precision a, b, bj, eta, gj, lj, lij, ljj, nu, s, theta,
1241      1                 wj, zj
1242       double precision one, zero
1243 c
1244 c  ***  data initializations  ***
1245 c
1246 c/6
1247 c     data one/1.d+0/, zero/0.d+0/
1248 c/7
1249       parameter (one=1.d+0, zero=0.d+0)
1250 c/
1251 c
1252 c+++++++++++++++++++++++++++++++  body  ++++++++++++++++++++++++++++++++
1253 c
1254       nu = one
1255       eta = zero
1256       if (n .le. 1) go to 30
1257       nm1 = n - 1
1258 c
1259 c  ***  temporarily store s(j) = sum over k = j+1 to n of w(k)**2 in
1260 c  ***  lambda(j).
1261 c
1262       s = zero
1263       do 10 i = 1, nm1
1264          j = n - i
1265          s = s + w(j+1)**2
1266          lambda(j) = s
1267  10      continue
1268 c
1269 c  ***  compute lambda, gamma, and beta by goldfarb*s recurrence 3.
1270 c
1271       do 20 j = 1, nm1
1272          wj = w(j)
1273          a = nu*z(j) - eta*wj
1274          theta = one + a*wj
1275          s = a*lambda(j)
1276          lj = dsqrt(theta**2 + a*s)
1277          if (theta .gt. zero) lj = -lj
1278          lambda(j) = lj
1279          b = theta*wj + s
1280          gamma(j) = b * nu / lj
1281          beta(j) = (a - b*eta) / lj
1282          nu = -nu / lj
1283          eta = -(eta + (a**2)/(theta - lj)) / lj
1284  20      continue
1285  30   lambda(n) = one + (nu*z(n) - eta*w(n))*w(n)
1286 c
1287 c  ***  update l, gradually overwriting  w  and  z  with  l*w  and  l*z.
1288 c
1289       np1 = n + 1
1290       jj = n * (n + 1) / 2
1291       do 60 k = 1, n
1292          j = np1 - k
1293          lj = lambda(j)
1294          ljj = l(jj)
1295          lplus(jj) = lj * ljj
1296          wj = w(j)
1297          w(j) = ljj * wj
1298          zj = z(j)
1299          z(j) = ljj * zj
1300          if (k .eq. 1) go to 50
1301          bj = beta(j)
1302          gj = gamma(j)
1303          ij = jj + j
1304          jp1 = j + 1
1305          do 40 i = jp1, n
1306               lij = l(ij)
1307               lplus(ij) = lj*lij + bj*w(i) + gj*z(i)
1308               w(i) = w(i) + lij*wj
1309               z(i) = z(i) + lij*zj
1310               ij = ij + i
1311  40           continue
1312  50      jj = jj - j
1313  60      continue
1314 c
1315  999  return
1316 c  ***  last card of lupdat follows  ***
1317       end
1318       subroutine lvmul(n, x, l, y)
1319 c
1320 c  ***  compute  x = l*y, where  l  is an  n x n  lower triangular
1321 c  ***  matrix stored compactly by rows.  x and y may occupy the same
1322 c  ***  storage.  ***
1323 c
1324       integer n
1325 cal   double precision x(n), l(1), y(n)
1326       double precision x(n), l(n*(n+1)/2), y(n)
1327 c     dimension l(n*(n+1)/2)
1328       integer i, ii, ij, i0, j, np1
1329       double precision t, zero
1330 c/6
1331 c     data zero/0.d+0/
1332 c/7
1333       parameter (zero=0.d+0)
1334 c/
1335 c
1336       np1 = n + 1
1337       i0 = n*(n+1)/2
1338       do 20 ii = 1, n
1339          i = np1 - ii
1340          i0 = i0 - i
1341          t = zero
1342          do 10 j = 1, i
1343               ij = i0 + j
1344               t = t + l(ij)*y(j)
1345  10           continue
1346          x(i) = t
1347  20      continue
1348  999  return
1349 c  ***  last card of lvmul follows  ***
1350       end
1351       subroutine vvmulp(n, x, y, z, k)
1352 c
1353 c ***  set x(i) = y(i) * z(i)**k, 1 .le. i .le. n (for k = 1 or -1)  ***
1354 c
1355       integer n, k
1356       double precision x(n), y(n), z(n)
1357       integer i
1358 c
1359       if (k .ge. 0) go to 20
1360       do 10 i = 1, n
1361  10      x(i) = y(i) / z(i)
1362       go to 999
1363 c
1364  20   do 30 i = 1, n
1365  30      x(i) = y(i) * z(i)
1366  999  return
1367 c  ***  last card of vvmulp follows  ***
1368       end
1369       subroutine wzbfgs (l, n, s, w, y, z)
1370 c
1371 c  ***  compute  y  and  z  for  lupdat  corresponding to bfgs update.
1372 c
1373       integer n
1374 cal   double precision l(1), s(n), w(n), y(n), z(n)
1375       double precision l(n*(n+1)/2), s(n), w(n), y(n), z(n)
1376 c     dimension l(n*(n+1)/2)
1377 c
1378 c--------------------------  parameter usage  --------------------------
1379 c
1380 c l (i/o) cholesky factor of hessian, a lower triang. matrix stored
1381 c             compactly by rows.
1382 c n (input) order of  l  and length of  s,  w,  y,  z.
1383 c s (input) the step just taken.
1384 c w (output) right singular vector of rank 1 correction to l.
1385 c y (input) change in gradients corresponding to s.
1386 c z (output) left singular vector of rank 1 correction to l.
1387 c
1388 c-------------------------------  notes  -------------------------------
1389 c
1390 c  ***  algorithm notes  ***
1391 c
1392 c        when  s  is computed in certain ways, e.g. by  gqtstp  or
1393 c     dbldog,  it is possible to save n**2/2 operations since  (l**t)*s
1394 c     or  l*(l**t)*s is then known.
1395 c        if the bfgs update to l*(l**t) would reduce its determinant to
1396 c     less than eps times its old value, then this routine in effect
1397 c     replaces  y  by  theta*y + (1 - theta)*l*(l**t)*s,  where  theta
1398 c     (between 0 and 1) is chosen to make the reduction factor = eps.
1399 c
1400 c  ***  general  ***
1401 c
1402 c     coded by david m. gay (fall 1979).
1403 c     this subroutine was written in connection with research supported
1404 c     by the national science foundation under grants mcs-7600324 and
1405 c     mcs-7906671.
1406 c
1407 c------------------------  external quantities  ------------------------
1408 c
1409 c  ***  functions and subroutines called  ***
1410 c
1411       external dotprd, livmul, ltvmul
1412       double precision dotprd
1413 c dotprd returns inner product of two vectors.
1414 c livmul multiplies l**-1 times a vector.
1415 c ltvmul multiplies l**t times a vector.
1416 c
1417 c  ***  intrinsic functions  ***
1418 c/+
1419       double precision dsqrt
1420 c/
1421 c--------------------------  local variables  --------------------------
1422 c
1423       integer i
1424       double precision cs, cy, eps, epsrt, one, shs, ys, theta
1425 c
1426 c  ***  data initializations  ***
1427 c
1428 c/6
1429 c     data eps/0.1d+0/, one/1.d+0/
1430 c/7
1431       parameter (eps=0.1d+0, one=1.d+0)
1432 c/
1433 c
1434 c+++++++++++++++++++++++++++++++  body  ++++++++++++++++++++++++++++++++
1435 c
1436       call ltvmul(n, w, l, s)
1437       shs = dotprd(n, w, w)
1438       ys = dotprd(n, y, s)
1439       if (ys .ge. eps*shs) go to 10
1440          theta = (one - eps) * shs / (shs - ys)
1441          epsrt = dsqrt(eps)
1442          cy = theta / (shs * epsrt)
1443          cs = (one + (theta-one)/epsrt) / shs
1444          go to 20
1445  10   cy = one / (dsqrt(ys) * dsqrt(shs))
1446       cs = one / shs
1447  20   call livmul(n, z, l, y)
1448       do 30 i = 1, n
1449  30      z(i) = cy * z(i)  -  cs * w(i)
1450 c
1451  999  return
1452 c  ***  last card of wzbfgs follows  ***
1453       end