Optimization with Analytic Gradient Computations

This GAUSS optimization example demonstrates nonlinear least squares estimation. The data for this example is taken from G.P.Y Cleark, “Approximate Confidence Limits for a Parameter Function in Nonlinear Regression”, JASA, 82:221-230, 1987.

The key feature of this example is the use of user-defined analytical first derivatives.

Code for optimization

The example:

  • Creates a data matrix.

  • Defines an objective function that takes 5 inputs.

  • Specifies analytical computations for the gradients.

new;
cls;

// Make optmt library available
library optmt;

/*
** data  -  weight of cut grass from 10 randomly sited quadrants taken
**          each week for 13 weeks from a grazing pasture.
*/
y = { 3.183,
    3.059,
    2.871,
    2.622,
    2.541,
    2.184,
    2.110,
    2.075,
    2.018,
    1.903,
    1.770,
    1.762,
    1.550 };

x = seqa(1, 1, 13);

// number of observations
n = 10;

// The Mitcherlitz Equation
proc Mitcherlitz(b, x);

    // Calculate and return Mitcherlitz equation
    retp(b[3] + b[2]*exp(-b[1]*x));
endp;

// The Gradient
proc dMitcherlitz(b, x);
    local w;

    w = exp(-b[1]*x);
    retp(((-b[2]*x.*w)~w~ones(rows(w), 1)));
endp;


// The objective function
proc fct(b, x, y, nobs, ind);
    local dev;
    struct modelResults mm;

    dev = y - Mitcherlitz(b,x);

    // If first element of 'ind' is non-zero,
    // calculate and return objective function
    if ind[1];
        mm.function = dev'dev/nobs;
    endif;

    // If the second element of 'ind' is non-zero,
    // calculate and return gradient
    if ind[2];
        mm.gradient = -2*(dMitcherlitz(b,x)'dev)/nobs;
    endif;

    retp(mm);
endp;

// 3x1 vector of starting values
b0 = { 1, 1, 0 };

// Declare 'out' to be an optmtResults structure
// to hold the results of the optimization
struct optmtResults out;

// Optimize function
out = optmt(&fct, b0, x, y, n);

// Print summary report of optimization results
call optmtPrt(out);

The code prints results to the Command Window.

Results

Convergence details

The first portion of the results provide details about convergence and performance.

Return code    =    0
Function value =    0.00535
Convergence    :    normal convergence

These results indicate that the optimization converged normally, with a return code of 0. Any return Code other than 0 would indicate some issue with the convergence. The exact meaning of the return code can be found in the optmt() documentation.

Parameter estimates

The next section of the results reports the parameter estimates and the associated gradients.

Parameters  Estimates   Gradient
---------------------------------------------------------------------
x[1,1]      0.1031      0.0000
x[2,1]      2.5190      0.0000
x[3,1]      0.9631      0.0000

In this example, the gradients for parameters all parameters are zero, as is expected at or near an optimum.

Computation time

Number of iterations    28
Minutes to convergence     0.00013

Extension

Though the printed results do not include the standard errors of the parameters or the other key inference statistics, these can be easily computed the dMitcherlitz function and the results stored in the optmtResults structure.

// Extract estimated parameters
bhat = pvGetParvector(out.par);

// Compute gradients
grad = dMitcherlitz(bhat,x);

// Find covariance matrix
cov = out.fct*invpd(grad'*grad);

// Print results
print;
print "standard errors of parameters";
sd = sqrt(diag(cov));
print sd';
print;
print "Correlation matrix of parameters";
print cov./sd./sd';
print;
print "t-statistics";
print (bhat./sd)';

The results printed are:

standard errors of parameters
     0.025504306       0.26577209       0.32158853

Correlation matrix of parameters

       1.0000000     -0.92330571       0.98410596
     -0.92330571       1.0000000      -0.97238973
      0.98410596     -0.97238973        1.0000000

t-statistics
      4.0406689        9.4780502        2.9948710