esys.escript.minimizer Package

Generic minimization algorithms

Classes

class esys.escript.minimizer.AbstractMinimizer(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)

Base class for function minimization methods.

__init__(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)

Initializes a new minimizer for a given cost function.

Parameters:
  • F (CostFunction) – the cost function to be minimized

  • m_tol (float) – relative tolerance for solution m for termination of iteration

  • grad_tol (float) – tolerance for gradient relative to initial costfunction value for termination of iteration

  • iterMax (int) – maximium number of iterations.

Default m_tol:

1e-4

Default grad_tol:

1e-8

Default iterMax:

300

doCallback(**args)

The callback function is called with the following arguments:

Key iterCount:

iteration count

Key m:

current solution

Key dm:

last solution incerement

Norm_m:

norm of current solution m

Key Fm:

value of costs function for m

Key gradFm:

gradient for m

Parameters:
  • args_m (tuple) – arguments for m

  • failed (bool) – set if the step was unsuccessful.

getCostFunction()

return the cost function to be minimized

Return type:

CostFunction

getLineSearch()

returns the LineSearch object being used.

getOptions()

returns a dictionary of LBFGS options rtype: dictionary

getResult()

Returns the result of the minimization. :rtype: m-type

logSummary()

Outputs a summary of the completed minimization process to the logger.

run(m0)

Executes the minimization algorithm for f starting with the initial guess m0.

Returns:

the result of the minimization

setCallback(callback)

Sets a callback function to be called after every iteration.

def callback(iterCount, m, norm_m, dm, Fm, gradFm, norm_grad_Fm, failed, args)

with iteration count iterCount, current approximation m, last update dm, costfunction value Fm, gradient gradFm, see method AbstractMinimizer.doCallback

setCostFunction(F)

set the cost function to be minimized

Parameters:

F (CostFunction) – the cost function to be minimized

setIterMax(iterMax=300)

Sets the maximum number of iterations before the minimizer terminates.

Parameters:

iterMax (int) – maximium number of iterations.

Default iterMax:

300

setOptions(**opts)

setOptions for LBFGS. use solver.setOptions( key = value)

Key m_tol:

relative tolerance for solution m for termination of iteration

Default m_tol:

1e-4

Key grad_tol:

tolerance for gradient relative to initial cost function value for termination of iteration

Default grad_tol:

1e-4

Key truncation:

sets the number of previous LBFGS iterations to keep

:type truncation : int :default truncation: 30 :key restart: restart after this many iteration steps. :type restart: int :default restart: 60 :key iterMax: maximium number of iterations. :type iterMax: int :default iterMax: 300 :key relAlphaMin: minimal step size relative to serach direction.

The value should be chosen such that At any iteration step F(m + alpha * p) is just discriminable from F(m) for any `alpha > relAlphaMin * |m|/|p|’.

Default relAlphaMin:

1e-8

Key initialAlpha:

initial step size alpha in line serach. Typically alpha=1 is a good initial value but a larger or smaller initial value may help to get the iteration started when only an approximation of the Hessian is available.

Default initialAlpha:
Key scaleSearchDirection:

if set the search direction is rescaled using an estimation of the norm of the Hessian

Default scaleSearchDirection:

True

Example of usage::

cf=DerivedCostFunction() solver=MinimizerLBFGS(J=cf, m_tol = 1e-5, grad_tol = 1e-5, iterMax=300) solver.setOptions(truncation=20) solver.getLineSearch().setOptions(zoom_relChangeMin =1e-7) solver.run(initial_m) result=solver.getResult()

setTolerance(m_tol=0.0001, grad_tol=0.0001)

Sets the tolerance for the stopping criterion. The minimizer stops when an appropriate norm is less than m_tol.

Parameters:
  • m_tol (float) – relative tolerance for solution m for termination of iteration

  • grad_tol (float) – tolerance for gradient relative to initial costfunction value for termination of iteration

Default m_tol:

1e-4

Default grad_tol:

1e-8

class esys.escript.minimizer.CostFunction

A cost function F(m) that can be minimized (base class). Near a solution m in the search space the cost function is approximated in direction p as

F(m+p) ~ F(m) + <grad F(m), p> + < H p, p>

where F(m) is the value of the function at m, grad F is the gradient, H is the Hessian and <.,.> is dual product. These four procedures are defined in this class.

The class distinguishes between the representation of the solution m (m-type) and the cost function gradient (g-type).

Example of usage:

# class DerivedCostFunction(CostFunction)

# overwrite: getArguments, etc.

cf=DerivedCostFunction() # … calculate m … args=cf.getArguments(m) # this could be potentially expensive! f=cf.getValue(m, *args) # … it could be required to update m without using the gradient… # … but then … gf=cf.getGradient(m, *args)

Use the “AndCount” versions (e.g. getValueAndCount) if you want to count calls.

__init__()

Constructor. Initializes logger.

getArguments(m)

returns precalculated values that are shared in the calculation of F(m) and grad F(m) and the Hessian operator.

Note:

Overwrite this method to implement a cost function.

Note:

The tuple returned by this call will be passed back to this CostFunction in other calls(eg: getGradientAndCount). Its contents are not specified at this level because no code, other than the CostFunction which created it, will be interacting with it. That is, the implementor can put whatever information they find useful in it.

Parameters:

m (m-type) – location of derivative

Returns:

pre-calculated arguments

Return type:

tuple

Note:

Overwrite this method to implement a cost function.

getArgumentsAndCount(m)

returns precalculated values that are shared in the calculation of F(m) and grad F(m) and the Hessian operator. The default implementation returns an empty tuple.

When calling this method the calling statistics is updated.

Parameters:

m (m-type) – location of derivative

Return type:

tuple

Note:

The tuple returned by this call will be passed back to this CostFunction in other calls (eg: getGradient). Its contents are not specified at this level because no code, other than the CostFunction which created it, will be interacting with it. That is, the implementor can put whatever information they find useful in it.

getDualProduct(m, r)

returns the dual product of m and r

Return type:

float

Note:

Overwrite this method to implement a cost function.

getDualProductAndCount(m, r)

returns the dual product <.,.> of r and m.

When calling this method the calling statistics is updated.

Return type:

float

getGradient(m, *args)

returns the gradient of F at m using the precalculated values for m.

Parameters:
Return type:

g-type

Note:

Overwrite this method to implement a cost function.

getGradientAndCount(m, *args)

returns the gradient of F at m using the precalculated values for m.

When calling this method the calling statistics is updated.

Parameters:
Return type:

g-type

getInverseHessianApproximation(r, m, *args, initializeHessian=True)

returns an approximate evaluation p of the inverse of the Hessian operator of the cost function for a given gradient r: H p = r :note: by default this method is returning r. In this case it is assumed that m-type==g-type :note: Overwrite this method to implement a cost function.

Parameters:

r (g-type) – a given gradient

Poram initializeHessian:

indicates if the Hessian operator should be initialized using m. If this method provides an approximation only and building the new Hessian approximation is expensive it is typically more efficient to update the Hessian operator occasionally only when on input initializeHessian is True. If the Hessian should be updated in each step ignore the value of initializeHessian.

Returns:

new search direction p.

Return type:

m-type

getInverseHessianApproximationAndCount(r, m, *args, initializeHessian=True)

returns an evaluation p of the inverse of the Hessian operator of the cost function for a given gradient r: H p = r

When calling this method the calling statistics is updated.

Parameters:
Poram initializeHessian:

indicates if the Hessian operator should be initialized using m. If updating the Hessian is expensive it should only be done when initializeHessian is True. If this method provides an approximation only and building the new Hessian approximation is expensive it is typically more efficient to update the Hessian operator occasionally.

Returns:

new search direction p.

Return type:

m-type

getNorm(m)

returns the norm of m. Typically, this is the ‘Lsup’ function.

Return type:

float

Note:

Overwrite this method to implement a cost function.

getNormAndCount(m)

returns the norm of m.

When calling this method the calling statistics is updated.

Return type:

float

getSqueezeFactor(m, p, *args)

The new solution is calculated as m+a*p with a>0. This function allows to provide an upper bound for a to make sure that m+a*p is valid typically to avoid overflow when the cost function is evaluated. the solver will take action to make sure that the value of a is not too small.

Parameters:
  • m (m-type) – a solution approximation

  • p – an increment to the solution

  • args – pre-calculated values for m from getArgumentsAndCount()

Return type:

positive float or None

Note:

Overwrite this method to implement a cost function.

getStatistics()

return the call statistics as a string:

getValue(m, *args)

returns the value F(m) using the precalculated values for m.

Parameters:
Return type:

float or valid argument for sum

Note:

Overwrite this method to implement a cost function.

getValueAndCount(m, *args)

returns the value F(m) using the precalculated values for m.

When calling this method the calling statistics is updated.

Parameters:
Return type:

float or valid argument for sum

resetStatistics()

resets all counters

class esys.escript.minimizer.CostFunction1DEvaluationFactory(m, p, costfunction=None)

This class manages the interface of Costfunction implementing F to the 1D projection Phi(a)=F(m+a*p). The idea is to use a EvaluatePhi object to hold the values of F, gradF and args for a given value.

__init__(m, p, costfunction=None)

sets up the factory for offset m and direction p to use costfunction CostFunction. :param m: offset :type m: m-type :param p: direction :type p: m-type :param costfunction: a cost function object :type costfunction: CostFunction

getArgs(a)

returns the arguments for m+a*p :param a: scaling factor of search direction :type a: float :returns args: arguments appropriate for Costfunction

getEvaluation(a=1)

issues a containe holder for Phi(a)=F(m+a*p), gradF(m+a*p) and the respective arguments :param a: scaling factor of search direction :type a: float :returns: instance EvalutedPhi

getGrad(a, gradF=None, args=None)

returns value Phi’(a)=<grad F(m+a*p), p>. If the arguments args==None, args is calculated. If the gradient gradF==None, the gradient is calculated (using args). :param a: scaling factor of search direction :type a: float :param args: arguments appropriate for Costfunction :type args: arguments appropriate for Costfunction or None :param gradF: gradient of CostFunction for m+alpha*p :type gradF: appropriate for Costfunction or None :returns: derivate of Phi at value a

getValue(a, args=None)

returns value F(m+a*p). If the arguments args==None, args is calculated :param a: scaling factor of search direction :type a: float :param args: arguments appropriate for Costfunction :type args: arguments appropriate for Costfunction or None :returns: value of F and corresponding args.

class esys.escript.minimizer.EvalutedPhi(factory, alpha=1.0, args=None, valF=None, gradF=None)

this class provides an access to value Phi=F(m+a*p) for a CostFunction The interface is handeled by a factory class that holds m, p and the costfunction where methods of the latter are called via the factory when needed. Main purpose of this class is to maintain a simple mechanism to manage the values of Phi and its derivative Phi’ (=gradPhi) but avoiding a recalculation of the arguments and full gradient once the Phi has been optimized via line serach and zoom.

__init__(factory, alpha=1.0, args=None, valF=None, gradF=None)
Parameters:
  • factory (CostFunction1DEvaluationFactory or similar) – factory class

  • alpha (float) – length factor

  • args (appropriate for Costfunction) – argument for m+alpha*p if known. Is held at self.args

  • valF (float) – value of Phi(alpha)=F(m+alpha*p) if known. Is held at self.valF

  • gradF (appropriate for Costfunction) – gradient of CostFunction for m+alpha*p if known. is held at self.gradF

getDiff()

return the value for Phi’(alpha)=<p,grad(F(m+alpha*p)).

getVal()

return the value for Phi(alpha)=F(m+alpha*p). In contrast to use self.valF the value is calculated if self.valF is None.

class esys.escript.minimizer.LineSearch(logger=None)

Line search method to minimize F(m+alpha*p) The iteration is terminated when the strong Wolfe conditions is fulfilled.

See Chapter 3 of ‘Numerical Optimization’ by J. Nocedal for an explanation.

Note: the line search return a new search direction scale alpha that satisfies the strong Wolfe condition:

(W1) sufficient decrease condition: Phi(alpha) <= Phi(0) + c1 * alpha * phi’(0) (W2) curveture condition abs(Phi’(alpha)) <= c2 * abs(Phi’(0))

The first step is to find an alpha_lo and alpha_hi such that the following conditions hold:

(Z1) [alpha_lo, alpha_hi] contains an alpha fullfilling (W1)+(W2) (Z2) alpha_lo is giving the smallest value for Phi amongst alphas generated so far and that

are satisfying the sufficient decrease condition (W1)

(Z3) alpha_hi is chosen so that Phi′(alpha_lo)(alpha_hi − alpha_lo ) < 0

that means alpha_hi > alpha_lo if Phi′(alpha_lo)<0 and alpha_hi < alpha_lo if Phi′(alpha_lo)>0

Condition (Z1) is fullfilled if one of the following conditions is satisfied:

(E1) alpha_hi violates the sufficient decrease condition (W1) (E2) Phi(αlpha_hi) ≥ Phi(αlpha_lo) (E3) Phi′(alpha_lo)(alpha_hi − alpha_lo ) < 0.

__init__(logger=None)

initialize the line search solver

Parameters:

logger (logging.Logger) – logger to be used. If not set ‘esys.minimizer’ is used.

getOptions()

returns a dictionary of LBFGS options rtype: dictionary

run(phi_0, alpha=1.0)

Line search method to minimize F(m+alpha*p) The iteration is terminated when the strong Wolfe conditions is fulfilled. See Chapter 3 of ‘Numerical Optimization’ by J. Nocedal for an explanation.

Parameters:
  • phi_0EvaluatePhi for alpha=0

  • alpha – initial step length. If grad_Fm is properly scaled alpha=1 is a reasonable starting value.

Returns:

alpha

setOptions(**opts)

set options for the line search.

Key alphaMin:

minimum search length

Default alphaMin:

1e-20

Key alphaMax:

maximum search length

Default alphaMax:

5000

:key overStepFactor : factor to increase step size in line search :type overStepFactor: float :default overStepFactor: 2. :key iterMax: maximum number of line search iterations :type iterMax: int :default iterMax: 25 :key c1: sufficient decrease condition factor c1 :type c1: float :default c1: 1e-4 :key c2: curvature condition factor c2 :type c2: float :default c2: 0.9 :key inter_order: order of the interpolation used for line search :type inter_order: 1,2,3 :default inter_order: 3 :key inter_iterMax: maximum number of iteration steps to when minimizing interploted cost function :type inter_iterMax: int :default inter_iterMax: 100 :key inter_tol: tolerance to when minimizing interploted cost function :type inter_tol: float :default inter_tol: 1. :key zoom_iterMax: maximum number of zoom iterations :type zoom_iterMax: int :default zoom_iterMax: 20

:key phiEpsilon : tolerance for greater than check of cost function values :type phiEpsilon: float :default phiEpsilon: np.sqrt(EPSILON)

:key alphaOffset : minimal relative distance of new alpha from boundaries :type alphaOffset : float :default alphaOffset : 1e-4

:key alphaWidthMin : minimal relative distance of new alpha from boundaries :type alphaWidthMin: float :default alphaWidthMin: np.sqrt(EPSILON) :key zoom_reductionMin: minimal reduction search interval length between zoom steps :type zoom_reductionMin : float :default zoom_reductionMin :0.66

zoom(phi_lo, phi_hi, phi_0)

Helper function for line_search below which tries to tighten the range phi_lo.alpha…phi_hi.alpha such that phi_lo, phi_hi fulfill one of these conditions (which needs to be fullfilled at entry):

(Z1) [alpha_lo, alpha_hi] contains an alpha fullfilling (W1)+(W2). This condition holds if

(E1) alpha_hi violates the sufficient decrease condition (W1) (E2) Phi(αlpha_hi) ≥ Phi(αlpha_lo) (E3) Phi′(alpha_lo)(alpha_hi − alpha_lo ) < 0.

(Z2) alpha_lo is giving the smallest value for Phi amongst alphas generated so far and that are satisfying

the sufficient decrease condition (W1)

(Z3) alpha_hi is chosen so that Phi′(alpha_lo)(alpha_hi − alpha_lo ) < 0

See Chapter 3 of ‘Numerical Optimization’ by J. Nocedal for an explanation. interpolation options are linear, quadratic, cubic or Newton interpolation.

class esys.escript.minimizer.LineSearchAlphaMaxError

Exception thrown if the line search if maximum alpha is reached.

__init__(*args, **kwargs)
class esys.escript.minimizer.LineSearchAlphaMinError

Exception thrown if the line search if minimum alpha is reached.

__init__(*args, **kwargs)
class esys.escript.minimizer.LineSearchInterpolationBreakDownError

Interpolation break down

__init__(*args, **kwargs)
class esys.escript.minimizer.LineSearchIterMaxReachedError

Exception thrown if the line serach reaches maximum iteration count

__init__(*args, **kwargs)
class esys.escript.minimizer.LineSearchSearchDirectionError

Exception thrown if The search direction is not a descent direction.

__init__(*args, **kwargs)
class esys.escript.minimizer.LineSearchTerminationError

Exception thrown if the line search was unsuccessful

__init__(*args, **kwargs)
class esys.escript.minimizer.MinimizerException

This is a generic exception thrown by a minimizer.

__init__(*args, **kwargs)
class esys.escript.minimizer.MinimizerIterationIncurableBreakDown

Exception thrown if the iteration scheme encountered an incurable breakdown.

__init__(*args, **kwargs)
class esys.escript.minimizer.MinimizerLBFGS(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)

Minimizer that uses the limited-memory Broyden-Fletcher-Goldfarb-Shanno method. See Chapter 6 of ‘Numerical Optimization’ by J. Nocedal for an explanation.

__init__(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)

Initializes a new minimizer for a given cost function.

Parameters:
  • F (CostFunction) – the cost function to be minimized

  • m_tol (float) – relative tolerance for solution m for termination of iteration

  • grad_tol (float) – tolerance for gradient relative to initial costfunction value for termination of iteration

  • iterMax (int) – maximium number of iterations.

Default m_tol:

1e-4

Default grad_tol:

1e-8

Default iterMax:

300

run(m)
Parameters:

m (m-type) – initial guess

Returns:

solution

Return type:

m-type

class esys.escript.minimizer.MinimizerMaxIterReached

Exception thrown if the maximum number of iteration steps is reached.

__init__(*args, **kwargs)
class esys.escript.minimizer.MinimizerNLCG(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)

Minimizer that uses the nonlinear conjugate gradient method (Fletcher-Reeves variant).

__init__(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)

Initializes a new minimizer for a given cost function.

Parameters:
  • F (CostFunction) – the cost function to be minimized

  • m_tol (float) – relative tolerance for solution m for termination of iteration

  • grad_tol (float) – tolerance for gradient relative to initial costfunction value for termination of iteration

  • iterMax (int) – maximium number of iterations.

Default m_tol:

1e-4

Default grad_tol:

1e-8

Default iterMax:

300

run(x)
The callback function is called with the following arguments:

k - iteration number x - current estimate Fm - value of cost function at x grad_Fm - gradient of cost function at x gnorm - norm of grad_Fm (stopping criterion)

Functions

esys.escript.minimizer.muchLarger(x, y, vareps=0.0)

returns true is x > y and |x-y| > vareps * max(|x|,|y|)

esys.escript.minimizer.muchSmaller(x, y, vareps=0.0)

returns true is x < y and |x-y| > vareps * max(|x|,|y|)

Others

  • EPSILON

Packages