esys.escript.minimizer Package¶
Generic minimization algorithms
Classes¶
- class esys.escript.minimizer.AbstractMinimizer(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Base class for function minimization methods.
- __init__(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Initializes a new minimizer for a given cost function.
- Parameters:
F (
CostFunction) – the cost function to be minimizedm_tol (
float) – relative tolerance for solutionmfor termination of iterationgrad_tol (
float) – tolerance for gradient relative to initial costfunction value for termination of iterationiterMax (
int) – maximium number of iterations.
- Default m_tol:
1e-4
- Default grad_tol:
1e-8
- Default iterMax:
300
- doCallback(**args)¶
The callback function is called with the following arguments:
- Key iterCount:
iteration count
- Key m:
current solution
- Key dm:
last solution incerement
- Norm_m:
norm of current solution
m- Key Fm:
value of costs function for
m- Key gradFm:
gradient for
m- Parameters:
args_m (
tuple) – arguments formfailed (
bool) – set if the step was unsuccessful.
- getCostFunction()¶
return the cost function to be minimized
- Return type:
- getLineSearch()¶
returns the
LineSearchobject being used.
- getOptions()¶
returns a dictionary of LBFGS options rtype: dictionary
- getResult()¶
Returns the result of the minimization. :rtype: m-type
- logSummary()¶
Outputs a summary of the completed minimization process to the logger.
- run(m0)¶
Executes the minimization algorithm for f starting with the initial guess
m0.- Returns:
the result of the minimization
- setCallback(callback)¶
Sets a callback function to be called after every iteration.
def callback(iterCount, m, norm_m, dm, Fm, gradFm, norm_grad_Fm, failed, args)
with iteration count
iterCount, current approximationm, last updatedm, costfunction valueFm, gradientgradFm, see methodAbstractMinimizer.doCallback
- setCostFunction(F)¶
set the cost function to be minimized
- Parameters:
F (
CostFunction) – the cost function to be minimized
- setIterMax(iterMax=300)¶
Sets the maximum number of iterations before the minimizer terminates.
- Parameters:
iterMax (
int) – maximium number of iterations.- Default iterMax:
300
- setOptions(**opts)¶
setOptions for LBFGS. use solver.setOptions( key = value)
- Key m_tol:
relative tolerance for solution
mfor termination of iteration- Default m_tol:
1e-4
- Key grad_tol:
tolerance for gradient relative to initial cost function value for termination of iteration
- Default grad_tol:
1e-4
- Key truncation:
sets the number of previous LBFGS iterations to keep
:type truncation :
int:default truncation: 30 :key restart: restart after this many iteration steps. :type restart:int:default restart: 60 :key iterMax: maximium number of iterations. :type iterMax:int:default iterMax: 300 :key relAlphaMin: minimal step size relative to serach direction.- Default relAlphaMin:
1e-8
- Key initialAlpha:
initial step size alpha in line serach. Typically alpha=1 is a good initial value but a larger or smaller initial value may help to get the iteration started when only an approximation of the Hessian is available.
- Default initialAlpha:
- Key scaleSearchDirection:
if set the search direction is rescaled using an estimation of the norm of the Hessian
- Default scaleSearchDirection:
True
- Example of usage::
cf=DerivedCostFunction() solver=MinimizerLBFGS(J=cf, m_tol = 1e-5, grad_tol = 1e-5, iterMax=300) solver.setOptions(truncation=20) solver.getLineSearch().setOptions(zoom_relChangeMin =1e-7) solver.run(initial_m) result=solver.getResult()
- setTolerance(m_tol=0.0001, grad_tol=0.0001)¶
Sets the tolerance for the stopping criterion. The minimizer stops when an appropriate norm is less than
m_tol.- Parameters:
m_tol (
float) – relative tolerance for solutionmfor termination of iterationgrad_tol (
float) – tolerance for gradient relative to initial costfunction value for termination of iteration
- Default m_tol:
1e-4
- Default grad_tol:
1e-8
- class esys.escript.minimizer.CostFunction¶
A cost function F(m) that can be minimized (base class). Near a solution m in the search space the cost function is approximated in direction p as
F(m+p) ~ F(m) + <grad F(m), p> + < H p, p>
where F(m) is the value of the function at m, grad F is the gradient, H is the Hessian and <.,.> is dual product. These four procedures are defined in this class.
The class distinguishes between the representation of the solution m (m-type) and the cost function gradient (g-type).
- Example of usage:
# class DerivedCostFunction(CostFunction)
# overwrite: getArguments, etc.
cf=DerivedCostFunction() # … calculate m … args=cf.getArguments(m) # this could be potentially expensive! f=cf.getValue(m, *args) # … it could be required to update m without using the gradient… # … but then … gf=cf.getGradient(m, *args)
Use the “AndCount” versions (e.g. getValueAndCount) if you want to count calls.
- __init__()¶
Constructor. Initializes logger.
- getArguments(m)¶
returns precalculated values that are shared in the calculation of F(m) and grad F(m) and the Hessian operator.
- Note:
Overwrite this method to implement a cost function.
- Note:
The tuple returned by this call will be passed back to this
CostFunctionin other calls(eg:getGradientAndCount). Its contents are not specified at this level because no code, other than theCostFunctionwhich created it, will be interacting with it. That is, the implementor can put whatever information they find useful in it.- Parameters:
m (m-type) – location of derivative
- Returns:
pre-calculated arguments
- Return type:
tuple- Note:
Overwrite this method to implement a cost function.
- getArgumentsAndCount(m)¶
returns precalculated values that are shared in the calculation of F(m) and grad F(m) and the Hessian operator. The default implementation returns an empty tuple.
When calling this method the calling statistics is updated.
- Parameters:
m (m-type) – location of derivative
- Return type:
tuple- Note:
The tuple returned by this call will be passed back to this
CostFunctionin other calls (eg:getGradient). Its contents are not specified at this level because no code, other than theCostFunctionwhich created it, will be interacting with it. That is, the implementor can put whatever information they find useful in it.
- getDualProduct(m, r)¶
returns the dual product of
mandr- Return type:
float- Note:
Overwrite this method to implement a cost function.
- getDualProductAndCount(m, r)¶
returns the dual product <.,.> of
randm.When calling this method the calling statistics is updated.
- Return type:
float
- getGradient(m, *args)¶
returns the gradient of F at m using the precalculated values for m.
- Parameters:
m (m-type) – location of derivative
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
g-type
- Note:
Overwrite this method to implement a cost function.
- getGradientAndCount(m, *args)¶
returns the gradient of F at m using the precalculated values for m.
When calling this method the calling statistics is updated.
- Parameters:
m (m-type) – location of derivative
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
g-type
- getInverseHessianApproximation(r, m, *args, initializeHessian=True)¶
returns an approximate evaluation p of the inverse of the Hessian operator of the cost function for a given gradient r: H p = r :note: by default this method is returning r. In this case it is assumed that m-type==g-type :note: Overwrite this method to implement a cost function.
- Parameters:
r (g-type) – a given gradient
- Poram initializeHessian:
indicates if the Hessian operator should be initialized using
m. If this method provides an approximation only and building the new Hessian approximation is expensive it is typically more efficient to update the Hessian operator occasionally only when on input initializeHessian isTrue. If the Hessian should be updated in each step ignore the value ofinitializeHessian.- Returns:
new search direction p.
- Return type:
m-type
- getInverseHessianApproximationAndCount(r, m, *args, initializeHessian=True)¶
returns an evaluation p of the inverse of the Hessian operator of the cost function for a given gradient r: H p = r
When calling this method the calling statistics is updated.
- Parameters:
r (g-type) – a given gradient
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Poram initializeHessian:
indicates if the Hessian operator should be initialized using
m. If updating the Hessian is expensive it should only be done when initializeHessian is True. If this method provides an approximation only and building the new Hessian approximation is expensive it is typically more efficient to update the Hessian operator occasionally.- Returns:
new search direction p.
- Return type:
m-type
- getNorm(m)¶
returns the norm of
m. Typically, this is the ‘Lsup’ function.- Return type:
float- Note:
Overwrite this method to implement a cost function.
- getNormAndCount(m)¶
returns the norm of
m.When calling this method the calling statistics is updated.
- Return type:
float
- getSqueezeFactor(m, p, *args)¶
The new solution is calculated as m+a*p with a>0. This function allows to provide an upper bound for a to make sure that m+a*p is valid typically to avoid overflow when the cost function is evaluated. the solver will take action to make sure that the value of a is not too small.
- Parameters:
m (m-type) – a solution approximation
p – an increment to the solution
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
positive
floator None- Note:
Overwrite this method to implement a cost function.
- getStatistics()¶
return the call statistics as a string:
- getValue(m, *args)¶
returns the value F(m) using the precalculated values for m.
- Parameters:
m (m-type) – a solution approximation
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
floator valid argument forsum- Note:
Overwrite this method to implement a cost function.
- getValueAndCount(m, *args)¶
returns the value F(m) using the precalculated values for m.
When calling this method the calling statistics is updated.
- Parameters:
m (m-type) – a solution approximation
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
floator valid argument forsum
- resetStatistics()¶
resets all counters
- class esys.escript.minimizer.CostFunction1DEvaluationFactory(m, p, costfunction=None)¶
This class manages the interface of
Costfunctionimplementing F to the 1D projection Phi(a)=F(m+a*p). The idea is to use aEvaluatePhiobject to hold the values of F, gradF and args for a given value.- __init__(m, p, costfunction=None)¶
sets up the factory for offset
mand directionpto use costfunctionCostFunction. :param m: offset :type m: m-type :param p: direction :type p: m-type :param costfunction: a cost function object :type costfunction:CostFunction
- getArgs(a)¶
returns the arguments for m+a*p :param a: scaling factor of search direction :type a:
float:returns args: arguments appropriate forCostfunction
- getEvaluation(a=1)¶
issues a containe holder for Phi(a)=F(m+a*p), gradF(m+a*p) and the respective arguments :param a: scaling factor of search direction :type a:
float:returns: instanceEvalutedPhi
- getGrad(a, gradF=None, args=None)¶
returns value Phi’(a)=<grad F(m+a*p), p>. If the arguments args==None, args is calculated. If the gradient gradF==None, the gradient is calculated (using args). :param a: scaling factor of search direction :type a:
float:param args: arguments appropriate forCostfunction:type args: arguments appropriate forCostfunctionorNone:param gradF: gradient ofCostFunctionform+alpha*p:type gradF: appropriate forCostfunctionor None :returns: derivate of Phi at value a
- getValue(a, args=None)¶
returns value F(m+a*p). If the arguments args==None, args is calculated :param a: scaling factor of search direction :type a:
float:param args: arguments appropriate forCostfunction:type args: arguments appropriate forCostfunctionorNone:returns: value of F and corresponding args.
- class esys.escript.minimizer.EvalutedPhi(factory, alpha=1.0, args=None, valF=None, gradF=None)¶
this class provides an access to value Phi=F(m+a*p) for a
CostFunctionThe interface is handeled by a factory class that holds m, p and the costfunction where methods of the latter are called via the factory when needed. Main purpose of this class is to maintain a simple mechanism to manage the values of Phi and its derivative Phi’ (=gradPhi) but avoiding a recalculation of the arguments and full gradient once the Phi has been optimized via line serach and zoom.- __init__(factory, alpha=1.0, args=None, valF=None, gradF=None)¶
- Parameters:
factory (
CostFunction1DEvaluationFactoryor similar) – factory classalpha (
float) – length factorargs (appropriate for
Costfunction) – argument form+alpha*pif known. Is held at self.argsvalF (
float) – value of Phi(alpha)=F(m+alpha*p) if known. Is held at self.valFgradF (appropriate for
Costfunction) – gradient ofCostFunctionform+alpha*pif known. is held at self.gradF
- getDiff()¶
return the value for Phi’(alpha)=<p,grad(F(m+alpha*p)).
- getVal()¶
return the value for Phi(alpha)=F(m+alpha*p). In contrast to use self.valF the value is calculated if self.valF is None.
- class esys.escript.minimizer.LineSearch(logger=None)¶
Line search method to minimize F(m+alpha*p) The iteration is terminated when the strong Wolfe conditions is fulfilled.
See Chapter 3 of ‘Numerical Optimization’ by J. Nocedal for an explanation.
- Note: the line search return a new search direction scale alpha that satisfies the strong Wolfe condition:
(W1) sufficient decrease condition: Phi(alpha) <= Phi(0) + c1 * alpha * phi’(0) (W2) curveture condition abs(Phi’(alpha)) <= c2 * abs(Phi’(0))
- The first step is to find an alpha_lo and alpha_hi such that the following conditions hold:
(Z1) [alpha_lo, alpha_hi] contains an alpha fullfilling (W1)+(W2) (Z2) alpha_lo is giving the smallest value for Phi amongst alphas generated so far and that
are satisfying the sufficient decrease condition (W1)
- (Z3) alpha_hi is chosen so that Phi′(alpha_lo)(alpha_hi − alpha_lo ) < 0
that means alpha_hi > alpha_lo if Phi′(alpha_lo)<0 and alpha_hi < alpha_lo if Phi′(alpha_lo)>0
- Condition (Z1) is fullfilled if one of the following conditions is satisfied:
(E1) alpha_hi violates the sufficient decrease condition (W1) (E2) Phi(αlpha_hi) ≥ Phi(αlpha_lo) (E3) Phi′(alpha_lo)(alpha_hi − alpha_lo ) < 0.
- __init__(logger=None)¶
initialize the line search solver
- Parameters:
logger (
logging.Logger) – logger to be used. If not set ‘esys.minimizer’ is used.
- getOptions()¶
returns a dictionary of LBFGS options rtype: dictionary
- run(phi_0, alpha=1.0)¶
Line search method to minimize F(m+alpha*p) The iteration is terminated when the strong Wolfe conditions is fulfilled. See Chapter 3 of ‘Numerical Optimization’ by J. Nocedal for an explanation.
- Parameters:
phi_0 –
EvaluatePhifor alpha=0alpha – initial step length. If grad_Fm is properly scaled alpha=1 is a reasonable starting value.
- Returns:
alpha
- setOptions(**opts)¶
set options for the line search.
- Key alphaMin:
minimum search length
- Default alphaMin:
1e-20
- Key alphaMax:
maximum search length
- Default alphaMax:
5000
:key overStepFactor : factor to increase step size in line search :type overStepFactor: float :default overStepFactor: 2. :key iterMax: maximum number of line search iterations :type iterMax: int :default iterMax: 25 :key c1: sufficient decrease condition factor c1 :type c1: float :default c1: 1e-4 :key c2: curvature condition factor c2 :type c2: float :default c2: 0.9 :key inter_order: order of the interpolation used for line search :type inter_order: 1,2,3 :default inter_order: 3 :key inter_iterMax: maximum number of iteration steps to when minimizing interploted cost function :type inter_iterMax: int :default inter_iterMax: 100 :key inter_tol: tolerance to when minimizing interploted cost function :type inter_tol: float :default inter_tol: 1. :key zoom_iterMax: maximum number of zoom iterations :type zoom_iterMax: int :default zoom_iterMax: 20
:key phiEpsilon : tolerance for
greater thancheck of cost function values :type phiEpsilon: float :default phiEpsilon:np.sqrt(EPSILON):key alphaOffset : minimal relative distance of new alpha from boundaries :type alphaOffset : float :default alphaOffset : 1e-4
:key alphaWidthMin : minimal relative distance of new alpha from boundaries :type alphaWidthMin: float :default alphaWidthMin:
np.sqrt(EPSILON):key zoom_reductionMin: minimal reduction search interval length between zoom steps :type zoom_reductionMin : float :default zoom_reductionMin :0.66
- zoom(phi_lo, phi_hi, phi_0)¶
Helper function for
line_searchbelow which tries to tighten the range phi_lo.alpha…phi_hi.alpha such that phi_lo, phi_hi fulfill one of these conditions (which needs to be fullfilled at entry):- (Z1) [alpha_lo, alpha_hi] contains an alpha fullfilling (W1)+(W2). This condition holds if
(E1) alpha_hi violates the sufficient decrease condition (W1) (E2) Phi(αlpha_hi) ≥ Phi(αlpha_lo) (E3) Phi′(alpha_lo)(alpha_hi − alpha_lo ) < 0.
- (Z2) alpha_lo is giving the smallest value for Phi amongst alphas generated so far and that are satisfying
the sufficient decrease condition (W1)
(Z3) alpha_hi is chosen so that Phi′(alpha_lo)(alpha_hi − alpha_lo ) < 0
See Chapter 3 of ‘Numerical Optimization’ by J. Nocedal for an explanation. interpolation options are linear, quadratic, cubic or Newton interpolation.
- class esys.escript.minimizer.LineSearchAlphaMaxError¶
Exception thrown if the line search if maximum alpha is reached.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchAlphaMinError¶
Exception thrown if the line search if minimum alpha is reached.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchInterpolationBreakDownError¶
Interpolation break down
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchIterMaxReachedError¶
Exception thrown if the line serach reaches maximum iteration count
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchSearchDirectionError¶
Exception thrown if The search direction is not a descent direction.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchTerminationError¶
Exception thrown if the line search was unsuccessful
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.MinimizerException¶
This is a generic exception thrown by a minimizer.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.MinimizerIterationIncurableBreakDown¶
Exception thrown if the iteration scheme encountered an incurable breakdown.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.MinimizerLBFGS(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Minimizer that uses the limited-memory Broyden-Fletcher-Goldfarb-Shanno method. See Chapter 6 of ‘Numerical Optimization’ by J. Nocedal for an explanation.
- __init__(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Initializes a new minimizer for a given cost function.
- Parameters:
F (
CostFunction) – the cost function to be minimizedm_tol (
float) – relative tolerance for solutionmfor termination of iterationgrad_tol (
float) – tolerance for gradient relative to initial costfunction value for termination of iterationiterMax (
int) – maximium number of iterations.
- Default m_tol:
1e-4
- Default grad_tol:
1e-8
- Default iterMax:
300
- run(m)¶
- Parameters:
m (m-type) – initial guess
- Returns:
solution
- Return type:
m-type
- class esys.escript.minimizer.MinimizerMaxIterReached¶
Exception thrown if the maximum number of iteration steps is reached.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.MinimizerNLCG(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Minimizer that uses the nonlinear conjugate gradient method (Fletcher-Reeves variant).
- __init__(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Initializes a new minimizer for a given cost function.
- Parameters:
F (
CostFunction) – the cost function to be minimizedm_tol (
float) – relative tolerance for solutionmfor termination of iterationgrad_tol (
float) – tolerance for gradient relative to initial costfunction value for termination of iterationiterMax (
int) – maximium number of iterations.
- Default m_tol:
1e-4
- Default grad_tol:
1e-8
- Default iterMax:
300
- run(x)¶
- The callback function is called with the following arguments:
k - iteration number x - current estimate Fm - value of cost function at x grad_Fm - gradient of cost function at x gnorm - norm of grad_Fm (stopping criterion)
Functions¶
- esys.escript.minimizer.muchLarger(x, y, vareps=0.0)¶
Others¶
EPSILON