esys.escript.minimizer Package¶
Generic minimization algorithms
Classes¶
- class esys.escript.minimizer.AbstractMinimizer(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Base class for function minimization methods.
- __init__(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Initializes a new minimizer for a given cost function.
- Parameters:
F (
CostFunction) – the cost function to be minimizedm_tol (
float) – relative tolerance for solutionmfor termination of iterationgrad_tol (
float) – tolerance for gradient relative to initial costfunction value for termination of iterationiterMax (
int) – maximium number of iterations.
- Default m_tol:
1e-4
- Default grad_tol:
1e-8
- Default iterMax:
300
- doCallback(**args)¶
The callback function is called with the following arguments:
- Key iterCount:
iteration count
- Key m:
current solution
- Key dm:
last solution incerement
- Norm_m:
norm of current solution
m- Key Fm:
value of costs function for
m- Key gradFm:
gradient for
m- Parameters:
args_m (
tuple) – arguments formfailed (
bool) – set if the step was unsuccessful.
- getCostFunction()¶
return the cost function to be minimized
- Return type:
- getLineSearch()¶
returns the
LineSearchobject being used.
- getOptions()¶
returns a dictionary of LBFGS options rtype: dictionary
- getResult()¶
Returns the result of the minimization. :rtype: m-type
- logSummary()¶
Outputs a summary of the completed minimization process to the logger.
- run(m0)¶
Executes the minimization algorithm for f starting with the initial guess
m0.- Returns:
the result of the minimization
- setCallback(callback)¶
Sets a callback function to be called after every iteration.
def callback(iterCount, m, norm_m, dm, Fm, gradFm, norm_grad_Fm, failed, args)
with iteration count
iterCount, current approximationm, last updatedm, costfunction valueFm, gradientgradFm, see methodAbstractMinimizer.doCallback
- setCostFunction(F)¶
set the cost function to be minimized
- Parameters:
F (
CostFunction) – the cost function to be minimized
- setIterMax(iterMax=300)¶
Sets the maximum number of iterations before the minimizer terminates.
- Parameters:
iterMax (
int) – maximium number of iterations.- Default iterMax:
300
- setOptions(**opts)¶
setOptions for LBFGS. use solver.setOptions( key = value)
- Key m_tol:
relative tolerance for solution
mfor termination of iteration- Default m_tol:
1e-4
- Key grad_tol:
tolerance for gradient relative to initial cost function value for termination of iteration
- Default grad_tol:
1e-4
- Key truncation:
sets the number of previous LBFGS iterations to keep
:type truncation :
int:default truncation: 30 :key restart: restart after this many iteration steps. :type restart:int:default restart: 60 :key iterMax: maximium number of iterations. :type iterMax:int:default iterMax: 300 :key relAlphaMin: minimal step size relative to search direction. The value should be chosen such that at any iteration stepF(m + alpha * p)is just discriminable fromF(m)for anyalpha > relAlphaMin * norm(m)/norm(p). :type relAlphaMin:float:default relAlphaMin: 1e-8 :key initialAlpha: initial step size alpha in line search. Typically alpha=1 is a good initial value but a larger or smaller initial value may help to get the iteration started when only an approximation of the Hessian is available. :type initialAlpha:float:default initialAlpha: 1. :key scaleSearchDirection: if set the search direction is rescaled using an estimation of the norm of the Hessian :type scaleSearchDirection:bool:default scaleSearchDirection: TrueExample of usage:
cf = DerivedCostFunction() solver = MinimizerLBFGS(J=cf, m_tol=1e-5, grad_tol=1e-5, iterMax=300) solver.setOptions(truncation=20) solver.getLineSearch().setOptions(zoom_relChangeMin=1e-7) solver.run(initial_m) result = solver.getResult()
- setTolerance(m_tol=0.0001, grad_tol=0.0001)¶
Sets the tolerance for the stopping criterion. The minimizer stops when an appropriate norm is less than
m_tol.- Parameters:
m_tol (
float) – relative tolerance for solutionmfor termination of iterationgrad_tol (
float) – tolerance for gradient relative to initial costfunction value for termination of iteration
- Default m_tol:
1e-4
- Default grad_tol:
1e-8
- class esys.escript.minimizer.CostFunction¶
A cost function F(m) that can be minimized (base class). Near a solution m in the search space the cost function is approximated in direction p as
F(m+p) ~ F(m) + <grad F(m), p> + < H p, p>
where F(m) is the value of the function at m, grad F is the gradient, H is the Hessian and <.,.> is dual product. These four procedures are defined in this class.
The class distinguishes between the representation of the solution m (m-type) and the cost function gradient (g-type).
Example of usage:
class DerivedCostFunction(CostFunction): # overwrite: getArguments, etc. pass cf = DerivedCostFunction() # ... calculate m ... args = cf.getArguments(m) # this could be potentially expensive! f = cf.getValue(m, *args) # ... it could be required to update m without using the gradient... # ... but then ... gf = cf.getGradient(m, *args)
Use the “AndCount” versions (e.g. getValueAndCount) if you want to count calls.
- __init__()¶
Constructor. Initializes logger.
- getArguments(m)¶
returns precalculated values that are shared in the calculation of F(m) and grad F(m) and the Hessian operator.
- Note:
Overwrite this method to implement a cost function.
- Note:
The tuple returned by this call will be passed back to this
CostFunctionin other calls(eg:getGradientAndCount). Its contents are not specified at this level because no code, other than theCostFunctionwhich created it, will be interacting with it. That is, the implementor can put whatever information they find useful in it.- Parameters:
m (m-type) – location of derivative
- Returns:
pre-calculated arguments
- Return type:
tuple- Note:
Overwrite this method to implement a cost function.
- getArgumentsAndCount(m)¶
returns precalculated values that are shared in the calculation of F(m) and grad F(m) and the Hessian operator. The default implementation returns an empty tuple.
When calling this method the calling statistics is updated.
- Parameters:
m (m-type) – location of derivative
- Return type:
tuple- Note:
The tuple returned by this call will be passed back to this
CostFunctionin other calls (eg:getGradient). Its contents are not specified at this level because no code, other than theCostFunctionwhich created it, will be interacting with it. That is, the implementor can put whatever information they find useful in it.
- getDualProduct(m, r)¶
returns the dual product of
mandr- Return type:
float- Note:
Overwrite this method to implement a cost function.
- getDualProductAndCount(m, r)¶
returns the dual product <.,.> of
randm.When calling this method the calling statistics is updated.
- Return type:
float
- getGradient(m, *args)¶
returns the gradient of F at m using the precalculated values for m.
- Parameters:
m (m-type) – location of derivative
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
g-type
- Note:
Overwrite this method to implement a cost function.
- getGradientAndCount(m, *args)¶
returns the gradient of F at m using the precalculated values for m.
When calling this method the calling statistics is updated.
- Parameters:
m (m-type) – location of derivative
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
g-type
- getInverseHessianApproximation(r, m, *args, initializeHessian=True)¶
returns an approximate evaluation p of the inverse of the Hessian operator of the cost function for a given gradient r: H p = r :note: by default this method is returning r. In this case it is assumed that m-type==g-type :note: Overwrite this method to implement a cost function.
- Parameters:
r (g-type) – a given gradient
initializeHessian (bool) – indicates if the Hessian operator should be initialized using
m. If this method provides an approximation only and building the new Hessian approximation is expensive it is typically more efficient to update the Hessian operator occasionally only when on input initializeHessian isTrue. If the Hessian should be updated in each step ignore the value ofinitializeHessian.
- Returns:
new search direction p.
- Return type:
m-type
- getInverseHessianApproximationAndCount(r, m, *args, initializeHessian=True)¶
returns an evaluation p of the inverse of the Hessian operator of the cost function for a given gradient r: H p = r
When calling this method the calling statistics is updated.
- Parameters:
r (g-type) – a given gradient
initializeHessian (bool) – indicates if the Hessian operator should be initialized using
m. If updating the Hessian is expensive it should only be done when initializeHessian is True. If this method provides an approximation only and building the new Hessian approximation is expensive it is typically more efficient to update the Hessian operator occasionally.args – pre-calculated values for
mfromgetArgumentsAndCount()
- Returns:
new search direction p.
- Return type:
m-type
- getNorm(m)¶
returns the norm of
m. Typically, this is the ‘Lsup’ function.- Return type:
float- Note:
Overwrite this method to implement a cost function.
- getNormAndCount(m)¶
returns the norm of
m.When calling this method the calling statistics is updated.
- Return type:
float
- getSqueezeFactor(m, p, *args)¶
The new solution is calculated as m+a*p with a>0. This function allows to provide an upper bound for a to make sure that m+a*p is valid typically to avoid overflow when the cost function is evaluated. the solver will take action to make sure that the value of a is not too small.
- Parameters:
m (m-type) – a solution approximation
p – an increment to the solution
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
positive
floator None- Note:
Overwrite this method to implement a cost function.
- getStatistics()¶
Returns the call statistics as a string.
- Returns:
formatted string with counts of all function calls
- Return type:
str
- getValue(m, *args)¶
returns the value F(m) using the precalculated values for m.
- Parameters:
m (m-type) – a solution approximation
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
floator valid argument forsum- Note:
Overwrite this method to implement a cost function.
- getValueAndCount(m, *args)¶
returns the value F(m) using the precalculated values for m.
When calling this method the calling statistics is updated.
- Parameters:
m (m-type) – a solution approximation
args – pre-calculated values for
mfromgetArgumentsAndCount()
- Return type:
floator valid argument forsum
- resetStatistics()¶
resets all counters
- class esys.escript.minimizer.CostFunction1DEvaluationFactory(m, p, costfunction=None)¶
This class manages the interface of
Costfunctionimplementing F to the 1D projection Phi(a)=F(m+a*p). The idea is to use aEvaluatePhiobject to hold the values of F, gradF and args for a given value.- __init__(m, p, costfunction=None)¶
sets up the factory for offset
mand directionpto use costfunctionCostFunction. :param m: offset :type m: m-type :param p: direction :type p: m-type :param costfunction: a cost function object :type costfunction:CostFunction
- getArgs(a)¶
returns the arguments for m+a*p :param a: scaling factor of search direction :type a:
float:returns args: arguments appropriate forCostfunction
- getEvaluation(a=1)¶
Creates a container for Phi(a)=F(m+a*p), gradF(m+a*p) and the respective arguments.
- Parameters:
a (
float) – scaling factor of search direction- Returns:
container holding evaluation state
- Return type:
- getGrad(a, gradF=None, args=None)¶
returns value Phi’(a)=<grad F(m+a*p), p>. If the arguments args==None, args is calculated. If the gradient gradF==None, the gradient is calculated (using args). :param a: scaling factor of search direction :type a:
float:param args: arguments appropriate forCostfunction:type args: arguments appropriate forCostfunctionorNone:param gradF: gradient ofCostFunctionform+alpha*p:type gradF: appropriate forCostfunctionor None :returns: derivate of Phi at value a
- getValue(a, args=None)¶
returns value F(m+a*p). If the arguments args==None, args is calculated :param a: scaling factor of search direction :type a:
float:param args: arguments appropriate forCostfunction:type args: arguments appropriate forCostfunctionorNone:returns: value of F and corresponding args.
- class esys.escript.minimizer.EvalutedPhi(factory, alpha=1.0, args=None, valF=None, gradF=None)¶
this class provides an access to value Phi=F(m+a*p) for a
CostFunctionThe interface is handeled by a factory class that holds m, p and the costfunction where methods of the latter are called via the factory when needed. Main purpose of this class is to maintain a simple mechanism to manage the values of Phi and its derivative Phi’ (=gradPhi) but avoiding a recalculation of the arguments and full gradient once the Phi has been optimized via line serach and zoom.- __init__(factory, alpha=1.0, args=None, valF=None, gradF=None)¶
- Parameters:
factory (
CostFunction1DEvaluationFactoryor similar) – factory classalpha (
float) – length factorargs (appropriate for
Costfunction) – argument form+alpha*pif known. Is held at self.argsvalF (
float) – value of Phi(alpha)=F(m+alpha*p) if known. Is held at self.valFgradF (appropriate for
Costfunction) – gradient ofCostFunctionform+alpha*pif known. is held at self.gradF
- getDiff()¶
Returns the value for Phi’(alpha)=<p, grad(F(m+alpha*p))>.
- Returns:
the derivative of Phi at alpha
- Return type:
float
- getVal()¶
Returns the value for Phi(alpha)=F(m+alpha*p).
In contrast to using self.valF directly, the value is calculated if self.valF is None.
- Returns:
the function value at alpha
- Return type:
float
- class esys.escript.minimizer.LineSearch(logger=None)¶
Line search method to minimize F(m+alpha*p).
The iteration is terminated when the strong Wolfe conditions is fulfilled. See Chapter 3 of ‘Numerical Optimization’ by J. Nocedal for an explanation.
The line search returns a new search direction scale alpha that satisfies the strong Wolfe condition:
(W1) sufficient decrease condition:
Phi(alpha) <= Phi(0) + c1 * alpha * phi'(0)(W2) curvature condition:
abs(Phi'(alpha)) <= c2 * abs(Phi'(0))
The first step is to find an alpha_lo and alpha_hi such that the following conditions hold:
(Z1) [alpha_lo, alpha_hi] contains an alpha fulfilling (W1)+(W2)
(Z2) alpha_lo is giving the smallest value for Phi amongst alphas generated so far and that are satisfying the sufficient decrease condition (W1)
(Z3) alpha_hi is chosen so that
Phi'(alpha_lo)*(alpha_hi - alpha_lo) < 0, that means alpha_hi > alpha_lo if Phi’(alpha_lo) < 0 and alpha_hi < alpha_lo if Phi’(alpha_lo) > 0
Condition (Z1) is fulfilled if one of the following conditions is satisfied:
(E1) alpha_hi violates the sufficient decrease condition (W1)
(E2) Phi(alpha_hi) >= Phi(alpha_lo)
(E3) Phi’(alpha_lo)*(alpha_hi - alpha_lo) < 0
- __init__(logger=None)¶
initialize the line search solver
- Parameters:
logger (
logging.Logger) – logger to be used. If not set ‘esys.minimizer’ is used.
- getOptions()¶
returns a dictionary of LBFGS options rtype: dictionary
- run(phi_0, alpha=1.0)¶
Line search method to minimize F(m+alpha*p) The iteration is terminated when the strong Wolfe conditions is fulfilled. See Chapter 3 of ‘Numerical Optimization’ by J. Nocedal for an explanation.
- Parameters:
phi_0 –
EvaluatePhifor alpha=0alpha – initial step length. If grad_Fm is properly scaled alpha=1 is a reasonable starting value.
- Returns:
alpha
- setOptions(**opts)¶
Set options for the line search.
- Key alphaMin:
minimum search length
- Default alphaMin:
1e-20
- Key alphaMax:
maximum search length
- Default alphaMax:
5000
:key overStepFactor : factor to increase step size in line search :type overStepFactor: float :default overStepFactor: 2. :key iterMax: maximum number of line search iterations :type iterMax: int :default iterMax: 25 :key c1: sufficient decrease condition factor c1 :type c1: float :default c1: 1e-4 :key c2: curvature condition factor c2 :type c2: float :default c2: 0.9 :key inter_order: order of the interpolation used for line search :type inter_order: 1,2,3 :default inter_order: 3 :key inter_iterMax: maximum number of iteration steps to when minimizing interploted cost function :type inter_iterMax: int :default inter_iterMax: 100 :key inter_tol: tolerance to when minimizing interploted cost function :type inter_tol: float :default inter_tol: 1. :key zoom_iterMax: maximum number of zoom iterations :type zoom_iterMax: int :default zoom_iterMax: 20
:key phiEpsilon : tolerance for
greater thancheck of cost function values :type phiEpsilon: float :default phiEpsilon:np.sqrt(EPSILON):key alphaOffset : minimal relative distance of new alpha from boundaries :type alphaOffset : float :default alphaOffset : 1e-4
:key alphaWidthMin : minimal relative distance of new alpha from boundaries :type alphaWidthMin: float :default alphaWidthMin:
np.sqrt(EPSILON):key zoom_reductionMin: minimal reduction search interval length between zoom steps :type zoom_reductionMin : float :default zoom_reductionMin :0.66
- zoom(phi_lo, phi_hi, phi_0)¶
Helper function for
line_searchbelow which tries to tighten the range phi_lo.alpha…phi_hi.alpha such that phi_lo, phi_hi fulfill one of these conditions (which needs to be fulfilled at entry):(Z1) [alpha_lo, alpha_hi] contains an alpha fulfilling (W1)+(W2). This condition holds if:
(E1) alpha_hi violates the sufficient decrease condition (W1)
(E2) Phi(alpha_hi) >= Phi(alpha_lo)
(E3) Phi’(alpha_lo)*(alpha_hi - alpha_lo) < 0
(Z2) alpha_lo is giving the smallest value for Phi amongst alphas generated so far and that are satisfying the sufficient decrease condition (W1)
(Z3) alpha_hi is chosen so that
Phi'(alpha_lo)*(alpha_hi - alpha_lo) < 0
See Chapter 3 of ‘Numerical Optimization’ by J. Nocedal for an explanation. Interpolation options are linear, quadratic, cubic or Newton interpolation.
- class esys.escript.minimizer.LineSearchAlphaMaxError¶
Exception thrown if the line search if maximum alpha is reached.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchAlphaMinError¶
Exception thrown if the line search if minimum alpha is reached.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchInterpolationBreakDownError¶
Interpolation break down
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchIterMaxReachedError¶
Exception thrown if the line search reaches maximum iteration count
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchSearchDirectionError¶
Exception thrown if The search direction is not a descent direction.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.LineSearchTerminationError¶
Exception thrown if the line search was unsuccessful
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.MinimizerException¶
This is a generic exception thrown by a minimizer.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.MinimizerIterationIncurableBreakDown¶
Exception thrown if the iteration scheme encountered an incurable breakdown.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.MinimizerLBFGS(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Minimizer that uses the limited-memory Broyden-Fletcher-Goldfarb-Shanno method. See Chapter 6 of ‘Numerical Optimization’ by J. Nocedal for an explanation.
- __init__(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Initializes a new minimizer for a given cost function.
- Parameters:
F (
CostFunction) – the cost function to be minimizedm_tol (
float) – relative tolerance for solutionmfor termination of iterationgrad_tol (
float) – tolerance for gradient relative to initial costfunction value for termination of iterationiterMax (
int) – maximium number of iterations.
- Default m_tol:
1e-4
- Default grad_tol:
1e-8
- Default iterMax:
300
- run(m)¶
- Parameters:
m (m-type) – initial guess
- Returns:
solution
- Return type:
m-type
- class esys.escript.minimizer.MinimizerMaxIterReached¶
Exception thrown if the maximum number of iteration steps is reached.
- __init__(*args, **kwargs)¶
- class esys.escript.minimizer.MinimizerNLCG(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Minimizer that uses the nonlinear conjugate gradient method (Fletcher-Reeves variant).
- __init__(F=None, m_tol=0.0001, grad_tol=1e-08, iterMax=300, logger=None)¶
Initializes a new minimizer for a given cost function.
- Parameters:
F (
CostFunction) – the cost function to be minimizedm_tol (
float) – relative tolerance for solutionmfor termination of iterationgrad_tol (
float) – tolerance for gradient relative to initial costfunction value for termination of iterationiterMax (
int) – maximium number of iterations.
- Default m_tol:
1e-4
- Default grad_tol:
1e-8
- Default iterMax:
300
- run(x)¶
- The callback function is called with the following arguments:
k - iteration number x - current estimate Fm - value of cost function at x grad_Fm - gradient of cost function at x gnorm - norm of grad_Fm (stopping criterion)
Functions¶
- esys.escript.minimizer.muchLarger(x, y, vareps=0.0)¶
Tests if x is much larger than y with tolerance.
Returns True if
x > yandabs(x-y) > vareps * max(abs(x), abs(y)).- Parameters:
x (
float) – first valuey (
float) – second valuevareps (
float) – relative tolerance
- Returns:
True if x is much larger than y
- Return type:
bool
- esys.escript.minimizer.muchSmaller(x, y, vareps=0.0)¶
Tests if x is much smaller than y with tolerance.
Returns True if
x < yandabs(x-y) > vareps * max(abs(x), abs(y)).- Parameters:
x (
float) – first valuey (
float) – second valuevareps (
float) – relative tolerance
- Returns:
True if x is much smaller than y
- Return type:
bool
Others¶
EPSILON