MC++
|
A convex relaxation \(f^{\rm cv}\) of a function \(f\) on the convex domain \(D\) is a function that is (i) convex on \(D\) and (ii) underestimates \(f\) on \(D\). Likewise, a concave relaxation \(f^{\rm cc}\) of a function \(f\) on the convex domain \(D\) is a function that is (i) concave on \(D\) and (ii) overestimates \(f\) on \(D\). McCormick's technique [McCormick, 1976] provides a means for computing pairs of convex/concave relaxations of a multivariate function on interval domains provided that this function is factorable and that the intrinsic univariate functions in its factored form have known convex/concave envelopes or, at least, relaxations.
![]() |
The class mc::McCormick provides an implementation of the McCormick relaxation technique and its recent extensions; see [McCormick, 1976; Scott et al., 2011; Tsoukalas & Mitsos, 2012; Wechsung & Barton, 2013]. mc::McCormick also has the capability to propagate subgradients for these relaxations, which are guaranteed to exist in the interior of the domain of definition of any convex/concave function. This propagation is similar in essence to the forward mode of automatic differentiation; see [Mitsos et al., 2009]. We note that mc::McCormick is not a verified implementation in the sense that rounding errors are not accounted for in computing convex/concave bounds and subgradients.
The implementation of mc::McCormick relies on the operator/function overloading mechanism of C++. This makes the computation of convex/concave relaxations both simple and intuitive, similar to computing function values in real arithmetics or function bounds in interval arithmetic (see Non-Verified Interval Arithmetic for Factorable Functions). Moreover, mc::McCormick can be used as the template parameter of other classes of MC++, for instance mc::TModel and mc::TVar. Likewise, mc::McCormick can be used as the template parameter of the classes fadbad::F, fadbad::B and fadbad::T of FADBAD++ for computing McCormick relaxations and subgradients of the partial derivatives or the Taylor coefficients of a factorable function (see How do I compute McCormick relaxations of the partial derivatives or the Taylor coefficients of a factorable function using FADBAD++?).
mc::McCormick itself is templated in the type used to propagate the supporting interval bounds. By default, mc::McCormick can be used with the non-verified interval type mc::Interval of MC++. For reliability, however, it is strongly recommended to use verified interval arithmetic such as PROFIL or FILIB++. We note that Taylor models as provided by the classes mc::TModel and mc::TVar can also be used as the template parameter (see Taylor Model Arithmetic for Factorable Functions).
Examples of McCormick relaxations constructed with mc::McCormick are shown on the left plot of the figure below for the factorable function \(f(x)=\cos(x^2)\,\sin(x^{-3})\) for \(x\in [\frac{\pi}{6},\frac{\pi}{3}]\). Also shown on the right plot are the affine relaxations constructed from a subgradient at \(\frac{\pi}{4}\) of the McCormick relaxations of \(f\) on \([\frac{\pi}{6},\frac{\pi}{3}]\).
![]() | ![]() |
Suppose one wants to compute McCormick relaxation of the real-valued function \(f(x,y)=x(\exp(x)-y)^2\) for \((x,y)\in [-2,1]\times[-1,2]\), at the point \((x,y)=(0,1)\). For simplicity, the supporting interval bounds are calculated using the default interval type mc::Interval here:
First, the variables \(x\) and \(y\) are defined as follows:
Essentially, the first line means that X
is a variable of type mc::McCormick, belonging to the interval \([-2,1]\), and whose current value is \(0\). The same holds for the McCormick variable Y
, which belonging to the interval \([-1,2]\) and has a value of \(1\).
Having defined the variables, McCormick relaxations of \(f(x,y)=x(\exp(x)-y)^2\) on \([-2,1]\times[-1,2]\) at \((0,1)\) are simply computed as:
These relaxations can be displayed to the standard output as:
which produces the following output:
f relaxations at (0,1): [ -2.76512e+01 : 1.38256e+01 ] [ -1.38256e+01 : 8.52245e+00 ]
Here, the first pair of bounds correspond to the supporting interval bounds, as obtained with mc::Interval, and which are valid over the entire domain \([-2,1]\times[-1,2]\). The second pair of bounds are the values of the convex and concave relaxations at the selected point \((0,1)\). In order to describe the convex and concave relaxations on the entire range \([-2,1]\times[-1,2]\), it would be necessary to repeat the computations at different points. The current point can be set/modified by using the method mc::McCormick::c, for instance at \((-1,0)\)
producing the output:
f relaxations at (-1,0): [ -2.76512e+01 : 1.38256e+01 ] [ -1.75603e+01 : 8.78014e+00 ]
The values of the McCormick convex and concave relaxations of \(f(x,y)\) can be retrieved, respectively, as:
Likewise, the lower and upper bounds of the supporting interval bounds can be retrieved as:
Computing a subgradient of a McCormick relaxation requires specification of the independent variables via the .sub method, prior to evaluating the function in mc::McCormick type. Continuing the previous example, the function has two independent variables \(x\) and \(y\). Defining \(x\) and \(y\) as the subgradient components \(0\) and \(1\) (indexing in C/C++ start at 0 by convention!), respectively, is done as follows:
Similar to above, the McCormick convex and concave relaxations of \(f(x,y)\) at \((-1,0)\) along with subgradients of these relaxations are computed as:
producing the output:
f relaxations and subgradients at (-1,0): [ -2.76512e+01 : 1.38256e+01 ] [ -1.75603e+01 : 8.78014e+00 ] [ (-3.19186e+00, 3.70723e+00) : ( 1.59593e+00,-1.85362e+00) ]
The additional information displayed corresponds to, respectively, a subgradient of the McCormick convex underestimator at (-1,0) and a subgradient of the McCormick concave overestimator at (-1,0). In turn, these subgradients can be used to construct affine relaxations on the current range, or passed to a bundle solver to locate the actual minimum or maximum of the McCormick relaxations.
The subgradients of the McCormick relaxations of \(f(x,y)\) at the current point can be retrieved as follows:
or, alternatively, componentwise as:
Directional subgradients can be propagated too. In the case that subgradients are to computed along the direction (1,-1) for both the convex and concave relaxations, we define:
producing the output:
f relaxations and subgradients along direction (1,-1) at (-1,0): [ -2.76512e+01 : 1.38256e+01 ] [ -1.75603e+01 : 8.78014e+00 ] [ (-6.89910e+00) : ( 3.44955e+00) ]
Now suppose one wants to compute McCormick relaxation not only for a given factorable function, but also for its partial derivatives. Continuing the previous example, the partial derivatives of \(f(x,y)=x(\exp(x)-y)^2\) with respect to its independent variables \(x\) and \(y\) can be obtained via automatic differentiation (AD), either forward or reverse AD. This can be done for instance using the classes fadbad::F, and fadbad::B of FADBAD++.
Considering forward AD first, we include the following header files:
The variables are initialized and the derivatives and subgradients are seeded as follows:
As previously, the McCormick convex and concave relaxations of \(f\), \(\frac{\partial f}{\partial x}\), and \(\frac{\partial f}{\partial y}\) at \((-1,0)\) on the range \([-2,1]\times[-1,2]\), along with subgradients of these relaxations, are computed as:
producing the output:
f relaxations and subgradients at (-1,0): [ -2.76512e+01 : 1.38256e+01 ] [ -1.75603e+01 : 8.78014e+00 ] [ (-3.19186e+00, 3.70723e+00) : ( 1.59593e+00,-1.85362e+00) ] df/dx relaxations and subgradients at (-1,0): [ -4.04294e+01 : 3.41004e+01 ] [ -2.33469e+01 : 2.90549e+01 ] [ (-2.31383e+01,-1.94418e-01) : ( 1.59593e+00,-1.85362e+00) ] df/dy relaxations and subgradients at (-1,0): [ -7.45866e+00 : 1.48731e+01 ] [ -5.96505e+00 : 7.71460e+00 ] [ (-5.96505e+00,-4.00000e+00) : ( 7.17326e+00,-4.00000e+00) ]
Relaxations of the partial derivatives can also be computed using the backward mode of AD, which requires the additional header file:
then initialize and seed new variables and compute the function as follows:
producing the output:
f relaxations and subgradients at (-1,0): [ -2.76512e+01 : 1.38256e+01 ] [ -1.75603e+01 : 8.78014e+00 ] [ (-3.19186e+00, 3.70723e+00) : ( 1.59593e+00,-1.85362e+00) ] df/dx relaxations and subgradients at (-1,0): [ -4.04294e+01 : 3.41004e+01 ] [ -1.37142e+01 : 1.60092e+01 ] [ (-1.35056e+01,-1.94418e-01) : ( 8.82498e+00,-1.31228e+00) ] df/dy relaxations and subgradients at (-1,0): [ -7.45866e+00 : 1.48731e+01 ] [ -5.96505e+00 : 7.71460e+00 ] [ (-5.96505e+00,-4.00000e+00) : ( 7.17326e+00,-4.00000e+00) ]
It is noteworthy that the bounds, McCormick relaxations and subgradients for the partial derivatives as computed with the forward and backward mode, although valid, may not be the same since the computations involve different sequences or even operations. In the previous examples, for instance, forward and backward AD produce identical interval bounds for \(\frac{\partial f}{\partial x}\) and \(\frac{\partial f}{\partial y}\) at \((-1,0)\), yet significantly tighter McCormick relaxations are obtained with backward AD for \(\frac{\partial f}{\partial x}\) at \((-1,0)\).
Another use of FADBAD++ involves computing McCormick relaxations of the Taylor coefficients in the Taylor expansion of a factorable function in a given direction up to a certain order. Suppose we want to compute McCormick relaxation of the first 5 Taylor coefficients of \(f(x,y)=x(\exp(x)-y)^2\) in the direction \((1,0)\), i.e. the direction of \(x\). This information can be computed by using the classes fadbad::T, which requires the following header file:
The variables are initialized and the derivatives and subgradients are seeded as follows:
producing the output:
d^0f/dx^0 relaxations and subgradients at (-1,0): [ -2.76512e+01 : 1.38256e+01 ] [ -1.75603e+01 : 8.78014e+00 ] [ (-3.19186e+00, 3.70723e+00) : ( 1.59593e+00,-1.85362e+00) ] d^1f/dx^1 relaxations and subgradients at (-1,0): [ -4.04294e+01 : 3.41004e+01 ] [ -2.33469e+01 : 2.90549e+01 ] [ (-2.31383e+01,-1.94418e-01) : ( 1.59593e+00,-1.85362e+00) ] d^2f/dx^2 relaxations and subgradients at (-1,0): [ -4.51302e+01 : 3.77111e+01 ] [ -1.97846e+01 : 2.25846e+01 ] [ (-1.97113e+01, 0.00000e+00) : ( 7.36023e+00,-4.06006e-01) ] d^3f/dx^3 relaxations and subgradients at (-1,0): [ -2.65667e+01 : 2.82546e+01 ] [ -1.02662e+01 : 1.27412e+01 ] [ (-1.00820e+01,-4.51118e-02) : ( 7.66644e+00,-1.80447e-01) ] d^4f/dx^4 relaxations and subgradients at (-1,0): [ -1.19764e+01 : 1.59107e+01 ] [ -4.29280e+00 : 6.13261e+00 ] [ (-4.25007e+00,-2.25559e-02) : ( 4.86086e+00,-5.63897e-02) ] d^5f/dx^5 relaxations and subgradients at (-1,0): [ -4.44315e+00 : 7.16828e+00 ] [ -1.49744e+00 : 2.55611e+00 ] [ (-1.44773e+00,-6.76676e-03) : ( 2.29932e+00,-1.35335e-02) ]
The zeroth Taylor coefficient corresponds to the function \(f\) itself. It can also be checked that the relaxations of the first Taylor coefficient of \(f\) matches those obtained with forward AD for \(\frac{\partial f}{\partial x}\).
Naturally, the classes fadbad::F, fadbad::B and fadbad::T can be nested to produce relaxations of higher-order derivative information.
As well as overloading the usual functions exp
, log
, sqr
, sqrt
, pow
, inv
, cos
, sin
, tan
, acos
, asin
, atan
, erf
, erfc
, min
, max
, fabs
, mc::McCormick also defines the following functions:
fstep(x)
and bstep(x)
, implementing the forward step function (switching value from 0 to 1 for x>=0) and the backward step function (switching value from 1 to 0 for x>=0). These functions can be used to model a variety of discontinuous functions as first proposed in [Wechsung & Barton, 2012].ltcond(x,y,z)
and gtcond(x,y,z)
, similar in essence to fstep(x)
and bstep(x)
, and implementing disjunctions of the form { y if x<=0; z otherwise } and { y if x>=0; z otherwise }, respectively.inter(x,y,z)
, computing the intersection \(x = y\cap z\) and returning true/false if the intersection is nonempty/empty.hull(x,y)
, computing convex/concave relaxations of the union \(x\cup y\).The class mc::McCormick has a public static member called mc::McCormick::options that can be used to set/modify the options; e.g.,
The available options are the following:
Name | Type | Default | Description |
---|---|---|---|
ENVEL_USE | bool | true | Whether to compute convex/concave envelopes for the neither-convex-nor-concave univariate functions such as odd power terms, sin, cos, asin, acos, tan, atan, erf, erfc. This provides tighter McCormick relaxations, but it is more time consuming. Junction points are computed using the Newton or secant method first, then the more robust golden section search method if unsuccessful. |
ENVEL_TOL | double | 1e-10 | Termination tolerance for determination function points in convex/concave envelopes of univariate terms. |
ENVEL_MAXIT | int | 100 | Maximum number of iterations for determination function points in convex/concave envelopes of univariate terms. |
MVCOMP_USE | bool | false | Whether to use Tsoukalas & Mitsos's multivariate composition result for min/max, product, and division terms; see [Tsoukalas & Mitsos, 2012]. This provides tighter McCormick relaxations, but it is more time consuming. |
MVCOMP_TOL | double | 1e1*machprec() | Tolerance for equality test in subgradient propagation for product terms with Tsoukalas & Mitsos's multivariate composition result; see [Tsoukalas & Mitsos, 2012]. |
DISPLAY_DIGITS | unsigned int | 5 | Number of digits in output stream |
Errors are managed based on the exception handling mechanism of the C++ language. Each time an error is encountered, a class object of type mc::McCormick::Exceptions is thrown, which contains the type of error. It is the user's responsibility to test whether an exception was thrown during a McCormick relaxation, and then make the appropriate changes. Should an exception be thrown and not caught by the calling program, the execution will stop.
Possible errors encountered during the computation of a McCormick relaxation are:
Number | Description |
---|---|
1 | Division by zero |
2 | Inverse with zero in range |
3 | Log with negative values in range |
4 | Square-root with nonpositive values in range |
5 | Inverse sine or cosine with values outside of \([-1,1]\) range |
6 | Tangent with values outside of \([-\frac{\pi}{2}+k\pi,\frac{\pi}{2}+k\pi]\) range |
-1 | Inconsistent size of subgradient between two mc::McCormick variables |
-2 | Failed to compute the convex or concave envelope of a univariate term |
-3 | Failed to propagate subgradients for a product term with Tsoukalas & Mitsos's multivariable composition result |