MC++
|
Originally introduced by McCormick [McCormick, 1976] for the development of a convex/concave relaxation arithmetic, factorable functions cover an extremely inclusive class of functions which can be represented finitely on a computer by means of a code list or a computational graph involving atom operations. These are typically unary and binary operations within a library of atom operators, which can be based for example on the C-code library math.h
. Besides convex/concave relaxations, factorable functions find applications in automatic differentiation (AD) [Naumann, 2009] as well as in interval analysis [Moore et al., 2009] and Taylor model arithmetic [Neumaier, 2002].
Factorable functions can be represented using directed acyclic graphs (DAGs), whose nodes are subexpressions and whose directed edges are computational flows [Schichl & Neumaier, 2005]. Compared to tree-based representations, DAGs offer the essential advantage of more accurately handling the influence of subexpressions shared by several functions during evaluation.
The classes mc::FFGraph, mc::FFVar and mc::FFOp defined in ffunc.hpp
implement such a DAG construction for factorable functions. They also provide a basis for their manipulation, including differentiation and Taylor expansion, as well as their evaluation, in particular with the types mc::McCormick, mc::Specbnd, mc::TVar and mc::CVar of MC++.
For illustration, suppose we want to construct a DAG for the factorable function \({\bf f}:\mathbb{R}^4\to\mathbb{R}^2\) defined by
\begin{align*} {\bf f} = \left(\begin{array}{c} x_2x_3-x_0\\ x_0(\exp(x_2x_3)+3.0)^4)+x_1\end{array}\right) \end{align*}
The constructions require the header file ffunc.hpp
to be included:
An environment mc::FFGraph is first defined for recording the factorable function DAG. All four variables mc::FFVar participating in that function are then defined in the enviornment using the method mc::FFVar::set:
The two components of the factorable function can be defined next:
The last line displays the following information about the factorable function DAG:
DAG VARIABLES: X0 => { Z1 Z7 } X1 => { Z8 } X2 => { Z0 } X3 => { Z0 } DAG INTERMEDIATES: Z0 <= X2 * X3 => { Z1 Z2 } Z1 <= Z0 - X0 => { } Z2 <= EXP( Z0 ) => { Z4 } Z4 <= Z2 + Z3 => { Z6 } Z6 <= POW( Z4, Z5 ) => { Z7 } Z7 <= X0 * Z6 => { Z8 } Z8 <= X1 + Z7 => { } Z5 <= 4(I) => { Z6 } Z3 <= 3(D) => { Z4 }
Observe that 9 auxiliary variables, \(z_0,\ldots,z_8\), have been created in the DAG, which correspond to the various unary and binary operations in the factorable function expression, as well as the (integer or real) participating constants. Observe, in particular, that the common sub-expression \(x_2x_3\) is detected here; that is, the intermediate \(z_0\) is reused to obtain both subsequent auxiliary variables \(z_1\) and \(z_2\).
At this point, the member function mc::FFGraph::subgraph can be used to generate the subgraph of a DAG corresponding to a given subset of dependent variables. This function returns a list of const pointers mc::FFOp* to the operations participating in the subgraph in order of appearance, which can then be displayed using the member function mc::FFGraph::output:
The member function FFGraph::subgraph Here, the first line generates and displays a subgraph of both components of \({\bf f}\), whereas the second line generates and displays a subgraph of the first component \(f_0\) only:
FACTORS IN SUBGRAPH: X2 <= VARIABLE X3 <= VARIABLE Z0 <= X2 * X3 X0 <= VARIABLE Z1 <= Z0 - X0 X1 <= VARIABLE Z2 <= EXP( Z0 ) Z3 <= 3(D) Z4 <= Z2 + Z3 Z5 <= 4(I) Z6 <= POW( Z4, Z5 ) Z7 <= X0 * Z6 Z8 <= X1 + Z7 FACTORS IN SUBGRAPH: X2 <= VARIABLE X3 <= VARIABLE Z0 <= X2 * X3 X0 <= VARIABLE Z1 <= Z0 - X0
The obtained subgraphs can also be depicted using the (open source) graph plotting program DOT. The dot files F.dot
and F1.dot
can be generated for both subgraphs as follows [which requires the header file fstream.h
]:
The graphs can be visualized, e.g., after generating SVG files using the command line as:
$ dot -Tsvg -O F.dot; display F.dot.svg $ dot -Tsvg -O F0.dot; display F0.dot.svg
Graph for file
| Graph for file
|
Derivatives of a factorable function in mc::FFGraph can be obtained with the methods mc::FFGraph::FAD and mc::FFGraph::BAD, which implement the forward and reverse mode of automatic differentiation (AD), respectively. It should be noted that mc::FFGraph does not implement these AD methods per se, but uses the classes fadbad::F and fadbad::B as part of FADBAD++.
In the forward mode of AD, for instance, entries of the Jacobian matrix of the factorable function \(f\) considered in the previous section can be added to the DAG as follows:
The last line displays the following information about the DAG of the factorable function and its Jacobian:
DAG VARIABLES: X0 => { Z1 Z7 Z17 Z18 } X1 => { Z8 } X2 => { Z0 Z10 } X3 => { Z0 Z9 } DAG INTERMEDIATES: Z0 <= X2 * X3 => { Z1 Z2 } Z1 <= Z0 - X0 => { } Z2 <= EXP( Z0 ) => { Z4 Z9 Z10 } Z4 <= Z2 + Z3 => { Z6 Z12 } Z6 <= POW( Z4, Z5 ) => { Z7 } Z7 <= X0 * Z6 => { Z8 } Z8 <= X1 + Z7 => { } Z9 <= X3 * Z2 => { Z15 } Z10 <= X2 * Z2 => { Z16 } Z12 <= POW( Z4, Z11 ) => { Z14 } Z14 <= Z12 * Z13 => { Z15 Z16 } Z15 <= Z9 * Z14 => { Z17 } Z16 <= Z10 * Z14 => { Z18 } Z17 <= X0 * Z15 => { } Z18 <= X0 * Z16 => { } Z11 <= 3(I) => { Z12 } Z5 <= 4(I) => { Z6 } Z19 <= -1(D) => { } Z20 <= 0(D) => { } Z21 <= 1(D) => { } Z3 <= 3(D) => { Z4 } Z13 <= 4(D) => { Z14 }
Observe that 13 extra auxiliary variables, \(z_9,\ldots,z_{21}\), have been created in the DAG after the application of forward AD. Moreover, the function mc:FFGraph::FAD returns a vector of pointers to the dependent variables representing the entries \(\frac{\partial f_i}{\partial x_j}\) of the Jacobian matrix of \(f\) in the DAG, ordered column-wise as \(\frac{\partial f_1}{\partial x_1},\ldots,\frac{\partial f_1}{\partial x_n},\frac{\partial f_2}{\partial x_1},\ldots,\frac{\partial f_2}{\partial x_n},\ldots\).
As previously, subgraphs can be constructed for all or part of the derivatives, and dot file can be generated for these subgraphs too, e.g.:
The first subgraph created above corresponds to both components of the factorable function \(f\) as well as all eight component of its Jacobian matrix \(\frac{\partial {\bf f}}{\partial {\bf x}}\); the second subgraph is for the Jacobian element \(\frac{\partial f_1}{\partial x_3}\). The corresponding graphs are shown below.
Graph for file
| Graph for file
|
The reserve method of AD can be applied in a likewise manner using the method mc::FFGraph::BAD instead of mc::FFGRAPH::FAD, everything else remaining the same. The corresponding graphs are shown below. Note that, in constrast to forward AD, the reverse mode only requires 11 extra auxiliary variables to construct the DAG of the Jacobian matrix.
Graph for file
| Graph for file
|
Consider a dynamic system of the form
\begin{align*} \dot{\bf x}(t,{\bf p}) \;=\; {\bf f}({\bf x}(t,{\bf p}),{\bf p})\,, \end{align*}
where \({\bf x} \in \mathbb{R}^{n_x}\) denotes the state variables, and \({\bf p} \in \mathbb{R}^{n_p}\) a set of (time-invariant) parameters. Assuming that the right-hand side function \({\bf f}\) is sufficiently often continuously differentiable on \(\mathbb{R}^{n_x}\times\mathbb{R}^{n_p}\), a \(q\)th-order Taylor expansion in time of the ODE solutions \(x(\cdot,{\bf p})\) at a given time \(t\) reads:
\begin{align*} {\bf x}(t+h,{\bf p}) \;=\; \sum_{i=0}^q h^i {\boldsymbol\phi}_i({\bf x}(t,{\bf p})) + \cal{O}(h^{s+1}) \,, \end{align*}
where \({\boldsymbol\phi}_0,\ldots,{\boldsymbol\phi}_q\) denote the Taylor coefficients of the solution, defined recursively as
\begin{align*} {\boldsymbol\phi}_0({\bf x},{\bf p}) \;:=\; {\bf x} \qquad \text{and} \qquad {\boldsymbol\phi}_i({\bf x},{\bf p}) \;:=\; \frac{1}{i} \frac{\partial{\boldsymbol\phi}_{i-1}}{\partial {\bf x}}({\bf x},{\bf p})\, {\bf f}({\bf x},{\bf p}) \quad\text{for $i\geq 1$} \,. \end{align*}
DAGs for these Taylor coefficients can be generated using the method mc::FFGraph::TAD, which relies upon the class fadbad::T of FADBAD++.
As a simple illustrative example, consider the scalar linear ODE \(\dot{x}(t) = x(t)\), whose solutions are given by \(x(t+h) = \exp(h)x(t)\). Accordingly, the desired Taylor coefficients are:
\begin{align*} \phi_i({\bf x}) \;:=\; \frac{1}{i!}x \quad\text{for all $i\geq 0$} \,. \end{align*}
A DAG of these Taylor coefficients can be generated by mc::FFGraph as follows:
The resulting DAG of the Taylor coefficients up to order 10 is shown below.
DAG VARIABLES: X0 => { Z1 } DAG INTERMEDIATES: Z1 <= X0 * Z0 => { Z3 } Z3 <= Z1 * Z2 => { Z5 } Z5 <= Z3 * Z4 => { Z7 } Z7 <= Z5 * Z6 => { Z9 } Z9 <= Z7 * Z8 => { Z11 } Z11 <= Z9 * Z10 => { Z13 } Z13 <= Z11 * Z12 => { Z15 } Z15 <= Z13 * Z14 => { Z17 } Z17 <= Z15 * Z16 => { } Z16 <= 0.1(D) => { Z17 } Z14 <= 0.111111(D) => { Z15 } Z12 <= 0.125(D) => { Z13 } Z10 <= 0.142857(D) => { Z11 } Z8 <= 0.166667(D) => { Z9 } Z6 <= 0.2(D) => { Z7 } Z4 <= 0.25(D) => { Z5 } Z2 <= 0.333333(D) => { Z3 } Z0 <= 0.5(D) => { Z1 } FACTORS IN SUBGRAPH: X0 <= VARIABLE Z0 <= 0.5(D) Z1 <= X0 * Z0 Z2 <= 0.333333(D) Z3 <= Z1 * Z2 Z4 <= 0.25(D) Z5 <= Z3 * Z4 Z6 <= 0.2(D) Z7 <= Z5 * Z6 Z8 <= 0.166667(D) Z9 <= Z7 * Z8 Z10 <= 0.142857(D) Z11 <= Z9 * Z10 Z12 <= 0.125(D) Z13 <= Z11 * Z12 Z14 <= 0.111111(D) Z15 <= Z13 * Z14 Z16 <= 0.1(D) Z17 <= Z15 * Z16 | Graph for file
|
Naturally, the resulting DAG of Taylor coefficients can be differentiated using mc::FFGraph::FAD or mc::FFGraph::BAD in turn, or evaluated in any compatible arithmetic as explained next.
Having created the DAG of a factorable function as well as its derivatives or Taylor coefficients, one can evaluate these functions in any compatible arithmetic using the method mc::FFGraph::eval.
Coming back to our initial example, suppose that we want to compute interval bounds on the second derivatives of the factorable function
\begin{align*} {\bf f} = \left(\begin{array}{c} x_2x_3-x_0\\ x_0(\exp(x_2x_3)+3.0)^4)+x_1\end{array}\right) \end{align*}
with \(x_0\in[0,0.5]\), \(x_1\in[1,2]\), \(x_2\in[-1,-0.8]\), and \(x_3\in[0.5,1]\). For simplicity, the default interval type mc::Interval of MC++ is used here:
First, a DAG of the second-order derivatives of \(f\) is constructed as explained above—here using forward AD twice:
In a second step, the DAG of second-order derivatives is evaluated in real-arithmetic as follows:
The DAG evaluation can be carried out in Taylor model arithmetic likewise:
These evaluations produce the following results:
Evaluation in Interval Arithmeticd2FdX2(0) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(1) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(2) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(3) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(4) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(5) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(6) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(7) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(8) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(9) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(10) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(11) = [ 1.00000e+00 : 1.00000e+00 ] d2FdX2(12) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(13) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(14) = [ 1.00000e+00 : 1.00000e+00 ] d2FdX2(15) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(16) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(17) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(18) = [ 2.81064e+01 : 1.32573e+02 ] d2FdX2(19) = [ -1.32573e+02 : -4.49702e+01 ] d2FdX2(20) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(21) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(22) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(23) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(24) = [ 2.81064e+01 : 1.32573e+02 ] d2FdX2(25) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(26) = [ 0.00000e+00 : 1.02604e+02 ] d2FdX2(27) = [ -6.62258e+01 : 4.80507e+01 ] d2FdX2(28) = [ -1.32573e+02 : -4.49702e+01 ] d2FdX2(29) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(30) = [ -6.62258e+01 : 4.80507e+01 ] d2FdX2(31) = [ 0.00000e+00 : 1.02604e+02 ] | Evaluation in Taylor Model Arithmeticd2FdX2(0) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(1) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(2) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(3) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(4) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(5) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(6) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(7) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(8) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(9) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(10) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(11) = [ 1.00000e+00 : 1.00000e+00 ] d2FdX2(12) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(13) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(14) = [ 1.00000e+00 : 1.00000e+00 ] d2FdX2(15) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(16) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(17) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(18) = [ 5.19577e+01 : 7.71431e+01 ] d2FdX2(19) = [ -1.14333e+02 : -5.32913e+01 ] d2FdX2(20) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(21) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(22) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(23) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(24) = [ 5.19577e+01 : 7.71431e+01 ] d2FdX2(25) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(26) = [ -1.97513e+01 : 5.36111e+01 ] d2FdX2(27) = [ -1.92705e+01 : 2.57043e+01 ] d2FdX2(28) = [ -1.14333e+02 : -5.32913e+01 ] d2FdX2(29) = [ 0.00000e+00 : 0.00000e+00 ] d2FdX2(30) = [ -1.92681e+01 : 2.57022e+01 ] d2FdX2(31) = [ -3.07240e+01 : 8.59907e+01 ] |
Errors are managed based on the exception handling mechanism of the C++ language. Each time an error is encountered, a class object of type mc::FFGraph::Exceptions is thrown, which contains the type of error. It is the user's responsibility to test whether an exception was thrown during the creation/manipulation of a DAG, and then make the appropriate changes. Should an exception be thrown and not caught by the calling program, the execution will abort.
Possible errors encountered during the creation/manipulation of a DAG are:
Number | Description |
---|---|
1 | Invalid mc::FFGraph* pointer in initialization of an mc::FFVar variable |
2 | Operation between variables linked to different DAGs |
3 | Subgraph cannot be evaluated because an independent variable is missing |
-1 | Internal Error |
-33 | Feature not yet implemented in mc::FFGraph |
Further exceptions can be thrown by the underlying arithmetic used for the evaluation of a DAG.