Study notes from Convex Optimization by Stephen Boyd, Lieven Vandenberghe.
1. Mathematical optimization
qquadA mathematical optimization problem has the following form
2. Convex optimization
2.1 Least squares
Inweighted least-squares, the weighted least-squares cost is minimized:
Another technique in least-squares isregularization, in which extra terms are added to the cost function. In the simplest case, a positive multiple of the sum of squares of the variables is added to the cost function:
2.2 Linear programming
3. Nonlinear optimization
qquadNonlinear optimization problem means that the objective or constraint functions are not linear, but not known to be convex. Sadly, there are no effective methods for solving the general nonlinear prgramming problem. Even simple looking problems with as few as ten variables can be extremely challenging, while problems with a few hundreds of variables can be intractable. Methods for the general nonlinear programming problem therefore take several different aproaches, each of which involves some compromise.
3.1 Local optimization
qquadIn local optimization, the compromise is to give up seeking the optimal xxx, which minimizes the objective over all feasible points. Instead we seek a point that is only locally optimal, which means that it minimizes the objective function among feasiable points that are near it, but is not guaranteed to have a lower objective value than all other feasible points. A large fraction of the research on general nonlinear programming has focused on methods for local optimization, which as a consequence are well developed.
qquadLocal optimization methods can be fast, can handle large-scale problems, and are widely applicable, since they only require differentiability of the objective and constraint functions. As a result, local optimization methods are widely used in applications where there is value in finding a good poing, if not the very best.
qquadThere are several disadvantages of local optimization methods, beyond not finding the true, globally optimal solution. The methods require an initial guess for the optimization variable. This initial guess or starting point is critical, and can greatly affect the objective value of the local solution obtained. Little information is provided about how far from globally optimal the local solution is. Local optimization methods are often sensitive to algorithm parameter values, which may need to be adjusted for a particular problem, or family of problems.
qquadUsing a local optimization method is trickier than solving a least-squares problem, linear programming, or convex optimization problem. It involves experimenting with the choice of algorithm, adjusting algorithm parameters, and finding a good enough initial guess (when one instance is to be solved) or a method for producing enough initial guess (when a family of problems is to be solved).
3.2 Global optimization
qquadIn global optimization, the true global solution of the optimization problem is found, the compromise is efficiency. The worst-case complexity of global optimization methods grows exponentially with the problem sizes nnn and mmm. The hope is that in practice, for the particular problem instances encountered, the method is far faster. While this favorable situation does occur, it is not typical. Even small problems, with a few tens of variables, can take a very long time to solve.
qquadGlobal optimization is used for problems with a small number of variables, where computing time is not critical, and the value of finding the true global solution is very high. A local optimization method can rapidly find a set of parameter values that is bad, but not guaranteed to be the absolute worst possible. If a local optimization method finds parameter values that yield unacceptable performance, it has succeeded in determining that the system is not reliable. But a local optimization method cannot certify the system as reliable, it can only fail to find bad parameter values. A global optimization method, in contrast, will find the absolute worst values of the parameters, and if the associated performance is acceptable, can certify the system as safe. The cost is computation time, which can be very large, even for a relatively small number of parameters. But it may be worth it in cases where the value of certifying the performance is high, or the cost of being wrong about the reliability or safety is high.
4. Role of convex optimization in nonconvex problems
Convex optimization also plays an important role in problems that are not convex.
4.1 Initializatoin for local optimization
One obvious use is to combine convex optimization with a local optimization method. Starting with a nonconvex problem, we first find an approximate, but convex, formulation of the problem. By solving this approximate problem, which can be done easily and without an initial guess, we obtain the the exact solution to the approximate convex problem. This point is then used as the starting poing for a local optimization method, applied to the original nonconvex problem.
4.2 Convex heuristics for nonconvex optimization
qquadConvex optimization is the basis for several heuristics for solving nonconvex problems.
qquadOne interesting example is the problem of finding a sparse vector xxx that satisfies some constraints. While this is a difficult combinatorial problem, there are some simple heuristics, based on convex optimization, that often find fairly sparse solutions.
qquadAnother broad example is given by randomized algorithms, in which an approximate solution to a nonconvex problem is found by drawing some number of candidates from a probability distribution, and taking the best one found as the approximate solution.
4.3 Bounds for global optimization
qquadMany methods for global optimization require a cheaply computable lower bound on the optimal value of the nonconvex problem… Two standard methods for doing this are based on convex optimization. In relaxation, each nonconvex constraint is replaced with a looser, but convex, constraint. In Lagrangian relaxation, the Lagrangian dual problem is solved. This problem is convex, and provides a lower bound on the optimal value of the nonconvex problem.