A Multistart Scatter Search Heuristic for Smooth NLP

and MINLP Problems

 

Zsolt Ugray1, Leon Lasdon2, John Plummer2, Fred Glover3, Jim Kelly4 and Rafael Marti5

 

1AThe A. Gary Anderson Graduate School of Management, University of California, Riverside, CA, 92521-0203.

2Management Science and Information Systems Department, McCombs School of Business, The University of Texas at Austin, Austin, TX, 78712, USA

3Graduate School of Business, University of  Colorado, Boulder, CO, 80309

4OptTek Systems, Inc, 1919 7th St, Boulder CO, 80302

5Departament D’estadistica i Investigacio Operativa, Universitat de Valencia, 46100 BURJASSOT, Valencia, Spain.

zsolt.ugray@ucr.edu,lasdon@mail.utexas.edu,jcplummer@mail.utexas.edu, fred.glover@colorado.edu,Kelly@OptTek.com,rafael.marti@uv.es

 

 

zsolt.ugray@ucr.edu, lasdon@mail.utexas.edu, jcplummer@mail.utexas.edu, fred.glover@colorado.edu, Kelly@OptTek.com, rafael.marti@uv.es

 

Submitted to INFORMS Journal on Computing, July 25, 2002.

 

The algorithm described here, called OptQuest/Multistart or OQMS, is a heuristic designed to find global optima for pure and mixed integer nonlinear problems with many constraints and variables, where all problem functions are differentiable with respect to the continuous variables. It uses OptQuest, a commercial implementation of scatter search developed by OptTek Systems, Inc., to provide starting points for any gradient-based local NLP solver.  This solver seeks a local solution from a subset of these points, holding discrete variables fixed.  The procedure is motivated by our desire to combine the superior accuracy and feasibility-seeking behavior of gradient-based local NLP solvers with the global optimization abilities of OptQuest.  Computational results include 156 155 smooth NLP and MINLP problems due to Floudas et al., most with both linear and nonlinear constraints, coded in the GAMS modeling language. Some are quite large for global optimization, with over 100 variables and many constraints.  Global solutions to almost all problems are found in a small number of local solver calls, often one or two.

(Global Optimization; Multistart Heuristic; Mixed Integer Nonlinear programming; Scatter Search; Gradient Methods)

 

1.   Introduction

 

This paper describes OQMS, a multistart heuristic algorithm designed to find global optima of smooth constrained nonlinear programs (NLPs) and mixed integer nonlinear programs (MINLPs).  It uses the OptQuest Callable Library (OCL) implementation of Scatter Search [Laguna and Marti, 2000] to generate trial points, which are candidate starting points for a local NLP solver.  These are filtered to provide a smaller subset from which the solver attempts to find a local optimum. Our GAMS implementation can use any GAMS NLP solver, and the stand-alone version uses the generalized reduced gradient NLP solver LSGRG2 [Smith and Lasdon, 1993].

  The most general problem this algorithm can solve has the form

                                                                   minimize f(x,y)                                                 (1)

subject to the nonlinear constraints

                                                                                                 (2)

the linear constraints

                                                                                                 (3)

                                                                                                                            (4)

where x is an n-dimensional vector of continuous decision variables, y is a p-dimensional vector of discrete decision variables, and the vectors gl, gu, l, and u contain upper and lower bounds for the nonlinear and linear constraints respectively.  The matrices  and  are  by n and  by p respectively, and contain the coefficients of any linear constraints.  The set S is defined by simple bounds on x, and we assume that it is closed and bounded, i.e., that each component of x has a finite upper and lower bound.  This is required by the OptQuest scatter search procedure.  The set Y is assumed to be finite, and is often the set of all p-dimensional binary or integer vectors y.  The objective function f and the -dimensional vector of constraint functions G are assumed to have continuous first partial derivatives at all points in .  This is necessary so that a gradient-based local NLP solver can be applied to the relaxed NLP sub-problems formed from (1) - (4) by allowing the y variables to be continuous.

 

2.   Multi-start algorithms for global optimization

In this section, which reviews past work on multi-start algorithms, we focus on unconstrained problems where there are no discrete variables, since to the best of our knowledge multi-start algorithms have been investigated theoretically only in this context.  These problems have the form of (1)-(4) with no y variables and no constraints except the bounds  in (4).

All global minima of f are assumed to occur in the interior of S.  By multi-start we mean any algorithm that attempts to find a global solution by starting a local NLP solver, denoted by L, from multiple starting points in S.  The most basic multi-start method generates uniformly distributed points in S, and starts L from each of these.  This converges to a global solution with probability one as the number of points approaches infinity--in fact, the best of the starting points converges as well.  However, this procedure is very inefficient because the same local solution is located many times.  A convergent procedure that largely overcomes this difficulty is called multi-level single linkage (MLSL) [Rinnooy Kan and Timmer, 1987].  MLSL uses a simple rule to exclude some potential starting points.  A uniformly distributed sample of N points in S is generated, and the objective, f, is evaluated at each point.  The points are sorted according to their f values, and the qN best points are retained, where q is an algorithm parameter between 0 and 1.  L is started from each point of this reduced sample, except if there is another sample point within a certain critical distance whichthat has a lower f value.  L is also not started from sample points that are too near the boundary of S, or too close to a previously discovered local minimum.  Then, N additional uniformly distributed points are generated, and the procedure is applied to the union of these points and those retained from previous iterations.  The critical distance referred to above decreases each time a new set of sample points is added.  The authors show that, if the sampling continues indefinitely, each local minimum of f will be located, but the total number of local searches is finite with probability one.  They also develop Bayesian stopping rules, which incorporate assumptions about the costs and potential benefits of further function evaluations, to determine when to stop the procedure.

When the critical distance decreases, a point from which L was previously not started may become a starting point in the next cycle.  Hence all sample points generated must be saved.  This also makes the choice of the sample size, N, important, since too small a sample leads to many revised decisions, while too large a sample will cause L to be started many times.    Random Linkage (RL) multi-start algorithms introduced by [Locatelli and Schoen, 1999] retain the good convergence properties of MLSL, and do not require that past starting decisions be revised.  Uniformly distributed points are generated one at a time, and L is started from each point with a probability given by a nondecreasing function , where d is the distance from the current sample point to the closest of the previous sample points with a better function value.  Assumptions on this function that give RL methods the same theoretical properties as MLSL are derived in the above reference. 

Recently, Fylstra et al. have implemented a version of MLSL whichthat can solve constrained problems [Frontline Systems, Inc., 2000].  See also www.frontsys.com.  Limited to problems with no discrete variables y, it uses the  exact penalty function, defined as

                                                                                                (5)

where the  are nonnegative penalty weights, , and the vector g has been extended to include the linear constraints (4).  The function  is equal to the absolute amount by which the ith constraint is violated at the point x.  It is well known (see [Nash and Sofer, 1996]) that if  is a local optimum of (1)-(1)-(4),  is a corresponding optimal multiplier vector, the second order sufficiency conditions are satisfied at , and 

                                                                                                                                  (6)

then  is a local unconstrained minimum of .  If (1)-(4) has several local minima, and each  is larger than the maximum of all absolute multipliers for constraint i over all these optima, then  has a local minimum at each of these local constrained minima.  Even though  is not a differentiable function of x, MLSL can be applied to it, and when a randomly generated trial point satisfies the MLSL criterion to be a starting point, any local solver for the smooth NLP problem can be started from that point.  The local solver need not make any reference to the exact penalty function , whose only role is to provide function values to MLSL.  We will use  in the same way in our OQMS algorithm.  We are not aware of any theoretical investigations of this extended MLSL procedure, so it must currently be regarded as a heuristic. 

 

3.   The OQMS Algorithm

3.1  The3.1      The Global Phase - Scatter Search

Scatter Search (SS) is a population based meta-heuristic algorithm devised to intelligently perform a search on the problem domain [Glover, 1998].  It operates on a set of solutions called the reference set or population.  Elements of the population are maintained and updated from iteration to iteration.  Scatter Search differs from other population-based evolutionary heuristics like Genetic Algorithms (GAs) mainly in its emphasis on generating new elements of the population mostly by deterministic combinations of previous members of the population as opposed to the more extensive use of randomization.  SS was founded on strategies that were proposed as augmentations to GAs more than a decade after their debut in Scatter Search.  It embodies principles and strategies that are still not emulated by other evolutionary methods and prove to be advantageous for solving a variety of complex optimization problems.

   A summary of the OptQuest [Laguna and Marti, 2001] and [Laguna and Marti, 2000] implementation of SS follows.  The problem to be solved has the form (1)-(4) but, to simplify the explanation, we assume there are no y variables. .

3.1.1   Steps of Scatter Search

1. Initialize: size of reference set = b, initial point =x0, input upper and lower bounds on variables and constraint functions, and the coefficients of any linear constraints.  Create an initial set of three points, : all variables equal their lower bounds, all variables set to their upper bounds, and all variables equal the mid-point between their bounds.  If an initial point has been determined, add it to R0.

2. Given R0, use a diversification generation method to augment it with additional points, creating an initial diverse reference set,  of cardinality b.  Optionally, map the elements of R into points that satisfy the linear constraints.

3. Evaluate the objective f and the nonlinear constraint functions G at each point in R, and evaluate a penalty function POQ, equal to the objective plus a penalty weight times the maximum percentage violation of the nonlinear constraints. POQ is used as the quality measure of a population point. 

While (stopping criteria are not satisfied)

While (some distinct pair of points in R has not been processed)

4. Select a new pair of points in R

5. Use a solution combination method to produce a small number of trial solutions from this pair of points.  Optionally, map each trial point into the closest point that satisfies the linear constraints and variable bounds.

6. At each (mapped) trial solution, evaluate the objective f and nonlinear constraint functions G, and form the penalty function, POQ .

Endwhile

7. Update the reference set.

8. If the reference set has changed, return to step 4.  Otherwise, restart the procedure by selecting a subset (typically the best half) of the best points in the reference set to be retained as the set R0, and return to step 2.

Endwhile

3.1.2   Description of the Scatter Search Steps

Step 1 generates the starting points to create the initial reference set R0.  The 3 points always appearing in this set are the vectors x for which all element are set to the upper bounds, to the lower bound, and to the midpoints of the bounds.  If there is an initial point recommended to the problem, it is also added to R0 as a fourth point.

Step 2 generates the remaining points to the initial reference set R.  The diversification generation method begins by generating nr>b randomly generated points in S, using a stratified sampling procedure described in [Laguna and Marti, 2000].  It then creates the reference set, R, by adding to R0 the random point farthest from its nearest neighbor in R0, and repeating this process until R0 has cardinality b.  If the problem has linear constraints and the points selected are infeasible for these linear constraints, they are first projected onto the convex polyhedron defined by the linear constraints and then added to R0.  This is done by finding the point in this polyhedron whichthat is closest (using the L1 norm) to the infeasible point by solving a linear program. The result of this step is a diverse reference set that satisfies the linear constraints of the problem.

  The initial population resulting from this procedure for a reference set of size b = 10 is shown in Figure 3.1, which uses data from a 2 variable unconstrained problem due to [Dixon and Szego, 1975] called the six-hump camelback function.  The objective to be minimized is

                                                                            (7)

This is the problem EX8_2_5 from a large set of problems described in [Floudas, et. al., 1999].  Problems from this set are used as test problems for OQMS, and will be discussed in detail later.  The problem has upper bounds of 10.0 and lower bounds of –10.0 on both variables, and has 6 local minima, all lying well within these bounds (see Figure 3.2 for their location), plus a stationary point at the origin that is neither a local minimum nor a maximum.  The initial set R0 is the three points (0,0), (10,10), (-10,-10), where (0,0) is user-supplied and the other two are the vectors of upper and lower bounds respectively.


 


Figure 3.1:  Initial Population for the Six-Hump Camelback Function

 

 

 

 



 


Figure 3.2:  Locally Optimal Solutions for the Six-Hump Camelback Function

 

Step 3 ranks the points in the reference set based on their quality, measured by a penalty function POQ which is equal to the objective plus a penalty weight times the maximum percentage violation of the violated nonlinear constraints.  The penalty function POQ is not the same as the exact penalty function  described in (5), and is not exact.  It is used because Lagrange multiplier information is assumed not to be available.  Multipliers may not even exist if the problem is non-smooth or has discrete variables, and OptQuest is designed to solve such problems as well.

  Steps 4 and 5 create new trial solutions by selecting 2 “parent” points from the reference set and performing the solution combination method on them.  (If the option to always satisfy linear constraints is selected, the trial solutions are projected onto the linear constraints.) To illustrate how the combination method currently implemented in OptQuest works, Figure 3.3 demonstrates the generation of new trial points from the 3 best points of the initial population for the six-hump camel back function.

 

 


 


Figure 3.3:  Trial Points for Six-Hump Camelback Function

 

  The three points shown as diamonds and labeled x1, x2, x3 (x1 is the origin) have the lowest objective values in the initial population.  The two lines in the figure are determined by the pairs of points (x1, x2) and (x1, x3).  Focusing on the line (x1, x2), let

d = (x2-x1)/2

v1 = x1-d

v2 = x1

                                                                                                                                          v3 = x1+d                                                                    

v4 = x2

v5 = x2+d.

Thus v3 is at the midpoint of this line, and v1 and v5 extend it beyond x1 and x2.  These points are shown as black or white squares in the figure.  The points v1 and v2 can be used to define a hyper-rectangle whose  vertices are the set

                                        .                                       

Thus the four pairs of points (vi, v(i+1)) for i = 1,..,4 define 4 rectangles, three of which are shown on the line determined by x1 and x2.  OptQuest generates one randomly distributed trial point in each rectangle, and these points are shown as triangles.  Three more trial points are generated in the same way starting with the points (x1, x3).  These points lie “close” to the lines, but are not on them. 

If there are discrete variables, the above process produces trial points whose values for these variables are not in the finite set of allowed values.  These components are rounded to an allowable value using generalized rounding processes, i.e., processes where the rounding of each successive variable depends on the outcomes of previous roundings.

  The full set of 144 trial points generated from the initial population of this example is shown in Figure 3.4. 


 


Figure 3.4:4:  Initial Population and 144 Trial Points for Six-Hump Camelback Problem

 

  The ten white diamond points are the members of the initial reference set, while the dark squares are the trial points, generated as described earlier.  These are well scattered over the region defined by the bounds.

In Step 6 the objective f and non-linear constraints G are evaluated and the penalty function POQ is calculated.  OptQuest considers f and G to be black boxes, and it is the responsibility of the user to provide the evaluation and return the corresponding values. 

In step 7, after all trial points have been evaluated, the reference set is updated by replacing the population which generated the trial points by the best b points of the union of the trial points and the initial reference set, where best is determined by the OptQuest penalty function POQ.  This is an aggressive update, emphasizing solution quality over diversity. This updated reference set, used to generate trial points from iteration 155 onward, is shown in Figure 3.5.  The ten population points cluster in the region about the origin where the six local optima are located, so the next set of 144 trial points will lie within a slight expansion of this region. These trial points thus have much better objective values than those generated by the initial population, as we illustrate later in section 6.

 


 


Figure 3.5:5:  Second Reference Set for Six-Hump Camelback Function

 

    When the diversity and quality of the reference ares considered equally important, a different updating method is suggested in [Laguna and Marti, 2001].  In this variation, the reference set is split into 2 halves.  The first half is created and maintained the same way as described earlier, focusing on the quality of the points.  The other half contains diverse points.  If a solution does not qualify to enter the first half of the reference set based on its quality, a test is performed to determine whether it fits the diversity criterion.  That is, if the new point’s minimum distance to any point in the second half of the reference set is larger than that of any points’ already in the set, the new point will replace it.  With this method the dynamic preservation of diversity is assured for the reference set.

If the updated reference set is unchanged, step 8 forces a return to Step 4 where a new diverse reference set is created.  When this occurs, a number (typically b/2) of the best points from the current population are retained and newly generated points replace the rest.

 

3.2  The3.2      The Local Phase – Gradient Based NLP Solvers

There are many papers and texts discussing gradient-based NLP solvers, e.g., [Nash and Sofer, 1996], [Nocedal and Wright, 1999], [Edgar, Himmelblau, and Lasdon, 2001].  These solve problems of the form (1)-(4), but with no discrete (y) variables.  They require a starting point as input, and use values and gradients of the problem functions to generate a sequence of points which, under fairly general smoothness and regularity conditions, converges to a local optimum.  The main classes of algorithms in widespread use today are Successive Quadratic Programming (SQP) and Generalized Reduced Gradient (GRG)-see [Edgar, Himmelblau, and Lasdon, 2001, Chapter 8.]  The algorithm implemented in the widely used MINOS solver [Murtagh and Saunders, 1982] is similar to SQP.  If there are nonlinear constraints, SQP and MINOS generate a sequence of points that usually violate the nonlinear constraints, with the violations decreasing to within a specified feasibility tolerance as the sequence converges to a local optimum.  GRG algorithms have a simplex-like phase 1-phase 2 structure.  Phase 1 begins with the given starting point and, if it is not feasible, attempts to find a feasible point by minimizing the sum of constraint violations.  If this effort terminates with some constraints violated, the problem is assumed to be infeasible.  However, this local optimum of the phase 1 objective may not be global, so a feasible point may exist.  If a feasible point is found, phase 2 uses it as its starting point, and proceeds to minimize the true objective.  Both phases consist of a sequence of linesearches, each of which produces a feasible point with an objective value no worse (and usually better) than its predecessor.

  Several good commercially available implementations of GRG and SQP solvers exist-see [Nash, 1998] for a review.  As with any numerical analysis software, a local NLP solver can fail to find a local solution from a specified starting point.  The problem may be too badly conditioned, badly scaled, or too large for the solver, causing it to terminate at a point (feasible or infeasible) which is not locally optimal. While the reliability of the best current NLP solvers is quite high, these difficulties occurred in our computational testing, and we discuss this in more detail later.

  Let L be a local NLP solver capable of solving (1)-(4), and assume that L converges to a local optimum for any starting point.  Let  be the locally optimal solution found by L starting from , and let , i = 1,2,...,nloc be all the local optima of the problem.  The basin of attraction of the ith local optimum relative to L, denoted by B(), is the set of all starting points in S from which the sequence of points generated by L converges to .  Formally:

                                                     B() =                                                     

   One measure of difficulty of a global optimization problem with unique global solution is the volume of B() divided by the volume of the rectangle, S, the relative volume of B().  The problem is trivial if this relative volume is 1, as it is for convex programs, and problem difficulty increases as this relative volume approaches zero.

 

3.3  Comparing3.3      Comparing Search Methods and Gradient Based NLP Solvers

For smooth problems, the relative advantages of a search method like Scatter Search over a gradient-based NLP solver are its ability to locate an approximation to a good local solution (often the global optimum), and the fact that it can handle discrete variables. Gradient-based NLP solvers converge to the “nearest” local solution, and have no facilities for discrete variables, unless they are imbedded in a rounding heuristic or branch-and-bound method. Relative disadvantages of search methods are their limited accuracy, and their weak abilities to deal with equality constraints (more generally, narrow feasible regions). They find it difficult to satisfy many nonlinear constraints to high accuracy, but this is a strength of gradient-based NLP solvers. Search methods also require an excessive number of iterations to find approximations to local or global optima accurate to more than 2 or 3 significant figures, while gradient-based solvers usually achieve 4 to 8-digit accuracy rapidly.

  The motivation for combining search and gradient-based solvers in a multi-start procedure is to achieve the advantages of both while avoiding the disadvantages of either.  Surprisingly, we have been unable to locate any published efforts in this direction, besides the Frontline extended MLSL method discussed in Section 2.

 

3.4      The OQMS Algorithm

A pseudo-code description of the simplest OQMS algorithm follows:

 

INITIALIZATION

Read_Problem_Parameters (n, p, ,, bounds, starting point);

Setup_OptQuest_Parameters (size, iteration limits, population, accuracy, variables, bounds, constraints);

Initialize_OptQuest_Population;

 

STAGE 1: INITIAL OPTQUEST ITERATIONS AND FIRST L CALL

 

WHILE (unevaluated trial points from initial population remain)

DO {

            Get (trial solution from OptQuest);

            Evaluate (objective and nonlinear constraint values at trial solution,);

            Put (trial solution , objective and constraint values to OptQuest database);

} ENDDO

Get_Best_Point_from_OptQuest_database (starting point);

Call_L (starting point, local solution);

threshold =  value of local solution;

 

STAGE 2: MAIN ITERATIVE LOOP

 

WHILE (stopping criteria not met)

DO {

            Get (trial solution from OptQuest);

            Evaluate (objective and nonlinear constraint values at trial solution,);

            Put (trial solution, objective and constraint values to OptQuest database);

Calculate_ Penalty_ Function (trial solution,);

            IF (distance and merit filter criteria are satisfied) THEN {

 

                        Call_L (trial solution, local solution);

                        Analyze_Solution (L Terminating Condition);

                        Update_Local_Solutions_Found;

                        Update_Largest_Lagrange_Multipliers_Found;

 

            }

ELSE IF ( > threshold for waitcycle consecutive iterations) {increase threshold}    

} ENDDO

 

After initialization, there are two main stages.  In the “initial OptQuest iterations” stage, the objective and constraint values at all trial points generated by the initial OptQuest population (including the population points themselves) are evaluated, and these values are returned to OptQuest, which computes its penalty function, POQ, at each point.  The point with the best POQ value is selected, and L is started from this point.  If there are any discrete variables, y, they are fixed at their current values during the L solution process. Figure 3.4 shows a graph of these trial points for a two variable unconstrained problem.  In general, they are scattered within the rectangle defined by the bounds on the variables, so choosing the best corresponds to performing a coarse search over this rectangle.  If the best point falls inside the basin of attraction of the global optimum relative to L (as it often does), then if the subsequent L call is successful, it will find a global optimum. This call also determines optimal Lagrange multiplier values, , for the constraints.  These are used to determine initial values for the penalty weights,, satisfying (6), which are used in the exact penalty function,, defined in (5). All local optima found are stored in a linked list, along with the associated Lagrange multipliers and objective values.  Whenever a new local optimum is found, the penalty weights are updated so that (6) is satisfied over all known local optima.

The main iterative loop of stage 2 obtains trial points from OptQuest, and starts L from the subset of these points determined by two filters.  The distance filter helps insure that these starting points are diverse, in the sense that they are not too close to any previously found local solution.  Its goal is to prevent L from starting more than once within the basin of attraction of any local optimum, so it plays the same role as the rule in the MLSL algorithm of Section 2, which does not start at a point if it is within a critical distance of a better point. When a local solution is found, it is stored in a linked list, ordered by its objective value, as is the euclidean distance between it and the starting point that led to it.  If a local solution is located more than once, the maximum of these distances, maxdist, is updated and stored.  For each trial point, t, if the distance between t and any local solution already found is less than distfactor*maxdist, L is not started from the point, and we obtain the next trial solution from OptQuest.

This distance filter implicitly assumes that the attraction basins are spherical, with radii at least maxdist. The default value of distfactor is 0.75, and it can be set to any positive value. As distfactor approaches zero, the filtering effect vanishes, as would be appropriate if there were many closely spaced local solutions.  As it increases, the filtering effect increases until eventually L is never started in stage 2. 

The merit filter helps insure that the L starting points have high quality, by not starting from candidate points whose exact penalty function value  in (5) is greater than a threshold. This threshold is set initially to the  value of the best candidate point found in the first stage of the algorithm.  If trial points are rejected by this test for more than waitcycle consecutive iterations, the threshold is increased by the updating rule:

                                 threshold ß threshold +threshfactor*(1.0+abs(threshold))

where the default value of thfactthreshfactor is 0.2 and that for waitcycle is 20.  The additive 1.0 term is included so that threshold increases by at least thfactthreshfactor when its current value is near zero. When a trial point is accepted by the merit filter, threshold is decreased by setting it to the  value of that point.

The combined effect of these 2 filters is that L is started at only a few percent of the OptQuest trial points, yet global optimal solutions are found for a very high percentage of the test problems.  Some insight is gained by examining Figure 3.6, which shows the stationary point at the origin and the 6 local minima of the 2 variable six-hump camelback function defined in (7) as dark squares, labeled with their objective value.  The ten points from which OQMS starts the local solver are shown as nine white diamonds, plus the origin . The local minima occur in pairs with equal objective value, located symmetrically about the origin.  There were 144 trial points generated in the “initial OptQuest iterations” stage, and these, plus the 10 points in the initial population, are shown in Figure 3.4.  The best of these 154 points is the population point (0,0), so this becomes the first starting point for the local solver.  This happens to be a stationary point of F, so it satisfies the optimality test (that the norm of the gradient of the objective be less than the optimality tolerance), and the local solver terminates there. The next local solver start is at iteration 201, and this locates the global optimum at (.0898, -.7127), which is located two times. The other global optimum at (-.0898, .7127) is found first at iteration 268, and is located 6 times.

 

 


 

 

 


Figure 3.6:6:  Local Optima and 10 L Starting Points for 6 hump camelback function

function


 


 

 


Figure 3.7:Figure 7:  Objective and Threshold Values for Six-Hump Camelback Function

for Iterations 155 to 407

 

The limit on total OQMS iterations in this run was 1000. L was started at only 9 of the 846 OptQuest trial points generated in the main iterative loop of stage 2. All but 2 of the starting points are in the basin of attraction of one of the two global optima.  This is mainly due to the merit filter. In particular, the threshold values are always less than 1.6071, so no starts are ever made in the basin of attraction of the two local optima with this objective value.  The merit filter alone rejected 498 points, the distance filter alone 57, and both rejected 281.

Figure 3.7 illustrates the dynamics of the merit filtering process for iterations 155 to 407 of this problem, displaying the objective values for the trial points as white diamonds, and the threshold values as dark lines.  All objective values greater than 2.0 are set to 2.0.

The initial threshold value is zero, and it is raised twice to a level of 0.44 at iteration 201, where the trial point objective value of  -0.29 falls below it. L is then started and locates the global optimum at (.0898, -.7127), and the threshold is reset to –0.29.  This cycle then repeats. Nine of the ten L starts are made in the 252 iterations shown in the graph. In this span, there are 12 points where the merit filter allows a start and the threshold is decreased, but L is not started at three of these because the distance filter rejects them.

Figure 8 shows the same information for iterations 408 to 1000.  There is only one L start in this span.  This is not due to a lack of high quality trial points: there are more good points than previously, many with values near or equal to –1.0310 (the global minimum is –1.0316), and the merit threshold is usually –1.0310 as well. Every time this threshold is raised, the merit filter accepts one of the next trial points, but 51 of the 52 accepted points are too near one of the 2 global optima, and they are rejected by the distance filter.

 

This simple example illustrates a number of important points:

1.  Setting the bounds on the continuous or discrete variables to be too large in magnitude is likely to slow the OQMS algorithm (or any search algorithm) and may lead to a poorer final solution.  In the above example, if the variable bounds had been [-2,2] rather than [10,10], the trial points generated by the initial population would have had much lower objective values.  OptQuest can overcome this when the initial population is updated.

2.  L found a highly accurate approximation to the global solution of this unconstrained problem at its second call.  OptQuest alone would have taken many more iterations to achieve this accuracy.

3.  The best trial point generated by the initial population may not have as good an objective value as those generated from the second or succeeding ones, especially if the variable bounds are too large.  Using the best “first generation” point as the initial L starting point may not lead to as good a local solution as if some “second generation” points had been considered.  For this reason our base case computational results use a first stage of 200 OptQuest trial points, which in this example would include all 144 first generation points and 56 from the second generation.

 

 

 

 

 

 

 

 


 


Figure 8:  Objective and Threshold Values for Six-Hump Camelback Function: 

iterations 408 to 1000

 

 

3.5      Filtering Logic for Problems with Discrete Variables

 

The filtering logic described above must be extended when there are discrete variables (the y variables in the problem statement (1)-(4)).  There are 2 distinct modes: (1) Optquest is aware of both the x and y variables and all problem constraints and (2) Optquest is aware only of the y variables and the constraints involving y only.  Tests thus far do not show conclusively that one of these is preferred, so both modes are described and comparisons thus far are presented later.

In mode one, when a trial point (xt, yt) provided by OptQuest passes the two filtering tests and is passed to GRGthe local solver L, xt acts as a starting point and is changed by LGRG, but the yt values are fixed and are not changed.  Each new set of yt values defines a different NLP for GRG L to solve, say NLP(yt), with its own set of local minima in x space, so both filters must be made specific to NLP(yt). For the distance filter, it is irrelevant if xt is close to any local minima (in x space) previously found which correspond to problems NLP(y) with y different from yt.  Hence the distance filter is based on the distance from xt to local minima of NLP(yt) only. Similarly, the tests and threshold values in the merit filter must be specific to the problem NLP(yt) currently being solved. However, the weights w in the exact penalty function (x,y,w) used in the merit filter are based on the maximum absolute multipliers over all local optima for all vectors yt, because these weights are large enough to ensure that  this function is exact for all problems NLP(y).

Therefore, in stage 2 of the algorithm, the exact penalty function,(xt,yt,w), is calculated at each trial point (xt,yt), and GRG L is started at (xt,yt) if  is smaller than the current threshold for NLP(yt). This threshold is initialized to plus infinity, so if the values yt have not occurred in a previous stage 2 trial point, GRG L will be called at this point.  This leads to many more GRG local solver calls in problems with discrete variables, as we show later in the computational results sections.

In mode two, Optquest presents candidate y vectors only to LL, which are fixed while L finds corresponding (locally) optimal x values.  The starting values for x can be chosen to minimize computational effort.  We are experimenting with an option which obtains all possible trial points for the current population, sorts them in terms of their distance from each other, and calls LL in that sorted order, starting each call of LL from the previous optimum.  It is expected that y’s which are close to one another will have x’s with that property, so the previous optimum will be a good starting point.. 

In mode two, there is no stage 1, and LL must be called at each y vector whichthat has not been produced previously. As a result, the local solver call where the best value is found typically comes later than with mode 1.  On the other hand, OptQuest’s effort is reduced since it processes a much smaller problem, and the information returned to it by the local solver (the optimal objective value over the continuous variables) is of much higher quality than in the base case (the penalized objective value at OptQuest’s trial point).

.  An important option involves the return of information from GRG the local solver to OptQuest, which is absent in the above procedure, i.e., local solutions found by GRG the local solver are not returned to OptQuest.  Such solutions are generally of very high quality, and might aid the search process if they were incorporated into the OptQuest population, because at least a subset would likely be retained there.  However, this should be done so as to preserve the diversity of the population. 

 

We discuss this option further in Section 9.

 

4.   Computational Results

The algorithm described in the previous section has been implemented as a callable C-language function.  In this form, the user supplies a C function that evaluates the objective and constraint functions, an optional routine that evaluates their first partial derivatives (finite difference approximations are used otherwise), and a calling program that supplies problem size, bounds, and an initial point, and invokes the algorithm.  Algorithm parameters and options all are set initially to default values, and any changes are specified in an options settings file.  Any parameter not referred to in the settings file takes its default value.  The local NLP solver is the LSGRG2 implementation of [Smith and Lasdon, 1993].  We have also developed an interface between this our C implementation and the GAMS algebraic modeling language (see www.gams.com), using C library routines provided by GAMS Development Company.  The user function routine is replaced by one that calls the GAMS interpreter, and a special derivative routine accesses and evaluates expressions developed by GAMS for first derivatives of all nonlinear problem functions.  GAMS identifies all linear terms in each function, and supplies their coefficients separately, thus identifying all linear constraints. This enables us to invoke the OptQuest option which maps each trial point into a point which satisfies the linear constraints.  The derivative information supplied by GAMS significantly enhances the performance of lsgrg2 the local solver (GRG), the gradient-based NLP solver used, since only non-constant derivatives are re-evaluated, and these are always available to full machine precision.  As mentioned earlier, this GAMS version can call any GAMS NLP solver.

For our computational experiments we used the large set of global optimization test problems coded in GAMS from [Floudas et. al., 1999].  Table 1 shows the characteristics of 142 individual and 2 groups of problems.  

 

 

 

  Table 4.1:Table 1:  Floudas Test Problem Set Characteristics

 

 


 



    Most problems arise from chemical engineering, but some are from other sources.  Most are small, but a few have over 100 variables and comparable numbers of constraints, and 13 have both continuous and discrete variables.  Almost all of the problems without discrete variables have local solutions distinct from the global solution, and the majority of problems have constraints.  Sometimes all constraints are linear, as with the concave quadratic programs of series EX2_1_x, but many problems have nonlinear constraints, and these are often the source of the non-convexities.  The best-known objective value and (in most cases) the corresponding variable values are provided in [Floudas, et al., 1999].  The symbol N in the rows for the series EX8_6_1 and EX8_6_2 is the number of particles in a cluster whose equilibrium configuration is sought via potential energy minimization.  Each particle has 3 coordinates, so there are 3N variables.

 

4.1  Continuous4.1      Continuous Variables-The Base Case

  This section describes results obtained when OQGRG OQMS is applied to 128 of the problems in the Floudas test set with no discrete variables.  A few problems for which no gams nlp solver can find a feasible solution in 800 solver calls are omitted. Computations were performed on a DELL OptiPlex PC with a 1.2 Ghz Pentium IV processor and 261 Mbytes of RAM, running under Windows 2000.

  The options and main algorithm parameters used are shown in Table 4.2 (see section 3.4 for definitions). The filter parameter values (waitcycle, threshfactor, distfactor) correspond to fairly tight filters, and these must be loosened to solve some problems. The OptQuest “use linear constraints” option, which projects trial points onto the linear constraints, is not used because it is very time consuming for larger problems.  SNOPT, an SQP implementation, was used for the largest problems because many calls to the GRG solvers CONOPT and LSGRG2 terminate infeasible on these problems.  The 8_3_x problems include many “pooling” constraints, which have bilinear terms.  In almost all these terminations, the GRG solvers find a local minimum of the phase 1 objective.  SNOPT has no phase 1, and never terminates infeasible.

 

Table 2:  Base case OQMS and OptQuestGRG Parameters and Options Used

 

Total iterations = 1000

Total sStage 1 iterations = 200

Waitcycle = 20 

ThfactThreshfactor = 0.2

DistfactDistfactor = 0.75

Use linear constraints = no

OptQuest search type = boundary

Boundary search parameter = 0.5

NLP solver = LSGRG2 except SNOPT for  110 to 141 range

 

 

 

Table4.2:  Base case OQMS and OptQuestGRG Parameters and Options Used

In the base case no initial GRG call is performed at the start of Stage 1.  [I think we can leave out the initial GRG call option altogether.  I delete the section referring to it, too.]

  Table 4.3 shows outcomes and average effort statistics for 128 of the Floudas problems with continuous variables only, sorted into 6 groups by number of variables. Geometric rather than arithmetic means are used to reduce the effects of outliers: function calls, iterations, and times for the larger problem sets typically include a few problems with values much larger than all others. Computational effort is measured by OptQuest iterations, solver calls, function calls (each function call evaluates the objective and all constraint functions), and computation time.  The three “to best” columns show the effort required to find the best OQMS objective value.  Function calls are not available for the largest problems because the SNOPT interface does not yet make them available.

 

variable

range

problems

variables

constraints

iterations

to best

solver calls to best

to best

solver calls

locals

found

function calls to best

 

function calls

time

to best

total

time

failed

first

grg call

second

L call

 

 

 

Table 3:  Results and Effort Statistics for 128 Continuous Variable Floudas Problems

 

 

Variable

range

Problems

Variables

Constraints

Iterations

to best

Solver calls to best

Solver calls

Locals

found

Function calls to best

 

Function calls

Time

to best

Total

time

Failed

First

grg call

Second

L call

1 to 4

32

2.5

1.7

206.3

1.1

7.5

1.9

263.9

2158.0

0.1

0.5

1

27

3

4 to 7

31

5.5

5.7

214.4

1.3

5.8

1.9

381.5

4766.7

0.2

0.6

0

22

6

8 to 12

21

9.4

8.1

238.2

1.5

13.2

3.0

575.6

19698.0

0.1

0.8

3

10

4

13 to 20

18

15.9

11.7

303.5

2.4

7.4

2.6

968.2

5211.9

0.3

0.7

4

8

1

22 to 78

13

37.9

27.6

259.6

2.1

14.1

3.1

1562.4

23077.9

0.6

2.5

1

5

3

110 to 141

13

116.4

80.1

305.0

2.7

23.7

22.7

NA

NA

6.6

64.1

0

7

2

Total/avg

128

 

 

251.3

1.8

10.5

3.5

537.9

7190.9

0.4

1.7

9

80

19

 

 

Table 4.3   Results and Effort Statistics for 128 Continuous Variable Floudas Problems

  Since all problems have known solutions, we define “failures” as problems with a solution gap of more than 1%.  This gap is the percentage difference between the best feasible OQMS objective value, foqms, and the best-known feasible objective value, fbest, defined for minimization problems as: 

                                                    

and the negative of the above for maximization, so positive gaps indicate that the best known solution was not reached.  Nine of the 128 problems failed to achieve gaps smaller than 1%, with gaps ranging from 2.2% to 80%.  All these are solved with more iterations or by loosening the filters. Percentage gaps for almost all 119 “solved” problems are less than 1.e-4, and the largest gap among solved problems is 0.37%.

   Computational effort needed to achieve these results is quite low, and increases slowly with problem size, except for the geometric mean solution time for the largest problems.  The best OQMS value is also found very early: in the first solver call in 80 of the 118 solved problems, and the second call in 19 more.  This shows that, for these test problems, stage one of OQMS is very effective in finding a point in the basin of attraction of the global optimum.  The ratio of the “to best” effort to total effort is also small.  For iterations, since there are always 200 stage 1 iterations, we subtract 200 before computing the ratio, giving 51.3/800 = 0.06.  The solver call ratio is 0.17 and the time ratio 0.23.  This implies that, for these problems, a criterion that stops OQMS when the fractional change in the best feasible objective value found thus far is below a small tolerance for some (reasonably large) number of successive iterations would rarely terminate the algorithm before the best solution was found.  The ratio of total solver calls to locals found, a measure of filter efficiency, varies from 3 to 5, and is nearly one for the largest problems.

  Table 4.5Table 5 shows results obtained in solving the 9 “failed” problems with looser filters and an OptQuest boundary search parameter of 1.  Seven of these 9, the 2_1_x series, are quadratic programs (QP’s) with concave objectives (to be minimized), so each has an optimal extreme point solution, and every extreme point is a local solution.  The base case and new parameter values are in table 4.4Table 4 below.


 

Table 4:  Base Case and Loosened Filter Parameter Values

 

Parameter

Base Case Value

Looser Value

Waitcycle

20

10

Threshold_factor

0.2

1.0

Distance_factor

0.75

0.1

Boundary Search Parameter

0.5

1.0

    Table 4.4  Base Case and Loosened Filter Parameter Values

 

 

The looser merit filter increases its threshold every 10 iterations, replacing the old value by old value+1.0*(1+abs(old value)).  The looser distance filter rejects a trial solution if its distance from any previously found local solution is less than 0.1*maxdist, where maxdist is the largest distance traveled to reach that solution.  A search parameter of 1 causes more OptQuest trial points to have values on the boundary of the rectangle defined by the variable bounds, which helps solve the seven concave QP’s. 

  Eight of the nine “unsolved” problems are solved with these new parameters, and the other, EX14_1_8, achieves a gap of 1.15%.  It is solved by using 1000 stage one iterations and 5000 total, with all other parameters as in the base case.

 

problem

name

variables

constraints

iterations

to best

solver calls

to best

total

solver calls

Base case

Solver calls

locals

found

function calls

to best

total

function calls

time

to best

total

time

gap


Table 5:  Solving 9 “failed” problems with looser filters and boundary parameter=1

 

 

 

 

 

 

Problem

name

 

Variables

 

Constraints

Iterations

to best

Solver calls

to best

Total

solver calls

Base case

solver calls

Locals

found

Function calls

to best

 

 

 

 

 

Total

function calls

Time

to best

Total

time

 

Gap

EX14_1_8

3

4

338

21

122

17

2

38141

1218680

0.49

2.25

1.15

EX9_2_5

8

7

344

2

6

45

3

690

2832

0.21

0.6

0

EX2_1_6

10

5

203

3

107

7

17

389

138749

0.2

1.38

0

EX2_1_9

10

1

201

1

111

66

39

252

342345

0.16

1.45

0

EX2_1_7_1

20

10

279

8

27

2

19

5051

60043

0.39

1.25

0

EX2_1_7_3

20

10

269

4

65

22

36

1321

286213

0.3

1.92

0

EX2_1_7_4

20

10

253

6

29

4

9

3676

83989

0.39

1.41

0

EX2_1_7_5

20

10

254

5

29

3

15

2620

71004

0.33

1.34

0

EX2_1_8

24

10

226

3

120

8

25

981

1071730

0.29

2.87

0

Means(geom)

 

 

258.6

4.1

48.7

10.5

12.8

1760.7

137876.8

0.3

1.5

 

 

     Table 4.5    Solving 9 “failed” problems with looser filters and boundary parameter=1

 


Comparing the “total solver calls” and “base case solver calls” columns shows that the new parameters represent a substantial loosening of both filters.  The looser filters result in many more solver calls in all but problem 9_2_5, and the geometric mean solver calls is 48.7 with the loose filters versus 10.5 with the tighter ones.  The behavior of 9_2_5 is surprising (6 solver calls with loose filters versus 45 with tighter ones), but the run with looser filters finds the global minimum at iteration 344, and after that its merit thresholds and set of local solutions differ from those of the base case run.

  Table 4.6Table 6 below shows the geometric performance means and totals obtained from solving the 14 concave QP problems with base case parameters, with and without the OptQuest “use linear constraints” option, which maps each trial point into a nearest point feasible for the linear constraints.  Since these are linearly constrained problems, invoking this option guarantees that all trial points are feasible.

 

iterations

solver calls

total

locals

fcn calls

total

time

total

 

case

to best

to best

solver calls

found

to best

fcn calls

to best

time

failed

 

Table 6:  Solving Concave QP problems with and without “use linear constraints

 

 

 

Case

 

Iterations to best

 

Solver calls to best

Total Solver calls

 

Locals found

 

fcn calls to best

 

Total fcn calls

 

Time to best

 

Total time

 

 

Failed

no use

284.8

2.3

6.6

3.7

643.8

3875.1

0.3

0.6

7

use

247.1

2.1

12.1

3.1

437.7

3827.6

6.9

19.0

2

use

247.1

2.1

12.1

3.1

437.7

3827.6

6.9

19.0

2

Table 4.6  Solving Concave QP problems with and without “use linear constraints

 

 


Clearly this option helps: there are roughly twice as many solver calls on average when using it, and only 2 failures, versus 7 in the base case.  The gaps for the 2 unsolved problems (2_1_7_5 and 2_1_9) are between 1% and 3.5% in both cases.  However, this option increases run times here by about a factor of 30, so it is currently off by default.

4.25      The Lennard-Jones and Morse Energy Minimization Problems

The Floudas set of test problems includes two GAMS models that choose the locations of a cluster of N particles to minimize the potential energy of the cluster, using two different potential energy functions, called Lennard-Jones and Morse.  The decision variables are the (x,y,z) coordinates of each particle.  Particle 1 is located at the origin, and three position components of particles 2 and 3 are fixed, so each family of problems has 3N-6 variables.  These problems have many local minima, and their number increases rapidly with problem size, so they constitute a good test for global optimization algorithms.

Results of applying OQMS to 14 of these problems, using 200 stage 1 and 1000 total iterations, are shown in Tables 710  and 8.  Each problem set was solved with Lsgrg2 and Conopt.  These results use Conopt for the Lennard-Jones problems and Lsgrg2 for the Morse, because they provide slightly better results, illustrating the value of being able to call several solvers.  Because of the many distinct local minima, the number of local minima found is equal to the number of solver calls for the 3 largest Lennard-Jones problems and for all the Morse problems,

 

Table 4.107:  Solving 6 Lennard-Jones Problems Using CONOPT and Loose Filters


problem

 

 

 Solver calls

Total

locals

time

Total

 

name

variables

constraints

to best

Solver calls

found

to best

Time

Gap,%

 

 

 

Problem name

 

 

Variables

 

 

Constraints

 

Solver calls to best

Total Solver calls

 

Locals found

 

Time to best

 

 

Total

 

 

Gap,%

EX8_6_1_5

9

10

1

152

39

1.09

21.83

0.00

EX8_6_1_10

24

45

21

130

114

18.56

68.34

0.00

EX8_6_1_15

39

105

6

104

100

13.63

165.09

0.00

EX8_6_1_20

54

190

67

118

118

257.62

396.21

1.12

EX8_6_1_25

69

300

42

94

94

325.82

730.68

1.84

EX8_6_1_30

84

435

16

59

59

134.35

434.56

0.88

 

 


The Lennard-Jones problems are the most difficult of the two. The 3 largest problems have gaps of roughly 1% to 2%, using the looser filter parameters in Table 4.  The default filter parameters led to positive gaps for the last 3 problems totaling 7.8%, while this sum in Table 7 is 3.8%.  The objective approaches infinity as the distance between any 2 particles approaches zero, so its unconstrained minimization for N=20 leads to about 50,000 domain violations (either divide by zero or integer power overflow), and this number grows rapidly with N.  Hence we added constraints lower bounding this distance by 0.1 for all distinct pairs of points, and the number of these constraints is shown in the table.  None are active at the best solution found.

Table 8 shows the Morse Potential results using the Lsgrg2 solver and the default OQMS parameters shown in Table 2.  The objective here has no singularities, so there are no difficulties with domain violations, and the only constraints are variable bounds.  All problems are solved to very small gaps except the largest (144 variables), which has a gap of .125%.  The number of solver calls is much smaller than for the Lennard-Jones problems, because the filters are much tighter.  Each call leads to a different local optimum. The largest problem is solved to a gap less than 1.e-4% with 5000 total and 1000 stage 1 iterations and the same filter parameters.  This run terminated because the 3000 second time limit was exceeded, took 4083 iterations, and found 210 distinct local optima in 210 solver calls, compared to only 25 in the base case.


 

  Table 4.9Table 8:  Solving 8 Morse Problems Using Lsgrg2 and default Parameters

 

Problem

 

Solver calls

Total solver

locals

Time to

Total

 

name

variables

To best

Calls

found

best

time

gap

 

 

Problem name

 

 

Variables

 

Solver calls to best

 

Total Solver calls

 

Locals found

 

 

Time to best

 

 

 Total time

 

 

Gap

EX8_6_2_5

9

1

5

5

0.23

0.61

0.0000

EX8_6_2_10

24

1

15

15

0.57

4.44

0.0000

EX8_6_2_15

39

1

6

6

1.41

6.43

0.0000

EX8_6_2_20

54

2

43

43

4.2

51.2

0.0000

EX8_6_2_25

69

4

20

20

13.44

58.38

0.0000

EX8_6_2_30

84

17

43

43

68.56

160.19

0.0000

EX8_6_2_40

114

7

33

33

66.29

273.91

0.0000

EX8_6_2_50

144

20

25

25

337.2

403.96

0.1251

 

 

 


 

4.37     Problems with Discrete Variables

OQMS can exploit OptQuest’s efficient handling of discrete variables and thus solving MINLPs.  The integer variables can appear nonlinearly.  There are 11 MINLP problems in the Floudas test set, with the total number of variables ranging from 3 to 29 and the number of binary variables ranging from 1 to 8.  Two of these, EX12_2_3 and EX12_2_4, had been reformulated so that all binaries appeared linearly, and we restored them to their original state where the binaries appear nonlinearly.  OQMS allows such representations, while the other GAMS MINLP solvers do not.  The final test set contains 13 problems.  These are far too small to yield meaningful inferences about the power of OQMS on problems of practical size, but allow preliminary testing of the two MINLP modes described in section 3.5.  The geometric means of some measures of computational outcomes and effort for both modes are shown in Table 9, using the LSGRG2 NLP solver.

The problems are sorted first by number of discrete variables, then by number of all variables. 

 

Table 9:  Solution Statistics for 13 Problems with Discrete Variables

 

Discrete  var mode

 

Options  and Parameters

 

Iterations to best

L calls  to best

Total L calls

 

Fcn calls to best

 

Total fcn calls

Time  to best

Total time

Failures

discrete only

default

10.2

10.2

20.1

2065.8

5794.1

0.8

1.5

0

all

default

261.7

4.0

32.1

1065.7

25022.2

0.3

0.8

7

all

(1000,5000)

1551.9

12

89.1

10196.8

224025.7

0.8

3.7

1

all

default,use

272.4

3.6

115.8

1178.0

88983.0

9.9

26.1

0

 

 

The first table row is for mode 2, where OptQuest manipulates only the discrete variables.  Each NLP problem was “cold started” from the same initial point in these runs, so the number of function calls could be reduced substantially by warm starts.  All runs are terminated by OptQuest after the small number of possible binary variable combinations have been completely enumerated.  The optimal solution is found on average about midway through the solution process, but we expect that this will occur earlier as the number of discrete variables increases.  The OptQuest logic requires that at least one population of binary solutions be evaluated before any learning can occur, and the average number of solver calls to find the best solution here is about equal to the population size of 10. 

The last 3 rows of Table 9 show results for mode 1, where OptQuest manipulates both binary and continuous variables.  In rows 2 and 3, we do not require trial points to satisfy linear constraints, while in row 4 we do.  Without using linear constraints, the default number of stage 1 and total iterations of (200,1000), are not enough to find the best solution for about 7 of the 13 problems.  This is because many OptQuest trial points have the same values for the binary variables but different values for the continuous variables, so complete enumeration takes far longer than in mode 2.  However, 1000 stage 1 and 5000 total iterations solve all but 1 problem (its gap is 9.4%), and the ratio of solver calls to best to total solver calls of about 1/9 is favorably small.  Row 4 shows that, if trial points are required to satisfy linear constraints, all problems are solved in 1000 total iterations.  This is because these problems have mostly linear constraints (geometric mean of linear constraints is 9.1 and of total constraints is 9.9), so the projected trial points tend to contain an optimal set of binary variable values earlier, after only 3.6 solver calls on average.  However, solving the MILP’s which map trial points into nearest points which satisfy the linear constraints increases total solution time by about a factor of 30 (compare the times in rows 2 and 4).

 

5.   Summary and future research

The results of Section 4 show that OQMS is a promising approach for smooth nonconvex NLP’s with continuous variables.  It solves all 142 of the test problems with no discrete variables with very reasonable solution effort.  While there is no guarantee of optimality and no “gap” is available, it can be combined with other algorithms that provide this information, e.g., LGO or BARON.  The lower bounds provided by these procedures can be used to estimate the gap for the OQMS solution, and the solution produced by any algorithm can be used as a warm start for any other.

Future research includes enhancing the filter logic.  It is difficult to choose a good default value for the merit filter parameter threshfactor  (new threshold=threshfactor(1+old threshold)).  This factor could be calculated dynamically.  Each time a penalty value is above the threshold, calculate the value of threshfactor that would cause the new threshold to equal the penalty value.  If this happens for waitcycle consecutive iterations, set threshfactor to the smallest of these values, so the new threshold would have just accepted the lowest of the penalty values.  For the distance filter, the radius of each attraction basin is currently set to the maximum distance traveled to that local solution.  The true basin may have a shape not nearly spherical, so a sphere of this radius may overestimate its size.  The true basins can have no points in common, so we can impose this condition on the spheres.  If the spherical model basins for any 2 local solutions xi and xj have radii ri and rj, these must satisfy

                                                                 

where d(i,j) is the Euclidean distance between xi and xj.  If this inequality is violated, the radii ri and rj can be reduced by the same scale factor so that it holds as equality.  We plan to test this option soon.

Another important aspect is the communication of locally optimal solutions back to OptQuest, to improve its search process.  These solutions usually have substantially lower penalty values than typical OptQuest trial points, so they are likely to ultimately be included in OptQuest’s population.  However, their penalty values often become the merit filter thresholds, causing most other trial points to be rejected.  Also, the local optima and their nearby “children” will be rejected by the distance filter.  We have seen these effects in preliminary tests.

NLP algorithms can fail by failing to find a feasible point in cases where the problem instance is feasible.  With GRG algorithms, this usually happens when Phase 1 terminates at a local optimum of the Phase 1 objective.  OQMS can be applied to such problems, if they are reformulated by dropping the true objective, adding deviation variables into all constraints, and minimizing the sum of these deviation variables.  This approach could greatly improve the ability of existing NLP solvers to diagnose infeasibility.  More generally, OQMS can improve NLP solver reliability by starting the solver from as many points as desired, while insuring that these points balance diversity and quality.

The performance of OQMS in solving MINLP’s is less clear, because the 13 MINLP test problems used here are so small.  More extensive testing is needed, which should clarify the relative merits of the 2 MINLP “modes” discussed in Section 4.3.  If OptQuest manipulates only the discrete variables, then all trial points generated by the current population may be generated at once, and the solver calls at these points may be done in any order.  The points can be sorted by increasing distance from their nearest neighbor, and each NLP call can be started from the previous optimum.  The NLP’s can also be solved in parallel.

Solving additional test problems will certainly lead to new ideas on how to improve the algorithm.  Many lessons were learned in the course of solving the described ~160 test problems.  New test problems will most certainly bring new challenges and new ideas to the algorithm.

Finally, comparative studies of OQMS and other global and MINLP solvers are needed.  This testing is facilitated by the existing GAMS interfaces for BARON, LGO, DICOPT, and SBB.  The “MINLP World” and “Global World” websites developed by GAMS Development Company (see www.gamsworld.org/minlp) provide solver information and test problems with known solutions.

 


 

REFERENCES

References

Biegler, L.T., I.E. Grossman, and A. W. Westerberg,  1997,A.W. Westerberg.  1997.  Systematic Methods of Chemical Process Design,Design.  Prentice-Hall, Englewood Cliffs, NJ.

Boender, C.G.E., H.E. Romeijn, 1995, “Stochastic Methods”, in Handbook of Global Optimization, pp. 829-869,Romeijn.  1995.  Stochastic Methods.  R. Horst, P.M. Pardalos, eds.,eds.  Handbook of Global Optimization.  Kluwer Academic Publishers.  829-869.

Brooke, A. et al., 1992,al.  1992.  GAMS:  A User's Guide,Guide.  Boyd and Fraser, Danvers, MA.

Dixon, L., G.P. Szego, 1975, “Towards Global Optimization”, inSzego.  1975.  Towards Global Optimization.  Proceedings of a Workshop at the University of Cagliari, Italy, North Holland.

Drud, A., 1994, “CONOPT—AA.  1994.  CONOPT—A Large-Scale GRG-Code”,GRG-Code.  ORSA Journal on Computing, Vol. 6, No.Computing 6 2.

Edgar, T.F., D.M. Himmelblau, L.S. Lasdon, 2001,Lasdon.  2001.  Optimization of Chemical Processes, theProcesses.  McGraw-Hill Companies, Inc.

Floudas, C., 1995,C.  1995.  Nonlinear and Mixed-Integer Optimization,Optimization.  Oxford University Press, New York.

Floudas, C.A., et. al., 1999,et al.  1999.  Handbook of Test Problems in Local and Global Optimization,Optimization.  Kluwer Academic Publishers.

Glover, F., 1998, “AF.  1998.  A Template for Scatter Search and Path Relinking”, in Artificial Evolution, Lecture Notes in Computer Science 1363, J.-KRelinking.  J.-K. Hao, E. Lutton, E. Ronald, M. Schoenauer and D. Snyers, eds.,eds.  Artificial Evolution, Lecture Notes in Computer Science 1363.  Springer Verlag, pp.Verlag.  13-54.

Horst, R., P.M. Pardalos, 1995,Pardalos.  1995.  Handbook of Global Optimization,Optimization. Kluwer Academic Publishers.

Horst, R., H. Tuy, 1993,Tuy.  1993.  Global Optimization: Deterministic Approaches,Approaches.  Springer-Verlag, Berlin, 2nd rev. edition.

Laguna, M., Rafael Marti, 2000, “ExperimentalMarti.  2000.  Experimental Testing of Advanced Scatter Search Designs for Global Optimization of Multimodal Functions”, workingFunctions.  Working paper, Department D’Estadistica i Investigacio Operativa, Universitat de Valencia, Burjassot 46100, Spain.

Laguna, M., Rafael Marti, 2001, “TheMarti.  2001.  The OptQuest Callable Library”, toLibrary.  Stefan Voss and D. Woodruff, eds.  To appear in Optimization Software Class Libraries,Stefan Voss and D. Woodruff, eds., Kluwer Academic Publishers, Boston.

Locatelli, M., F. Schoen, 1999, “Random Linkage: a Family of Acceptance/Rejection Algorithms for Global Optimization”, Math. Programming, 85, 2, pp. 379-396.

Murtagh, B.A., M.A. Saunders, 1982, “A Projected Lagrangian Algorithm and Its Implementation for Sparse Nonlinear Constraints”, Mathematical Programming Study, 16, pp. 84-117.

Nash, S.G., A. Sofer, 1996, Linear and Nonlinear Programming, the McGraw-Hill Companies, Inc.

Nash, S.G., 1998, “Nonlinear Programming”, OR/MS Today, pp. 36-45.

Nocedal, J., S. J. Wright, 1999, Numerical Optimization, Springer Series in Operations Research.

Pardalos, P.M., H.E. Romeijn, H. Tuy, 2000, “Recent Developments and Trends in Global Optimization”, Journal of Computational and Applied Mathematics, v.124, 1-2, pp. 209-228.

Pintér, J.D., 1996, Global Optimization in Action, Kluwer Academic Publishers.

Rinnooy Kan, A.H.G., G.T. Timmer, 1987, “Stochastic Global Optimization Methods; part I: Clustering Methods”, Mathematical Programming, 37, pp. 27-56.

Rinnooy Kan, A.H.G., G.T. Timmer, 1987, “Stochastic Global Optimization Methods; part II: Multi Level Methods”, Mathematical Programming, 37, pp. 57-78.

Smith, S., L. Lasdon, 1992, “Solving Large Sparse Nonlinear Programs Using GRG”, ORSA Journal on Computing, Vol. 4, No. 1, pp. 3-15.

 

Libraries.  Kluwer Academic Publishers, Boston.

Locatelli, M., F. Schoen.  1999.  Random Linkage: a Family of Acceptance/Rejection Algorithms for Global Optimization.  Mathematical. Programming 85 2 379-396.

Murtagh, B.A., M.A. Saunders.  1982. A Projected Lagrangian Algorithm and Its Implementation for Sparse Nonlinear Constraints.  Mathematical Programming Study 16 84-117.

Nash, S.G., A. Sofer.  1996.  Linear and Nonlinear Programming.  McGraw-Hill Companies, Inc.

Nash, S.G.  1998.  Nonlinear Programming.  OR/MS Today 36-45.

Nocedal, J., S. J. Wright.  1999.  Numerical Optimization.  Springer Series in Operations Research.

Pardalos, P.M., H.E. Romeijn, H. Tuy.  2000.  Recent Developments and Trends in Global Optimization.  Journal of Computational and Applied Mathematics 124 1-2 209-228.

Pintér, J.D.  1996.  Global Optimization in Action.  Kluwer Academic Publishers.

Rinnooy Kan, A.H.G., G.T. Timmer.  1987.  Stochastic Global Optimization Methods; part I: Clustering Methods.  Mathematical Programming 37 27-56.

Rinnooy Kan, A.H.G., G.T. Timmer, 1987.  Stochastic Global Optimization Methods; part II: Multi Level Methods.  Mathematical Programming 37 57-78.

Smith, S., L. Lasdon.  1992.  Solving Large Sparse Nonlinear Programs Using GRG. ORSA Journal on Computing 4 1 3-15.