4. User cuts and lazy constraints

User cuts and lazy constraints are two advanced mixed-integer programming techniques that can accelerate solver performance. User cuts are additional constraints, derived from the constraints already in the model, that can tighten the feasible region and eliminate fractional solutions, thus reducing the size of the branch-and-bound tree. Lazy constraints, on the other hand, are constraints that are potentially part of the problem formulation but are omitted from the initial model to reduce its size; these constraints are added to the formulation only once the solver finds a solution that violates them. While both techniques have been successful, significant computational effort may still be required to generate strong user cuts and to identify violated lazy constraints, which can reduce their effectiveness.

MIPLearn is able to predict which user cuts and which lazy constraints to enforce at the beginning of the optimization process, using machine learning. In this tutorial, we will use the framework to predict subtour elimination constraints for the traveling salesman problem using Gurobipy. We assume that MIPLearn has already been correctly installed.

Solver Compatibility

User cuts and lazy constraints are also supported in the Python/Pyomo and Julia/JuMP versions of the package. See the source code of build_tsp_model_pyomo and build_tsp_model_jump for more details. Note, however, the following limitations:

  • Python/Pyomo: Only gurobi_persistent is currently supported. PRs implementing callbacks for other persistent solvers are welcome.

  • Julia/JuMP: Only solvers supporting solver-independent callbacks are supported. As of JuMP 1.19, this includes Gurobi, CPLEX, XPRESS, SCIP and GLPK. Note that HiGHS and Cbc are not supported. As newer versions of JuMP implement further callback support, MIPLearn should become automatically compatible with these solvers.

4.1. Modeling the traveling salesman problem

Given a list of cities and the distances between them, the traveling salesman problem (TSP) asks for the shortest route starting at the first city, visiting each other city exactly once, then returning to the first city. This problem is a generalization of the Hamiltonian path problem, one of Karp’s 21 NP-complete problems, and has many practical applications, including routing delivery trucks and scheduling airline routes.

To describe an instance of TSP, we need to specify the number of cities \(n\), and an \(n \times n\) matrix of distances. The class TravelingSalesmanData, in the miplearn.problems.tsp package, can hold this data:

@dataclass
class TravelingSalesmanData:
    n_cities: int
    distances: np.ndarray

MIPLearn also provides TravelingSalesmandGenerator, a random generator for TSP instances, and build_tsp_model_gurobipy, a function which converts TravelingSalesmanData into an actual gurobipy optimization model, and which uses lazy constraints to enforce subtour elimination.

The example below is a simplified and annotated version of build_tsp_model_gurobipy, illustrating the usage of callbacks with MIPLearn. Compared the the previous tutorial examples, note that, in addition to defining the variables, objective function and constraints of our problem, we also define two callback functions lazy_separate and lazy_enforce.

[1]:
import gurobipy as gp
from gurobipy import quicksum, GRB, tuplelist
from miplearn.solvers.gurobi import GurobiModel
import networkx as nx
import numpy as np
from miplearn.problems.tsp import (
    TravelingSalesmanData,
    TravelingSalesmanGenerator,
)
from scipy.stats import uniform, randint
from miplearn.io import write_pkl_gz, read_pkl_gz
from miplearn.collectors.basic import BasicCollector
from miplearn.solvers.learning import LearningSolver
from miplearn.components.lazy.mem import MemorizingLazyComponent
from miplearn.extractors.fields import H5FieldsExtractor
from sklearn.neighbors import KNeighborsClassifier

# Set up random seed to make example more reproducible
np.random.seed(42)

# Set up Python logging
import logging

logging.basicConfig(level=logging.WARNING)


def build_tsp_model_gurobipy_simplified(data):
    # Read data from file if a filename is provided
    if isinstance(data, str):
        data = read_pkl_gz(data)

    # Create empty gurobipy model
    model = gp.Model()

    # Create set of edges between every pair of cities, for convenience
    edges = tuplelist(
        (i, j) for i in range(data.n_cities) for j in range(i + 1, data.n_cities)
    )

    # Add binary variable x[e] for each edge e
    x = model.addVars(edges, vtype=GRB.BINARY, name="x")

    # Add objective function
    model.setObjective(quicksum(x[(i, j)] * data.distances[i, j] for (i, j) in edges))

    # Add constraint: must choose two edges adjacent to each city
    model.addConstrs(
        (
            quicksum(x[min(i, j), max(i, j)] for j in range(data.n_cities) if i != j)
            == 2
            for i in range(data.n_cities)
        ),
        name="eq_degree",
    )

    def lazy_separate(m: GurobiModel):
        """
        Callback function that finds subtours in the current solution.
        """
        # Query current value of the x variables
        x_val = m.inner.cbGetSolution(x)

        # Initialize empty set of violations
        violations = []

        # Build set of edges we have currently selected
        selected_edges = [e for e in edges if x_val[e] > 0.5]

        # Build a graph containing the selected edges, using networkx
        graph = nx.Graph()
        graph.add_edges_from(selected_edges)

        # For each component of the graph
        for component in list(nx.connected_components(graph)):

            # If the component is not the entire graph, we found a
            # subtour. Add the edge cut to the list of violations.
            if len(component) < data.n_cities:
                cut_edges = [
                    [e[0], e[1]]
                    for e in edges
                    if (e[0] in component and e[1] not in component)
                    or (e[0] not in component and e[1] in component)
                ]
                violations.append(cut_edges)

        # Return the list of violations
        return violations

    def lazy_enforce(m: GurobiModel, violations) -> None:
        """
        Callback function that, given a list of subtours, adds lazy
        constraints to remove them from the feasible region.
        """
        print(f"Enforcing {len(violations)} subtour elimination constraints")
        for violation in violations:
            m.add_constr(quicksum(x[e[0], e[1]] for e in violation) >= 2)

    return GurobiModel(
        model,
        lazy_separate=lazy_separate,
        lazy_enforce=lazy_enforce,
    )

The lazy_separate function starts by querying the current fractional solution value through m.inner.cbGetSolution (recall that m.inner is a regular gurobipy model), then finds the set of violated lazy constraints. Unlike a regular lazy constraint solver callback, note that lazy_separate does not add the violated constraints to the model; it simply returns a list of objects that uniquely identifies the set of lazy constraints that should be generated. Enforcing the constraints is the responsbility of the second callback function, lazy_enforce. This function takes as input the model and the list of violations found by lazy_separate, converts them into actual constraints, and adds them to the model through m.add_constr.

During training data generation, MIPLearn calls lazy_separate and lazy_enforce in sequence, inside a regular solver callback. However, once the machine learning models are trained, MIPLearn calls lazy_enforce directly, before the optimization process starts, with a list of predicted violations, as we will see in the example below.

Constraint Representation

How should user cuts and lazy constraints be represented is a decision that the user can make; MIPLearn is representation agnostic. The objects returned by lazy_separate, however, are serialized as JSON and stored in the HDF5 training data files. Therefore, it is recommended to use only simple objects, such as lists, tuples and dictionaries.

4.2. Generating training data

To test the callback defined above, we generate a small set of TSP instances, using the provided random instance generator. As in the previous tutorial, we generate some test instances and some training instances, then solve them using BasicCollector. Input problem data is stored in tsp/train/00000.pkl.gz, ..., whereas solver training data (including list of required lazy constraints) is stored in tsp/train/00000.h5, ....

[2]:
# Configure generator to produce instances with 50 cities located
# in the 1000 x 1000 square, and with slightly perturbed distances.
gen = TravelingSalesmanGenerator(
    x=uniform(loc=0.0, scale=1000.0),
    y=uniform(loc=0.0, scale=1000.0),
    n=randint(low=50, high=51),
    gamma=uniform(loc=1.0, scale=0.25),
    fix_cities=True,
    round=True,
)

# Generate 500 instances and store input data file to .pkl.gz files
data = gen.generate(500)
train_data = write_pkl_gz(data[0:450], "tsp/train")
test_data = write_pkl_gz(data[450:500], "tsp/test")

# Solve the training instances in parallel, collecting the required lazy
# constraints, in addition to other information, such as optimal solution.
bc = BasicCollector()
bc.collect(train_data, build_tsp_model_gurobipy_simplified, n_jobs=10)

4.3. Training and solving new instances

After producing the training dataset, we can train the machine learning models to predict which lazy constraints are necessary. In this tutorial, we use the following ML strategy: given a new instance, find the 50 most similar ones in the training dataset and verify how often each lazy constraint was required. If a lazy constraint was required for the majority of the 50 most-similar instances, enforce it ahead-of-time for the current instance. To measure instance similarity, use the objective function only. This ML strategy can be implemented using MemorizingLazyComponent with H5FieldsExtractor and KNeighborsClassifier, as shown below.

[3]:
solver = LearningSolver(
    components=[
        MemorizingLazyComponent(
            extractor=H5FieldsExtractor(instance_fields=["static_var_obj_coeffs"]),
            clf=KNeighborsClassifier(n_neighbors=100),
        ),
    ],
)
solver.fit(train_data)

Next, we solve one of the test instances using the trained solver. In the run below, we can see that MIPLearn adds many lazy constraints ahead-of-time, before the optimization starts. During the optimization process itself, some additional lazy constraints are required, but very few.

[4]:
# Increase log verbosity, so that we can see what is MIPLearn doing
logging.getLogger("miplearn").setLevel(logging.INFO)

# Solve a new test instance
solver.optimize(test_data[0], build_tsp_model_gurobipy_simplified);
Set parameter Threads to value 1
Restricted license - for non-production use only - expires 2024-10-28
Gurobi Optimizer version 10.0.3 build v10.0.3rc0 (linux64)

CPU model: 13th Gen Intel(R) Core(TM) i7-13800H, instruction set [SSE2|AVX|AVX2]
Thread count: 10 physical cores, 20 logical processors, using up to 1 threads

Optimize a model with 50 rows, 1225 columns and 2450 nonzeros
Model fingerprint: 0x04d7bec1
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+01, 1e+03]
  Bounds range     [1e+00, 1e+00]
  RHS range        [2e+00, 2e+00]
Presolve time: 0.00s
Presolved: 50 rows, 1225 columns, 2450 nonzeros

Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    4.0600000e+02   9.700000e+01   0.000000e+00      0s
      66    5.5880000e+03   0.000000e+00   0.000000e+00      0s

Solved in 66 iterations and 0.01 seconds (0.00 work units)
Optimal objective  5.588000000e+03

User-callback calls 107, time in user-callback 0.00 sec
INFO:miplearn.components.cuts.mem:Predicting violated lazy constraints...
INFO:miplearn.components.lazy.mem:Enforcing 19 constraints ahead-of-time...
Enforcing 19 subtour elimination constraints
Set parameter PreCrush to value 1
Set parameter LazyConstraints to value 1
Gurobi Optimizer version 10.0.3 build v10.0.3rc0 (linux64)

CPU model: 13th Gen Intel(R) Core(TM) i7-13800H, instruction set [SSE2|AVX|AVX2]
Thread count: 10 physical cores, 20 logical processors, using up to 1 threads

Optimize a model with 69 rows, 1225 columns and 6091 nonzeros
Model fingerprint: 0x09bd34d6
Variable types: 0 continuous, 1225 integer (1225 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+01, 1e+03]
  Bounds range     [1e+00, 1e+00]
  RHS range        [2e+00, 2e+00]
Found heuristic solution: objective 29853.000000
Presolve time: 0.00s
Presolved: 69 rows, 1225 columns, 6091 nonzeros
Variable types: 0 continuous, 1225 integer (1225 binary)

Root relaxation: objective 6.139000e+03, 93 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0 6139.00000    0    6 29853.0000 6139.00000  79.4%     -    0s
H    0     0                    6390.0000000 6139.00000  3.93%     -    0s
     0     0 6165.50000    0   10 6390.00000 6165.50000  3.51%     -    0s
Enforcing 3 subtour elimination constraints
     0     0 6165.50000    0    6 6390.00000 6165.50000  3.51%     -    0s
     0     0 6198.50000    0   16 6390.00000 6198.50000  3.00%     -    0s
*    0     0               0    6219.0000000 6219.00000  0.00%     -    0s

Cutting planes:
  Gomory: 11
  MIR: 1
  Zero half: 4
  Lazy constraints: 3

Explored 1 nodes (222 simplex iterations) in 0.03 seconds (0.02 work units)
Thread count was 1 (of 20 available processors)

Solution count 3: 6219 6390 29853

Optimal solution found (tolerance 1.00e-04)
Best objective 6.219000000000e+03, best bound 6.219000000000e+03, gap 0.0000%

User-callback calls 141, time in user-callback 0.00 sec

Finally, we solve the same instance, but using a regular solver, without ML prediction. We can see that a much larger number of lazy constraints are added during the optimization process itself. Additionally, the solver requires a larger number of iterations to find the optimal solution. There is not a significant difference in running time because of the small size of these instances.

[5]:
solver = LearningSolver(components=[])  # empty set of ML components
solver.optimize(test_data[0], build_tsp_model_gurobipy_simplified);
Gurobi Optimizer version 10.0.3 build v10.0.3rc0 (linux64)

CPU model: 13th Gen Intel(R) Core(TM) i7-13800H, instruction set [SSE2|AVX|AVX2]
Thread count: 10 physical cores, 20 logical processors, using up to 1 threads

Optimize a model with 50 rows, 1225 columns and 2450 nonzeros
Model fingerprint: 0x04d7bec1
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+01, 1e+03]
  Bounds range     [1e+00, 1e+00]
  RHS range        [2e+00, 2e+00]
Presolve time: 0.00s
Presolved: 50 rows, 1225 columns, 2450 nonzeros

Iteration    Objective       Primal Inf.    Dual Inf.      Time
       0    4.0600000e+02   9.700000e+01   0.000000e+00      0s
      66    5.5880000e+03   0.000000e+00   0.000000e+00      0s

Solved in 66 iterations and 0.01 seconds (0.00 work units)
Optimal objective  5.588000000e+03

User-callback calls 107, time in user-callback 0.00 sec
Set parameter PreCrush to value 1
Set parameter LazyConstraints to value 1
Gurobi Optimizer version 10.0.3 build v10.0.3rc0 (linux64)

CPU model: 13th Gen Intel(R) Core(TM) i7-13800H, instruction set [SSE2|AVX|AVX2]
Thread count: 10 physical cores, 20 logical processors, using up to 1 threads

Optimize a model with 50 rows, 1225 columns and 2450 nonzeros
Model fingerprint: 0x77a94572
Variable types: 0 continuous, 1225 integer (1225 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+00]
  Objective range  [1e+01, 1e+03]
  Bounds range     [1e+00, 1e+00]
  RHS range        [2e+00, 2e+00]
Found heuristic solution: objective 29695.000000
Presolve time: 0.00s
Presolved: 50 rows, 1225 columns, 2450 nonzeros
Variable types: 0 continuous, 1225 integer (1225 binary)

Root relaxation: objective 5.588000e+03, 68 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0 5588.00000    0   12 29695.0000 5588.00000  81.2%     -    0s
Enforcing 9 subtour elimination constraints
Enforcing 11 subtour elimination constraints
H    0     0                    27241.000000 5588.00000  79.5%     -    0s
     0     0 5898.00000    0    8 27241.0000 5898.00000  78.3%     -    0s
Enforcing 4 subtour elimination constraints
Enforcing 3 subtour elimination constraints
     0     0 6066.00000    0    - 27241.0000 6066.00000  77.7%     -    0s
Enforcing 2 subtour elimination constraints
     0     0 6128.00000    0    - 27241.0000 6128.00000  77.5%     -    0s
     0     0 6139.00000    0    6 27241.0000 6139.00000  77.5%     -    0s
H    0     0                    6368.0000000 6139.00000  3.60%     -    0s
     0     0 6154.75000    0   15 6368.00000 6154.75000  3.35%     -    0s
Enforcing 2 subtour elimination constraints
     0     0 6154.75000    0    6 6368.00000 6154.75000  3.35%     -    0s
     0     0 6165.75000    0   11 6368.00000 6165.75000  3.18%     -    0s
Enforcing 3 subtour elimination constraints
     0     0 6204.00000    0    6 6368.00000 6204.00000  2.58%     -    0s
*    0     0               0    6219.0000000 6219.00000  0.00%     -    0s

Cutting planes:
  Gomory: 5
  MIR: 1
  Zero half: 4
  Lazy constraints: 4

Explored 1 nodes (224 simplex iterations) in 0.10 seconds (0.03 work units)
Thread count was 1 (of 20 available processors)

Solution count 4: 6219 6368 27241 29695

Optimal solution found (tolerance 1.00e-04)
Best objective 6.219000000000e+03, best bound 6.219000000000e+03, gap 0.0000%

User-callback calls 170, time in user-callback 0.01 sec

4.4. Learning user cuts

The example above focused on lazy constraints. To enforce user cuts instead, the procedure is very similar, with the following changes:

  • Instead of lazy_separate and lazy_enforce, use cuts_separate and cuts_enforce

  • Instead of m.inner.cbGetSolution, use m.inner.cbGetNodeRel

For a complete example, see build_stab_model_gurobipy, build_stab_model_pyomo and build_stab_model_jump, which solves the maximum-weight stable set problem using user cut callbacks.

[ ]: