3. Getting started (JuMP)

3.1. Introduction

MIPLearn is an open source framework that uses machine learning (ML) to accelerate the performance of mixed-integer programming solvers (e.g. Gurobi, CPLEX, XPRESS). In this tutorial, we will:

  1. Install the Julia/JuMP version of MIPLearn

  2. Model a simple optimization problem using JuMP

  3. Generate training data and train the ML models

  4. Use the ML models together Gurobi to solve new instances

Warning

MIPLearn is still in early development stage. If run into any bugs or issues, please submit a bug report in our GitHub repository. Comments, suggestions and pull requests are also very welcome!

3.2. Installation

MIPLearn is available in two versions:

  • Python version, compatible with the Pyomo and Gurobipy modeling languages,

  • Julia version, compatible with the JuMP modeling language.

In this tutorial, we will demonstrate how to use and install the Python/Pyomo version of the package. The first step is to install Julia in your machine. See the official Julia website for more instructions. After Julia is installed, launch the Julia REPL, type ] to enter package mode, then install MIPLearn:

pkg> add MIPLearn@0.3

In addition to MIPLearn itself, we will also install:

  • the JuMP modeling language

  • Gurobi, a state-of-the-art commercial MILP solver

  • Distributions, to generate random data

  • PyCall, to access ML model from Scikit-Learn

  • Suppressor, to make the output cleaner

pkg> add JuMP@1, Gurobi@1, Distributions@0.25, PyCall@1, Suppressor@0.2

Note

  • If you do not have a Gurobi license available, you can also follow the tutorial by installing an open-source solver, such as HiGHS, and replacing Gurobi.Optimizer by HiGHS.Optimizer in all the code examples.

  • In the code above, we install specific version of all packages to ensure that this tutorial keeps running in the future, even when newer (and possibly incompatible) versions of the packages are released. This is usually a recommended practice for all Julia projects.

3.3. Modeling a simple optimization problem

To illustrate how can MIPLearn be used, we will model and solve a small optimization problem related to power systems optimization. The problem we discuss below is a simplification of the unit commitment problem, a practical optimization problem solved daily by electric grid operators around the world.

Suppose that a utility company needs to decide which electrical generators should be online at each hour of the day, as well as how much power should each generator produce. More specifically, assume that the company owns \(n\) generators, denoted by \(g_1, \ldots, g_n\). Each generator can either be online or offline. An online generator \(g_i\) can produce between \(p^\text{min}_i\) to \(p^\text{max}_i\) megawatts of power, and it costs the company \(c^\text{fix}_i + c^\text{var}_i y_i\), where \(y_i\) is the amount of power produced. An offline generator produces nothing and costs nothing. The total amount of power to be produced needs to be exactly equal to the total demand \(d\) (in megawatts).

This simple problem can be modeled as a mixed-integer linear optimization problem as follows. For each generator \(g_i\), let \(x_i \in \{0,1\}\) be a decision variable indicating whether \(g_i\) is online, and let \(y_i \geq 0\) be a decision variable indicating how much power does \(g_i\) produce. The problem is then given by:

\[\begin{split}\begin{align} \text{minimize } \quad & \sum_{i=1}^n \left( c^\text{fix}_i x_i + c^\text{var}_i y_i \right) \\ \text{subject to } \quad & y_i \leq p^\text{max}_i x_i & i=1,\ldots,n \\ & y_i \geq p^\text{min}_i x_i & i=1,\ldots,n \\ & \sum_{i=1}^n y_i = d \\ & x_i \in \{0,1\} & i=1,\ldots,n \\ & y_i \geq 0 & i=1,\ldots,n \end{align}\end{split}\]

Note

We use a simplified version of the unit commitment problem in this tutorial just to make it easier to follow. MIPLearn can also handle realistic, large-scale versions of this problem.

Next, let us convert this abstract mathematical formulation into a concrete optimization model, using Julia and JuMP. We start by defining a data class UnitCommitmentData, which holds all the input data.

[1]:
struct UnitCommitmentData
    demand::Float64
    pmin::Vector{Float64}
    pmax::Vector{Float64}
    cfix::Vector{Float64}
    cvar::Vector{Float64}
end;

Next, we write a build_uc_model function, which converts the input data into a concrete JuMP model. The function accepts UnitCommitmentData, the data structure we previously defined, or the path to a JLD2 file containing this data.

[2]:
using MIPLearn
using JuMP
using Gurobi

function build_uc_model(data)
    if data isa String
        data = read_jld2(data)
    end
    model = Model(Gurobi.Optimizer)
    G = 1:length(data.pmin)
    @variable(model, x[G], Bin)
    @variable(model, y[G] >= 0)
    @objective(model, Min, sum(data.cfix[g] * x[g] + data.cvar[g] * y[g] for g in G))
    @constraint(model, eq_max_power[g in G], y[g] <= data.pmax[g] * x[g])
    @constraint(model, eq_min_power[g in G], y[g] >= data.pmin[g] * x[g])
    @constraint(model, eq_demand, sum(y[g] for g in G) == data.demand)
    return JumpModel(model)
end;

At this point, we can already use Gurobi to find optimal solutions to any instance of this problem. To illustrate this, let us solve a small instance with three generators:

[3]:
model = build_uc_model(
    UnitCommitmentData(
        100.0,  # demand
        [10, 20, 30],  # pmin
        [50, 60, 70],  # pmax
        [700, 600, 500],  # cfix
        [1.5, 2.0, 2.5],  # cvar
    )
)
model.optimize()
@show objective_value(model.inner)
@show Vector(value.(model.inner[:x]))
@show Vector(value.(model.inner[:y]));
Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)

CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]
Thread count: 16 physical cores, 32 logical processors, using up to 32 threads

Optimize a model with 7 rows, 6 columns and 15 nonzeros
Model fingerprint: 0x55e33a07
Variable types: 3 continuous, 3 integer (3 binary)
Coefficient statistics:
  Matrix range     [1e+00, 7e+01]
  Objective range  [2e+00, 7e+02]
  Bounds range     [0e+00, 0e+00]
  RHS range        [1e+02, 1e+02]
Presolve removed 2 rows and 1 columns
Presolve time: 0.00s
Presolved: 5 rows, 5 columns, 13 nonzeros
Variable types: 0 continuous, 5 integer (3 binary)
Found heuristic solution: objective 1400.0000000

Root relaxation: objective 1.035000e+03, 3 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0 1035.00000    0    1 1400.00000 1035.00000  26.1%     -    0s
     0     0 1105.71429    0    1 1400.00000 1105.71429  21.0%     -    0s
*    0     0               0    1320.0000000 1320.00000  0.00%     -    0s

Explored 1 nodes (5 simplex iterations) in 0.00 seconds (0.00 work units)
Thread count was 32 (of 32 available processors)

Solution count 2: 1320 1400

Optimal solution found (tolerance 1.00e-04)
Best objective 1.320000000000e+03, best bound 1.320000000000e+03, gap 0.0000%

User-callback calls 371, time in user-callback 0.00 sec
objective_value(model.inner) = 1320.0
Vector(value.(model.inner[:x])) = [-0.0, 1.0, 1.0]
Vector(value.(model.inner[:y])) = [0.0, 60.0, 40.0]

Running the code above, we found that the optimal solution for our small problem instance costs $1320. It is achieve by keeping generators 2 and 3 online and producing, respectively, 60 MW and 40 MW of power.

Notes

  • In the example above, JumpModel is just a thin wrapper around a standard JuMP model. This wrapper allows MIPLearn to be solver- and modeling-language-agnostic. The wrapper provides only a few basic methods, such as optimize. For more control, and to query the solution, the original JuMP model can be accessed through model.inner, as illustrated above.

3.4. Generating training data

Although Gurobi could solve the small example above in a fraction of a second, it gets slower for larger and more complex versions of the problem. If this is a problem that needs to be solved frequently, as it is often the case in practice, it could make sense to spend some time upfront generating a trained solver, which can optimize new instances (similar to the ones it was trained on) faster.

In the following, we will use MIPLearn to train machine learning models that is able to predict the optimal solution for instances that follow a given probability distribution, then it will provide this predicted solution to Gurobi as a warm start. Before we can train the model, we need to collect training data by solving a large number of instances. In real-world situations, we may construct these training instances based on historical data. In this tutorial, we will construct them using a random instance generator:

[4]:
using Distributions
using Random

function random_uc_data(; samples::Int, n::Int, seed::Int=42)::Vector
    Random.seed!(seed)
    pmin = rand(Uniform(100_000, 500_000), n)
    pmax = pmin .* rand(Uniform(2, 2.5), n)
    cfix = pmin .* rand(Uniform(100, 125), n)
    cvar = rand(Uniform(1.25, 1.50), n)
    return [
        UnitCommitmentData(
            sum(pmax) * rand(Uniform(0.5, 0.75)),
            pmin,
            pmax,
            cfix,
            cvar,
        )
        for _ in 1:samples
    ]
end;

In this example, for simplicity, only the demands change from one instance to the next. We could also have randomized the costs, production limits or even the number of units. The more randomization we have in the training data, however, the more challenging it is for the machine learning models to learn solution patterns.

Now we generate 500 instances of this problem, each one with 50 generators, and we use 450 of these instances for training. After generating the instances, we write them to individual files. MIPLearn uses files during the training process because, for large-scale optimization problems, it is often impractical to hold in memory the entire training data, as well as the concrete Pyomo models. Files also make it much easier to solve multiple instances simultaneously, potentially on multiple machines. The code below generates the files uc/train/00001.jld2, uc/train/00002.jld2, etc., which contain the input data in JLD2 format.

[5]:
data = random_uc_data(samples=500, n=500)
train_data = write_jld2(data[1:450], "uc/train")
test_data = write_jld2(data[451:500], "uc/test");

Finally, we use BasicCollector to collect the optimal solutions and other useful training data for all training instances. The data is stored in HDF5 files uc/train/00001.h5, uc/train/00002.h5, etc. The optimization models are also exported to compressed MPS files uc/train/00001.mps.gz, uc/train/00002.mps.gz, etc.

[6]:
using Suppressor
@suppress_out begin
    bc = BasicCollector()
    bc.collect(train_data, build_uc_model)
end

3.5. Training and solving test instances

With training data in hand, we can now design and train a machine learning model to accelerate solver performance. In this tutorial, for illustration purposes, we will use ML to generate a good warm start using \(k\)-nearest neighbors. More specifically, the strategy is to:

  1. Memorize the optimal solutions of all training instances;

  2. Given a test instance, find the 25 most similar training instances, based on constraint right-hand sides;

  3. Merge their optimal solutions into a single partial solution; specifically, only assign values to the binary variables that agree unanimously.

  4. Provide this partial solution to the solver as a warm start.

This simple strategy can be implemented as shown below, using MemorizingPrimalComponent. For more advanced strategies, and for the usage of more advanced classifiers, see the user guide.

[7]:
# Load kNN classifier from Scikit-Learn
using PyCall
KNeighborsClassifier = pyimport("sklearn.neighbors").KNeighborsClassifier

# Build the MIPLearn component
comp = MemorizingPrimalComponent(
    clf=KNeighborsClassifier(n_neighbors=25),
    extractor=H5FieldsExtractor(
        instance_fields=["static_constr_rhs"],
    ),
    constructor=MergeTopSolutions(25, [0.0, 1.0]),
    action=SetWarmStart(),
);

Having defined the ML strategy, we next construct LearningSolver, train the ML component and optimize one of the test instances.

[8]:
solver_ml = LearningSolver(components=[comp])
solver_ml.fit(train_data)
solver_ml.optimize(test_data[1], build_uc_model);
Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)

CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]
Thread count: 16 physical cores, 32 logical processors, using up to 32 threads

Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros
Model fingerprint: 0xd2378195
Variable types: 500 continuous, 500 integer (500 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+06]
  Objective range  [1e+00, 6e+07]
  Bounds range     [0e+00, 0e+00]
  RHS range        [2e+08, 2e+08]

User MIP start produced solution with objective 1.02165e+10 (0.00s)
Loaded user MIP start with objective 1.02165e+10

Presolve time: 0.00s
Presolved: 1001 rows, 1000 columns, 2500 nonzeros
Variable types: 500 continuous, 500 integer (500 binary)

Root relaxation: objective 1.021568e+10, 510 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0 1.0216e+10    0    1 1.0217e+10 1.0216e+10  0.01%     -    0s

Explored 1 nodes (510 simplex iterations) in 0.01 seconds (0.00 work units)
Thread count was 32 (of 32 available processors)

Solution count 1: 1.02165e+10

Optimal solution found (tolerance 1.00e-04)
Best objective 1.021651058978e+10, best bound 1.021567971257e+10, gap 0.0081%

User-callback calls 169, time in user-callback 0.00 sec

By examining the solve log above, specifically the line Loaded user MIP start with objective..., we can see that MIPLearn was able to construct an initial solution which turned out to be very close to the optimal solution to the problem. Now let us repeat the code above, but a solver which does not apply any ML strategies. Note that our previously-defined component is not provided.

[9]:
solver_baseline = LearningSolver(components=[])
solver_baseline.fit(train_data)
solver_baseline.optimize(test_data[1], build_uc_model);
Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)

CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]
Thread count: 16 physical cores, 32 logical processors, using up to 32 threads

Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros
Model fingerprint: 0xb45c0594
Variable types: 500 continuous, 500 integer (500 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+06]
  Objective range  [1e+00, 6e+07]
  Bounds range     [0e+00, 0e+00]
  RHS range        [2e+08, 2e+08]
Presolve time: 0.00s
Presolved: 1001 rows, 1000 columns, 2500 nonzeros
Variable types: 500 continuous, 500 integer (500 binary)
Found heuristic solution: objective 1.071463e+10

Root relaxation: objective 1.021568e+10, 510 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0 1.0216e+10    0    1 1.0715e+10 1.0216e+10  4.66%     -    0s
H    0     0                    1.025162e+10 1.0216e+10  0.35%     -    0s
     0     0 1.0216e+10    0    1 1.0252e+10 1.0216e+10  0.35%     -    0s
H    0     0                    1.023090e+10 1.0216e+10  0.15%     -    0s
H    0     0                    1.022335e+10 1.0216e+10  0.07%     -    0s
H    0     0                    1.022281e+10 1.0216e+10  0.07%     -    0s
H    0     0                    1.021753e+10 1.0216e+10  0.02%     -    0s
H    0     0                    1.021752e+10 1.0216e+10  0.02%     -    0s
     0     0 1.0216e+10    0    3 1.0218e+10 1.0216e+10  0.02%     -    0s
     0     0 1.0216e+10    0    1 1.0218e+10 1.0216e+10  0.02%     -    0s
H    0     0                    1.021651e+10 1.0216e+10  0.01%     -    0s

Explored 1 nodes (764 simplex iterations) in 0.03 seconds (0.02 work units)
Thread count was 32 (of 32 available processors)

Solution count 7: 1.02165e+10 1.02175e+10 1.02228e+10 ... 1.07146e+10

Optimal solution found (tolerance 1.00e-04)
Best objective 1.021651058978e+10, best bound 1.021573363741e+10, gap 0.0076%

User-callback calls 204, time in user-callback 0.00 sec

In the log above, the MIP start line is missing, and Gurobi had to start with a significantly inferior initial solution. The solver was still able to find the optimal solution at the end, but it required using its own internal heuristic procedures. In this example, because we solve very small optimization problems, there was almost no difference in terms of running time, but the difference can be significant for larger problems.

3.6. Accessing the solution

In the example above, we used LearningSolver.solve together with data files to solve both the training and the test instances. The optimal solutions were saved to HDF5 files in the train/test folders, and could be retrieved by reading theses files, but that is not very convenient. In the following example, we show how to build and solve a JuMP model entirely in-memory, using our trained solver.

[10]:
data = random_uc_data(samples=1, n=500)[1]
model = build_uc_model(data)
solver_ml.optimize(model)
@show objective_value(model.inner);
Gurobi Optimizer version 10.0.1 build v10.0.1rc0 (linux64)

CPU model: AMD Ryzen 9 7950X 16-Core Processor, instruction set [SSE2|AVX|AVX2|AVX512]
Thread count: 16 physical cores, 32 logical processors, using up to 32 threads

Optimize a model with 1001 rows, 1000 columns and 2500 nonzeros
Model fingerprint: 0x974a7fba
Variable types: 500 continuous, 500 integer (500 binary)
Coefficient statistics:
  Matrix range     [1e+00, 1e+06]
  Objective range  [1e+00, 6e+07]
  Bounds range     [0e+00, 0e+00]
  RHS range        [2e+08, 2e+08]

User MIP start produced solution with objective 9.86729e+09 (0.00s)
User MIP start produced solution with objective 9.86675e+09 (0.00s)
User MIP start produced solution with objective 9.86654e+09 (0.01s)
User MIP start produced solution with objective 9.8661e+09 (0.01s)
Loaded user MIP start with objective 9.8661e+09

Presolve time: 0.00s
Presolved: 1001 rows, 1000 columns, 2500 nonzeros
Variable types: 500 continuous, 500 integer (500 binary)

Root relaxation: objective 9.865344e+09, 510 iterations, 0.00 seconds (0.00 work units)

    Nodes    |    Current Node    |     Objective Bounds      |     Work
 Expl Unexpl |  Obj  Depth IntInf | Incumbent    BestBd   Gap | It/Node Time

     0     0 9.8653e+09    0    1 9.8661e+09 9.8653e+09  0.01%     -    0s

Explored 1 nodes (510 simplex iterations) in 0.02 seconds (0.01 work units)
Thread count was 32 (of 32 available processors)

Solution count 4: 9.8661e+09 9.86654e+09 9.86675e+09 9.86729e+09

Optimal solution found (tolerance 1.00e-04)
Best objective 9.866096485614e+09, best bound 9.865343669936e+09, gap 0.0076%

User-callback calls 182, time in user-callback 0.00 sec
objective_value(model.inner) = 9.866096485613789e9