PyTorch

PyTorch is a library for machine learning in Python.

The upstream documentation is available at https://pytorch.org/docs/stable/.

Supported layers

MathOptAI supports embedding a PyTorch models into JuMP if it is a nn.Sequential composed of:

File format

Use torch.save to save a trained PyTorch model to a .pt file:

#!/usr/bin/python3
import torch
model = torch.nn.Sequential(
    torch.nn.Linear(1, 2),
    torch.nn.ReLU(),
    torch.nn.Linear(2, 1),
)
torch.save(model, "saved_pytorch_model.pt")

Python integration

MathOptAI uses PythonCall.jl to call from Julia into Python.

To use PytorchModel your code must load the PythonCall package:

import PythonCall

PythonCall uses CondaPkg.jl to manage Python dependencies. See CondaPkg.jl for more control over how to link Julia to an existing Python environment. For example, if you have an existing Python installation (with PyTorch installed), and it is available in the current Conda environment, do:

ENV["JULIA_CONDAPKG_BACKEND"] = "Current"
import PythonCall

If the Python installation can be found on the path and it is not in a Conda environment, do:

ENV["JULIA_CONDAPKG_BACKEND"] = "Null"
import PythonCall

If python is not on your path, you may additionally need to set JULIA_PYTHONCALL_EXE, for example, do:

ENV["JULIA_PYTHONCALL_EXE"] = "python3"
ENV["JULIA_CONDAPKG_BACKEND"] = "Null"
import PythonCall

Basic example

Use MathOptAI.add_predictor to embed a PyTorch model into a JuMP model:

julia> using JuMP, MathOptAI, PythonCall
julia> model = Model();
julia> @variable(model, x[1:1]);
julia> predictor = MathOptAI.PytorchModel("saved_pytorch_model.pt");
julia> y, formulation = MathOptAI.add_predictor(model, predictor, x);
julia> y1-element Vector{JuMP.VariableRef}:
 moai_Affine[1]
julia> formulationAffine(A, b) [input: 1, output: 2]
├ variables [2]
│ ├ moai_Affine[1]
│ └ moai_Affine[2]
└ constraints [2]
  ├ 0.7473132610321045 x[1] - moai_Affine[1] = 0.27370238304138184
  └ -0.8237636089324951 x[1] - moai_Affine[2] = 0.08490216732025146
MathOptAI.ReLU()
├ variables [2]
│ ├ moai_ReLU[1]
│ └ moai_ReLU[2]
└ constraints [4]
  ├ moai_ReLU[1] ≥ 0
  ├ moai_ReLU[1] - max(0, moai_Affine[1]) = 0
  ├ moai_ReLU[2] ≥ 0
  └ moai_ReLU[2] - max(0, moai_Affine[2]) = 0
Affine(A, b) [input: 2, output: 1]
├ variables [1]
│ └ moai_Affine[1]
└ constraints [1]
  └ -0.4644489884376526 moai_ReLU[1] + 0.5382549166679382 moai_ReLU[2] - moai_Affine[1] = 0.5698285102844238

Reduced-space

Use the reduced_space = true keyword to formulate a reduced-space model:

julia> using JuMP, MathOptAI, PythonCall
julia> model = Model();
julia> @variable(model, x[1:1]);
julia> predictor = MathOptAI.PytorchModel("saved_pytorch_model.pt");
julia> y, formulation =
           MathOptAI.add_predictor(model, predictor, x; reduced_space = true);
julia> y1-element Vector{JuMP.NonlinearExpr}:
 ((+(0) + (-0.4644489884376526 * max(0, 0.7473132610321045 x[1] - 0.27370238304138184))) + (0.5382549166679382 * max(0, -0.8237636089324951 x[1] - 0.08490216732025146))) + -0.5698285102844238
julia> formulationReducedSpace(Affine(A, b) [input: 1, output: 2])
├ variables [0]
└ constraints [0]
ReducedSpace(MathOptAI.ReLU())
├ variables [0]
└ constraints [0]
ReducedSpace(Affine(A, b) [input: 2, output: 1])
├ variables [0]
└ constraints [0]

Gray-box

Use the gray_box = true keyword to embed the network as a vector nonlinear operator:

julia> using JuMP, MathOptAI, PythonCall
julia> model = Model();
julia> @variable(model, x[1:1]);
julia> predictor = MathOptAI.PytorchModel("saved_pytorch_model.pt");
julia> y, formulation =
           MathOptAI.add_predictor(model, predictor, x; gray_box = true);
julia> y1-element Vector{JuMP.VariableRef}:
 moai_Pytorch[1]
julia> formulationMathOptAI.GrayBox{MathOptAI.PytorchModel}(MathOptAI.PytorchModel("saved_pytorch_model.pt"), "cpu", true)
├ variables [1]
│ └ moai_Pytorch[1]
└ constraints [1]
  └ [x[1], moai_Pytorch[1]] ∈ VectorNonlinearOracle{Float64}(;
    dimension = 2,
    l = [0.0],
    u = [0.0],
    ...,
)

Change how layers are formulated

Pass a dictionary to the config keyword that maps the Symbol name of each PyTorch layer to a MathOptAI predictor:

julia> using JuMP, MathOptAI, PythonCall
julia> model = Model();
julia> @variable(model, x[1:1]);
julia> predictor = MathOptAI.PytorchModel("saved_pytorch_model.pt");
julia> y, formulation = MathOptAI.add_predictor(
           model,
           predictor,
           x;
           config = Dict(:ReLU => MathOptAI.ReLUSOS1),
       );
julia> y1-element Vector{JuMP.VariableRef}:
 moai_Affine[1]
julia> formulationAffine(A, b) [input: 1, output: 2]
├ variables [2]
│ ├ moai_Affine[1]
│ └ moai_Affine[2]
└ constraints [2]
  ├ 0.7473132610321045 x[1] - moai_Affine[1] = 0.27370238304138184
  └ -0.8237636089324951 x[1] - moai_Affine[2] = 0.08490216732025146
MathOptAI.ReLUSOS1()
├ variables [4]
│ ├ moai_ReLU[1]
│ ├ moai_ReLU[2]
│ ├ moai_z[1]
│ └ moai_z[2]
└ constraints [8]
  ├ moai_ReLU[1] ≥ 0
  ├ moai_z[1] ≥ 0
  ├ moai_Affine[1] - moai_ReLU[1] + moai_z[1] = 0
  ├ [moai_ReLU[1], moai_z[1]] ∈ MathOptInterface.SOS1{Float64}([1.0, 2.0])
  ├ moai_ReLU[2] ≥ 0
  ├ moai_z[2] ≥ 0
  ├ moai_Affine[2] - moai_ReLU[2] + moai_z[2] = 0
  └ [moai_ReLU[2], moai_z[2]] ∈ MathOptInterface.SOS1{Float64}([1.0, 2.0])
Affine(A, b) [input: 2, output: 1]
├ variables [1]
│ └ moai_Affine[1]
└ constraints [1]
  └ -0.4644489884376526 moai_ReLU[1] + 0.5382549166679382 moai_ReLU[2] - moai_Affine[1] = 0.5698285102844238

Custom layers

If your PyTorch model contains a custom layer, define a new AbstractPredictor and pass a config dictionary that maps the Class object to a callback that builds the new predictor.

The callback must have the signature (layer::PythonCall.Py; kwargs...). Valid keyword arguments are currently:

input_size: the input size of they layer
config: the config dictionary, if needed to convert layers inside the custom layer
nn: a reference to torch.nn

You must always have kwargs... so that future versions of MathOptAI can add new keywords in a non-breaking way.

julia> using JuMP, PythonCall, MathOptAI
julia> dir = mktempdir()"/tmp/jl_l9ekTW"
julia> write(
           joinpath(dir, "custom_model.py"),
           """
           import torch
           class Skip(torch.nn.Module):
               def __init__(self, inner):
                   super().__init__()
                   self.inner = inner
               def forward(self, x):
                   return self.inner(x) + x
           """,
       )186
julia> filename = joinpath(dir, "custom_model.pt")"/tmp/jl_l9ekTW/custom_model.pt"
julia> PythonCall.@pyexec(
           (dir, filename) =>
               """
               import sys
               sys.path.insert(0, dir)
               import torch
               from custom_model import Skip
               inner = torch.nn.Sequential(torch.nn.Linear(3, 3), torch.nn.ReLU())
               model = Skip(inner)
               torch.save(model, filename)
               """ => Skip,
       )Python: <class 'custom_model.Skip'>
julia> struct CustomPredictor <: MathOptAI.AbstractPredictor
           p::MathOptAI.Pipeline
       end
julia> function MathOptAI.add_predictor(
           model::JuMP.AbstractModel,
           predictor::CustomPredictor,
           x::Vector;
           kwargs...,
       )
           y, formulation = MathOptAI.add_predictor(model, predictor.p, x; kwargs...)
           @assert length(x) == length(y)
           return y .+ x, formulation
       end
julia> model = Model();
julia> @variable(model, x[i in 1:3]);
julia> predictor = MathOptAI.PytorchModel(filename)MathOptAI.PytorchModel("/tmp/jl_l9ekTW/custom_model.pt")
julia> function skip_callback(layer::PythonCall.Py; input_size, kwargs...)
           return CustomPredictor(MathOptAI.build_predictor(layer.inner))
       endskip_callback (generic function with 1 method)
julia> config = Dict(Skip => skip_callback)Dict{PythonCall.Py, typeof(Main.skip_callback)} with 1 entry:
  <class 'custom_model.Skip'> => skip_callback
julia> y, formulation = MathOptAI.add_predictor(model, predictor, x; config);
julia> y3-element Vector{JuMP.AffExpr}:
 moai_ReLU[1] + x[1]
 moai_ReLU[2] + x[2]
 moai_ReLU[3] + x[3]
julia> formulationAffine(A, b) [input: 3, output: 3]
├ variables [3]
│ ├ moai_Affine[1]
│ ├ moai_Affine[2]
│ └ moai_Affine[3]
└ constraints [3]
  ├ -0.1263355016708374 x[1] - 0.07469013333320618 x[2] - 0.1723976582288742 x[3] - moai_Affine[1] = -0.3124357759952545
  ├ -0.46936970949172974 x[1] + 0.017206240445375443 x[2] - 0.5485977530479431 x[3] - moai_Affine[2] = 0.20509101450443268
  └ -0.5314902067184448 x[1] - 0.33861589431762695 x[2] + 0.3549228608608246 x[3] - moai_Affine[3] = 0.14326581358909607
MathOptAI.ReLU()
├ variables [3]
│ ├ moai_ReLU[1]
│ ├ moai_ReLU[2]
│ └ moai_ReLU[3]
└ constraints [6]
  ├ moai_ReLU[1] ≥ 0
  ├ moai_ReLU[1] - max(0, moai_Affine[1]) = 0
  ├ moai_ReLU[2] ≥ 0
  ├ moai_ReLU[2] - max(0, moai_Affine[2]) = 0
  ├ moai_ReLU[3] ≥ 0
  └ moai_ReLU[3] - max(0, moai_Affine[3]) = 0