Flux.jl

Flux.jl is a library for machine learning in Julia.

The upstream documentation is available at https://fluxml.ai/Flux.jl/stable/.

Supported layers

MathOptAI supports embedding a Flux model into JuMP if it is a Flux.Chain composed of:

Basic example

Use MathOptAI.add_predictor to embed a Flux.Chain into a JuMP model:

julia> using JuMP, Flux, MathOptAI
julia> predictor = Flux.Chain(Flux.Dense(1 => 2, Flux.relu), Flux.Dense(2 => 1));
julia> model = Model();
julia> @variable(model, x[1:1]);
julia> y, formulation = MathOptAI.add_predictor(model, predictor, x);
julia> y1-element Vector{JuMP.VariableRef}: moai_Affine[1]
julia> formulationAffine(A, b) [input: 1, output: 2] ├ variables [2] │ ├ moai_Affine[1] │ └ moai_Affine[2] └ constraints [2] ├ 0.8960601687431335 x[1] - moai_Affine[1] = 0 └ -0.284283846616745 x[1] - moai_Affine[2] = 0 MathOptAI.ReLU() ├ variables [2] │ ├ moai_ReLU[1] │ └ moai_ReLU[2] └ constraints [4] ├ moai_ReLU[1] ≥ 0 ├ moai_ReLU[1] - max(0, moai_Affine[1]) = 0 ├ moai_ReLU[2] ≥ 0 └ moai_ReLU[2] - max(0, moai_Affine[2]) = 0 Affine(A, b) [input: 2, output: 1] ├ variables [1] │ └ moai_Affine[1] └ constraints [1] └ 0.6243241429328918 moai_ReLU[1] - 0.330809086561203 moai_ReLU[2] - moai_Affine[1] = 0

Reduced-space

Use the reduced_space = true keyword to formulate a reduced-space model:

julia> using JuMP, Flux, MathOptAI
julia> predictor = Flux.Chain(Flux.Dense(1 => 2, Flux.relu), Flux.Dense(2 => 1));
julia> model = Model();
julia> @variable(model, x[1:1]);
julia> y, formulation = MathOptAI.add_predictor(model, predictor, x; reduced_space = true);
julia> y1-element Vector{JuMP.NonlinearExpr}: ((+(0) + (-0.9877400398254395 * max(0, -0.23708461225032806 x[1]))) + (0.8143993020057678 * max(0, 0.18948295712471008 x[1]))) + 0
julia> formulationReducedSpace(Affine(A, b) [input: 1, output: 2]) ├ variables [0] └ constraints [0] ReducedSpace(MathOptAI.ReLU()) ├ variables [0] └ constraints [0] ReducedSpace(Affine(A, b) [input: 2, output: 1]) ├ variables [0] └ constraints [0]

Gray-box

Use the gray_box = true keyword to embed the network as a vector nonlinear operator:

julia> using JuMP, Flux, MathOptAI
julia> predictor = Flux.Chain(Flux.Dense(1 => 2, Flux.relu), Flux.Dense(2 => 1));
julia> model = Model();
julia> @variable(model, x[1:1]);
julia> y, formulation = MathOptAI.add_predictor(model, predictor, x; gray_box = true);
julia> y1-element Vector{JuMP.VariableRef}: moai_Flux[1]
julia> formulationMathOptAI.GrayBox{Flux.Chain{Tuple{Flux.Dense{typeof(NNlib.relu), Matrix{Float32}, Vector{Float32}}, Flux.Dense{typeof(identity), Matrix{Float32}, Vector{Float32}}}}}(Chain(Dense(1 => 2, relu), Dense(2 => 1)), "cpu", true) ├ variables [1] │ └ moai_Flux[1] └ constraints [1] └ [x[1], moai_Flux[1]] ∈ VectorNonlinearOracle{Float64}(; dimension = 2, l = [0.0], u = [0.0], ..., )

Change how layers are formulated

Pass a dictionary to the config keyword that maps Flux activation functions to a MathOptAI predictor:

julia> using JuMP, Flux, MathOptAI
julia> predictor = Flux.Chain(Flux.Dense(1 => 2, Flux.relu), Flux.Dense(2 => 1));
julia> model = Model();
julia> @variable(model, x[1:1]);
julia> y, formulation = MathOptAI.add_predictor( model, predictor, x; config = Dict(Flux.relu => MathOptAI.ReLUSOS1), );
julia> y1-element Vector{JuMP.VariableRef}: moai_Affine[1]
julia> formulationAffine(A, b) [input: 1, output: 2] ├ variables [2] │ ├ moai_Affine[1] │ └ moai_Affine[2] └ constraints [2] ├ -1.3564362525939941 x[1] - moai_Affine[1] = 0 └ -0.7396637201309204 x[1] - moai_Affine[2] = 0 MathOptAI.ReLUSOS1() ├ variables [4] │ ├ moai_ReLU[1] │ ├ moai_ReLU[2] │ ├ moai_z[1] │ └ moai_z[2] └ constraints [8] ├ moai_ReLU[1] ≥ 0 ├ moai_z[1] ≥ 0 ├ moai_Affine[1] - moai_ReLU[1] + moai_z[1] = 0 ├ [moai_ReLU[1], moai_z[1]] ∈ MathOptInterface.SOS1{Float64}([1.0, 2.0]) ├ moai_ReLU[2] ≥ 0 ├ moai_z[2] ≥ 0 ├ moai_Affine[2] - moai_ReLU[2] + moai_z[2] = 0 └ [moai_ReLU[2], moai_z[2]] ∈ MathOptInterface.SOS1{Float64}([1.0, 2.0]) Affine(A, b) [input: 2, output: 1] ├ variables [1] │ └ moai_Affine[1] └ constraints [2] ├ moai_Affine[1] ≥ 0 └ 0.4570336639881134 moai_ReLU[1] + 1.143480896949768 moai_ReLU[2] - moai_Affine[1] = 0

Custom layers

If your Flux model contains a custom layer, define new methods for build_predictor and add_predictor:

julia> using JuMP, Flux, MathOptAI
julia> struct CustomLayer{T<:Flux.Chain} chain::T end
julia> (model::CustomLayer)(x) = model.chain(x) + x
julia> struct CustomPredictor <: MathOptAI.AbstractPredictor p::MathOptAI.Pipeline end
julia> function MathOptAI.build_predictor(model::CustomLayer) predictor = MathOptAI.build_predictor(model.chain) return CustomPredictor(predictor) end
julia> function MathOptAI.add_predictor( model::JuMP.AbstractModel, predictor::CustomPredictor, x::Vector; kwargs..., ) y, formulation = MathOptAI.add_predictor(model, predictor.p, x; kwargs...) @assert length(x) == length(y) return y .+ x, formulation end
julia> model = Model();
julia> @variable(model, x[i in 1:3]);
julia> predictor = Flux.Chain(CustomLayer(Flux.Chain(Flux.Dense(3 => 3, Flux.relu))))Chain( CustomLayer( Chain( Dense(3 => 3, relu), # 12 parameters ), ), )
julia> y, formulation = MathOptAI.add_predictor(model, predictor, x);
julia> y3-element Vector{JuMP.AffExpr}: moai_ReLU[1] + x[1] moai_ReLU[2] + x[2] moai_ReLU[3] + x[3]
julia> formulationAffine(A, b) [input: 3, output: 3] ├ variables [3] │ ├ moai_Affine[1] │ ├ moai_Affine[2] │ └ moai_Affine[3] └ constraints [3] ├ -0.062187910079956055 x[1] + 0.3033400774002075 x[2] - 0.006968379020690918 x[3] - moai_Affine[1] = 0 ├ 0.1512216329574585 x[1] + 0.8705322742462158 x[2] - 0.5318925380706787 x[3] - moai_Affine[2] = 0 └ 0.07990121841430664 x[1] + 0.7459365129470825 x[2] - 0.571891188621521 x[3] - moai_Affine[3] = 0 MathOptAI.ReLU() ├ variables [3] │ ├ moai_ReLU[1] │ ├ moai_ReLU[2] │ └ moai_ReLU[3] └ constraints [6] ├ moai_ReLU[1] ≥ 0 ├ moai_ReLU[1] - max(0, moai_Affine[1]) = 0 ├ moai_ReLU[2] ≥ 0 ├ moai_ReLU[2] - max(0, moai_Affine[2]) = 0 ├ moai_ReLU[3] ≥ 0 └ moai_ReLU[3] - max(0, moai_Affine[3]) = 0