Speeding up gradients for large models (adjoint sensitivities)

For large models, adjoint sensitivity analysis is typically the most efficient way to compute gradients [24, 25] (mathematical overview is available in [20]. PEtab.jl supports adjoint methods from SciMLSensitivity.jl.

In practice, performance of adjoint sensitivity is driven by three factors: the adjoint algorithm (quadrature), the Vector–Jacobian product (VJP) backend, and the ODE solver. This page discusses these options and assumes familiarity with PEtab’s derivative methods (see Derivative methods (gradients and Hessians)). As a working example, we use the published model Bachmann [28] model which is available in the PEtab standard format (see Importing PEtab problems). The PEtab files can be downloaded from here, and given the problem YAML file, the model can be imported as:

julia

using PEtab
# path_yaml depends on where the problem is stored
path_yaml = joinpath(@__DIR__, "bachmann", "Bachmann_MSB2011.yaml")
model = PEtabModel(path_yaml)

Tuning adjoint gradients

The Bachmann model has 25 ODEs and 113 estimated parameters. Even though gradient_method = :ForwardDiff performs best for this model (see below), it is a useful example for illustrating adjoint tuning. For adjoint gradients, performance is mainly driven by the following PEtabODEProblem options:

odesolver_gradient: ODE solver and tolerances (abstol_adj, reltol_adj) used to solve the adjoint problem. In many cases CVODE_BDF() performs well.
sensealg: Adjoint algorithm and its Vector–Jacobian product (VJP) backend. PEtab.jl supports InterpolatingAdjoint, QuadratureAdjoint, and GaussAdjoint from SciMLSensitivity. A key choice for these is the VJP backend; in practice ReverseDiffVJP(true) is currently the most stable option.

Since QuadratureAdjoint is often less robust, this example compares InterpolatingAdjoint and GaussAdjoint:

julia

using SciMLSensitivity, Sundials
solver = ODESolver(CVODE_BDF(); abstol_adj = 1e-3, reltol_adj = 1e-6)

petab_prob1 = PEtabODEProblem(
    model;
    gradient_method = :Adjoint,
    odesolver = solver,
    odesolver_gradient = solver,
    sensealg = InterpolatingAdjoint(autojacvec = ReverseDiffVJP(true)),
)
petab_prob2 = PEtabODEProblem(
    model;
    gradient_method = :Adjoint,
    odesolver = solver,
    odesolver_gradient = solver,
    sensealg = GaussAdjoint(autojacvec = ReverseDiffVJP(true)),
)

Note that SciMLSensitivity must be loaded to use adjoints. The adjoint ODE-solver tolerances are set via abstol_adj/reltol_adj in ODESolver and apply only to the adjoint solve. Relaxing the adjoint tolerances (relative to the default 1e-8) can improve robustness (fewer failed gradient evaluations). Runtime can then be compared as:

julia

using Printf
x = get_x(petab_prob1)
g1, g2 = similar(x), similar(x)
petab_prob1.grad!(g1, x)
petab_prob2.grad!(g2, x)
@printf("Runtime InterpolatingAdjoint: %.1fs\n", b1)
@printf("Runtime GaussAdjoint: %.1fs\n", b2)

Runtime InterpolatingAdjoint: 0.4s
Runtime GaussAdjoint: 0.7s

In this setup InterpolatingAdjoint is faster (this can vary across machines).

Finally, even when gradient_method = :Adjoint is fastest, :ForwardDiff is preferable if it is not substantially slower. Adjoint gradients are more likely to fail (they require an additional difficult backward adjoint solve), and forward-mode methods often produce more accurate gradients [1].

References

S. Persson, F. Fröhlich, S. Grein, T. Loman, D. Ognissanti, V. Hasselgren, J. Hasenauer and M. Cvijovic. PEtab. jl: advancing the efficiency and utility of dynamic modelling. Bioinformatics 41, btaf497 (2025).
F. Sapienza, J. Bolibar, F. Schäfer, B. Groenke, A. Pal, V. Boussange, P. Heimbach, G. Hooker, F. Pérez, P.-O. Persson and others. Differentiable Programming for Differential Equations: A Review, arXiv preprint arXiv:2406.09699 (2024).
F. Fröhlich, B. Kaltenbacher, F. J. Theis and J. Hasenauer. Scalable parameter estimation for genome-scale biochemical reaction networks. PLoS computational biology 13, e1005331 (2017).
Y. Ma, V. Dixit, M. J. Innes, X. Guo and C. Rackauckas. A comparison of automatic differentiation and continuous sensitivity analysis for derivatives of differential equation solutions. In: 2021 IEEE High Performance Extreme Computing Conference (HPEC) (IEEE, 2021); pp. 1–9.
J. Bachmann, A. Raue, M. Schilling, M. E. Böhm, C. Kreutz, D. Kaschek, H. Busch, N. Gretz, W. D. Lehmann, J. Timmer and others. Division of labor by dual feedback regulators controls JAK2/STAT5 signaling over broad ligand range. Molecular systems biology 7, 516 (2011).

Speeding up gradients for large models (adjoint sensitivities) ​

Tuning adjoint gradients ​

References ​

Speeding up gradients for large models (adjoint sensitivities)

Tuning adjoint gradients

References