Parameter fitting with automatic differentiation
Source:vignettes/parameter-fitting.Rmd
parameter-fitting.RmdA distinctive feature of odelia is that ODE systems are
templated on their scalar type, so the solver can run with an
automatic differentiation (AD) number type as well as ordinary
double. This lets the solver return the exact gradient of a
loss function — the mismatch between the solution and a set of target
observations — with respect to the system’s parameters. Those gradients
can then be handed to a gradient-based optimiser to calibrate the
model.
This vignette demonstrates the workflow by recovering known Lorenz parameters from a reference trajectory.
Generate a reference trajectory
First we solve the system with known “true” parameters to produce the data we will later try to recover.
true_pars <- c(sigma = 10.0, R = 28.0, b = 8.0 / 3.0)
lz <- LorenzSystem$new(sigma = true_pars[1], R = true_pars[2], b = true_pars[3])
lz$set_initial_state(c(1, 1, 1), t0 = 0)
ctrl <- OdeControl$new()
runner <- Lorenz_Solver$new(lz$ptr, ctrl$ptr)
runner$advance_adaptive(c(0, 1))
times <- runner$times()
hist <- runner$history()
# The internal solver times that coincide with stored output rows
obs_index <- which(times %in% hist$time)
target_vals <- as.matrix(hist[, c("x", "y", "z")])Set up an AD-enabled solver
We move the system parameters away from the truth (this is our
starting guess), then build a solver with active = TRUE to
enable AD, and register the calibration target.
initial_guess <- c(sigma = 12.0, R = 30.0, b = 3.0)
lz$set_params(initial_guess)
ad_runner <- Lorenz_Solver$new(lz$ptr, ctrl$ptr, active = TRUE)
ad_runner$set_target(times, target_vals, obs_index)The $fit() method solves the system for a given
parameter vector and returns both the loss and its exact gradient:
res0 <- ad_runner$fit(params = initial_guess)
res0$loss
#> [1] 3.480593
res0$gradient
#> [1] 1.0865743 0.4861987 1.9423226Optimise
Because each $fit() call returns both loss and gradient,
we cache the most recent result so the optimiser does not have to re-run
the solver to obtain the gradient separately.
memoise_fit <- function(runner) {
cache_p <- NULL
cache_res <- NULL
fn <- function(p) {
cache_res <<- runner$fit(params = p)
cache_p <<- p
cache_res$loss
}
gr <- function(p) {
if (!identical(p, cache_p)) fn(p) # populate cache if stale
cache_res$gradient
}
list(obj = fn, grad = gr)
}
helper <- memoise_fit(ad_runner)
res <- optim(
par = initial_guess,
fn = helper$obj,
gr = helper$grad,
method = "L-BFGS-B",
control = list(maxit = 100)
)
res$convergence # 0 indicates successful convergence
#> [1] 0
res$value # final loss, should be ~0
#> [1] 2.697385e-13Check the recovered parameters
rbind(true = true_pars, recovered = res$par)
#> sigma R b
#> true 10 28 2.666667
#> recovered 10 28 2.666667The optimiser recovers the true parameters to high precision, confirming that the gradients produced by AD are correct and usable for calibration.