As I have posted before, I never learned any statistics during my education as a theoretical physicist/applied mathematician. However, it became fairly apparent after I entered biology (although I managed to avoid it for a few years) that fitting models to data and estimating parameters is unavoidable. As I have opined multiple times previously, Bayesian inference and the Markov Chain Monte Carlo (MCMC) method is the best way to do this. I thought I would provide the details on how to implement the algorithm since most of the treatments on the subject are often couched in statistical language that may be hard to decipher for someone with a physics or dynamical systems background. I’ll do this in a series of posts and in a nonstandard order. Instead of building up the entire Bayesian formalism, I thought I would show how the MCMC method can be used to do the job and then later show how it fits into the grand scheme of things to do even more.
Suppose you have data , which consists of some measurements (either a scalar or a vector) at discrete time points
and a proposed model, which produces a time series
, where
represents a set of free parameters that changes the function. This model could be a system of differential equations or just some proposed function like a polynomial. The goal is to find the set of parameters that best fits the data and to evaluate how good the model is. Now, the correct way to do this is to use Bayesian inference and model comparison, which can be computed using the MCMC. However, the MCMC can also be used just to get the parameters in the sense of finding the best fit according to some criterion.