3 Statistical Modelling and Inference at the LHC
Life is complicated, but not uninteresting.
– Jerzy Neyman
In this chapter we will consider the problem of extracting quantitative information about the validity or properties of the different theoretical models (see Chapter 1), which can be made given the experimental data acquired in a controlled setting (see Chapter 2). We will begin by formally defining the properties and structure of the statistical models used to link the parameters of interest with the experimental data, followed by a description of the inference problems in experimental high-energy physics and how they can be tackled with statistical techniques. Some relevant particularities of the inference problems typically of interest of the LHC experiments will be discussed, mainly the generative-only nature of the simulation models and the high dimensionality of the data. As we will see, these issues are intimately related, the former requiring the use of likelihood-free inference techniques such as constructing non-parametric sample likelihoods, which in turn demand for lower dimensional summary statistics.