Parton distributions need representative sampling


In global QCD fits of parton distribution functions (PDFs), a large part of the estimated uncertainty on the PDFs originates from the choices of parametric functional forms and fitting methodology. We argue that these types of uncertainties can be underestimated with common PDF ensembles in high-stake measurements at the Large Hadron Collider and Tevatron. A fruitful approach to quantify these uncertainties is to view them as arising from sampling of allowed PDF solutions in a multidimensional parametric space. This approach applies powerful insights gained in recent statistical studies of large-scale population surveys and quasi-Monte Carlo integration methods. In particular, PDF fits may be affected by the big data paradox, which stipulates that more experimental data do not automatically raise the accuracy of PDFs – close attention to the data quality and sampling of possible PDF solutions is as essential. To test if the sampling of the PDF uncertainty of an experimental observable is truly representative of all acceptable solutions, we introduce a technique (“a hopscotch scan”) based on a combination of parameter scans and stochastic sampling. With this technique, we show that the PDF uncertainty on key LHC cross sections at 13 TeV obtained with the public NNPDF4.0 fitting code is larger than the nominal uncertainty obtained with the published NNPDF4.0 Monte-Carlo replica sets. For example, the uncertainties on the charm distribution at a large momentum fraction xx and gluon PDF at small xx are enlarged. In PDF ensembles obtained in the analytic minimization (Hessian) formalism, the tolerance on the PDF uncertainty must be based on sufficiently complete sampling of PDF functional forms and choices of the experiments.