To-select-or-not. [WP]

with A. Riha, Aki Vehtari tbd,2025, 2025.

Abstract. Model selection is a commonly suggested approach for mitigating the risk of poor generalisation or overfitting in a range of modelling scenarios, especially when models of increasing complexity are investigated. Bayesian modelling workflows often require the consideration of different candidate models, and approaches for model selection in the Bayesian framework aim to support the modeller in navigating potential trade-offs between model complexity and generalisability of the results to yet unobserved data. In this work, we propose a change of perspective towards choosing generative priors, instead of relying on model selection after the fact. We revisit the issue of overfitting, and clarify why model selection is not necessarily needed and can even be harmful in some modelling scenarios with finite data. When integrating over the posterior and using generatively consistent priors, even if those priors can be considered weakly informative, we can safely use flexible models with a large number of parameters. We illustrate the relevance of appropriate prior choices, as well as the limitations and alternatives for model selection in different modelling tasks in simulated and real-data examples.