Struggles with getting the (Bayesian) language right

Maybe it is a function of being originally trained in the Frequentist paradigm, but I am finding it hard to get my Bayesian language right.

At some point when you are learning about Bayesian things you will come across a statement that says something along these lines:

Frequentists believe that the data is random and that parameter of interest is unknown and fixed. Bayesian, on the other hand, believe that the data is fixed, and that the parameter of interest is unknown and random.

At a certain level I am reasonably happy with this. The Bayesian statement leaves out something about the data generating mechanism being random, but on the whole – well, fine. However, I feel this does not mesh well with our statements. For example, let’s say we are doing a rather dull, but simple univariate analysis with inference about the mean. We have a set of observations, and we might propose a normal prior, a normal likelihood and consequently arrive at a normal posterior. How do we choose to summarise this information? Well, we might give a statement about the posterior density:

\mu|\mathbf{x}\sim N(\mu^\prime, (\sigma^{\prime})^2)

Or, we might give a (95%) credible interval:

\Pr(q_{0.025}^\prime\lt\mu|\mathbf{x}\lt q^{\prime}_{0.975})=0.95

Or we might just report the posterior mean and standard deviation. I think it is the last two where I question myself. Let’s take the credible interval. If we express it in words, then we say something along the lines of I am 95% sure the mean lies between a and b (where a and b are the respective lower and upper bounds of the credible interval. It seems implicit in this statement, at least to mean–and feel free to disagree, that there is this underlying belief that there is a single true value for the mean. Similarly, if we report the posterior mean, there is a feel (again to me) that we are reporting our best estimate of the one true mean. Clearly this flies in the face of what we actually think as Bayesians.

What brought this to the fore was working on the book my good friend David Lucy left unfinished.

David, like me, came to Bayesian thinking later in his academic career. He started out his academic life as an archaeologist, working on age estimation. If you are unfamiliar with this problem, it is essentially a classical calibration problem. You have a set of remains–perhaps teeth–from individuals of known age, and for whom you can measure some characteristic, say Y with variability. The idea is given a new value of Y, can you estimate the age of the individual. It parallels the calibration problem because (in theory) the age is measured without error, whereas Y contains all the usual sources of variation. David became interested in Bayesian solutions to this problem, and based on his thesis work, was firmly convinced that it was the only way to approach it.

David liked ontology and epistemology, and as a consequence of this introduces the idea of a Platonic view of statistics. I don’t know if this line will capture his thinking but it suffices. He writes:

Statistical modelling is very Platonic in that it is the parameters of a model which are the ontologically real matter which control the way in which the world works.

Implicit in this statement, again perhaps only to me, is the idea that there is one true value. In fact, if David was around to argue the point, I would say, based on the little I know about Plato, that his ideas were essentially reductionist. That is perhaps not a flaw if the building blocks are probability distributions, and maybe this is what David was arguing.

Anyway – there is no natural conclusion to these thoughts, except perhaps to ask “Are we actually conveying our intent accurately with our language when it comes to reporting Bayesian results?”

Share Button

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.