AGCI Session II: Characterizing and Communicating Scientific Uncertainty
Session Chairs: Dr. Richard H. Moss and Dr. Stephen H. Schneider
July 31 to August 8, 1996
Bayesian Approaches to Characterizing Uncertainty
Richard Berk
University of California, Los Angeles
Los Angeles, California
Berk began by noting certain limitations to what statisticians have to offer scientists grappling with issues of uncertainty. To begin, there are many "poorly posed" scientific questions that cannot be constructively addressed using statistical techniques. An instance of such a question is "is this model right?" This does not mean that the question is not important but simply that there is a lack of coherence between the tools statisticians bring to bear and what scientists want to know. There are also constraints on when available statistical tools apply. Too often, the scientific data are in a form inconsistent with what the statistical procedures require. One common case is "convenience" samples produced by unknown data generation mechanisms. And there are sometimes questions of scientific uncertainty for which no useful statistical tools exist at all. Finally, even when the scientific question is well posed, and existing statistical tools can be usefully applied, there may be practical constraints, such as insufficient computing power.
Frequentist Approach
Berk then addressed two statistical perspectives on uncertainty: frequentist and Bayesian. The frequentist approach constitutes the dominant paradigm among the fields' elders and is the one scientists most often use, though among statisticians under the age of 40, Bayesian approaches are more common. For the frequentist, probability is defined as the proportion of time something happens in a limitless number of independent, identical trials. Simply put, it is the long run relative frequency. This definition is clearly an abstraction reflecting a "thought experiment" that could never be implemented in practice. Arguably, nevertheless, the frequentist approach is often found to be scientifically useful; it is found to be an instructive way to think about uncertainty.
There are
sometimes questions of scientific uncertainty for which no useful
statistical tools exist.
Implications of the frequentist approach are: 1. The values to be estimated (i.e., the parameters) are
fixed; they are knowable in principle and do not change. Uncertainty
comes from the data. 2. We need a very friendly world for this definition to be useful;
the world really is not a place where a very large number of trials
can generally be expected to be even approximately independent and
identical. So ... a. We can make the thought experiment more credible by using
randomized trials or random samples when the data are collected. b. We can do the science to show that the thought experiment is
sensible, e.g., applying the Poisson distribution to
radioactive decay. c. We can become "science fiction writers," treating the data as
"random realizations" from some hypothetical population. We really
don't have to think, we simply pretend it is true (since it is
unfalsifiable). In other words, we just assume the thought experiment
applies. This is an unscientific but common approach. And as usual there are lots of caveat emptors : a. We should not confuse the statistical technique with the model
being applied. For example, the bootstrap technique can simulate the
thought experiment whether or not the thought experiment really
applies to the problem at hand. b. Be careful about how confidence intervals (CI) are interpreted.
For example, we might take a sample and compute the mean, the
standard deviation and the CI and then ask if the parameter (e.
g., average temperature) is in that confidence interval. But we
can't know that. It does not tell us if the parameter is covered by
this band in this study. It only tells us that the parameter is in 95
percent of the bands we would construct if we did the study over and
over using independent random samples (as in the thought
experiment).
The probability
of two events equals the probability of one event times the
conditional probability of the other event, given the first.
Bayesian Statistics Bayes Theorem is not controversial in and of itself. To begin, the
probability of two events equals the probability of one event times
the conditional probability of the other event, given the first.
After some simple manipulations we get to Bayes' theorem. There is no
dispute about the mathematics. But when we use Bayes' theorem as a
means to "learn," serious controversies follow. In outline form, we
begin with a belief about some state of the world, consider that
belief in light of the data, and then revise that belief accordingly.
And a key point: those beliefs are represented in probabilistic terms
so that probability reflects a state of mind, not long run relative
frequency. Expression in probability language becomes a means to
convey what a person believes, and resides, therefore, "in here" and
not "out there." Probability reflects what a person believes about
the world and not directly the condition of the world itself. Bayesian Inference We begin with a prior probability density function representing
our beliefs about the parameter of interest before looking at the
data. (By parameter, we mean some quantitative feature of whatever it
is that we are studying.) If there is a "tight prior," (if the
density has a small spread), it means that we have relatively clear
beliefs about the likely value of the parameter. If there is a
relatively "flat prior," (if the density has a larger spread), it
means that our beliefs about the likely value of the parameter are
unclear. Then we introduce the likelihood function, which represents the
distribution of the parameter given the data. The prior is then
revised, based on this information, and the posterior distribution
that results represents what we now believe about the parameter in
light of the data. This updating process can be repeated again and
again; the posterior from one distribution becomes the prior for the
next. But in any case, the posterior distribution contains all of the
information we have about the parameter and is used to draw
conclusions about it. In order to interpret and summarize results based on the posterior
distribution, we might choose the mode of the posterior, the mean of
the posterior, or the 95 percent confidence interval (in which we can
say we are 95 percent certain the value falls). This is consistent
with the language scientists speak.
We begin with a
belief about some state of the world, consider that belief in light
of the data, and then revise that belief accordingly.
Still, one can play this Bayesian game well and still come upon
various complications. For example, what if the resulting
distribution has two peaks separated by a valley; this makes the
posterior very difficult to summarize. Note also that there is a big
difference between the distribution of data and the distribution of
the parameter of interest. The focus here is on what individuals
believe about the parameter. Extensions of this approach include multiparameter problems (which
require joint probability distribution functions, and are thus much
harder), non-normal probability distribution functions, and model
comparisons.
One can play
this Bayesian game well and still come upon various complications.
For example, what if the resulting distribution has two peaks
separated by a valley; this makes the posterior very difficult to
summarize.