
AGCI Session II: Characterizing and Communicating Scientific Uncertainty
Session Chairs: Dr. Richard H. Moss and Dr. Stephen H. Schneider
July 31 to August 8, 1996
Statistics Refresher
1 The mean is the arithmetic average of a variable.
2 The median is the value beneath which half of the observations for a variable fall.
3 The mode is the most common value for a variable.
4 The mean, median and mode are called measures of location or measures of central tendency.
5 The standard deviation is a measure of how heterogeneous (spread out) the values of a variable are. It is the square root of the sum of the squared deviations around the mean. A "deviation" is the arithmetic difference between each value and the mean of the variable. One can think of the standard deviation as roughly (not literally) the average disparity between the mean and each value.
6 The range is another measure of heterogeneity and is the difference between the maximum and minimum value.
7 The interquartile range is another measure of heterogeneity and is the difference between the 75th percentile and the 25th percentile. It is the interval in which the middle 50 percent of the values fall.
8 The standard deviation, range, and interquartile range are called measures of variability or spread.
9 A probability distribution function is the set of probabilities, each associated with a discrete value of the variable in question.
10 A probability density function contains the same information as a probability distribution function but for variables whose values are continuous, not discrete.
11 Commonly used probability distribution functions are the binomial, multinomial, and the Poisson. Commonly used probability density functions are the normal, T, Chi-squared, and F.
12 A joint probability is the probability of two or more events occurring. A joint probability distribution is the set of such probabilities for two or more events. For example, suppose in the future the climate will either have more precipitation or less and be either warmer or colder. There are four combinations, then, with a probability attached to each. Those four probabilities, one attached to each combination, constitute a joint probability distribution.
13 A cumulative probability is the sum (or integral) of the probabilities (or densities) less than or equal to some value. For example, one might compute the probability that the temperature increase due to global warming is less than or equal to 3°C.
When comparing values from two different variables or the same variable measured on two different objects, we often cannot compare the values directly. For example, how could one compare a temperature of 30°C to rainfall of half a meter? Or how can one compare a hot day in Los Angeles to a hot day in Seattle? So, we standardize. One common way is with percentiles. We might say that a high temperature of 90°F is at the 75th percentile for Los Angeles summer days while a high temperature of 90°F is at the 95th percentile for Seattle summer days. We can then compare percentiles.
Another way is to compare z-scores, which are in standard deviation units. Thus, a 90°F temperature in Los Angles might have a z-score of around 1.5 which means it is 1.5 standard deviations above the mean. Since a standard deviation is roughly the average distance from the mean, Los Angeles temperature of 90°F is a bit more than an average departure from the mean. A 90°F temperature for Seattle might have a z-score of around 2.0, which means that it is about 2 standard deviations above the mean. So, it's about twice the average distance from the mean. So, z-scores and percentiles get at much the same thing and in fact are very closely related. For example, when one can work with the normal density function for the problem at hand, one can move easily between z-scores and percentiles: z-scores are monotonically related to percentiles.
Aspen Global Change Institute Homepage//Elements of Change 1996 Table of Contents//Comments & Questions: agcimail@agci.org