The definitions of the basic statistical terms given in this annex are taken from International Standard ISO 3534‑1:1993* [7]. This should be the first source consulted for the definitions of terms not included here. Some of these terms and their underlying concepts are elaborated upon in C.3 following the presentation of their formal definitions in C.2 in order to facilitate further the use of this Guide. However, C.3, which also includes the definitions of some related terms, is not based directly on ISO 3534‑1:1993.
As in Clause 2 and Annex B, the use of parentheses around certain words of some terms means that the words may be omitted if this is unlikely to cause confusion.
Terms C.2.1 to C.2.14 are defined in terms of the properties of populations. The definitions of terms C.2.15 to C.2.31 are related to a set of observations (see Reference [7]).
NOTE It can be related to a long‑run relative frequency of occurrence or to a degree of belief that an event will occur. For a high degree of belief, the probability is near 1.
[ISO 3534‑1:1993, definition 1.1]NOTE 1 A random variable that may take only isolated values is said to be “discrete”. A random variable which may take any value within a finite or infinite interval is said to be “continuous”.
NOTE 2 The probability of an event A is denoted by Pr(A) or P(A).
[ISO 3534‑1:1993, definition 1.2]Guide Comment: The symbol Pr(A) is used in this Guide in place of the symbol Pr(A) used in ISO 3534‑1:1993.
NOTE The probability on the whole set of values of the random variable equals 1.
[ISO 3534‑1:1993, definition 1.3]NOTE f(x) dx is the “probability element”:
NOTE Most statistical measures of correlation measure only the degree of linear relationship.
[ISO 3534‑1:1993, definition 1.13]NOTE If the random variable X has an expectation equal to μ, the corresponding centred random variable is (X − μ).
[ISO 3534‑1:1993, definition 1.21]NOTE The central moment of order 2 is the variance [ISO 3534‑1:1993, definition 1.22 (C.2.11)] of the random variable X.
[ISO 3534‑1:1993, definition 1.28]NOTE μ is the expectation and σ is the standard deviation of the normal distribution.
[ISO 3534‑1:1993, definition 1.37]NOTE The characteristic may be either quantitative (by variables) or qualitative (by attributes).
[ISO 3534‑1:1993, definition 2.2]NOTE In the case of a random variable, the probability distribution [ISO 3534‑1:1993, definition 1.3 (C.2.3)] is considered to define the population of that variable.
[ISO 3534‑1:1993, definition 2.3]NOTE The distribution may be graphically presented as a histogram (ISO 3534‑1:1993, definition 2.17), bar chart (ISO 3534‑1:1993, definition 2.18), cumulative frequency polygon (ISO 3534‑1:1993, definition 2.19), or as a two‑way table (ISO 3534‑1:1993, definition 2.22).
[ISO 3534‑1:1993, definition 2.15]NOTE 1 The term “mean” is used generally when referring to a population parameter and the term “average” when referring to the result of a calculation on the data obtained in a sample.
NOTE 2 The average of a simple random sample taken from a population is an unbiased estimator of the mean of this population. However, other estimators, such as the geometric or harmonic mean, or the median or mode, are sometimes used.
[ISO 3534‑1:1993, definition 2.26]EXAMPLE For n observations x1, x2, ..., xn with average
the variance is
NOTE 1 The sample variance is an unbiased estimator of the population variance.
NOTE 2 The variance is n∕(n − 1) times the central moment of order 2 (see note to ISO 3534‑1:1993, definition 2.39).
[ISO 3534‑1:1993, definition 2.33]Guide Comment: The variance defined here is more appropriately designated the “sample estimate of the population variance”. The variance of a sample is usually defined to be the central moment of order 2 of the sample (see C.2.13 and C.2.22).
NOTE The sample standard deviation is a biased estimator of the population standard deviation.
[ISO 3534‑1:1993, definition 2.34]NOTE The central moment of order 1 is equal to zero.
[ISO 3534‑1:1993, definition 2.37]NOTE A statistic, as a function of random variables, is also a random variable and as such it assumes different values from sample to sample. The value of the statistic obtained by using the observed values in this function may be used in a statistical test or as an estimate of a population parameter, such as a mean or a standard deviation.
[ISO 3534‑1:1993, definition 2.45]NOTE A result of this operation may be expressed as a single value [point estimate; see ISO 3534‑1:1993, definition 2.51 (C.2.26)] or as an interval estimate [see ISO 3534‑1:1993, definitions 2.57 (C.2.27) and 2.58 (C.2.28)].
[ISO 3534‑1:1993, definition 2.49]NOTE 1 The limits T1 and T2 of the confidence interval are statistics [ISO 3534‑1:1993, definition 2.45 (C.2.23)] and as such will generally assume different values from sample to sample.
NOTE 2 In a long series of samples, the relative frequency of cases where the true value of the population parameter θ is covered by the confidence interval is greater than or equal to (1 − α).
[ISO 3534‑1:1993, definition 2.57]NOTE 1 The limit T of the confidence interval is a statistic [ISO 3534‑1:1993, definition 2.45 (C.2.23)] and as such will generally assume different values from sample to sample.
NOTE 2 See Note 2 of ISO 3534‑1:1993, definition 2.57 (C.2.27).
[ISO 3534‑1:1993, definition 2.58]NOTE (1 − α) is often expressed as a percentage.
[ISO 3534‑1:1993, definition 2.59]NOTE 1 When both limits are defined by statistics, the interval is two‑sided. When one of the two limits is not finite or consists of the boundary of the variable, the interval is one‑sided.
NOTE 2 Also called “statistical tolerance interval”. This term should not be used because it may cause confusion with “tolerance interval” which is defined in ISO 3534‑2:1993.
[ISO 3534‑1:1993, definition 2.61]The expectation of a function g(z) over a probability density function p(z) of the random variable z is defined by
It is estimated statistically by z‾‾, the arithmetic mean or average of n independent observations zi of the random variable z, the probability density function of which is p(z):
The variance of a random variable is the expectation of its quadratic deviation about its expectation. Thus the variance of random variable z with probability density function p(z) is given by
NOTE 1 The factor n − 1 in the expression for s2(zi) arises from the correlation between zi and z‾‾ and reflects the fact that there are only n − 1 independent items in the set {zi − z‾‾}.
NOTE 2 If the expectation μz of z is known, the variance may be estimated by
The variance of the arithmetic mean or average of the observations, rather than the variance of the individual observations, is the proper measure of the uncertainty of a measurement result. The variance of a variable z should be carefully distinguished from the variance of the mean z‾‾. The variance of the arithmetic mean of a series of n independent observations zi of z is given by σ2(z‾‾) = σ2(zi)⁄n and is estimated by the experimental variance of the mean
The standard deviation is the positive square root of the variance. Whereas a Type A standard uncertainty is obtained by taking the square root of the statistically evaluated variance, it is often more convenient when determining a Type B standard uncertainty to evaluate a nonstatistical equivalent standard deviation first and then to obtain the equivalent variance by squaring the standard deviation.
The covariance of two random variables is a measure of their mutual dependence. The covariance of random variables y and z is defined by
NOTE The estimated covariance of the two means y‾ and z‾ is given by s(y‾, z‾ ) = s(yi, zi)⁄n.
For a multivariate probability distribution, the matrix V with elements equal to the variances and covariances of the variables is termed the covariance matrix. The diagonal elements, υ(z, z) ≡ σ2(z) or s(zi, zi) ≡ s2(zi), are the variances, while the off‑diagonal elements, υ(y, z) or s(yi, zi), are the covariances.
The correlation coefficient is a measure of the relative mutual dependence of two variables, equal to the ratio of their covariances to the positive square root of the product of their variances. Thus
NOTE 1 Because ρ and r are pure numbers in the range −1 to +1 inclusive, while covariances are usually quantities with inconvenient physical dimensions and magnitudes, correlation coefficients are generally more useful than covariances.
NOTE 2 For multivariate probability distributions, the correlation coefficient matrix is usually given in place of the covariance matrix. Since ρ(y, y) = 1 and r(yi, yi) = 1, the diagonal elements of this matrix are unity.
NOTE 3 If the input estimates xi and xj are correlated (see 5.2.2) and if a change δi in xi produces a change δj in xj, then the correlation coefficient associated with xi and xj is estimated approximately by
This relation can serve as a basis for estimating correlation coefficients experimentally. It can also be used to calculate the approximate change in one input estimate due to a change in another if their correlation coefficient is known.
Two random variables are statistically independent if their joint probability distribution is the product of their individual probability distributions.
NOTE If two random variables are independent, their covariance and correlation coefficient are zero, but the converse is not necessarily true.
The t‑distribution or Student's distribution is the probability distribution of a continuous random variable t whose probability density function is
The probability distribution of the variable (z‾‾ − μz)⁄s(z‾‾) is the t‑distribution if the random variable z is normally distributed with expectation μz, where z‾ is the arithmetic mean of n independent observations zi of z, s(zi) is the experimental standard deviation of the n observations, and s(z‾‾) = s(zi)⁄√n‾‾‾ is the experimental standard deviation of the mean z‾‾ with v = n − 1 degrees of freedom.
2) If, in the definition of the moments, the quantities X, X − a, Y, Y − b, etc. are replaced by their absolute values, i.e. │X│, │X − a│, │Y│, │Y − b│, etc., other moments called “absolute moments” are defined.
* Footnote to the 2008 version:
ISO 3534‑1:1993 has been cancelled and replaced by ISO 3534‑1:2006. Note that some of the terms and definitions
have been revised. For further information, see the latest edition.