This annex gives six examples, H.1 to H.6, which are worked out in considerable detail in order to illustrate the basic principles presented in this Guide for evaluating and expressing uncertainty in measurement. Together with the examples included in the main text and in some of the other annexes, they should enable the users of this Guide to put these principles into practice in their own work.
Because the examples are for illustrative purposes, they have by necessity been simplified. Moreover, because they and the numerical data used in them have been chosen mainly to demonstrate the principles of this Guide, neither they nor the data should necessarily be interpreted as describing real measurements. While the data are used as given, in order to prevent rounding errors, more digits are retained in intermediate calculations than are usually shown. Thus the stated result of a calculation involving several quantities may differ slightly from the result implied by the numerical values given in the text for these quantities.
It is pointed out in earlier portions of this Guide that classifying the methods used to evaluate components of uncertainty as Type A or Type B is for convenience only; it is not required for the determination of the combined standard uncertainty or expanded uncertainty of a measurement result because all uncertainty components, however they are evaluated, are treated in the same way (see 3.3.4, 5.1.2, and E.3.7). Thus, in the examples, the method used to evaluate a particular component of uncertainty is not specifically identified as to its type. However, it will be clear from the discussion whether a component is obtained from a Type A or a Type B evaluation.
This example demonstrates that even an apparently simple measurement may involve subtle aspects of uncertainty evaluation.
The length of a nominally 50 mm end gauge is determined by comparing it with a known standard of the same nominal length. The direct output of the comparison of the two end gauges is the difference d in their lengths:
is the measurand, that is, the length at 20 °C of the end gauge being calibrated;
is the length of the standard at 20 °C as given in its calibration certificate;
α and αS
are the coefficients of thermal expansion, respectively, of the gauge being calibrated and the standard;
θ and θS
are the deviations in temperature from the 20 °C reference temperature, respectively, of the gauge and the standard.
From Equation (H.1), the measurand is given by
If the difference in temperature between the end gauge being calibrated and the standard is written as δθ = θ − θS, and the difference in their thermal expansion coefficients as δα = α − αS, Equation (H.2) becomes
The differences δθ and δα, but not their uncertainties, are estimated to be zero; and δα, αS, δθ, and θ are assumed to be uncorrelated. (If the measurand were expressed in terms of the variables θ, θS, α, and αS, it would be necessary to include the correlation between θ and θS, and between α and αS.)
It thus follows from Equation (H.3) that the estimate of the value of the measurand l may be obtained from the simple expression lS + d‾‾‾, where lS is the length of the standard at 20 °C as given in its calibration certificate and d is estimated by d‾‾‾, the arithmetic mean of n = 5 independent repeated observations. The combined standard uncertainty uc(l) of l is obtained by applying Equation (10) in 5.1.2 to Equation (H.3), as discussed below.
NOTE In this and the other examples, for simplicity of notation, the same symbol is used for a quantity and its estimate.
The pertinent aspects of this example as discussed in this and the following subclauses are summarized in Table H.1.
Since it is assumed that δα = 0 and δθ = 0, the application of Equation (10) in 5.1.2 to Equation (H.3) yields
The calibration certificate gives as the expanded uncertainty of the standard U = 0,075 µm and states that it was obtained using a coverage factor of k = 3. The standard uncertainty is then
The pooled experimental standard deviation characterizing the comparison of l and lS was determined from the variability of 25 independent repeated observations of the difference in lengths of two standard end gauges and was found to be 13 nm. In the comparison of this example, five repeated observations were taken. The standard uncertainty associated with the arithmetic mean of these readings is then (see 4.2.4)
According to the calibration certificate of the comparator used to compare l with lS, its uncertainty “due to random errors” is ±0,01 µm at a level of confidence of 95 percent and is based on 6 replicate measurements; thus the standard uncertainty, using the t‑factor t95(5) = 2,57 for v = 6 − 1 = 5 degrees of freedom (see Annex G, Table G.2), is
The uncertainty of the comparator “due to systematic errors” is given in the certificate as 0,02 µm at the “three sigma level”. The standard uncertainty from this cause may therefore be taken to be
The total contribution is obtained from the sum of the estimated variances:
The coefficient of thermal expansion of the standard end gauge is given as αS = 11,5 × 10−6 °C−1 with an uncertainty represented by a rectangular distribution with bounds ±2 × 10−6 °C−1. The standard uncertainty is then [see Equation (7) in 4.3.7]
Since cαS = ∂f∕∂αS = −lSδθ = 0 as indicated in H.1.3, this uncertainty contributes nothing to the uncertainty of l in first order. It does, however, have a second‑order contribution that is discussed in H.1.7.
|Standard uncertainty component||Source of uncertainty||Value of standard uncertainty||ci ≡ ∂f∕∂xi||ui(l) ≡ ∣ci∣u(xi)||Degrees of freedom|
|u(lS)||Calibration of standard end gauge||25 nm||1||25||18|
|u(d)||Measured difference between end gauges||9,7 nm||1||9,7||25,6|
|u(d‾‾‾)||repeated observations||5,8 nm||24|
|u(d1)||random effects of comparator||3,9 nm||5|
|u(d2)||Systematic effects of comparator||6,7 nm||8|
|u(αS)||Thermal expansion coefficient of standard end gauge||1,2 × 10−6 °C−1||0||0|
|u(θ)||Temperature of test bed||0,41 °C||0||0|
|u(θ‾‾‾)||mean temperature of bed||0,2 °C|
|u(Δ)||cyclic variation of temperature of room||0,35 °C|
|u(δα)||Difference in expansion coefficients of end gauges||0,58 × 10−6 °C−1||−lSθ||2,9||50|
|u(δθ)||Difference in temperatures of end gauges||0,029 °C||−lSαS||16,6||2|
u2c(l) = ∑u2i(l) = 1 002 nm2
uc(l) = 32 nm
veff(l) = 16
The temperature of the test bed is reported as (19,9 ± 0,5) °C; the temperature at the time of the individual observations was not recorded. The stated maximum offset, Δ = 0,5 °C, is said to represent the amplitude of an approximately cyclical variation of the temperature under a thermostatic system, not the uncertainty of the mean temperature. The value of the mean temperature deviation
The temperature deviation θ may be taken equal to θ‾‾, and the standard uncertainty of θ is obtained from
Since cθ = ∂f∕∂θ = −lSδα = 0 as indicated in H.1.3, this uncertainty also contributes nothing to the uncertainty of l in first order; but it does have a second‑order contribution that is discussed in H.1.7.
The estimated bounds on the variability of δα are ±1 × 10−6 °C−1, with an equal probability of δα having any value within those bounds. The standard uncertainty is
The standard and the test gauge are expected to be at the same temperature, but the temperature difference could lie with equal probability anywhere in the estimated interval −0,05 °C to +0,05 °C. The standard uncertainty is
The combined standard uncertainty uc(l) is calculated from Equation (H.5). The individual terms are collected and substituted into this expression to obtain
The dominant component of uncertainty is obviously that of the standard, u(lS) = 25 nm.
The calibration certificate for the standard end gauge gives lS = 50,000 623 mm as its length at 20 °C. The arithmetic mean d‾‾‾ of the five repeated observations of the difference in lengths between the unknown end gauge and the standard gauge is 215 nm. Thus, since l = lS + d‾‾‾ (see H.1.2), the length l of the unknown end gauge at 20 °C is 50,000 838 mm. Following 7.2.2, the final result of the measurement may be stated as:
l = 50,000 838 mm with a combined standard uncertainty uc = 32 nm. The corresponding relative combined standard uncertainty is uc⁄l = 6,4 × 10−7.
Suppose that one is required to obtain an expanded uncertainty U99 = k99uc(l) that provides an interval having a level of confidence of approximately 99 percent. The procedure to use is that summarized in G.6.4, and the required degrees of freedom are indicated in Table H.1. These were obtained as follows:
The calculation of veff(l) from Equation (G.2b) in G.4.1 proceeds in exactly the same way as for the calculation of veff(d) in 2. above. Thus from Equations (H.6b) and (H.6c) and the values for v given in 1. through 4.,
To obtain the required expanded uncertainty, this value is first truncated to the next lower integer, veff(l) = 16. It then follows from Table G.2 in Annex G that t99(16) = 2,92, and hence U99 = t99(16)uc(l) = 2,92 × (32 nm) = 93 nm. Following 7.2.4, the final result of the measurement may be stated as:
l = (50,000 838 ± 0,000 093) mm, where the number following the symbol ± is the numerical value of an expanded uncertainty U = kuc, with U determined from a combined standard uncertainty uc = 32 nm and a coverage factor k = 2,92 based on the t‑distribution for v = 16 degrees of freedom, and defines an interval estimated to have a level of confidence of 99 percent. The corresponding relative expanded uncertainty is U⁄l = 1,9 × 10−6.
The note to 5.1.2 points out that Equation (10), which is used in this example to obtain the combined standard uncertainty uc(l), must be augmented when the nonlinearity of the function Y = f(X1, X2, ..., XN) is so significant that the higher‑order terms in the Taylor series expansion cannot be neglected. Such is the case in this example, and therefore the evaluation of uc(l) as presented up to this point is not complete. Application to Equation (H.3) of the expression given in the note to 5.1.2 yields in fact two distinct non‑negligible second‑order terms to be added to Equation (H.5). These terms, which arise from the quadratic term in the expression of the note, are
The second‑order terms increase uc(l) from 32 nm to 34 nm.
This example demonstrates the treatment of multiple measurands or output quantities determined simultaneously in the same measurement and the correlation of their estimates. It considers only the random variations of the observations; in actual practice, the uncertainties of corrections for systematic effects would also contribute to the uncertainty of the measurement results. The data are analysed in two different ways with each yielding essentially the same numerical values.
The resistance R and the reactance X of a circuit element are determined by measuring the amplitude V of a sinusoidally‑alternating potential difference across its terminals, the amplitude I of the alternating current passing through it, and the phase‑shift angle φ of the alternating potential difference relative to the alternating current. Thus the three input quantities are V, I, and φ and the three output quantities — the measurands — are the three impedance components R, X, and Z. Since Z2 = R2 + X2, there are only two independent output quantities.
The measurands are related to the input quantities by Ohm's law:
Consider that five independent sets of simultaneous observations of the three input quantities V, I, and φ are obtained under similar conditions (see B.2.15), resulting in the data given in Table H.2. The arithmetic means of the observations and the experimental standard deviations of those means calculated from Equations (3) and (5) in 4.2 are also given. The means are taken as the best estimates of the expected values of the input quantities, and the experimental standard deviations are the standard uncertainties of those means.
Because the means V‾‾‾, I‾‾, and φ‾‾‾ are obtained from simultaneous observations, they are correlated and the correlations must be taken into account in the evaluation of the standard uncertainties of the measurands R, X, and Z. The required correlation coefficients are readily obtained from Equation (14) in 5.2.2 using values of s(V‾‾‾, I‾‾), s(V‾‾‾, φ‾‾‾), and s(I‾‾, φ‾‾‾) calculated from Equation (17) in 5.2.3. The results are included in Table H.2, where it should be recalled that r(xi, xj) = r(xj, xi) and r(xi, xi) = 1.
|Set number||Input quantities|
|Arithmetic mean||V‾‾‾ = 4,9990||I‾‾ = 19,6610||φ‾‾‾ = 1,044 46|
|Experimental standard deviation of mean||s(V‾‾‾) = 0,0032||s(I‾‾) = 0,0095||s(φ‾‾‾) = 0,000 75|
|r(V‾‾‾, I‾‾) = −0,36|
|r(V‾‾‾, φ‾‾‾) = 0,86|
|r(I‾‾, φ‾‾‾) = −0,65|
Approach 1 is summarized in Table H.3.
The values of the three measurands R, X, and Z are obtained from the relations given in Equation (H.7) using the mean values V‾‾‾, I‾‾, and φ‾‾‾ of Table H.2 for V, I, and φ. The standard uncertainties of R, X, and Z are obtained from Equation (16) in 5.2.2 since, as pointed out above, the input quantities V‾‾‾, I‾‾, and φ‾‾‾ are correlated. As an example, consider Z = V‾‾‾⁄I‾‾. Identifying V‾‾‾ with x1, I‾‾ with x2, and f with Z = V‾‾‾⁄I‾‾, Equation (16) in 5.2.2 yields for the combined standard uncertainty of Z
Because the three measurands or output quantities depend on the same input quantities, they too are correlated. The elements of the covariance matrix that describes this correlation may be written in general as
To apply Equation (H.9) to this example, the following identifications are made:
The results of the calculations of R, X, and Z and of their estimated variances and correlation coefficients are given in Table H.3.
Relationship between estimate of measurand
and input estimates
Value of estimate
which is the result of measurement
Combined standard uncertainty
of result of measurement
|1||y1 = R = (V‾‾‾⁄I‾‾) cos φ‾‾‾||y1 = R = 127,732 Ω||
uc(R) = 0,071 Ω
uc(R)⁄R = 0,06 × 10−2
|2||y2 = X = (V‾‾‾⁄I‾‾) sin φ‾‾‾||y2 = X = 219,847 Ω||
uc(X) = 0,295 Ω
uc(X)⁄X = 0,13 × 10−2
|3||y3 = Z = V‾‾‾⁄I‾‾||y3 = Z = 254,260 Ω||
uc(Z) = 0,236 Ω
uc(Z)⁄Z = 0,09 × 10−2
|Correlation coefficients r(yl, ym)|
|r(y1, y2) = r(R, X) = −0,588|
|r(y1, y3) = r(R, Z) = −0,485|
|r(y2, y3) = r(X, Z) = 0,993|
Approach 2 is summarized in Table H.4.
Since the data have been obtained as five sets of observations of the three input quantities V, I, and φ. it is possible to compute a value for R, X, and Z from each set of input data, and then take the arithmetic mean of the five individual values to obtain the best estimates of R, X, and Z. The experimental standard deviation of each mean (which is its combined standard uncertainty) is then calculated from the five individual values in the usual way [Equation (5) in 4.2.3]; and the estimated covariances of the three means are calculated by applying Equation (17) in 5.2.3 directly to the five individual values from which each mean is obtained. There are no differences in the output values, standard uncertainties, and estimated covariances provided by the two approaches except for second‑order effects associated with replacing terms such as V‾‾‾⁄I‾‾ and cos φ‾‾‾ by V⁄I‾‾‾ ‾‾‾ and cos φ‾‾‾‾‾‾‾‾‾.
To demonstrate this approach, Table H.4 gives the values of R, X and Z calculated from each of the five sets of observations. The arithmetic means, standard uncertainties, and estimated correlation coefficients are then directly computed from these individual values. The numerical results obtained in this way are negligibly different from the results given in Table H.3.
|Set number||Individual values of measurands|
R = (V⁄I) cos φ
X = (V⁄I) sin φ
Z = V⁄I
|Arithmetic mean||y1 = R‾‾‾ = 127,732||y2 = X‾‾‾ = 219,847||y3 = Z‾‾‾ = 254,260|
|Experimental standard deviation of mean||s(R‾‾‾) = 0,071||s(X‾‾‾) = 0,295||s(Z‾‾‾) = 0,236|
|Correlation coefficients r(yl, ym)|
|r(y1, y2) = r(R‾‾‾, X‾‾‾) = −0,588|
|r(y1, y3) = r(R‾‾‾, Z‾‾‾) = −0,485|
|r(y2, y3) = r(X‾‾‾, Z‾‾‾) = 0,993|
In the terminology of the Note to 4.1.4, approach 2 is an example of obtaining the estimate y from Y‾‾‾ = (∑nk = 1Yk)⁄n, while approach 1 is an example of obtaining y from y = f(X‾‾‾1, X‾‾‾2, ..., X‾‾‾N). As pointed out in that note, in general, the two approaches will give identical results if f is a linear function of its input quantities (provided that the experimentally observed correlation coefficients are taken into account when implementing approach 1). If f is not a linear function, then the results of approach 1 will differ from those of approach 2 depending on the degree of nonlinearity and the estimated variances and covariances of the Xi. This may be seen from the expression
On the other hand, approach 2 would be inappropriate if the data of Table H.2 represented n1 = 5 observations of the potential difference V, followed by n2 = 5 observations of the current I, and then followed by n3 = 5 observations of the phase φ, and would be impossible if n1 ≠ n2 ≠ n3. (It is in fact poor measurement procedure to carry out the measurements in this way since the potential difference across a fixed impedance and the current through it are directly related.)
If the data of Table H.2 are reinterpreted in this manner so that approach 2 is inappropriate, and if correlations among the quantities V, I, and φ are assumed to be absent, then the observed correlation coefficients have no significance and should be set equal to zero. If this is done in Table H.2, Equation (H.9) reduces to the equivalent of Equation (F.2) in F.1.2.3, namely,
|Combined standard uncertainty uc(yl) of result of measurement|
|uc(R) = 0,195 Ω
uc(R)⁄R = 0,15 × 10−2
|uc(X) = 0,201 Ω
uc(X)⁄X = 0,09 × 10−2
uc(Z) = 0,204 Ω
uc(Z)⁄Z = 0,08 × 10−2
|Correlation coefficients r(yl, ym)|
|r(y1, y2) = r(R, X) = 0,056
r(y1, y3) = r(R, Z) = 0,527
r(y2, y3) = r(X, Z) = 0,878
This example illustrates the use of the method of least squares to obtain a linear calibration curve and how the parameters of the fit, the intercept and slope, and their estimated variances and covariance, are used to obtain from the curve the value and standard uncertainty of a predicted correction.
A thermometer is calibrated by comparing n = 11 temperature readings tk of the thermometer, each having negligible uncertainty, with corresponding known reference temperatures tR, k in the temperature range 21 °C to 27 °C to obtain the corrections bk = tR, k − tk to the readings. The measured corrections bk and measured temperatures tk are the input quantities of the evaluation. A linear calibration curve
Based on the method of least squares and under the assumptions made in H.3.1 above, the output quantities y1 and y2 and their estimated variances and covariance are obtained by minimizing the sum
This leads to the following equations for
their experimental variances
and their estimated correlation coefficient
r(y1, y2) = s(y1, y2)⁄s(y1)s(y2),
is their estimated covariance:
The data to be fitted are given in the second and third columns of Table H.6. Taking t0 = 20 °C as the reference temperature, application of Equations (H.13a) to (H.13g) yields
|Reading number||Thermometer reading||Observed correction||Predicted correction||Difference between observed and predicted correction|
|k||tk||bk = tR, k − tk||b(tk)||bk − b(tk)|
|1||21,521||−0,171||−0,167 9||−0,003 1|
|2||22,012||−0,169||−0,166 8||−0,002 2|
|3||22,512||−0,166||−0,165 7||−0,000 3|
|4||23,003||−0,159||−0,164 6||+0,005 6|
|5||23,507||−0,164||−0,163 5||−0,000 5|
|6||23,999||−0,165||−0,162 5||−0,002 5|
|7||24,513||−0,156||−0,161 4||+0,005 4|
|8||25,002||−0,157||−0,160 3||+0,003 3|
|9||25,503||−0,159||−0,159 2||+0,000 2|
|10||26,010||−0,161||−0,158 1||−0,002 9|
|11||26,511||−0,160||−0,157 0||−0,003 0|
The fact that the slope y2 is more than three times larger than its standard uncertainty provides some indication that a calibration curve and not a fixed average correction is required.
The calibration curve may then be written as
The expression for the combined standard uncertainty of the predicted value of a correction can be readily obtained by applying the law of propagation of uncertainty, Equation (16) in 5.2.2, to Equation (H.12). Noting that b(t) = f(y1, y2) and writing u(y1) = s(y1) and u(y2) = s(y2), one obtains
The estimated variance u2c[b(t)] is a minimum at tmin = t0 − u(y1)r(y1, y2)⁄u(y2), which in the present case is tmin = 24,008 5 °C.
As an example of the use of Equation (H.15), consider that one requires the thermometer correction and its uncertainty at t = 30 °C, which is outside the temperature range in which the thermometer was actually calibrated. Substituting t = 30 °C in Equation (H.14) gives
Thus the correction at 30 °C is −0,149 4 °C, with a combined standard uncertainty of uc = 0,004 1 °C, and with uc having v = n − 2 = 9 degrees of freedom.
Equation (H.13e) for the correlation coefficient r(y1, y2) implies that if t0 is so chosen that Σnk = 1θk = Σnk = 1(tk − t0) = 0, then r(y1, y2) = 0 and y1 and y2 will be uncorrelated, thereby simplifying the computation of the standard uncertainty of a predicted correction. Since Σnk = 1θk = 0 when t0 = t‾‾ = (Σnk = 1tk)⁄n, and t‾‾ = 24,008 5 °C in the present case, repeating the least‑squares fit with t0 = t‾‾ = 24,008 5 °C would lead to values of y1 and y2 that are uncorrelated. (The temperature t‾‾ is also the temperature at which u2[b(t)] is a minimum — see H.3.4.) However, repeating the fit is unnecessary because it can be shown that
Application of these relations to the results given in H.3.3 yields
That these expressions give the same results as Equations (H.14) and (H.15) can be checked by repeating the calculation of b(30 °C) and uc[b(30 °C)]. The substitution of t = 30 °C into Equations (H.17a) and (H.17b) yields
The least‑squares method can be used to fit higher‑order curves to data points, and is also applicable to cases where the individual data points have uncertainties. Standard texts on the subject should be consulted for details . However, the following examples illustrate two cases where the measured corrections bk are not assumed to be exactly known.
NOTE A pooled estimate of variance s2p based on N series of independent observations of the same random variable is obtained from
where s2i is the experimental variance of the ith series of ni independent repeated observations [Equation (4) in 4.2.2] and has degrees of freedom vi = ni − 1. The degrees of freedom of s2p is v = ΣNi = 1vi. The experimental variance s2p⁄m (and the experimental standard deviation sp⁄√m‾‾‾) of the arithmetic mean of m independent observations characterized by the pooled estimate of variance s2p also has v degrees of freedom.
This example is similar to example H.2, the simultaneous measurement of resistance and reactance, in that the data can be analysed in two different ways but each yields essentially the same numerical result. The first approach illustrates once again the need to take the observed correlations between input quantities into account.
The unknown radon (222Rn) activity concentration in a water sample is determined by liquid‑scintillation counting against a radon‑in‑water standard sample having a known activity concentration. The unknown activity concentration is obtained by measuring three counting sources consisting of approximately 5 g of water and 12 g of organic emulsion scintillator in vials of volume 22 ml:
a standard consisting of a mass mS of the standard solution with a known activity concentration;
a matched blank water sample containing no radioactive material, used to obtain the background counting rate;
the sample consisting of an aliquot of mass mx with unknown activity concentration.
Six cycles of measurement of the three counting sources are made in the order standard — blank — sample; and each dead‑time‑corrected counting interval T0 for each source during all six cycles is 60 minutes. Although the background counting rate cannot be assumed to be constant over the entire counting interval (65 hours), it is assumed that the number of counts obtained for each blank may be used as representative of the background counting rate during the measurements of the standard and sample in the same cycle. The data are given in Table H.7, where
tS, tB, tx
are the times from the reference time t = 0 to the midpoint of the dead‑time‑corrected counting intervals T0 = 60 min for the standard, blank, and sample vials, respectively; although tB is given for completeness, it is not needed in the analysis;
CS, CB, Cx
are the number of counts recorded in the dead‑time‑corrected counting intervals T0 = 60 min for the standard, blank, and sample vials, respectively.
The observed counts may be expressed as
is the liquid scintillation detection efficiency for 222Rn for a given source composition, assumed to be independent of the activity level;
is the activity concentration of the standard at the reference time t = 0;
is the measurand and is defined as the unknown activity concentration of the sample at the reference time t = 0;
is the mass of the standard solution;
is the mass of the sample aliquot;
is the decay constant for 222Rn: λ = (ln 2)⁄T1/2 = 1,258 94 × 10−4 min−1 (T1/2 = 5 505,8 min).
|1||243,74||15 380||305,56||4 054||367,37||41 432|
|2||984,53||14 978||1 046,10||3 922||1 107,66||38 706|
|3||1 723,87||14 394||1 785,43||4 200||1 846,99||35 860|
|4||2 463,17||13 254||2 524,73||3 830||2 586,28||32 238|
|5||3 217,56||12 516||3 279,12||3 956||3 340,68||29 640|
|6||3 956,83||11 058||4 018,38||3 980||4 079,94||26 356|
Equations (H.18a) and (H.18b) indicate that neither the six individual values of CS nor of Cx given in Table H.7 can be averaged directly because of the exponential decay of the activity of the standard and sample, and slight variations in background counts from one cycle to another. Instead, one must deal with the decay‑corrected and background‑corrected counts (or counting rates defined as the number of counts divided by T0 = 60 min). This suggests combining Equations (H.18a) and (H.18b) to obtain the following expression for the unknown concentration in terms of the known quantities:
Table H.8 summarizes the values of the background‑corrected and decay‑corrected counting rates RS and Rx calculated from Equations (H.21a) and (H.21b) using the data of Table H.7 and λ = 1,258 94 × 10−4 min−1 as given earlier. It should be noted that the ratio R = Rx⁄RS is most simply calculated from the expression
The arithmetic means R‾‾S, R‾‾x, and R‾‾, and their experimental standard deviations s(R‾‾S), s(R‾‾x), and s(R‾‾), are calculated in the usual way [Equations (3) and (5) in 4.2]. The correlation coefficient r(R‾‾x, R‾‾S) is calculated from Equation (17) in 5.2.3 and Equation (14) in 5.2.2.
Because of the comparatively small variability of the values of Rx and of RS, the ratio of means R‾‾x⁄R‾‾S and the standard uncertainty u(R‾‾x⁄R‾‾S) of this ratio are, respectively, very nearly the same as the mean ratio R‾‾ and its experimental standard deviation s(R‾‾) as given in the last column of Table H.8 [see H.2.4 and Equation (H.10) therein]. However, in calculating the standard uncertainty u(R‾‾x⁄R‾‾S), the correlation between Rx and RS as represented by the correlation coefficient r(R‾‾x, R‾‾S) must be taken into account using Equation (16) in 5.2.2. [That equation yields for the relative estimated variance of R‾‾x⁄R‾‾S the last three terms of Equation (H.22b).]
It should be recognized that the respective experimental standard deviations of Rx and of RS, √6‾‾s(R‾‾x) and √6‾‾s(R‾‾S), indicate a variability in these quantities that is two to three times larger than the variability implied by the Poisson statistics of the counting process; the latter is included in the observed variability of the counts and need not be accounted for separately.
tx − tS
|R = Rx⁄RS|
R‾‾‾x = 652,60
s(R‾‾‾x) = 6,42
s(R‾‾‾x)⁄R‾‾‾x = 0,98 × 10−2
R‾‾‾S = 206,09
s(R‾‾‾S) = 3,79
s(R‾‾‾S)⁄R‾‾‾S = 1,84 × 10−2
R‾‾‾ = 3,170
s(R‾‾‾) = 0,046
s(R‾‾‾)⁄R‾‾‾ = 1,44 × 10−2
R‾‾‾x⁄R‾‾‾S = 3,167
u(R‾‾‾x⁄R‾‾‾S) = 0,045
u(R‾‾‾x⁄R‾‾‾S)∕(R‾‾‾x⁄R‾‾‾S) = 1,42 × 10−2
|r(R‾‾‾x, R‾‾‾S) = 0,646|
To obtain the unknown activity concentration Ax and its combined standard uncertainty uc(Ax) from Equation (H.20) requires AS, mx, and mS and their standard uncertainties. These are given as
Other possible sources of uncertainty are evaluated to be negligible:
As indicated earlier, Ax and uc(Ax) may be obtained in two different ways from Equation (H.20). In the first approach, Ax is calculated using the arithmetic means R‾‾x and R‾‾S, which leads to
Application of Equation (16) in 5.2.2 to this expression yields for the combined variance u2c(Ax)
Substitution of the values of the relevant quantities into Equations (H.22a) and (H.22b) yields
The result of the measurement may then be stated as:
Ax = 0,430 0 Bq∕g with a combined standard uncertainty of uc = 0,008 3 Bq∕g.
In the second approach, which avoids the correlation between R‾‾x and R‾‾S, Ax is calculated using the arithmetic mean R‾‾. Thus
The expression for u2c(Ax) is simply
The result of the measurement may then be stated as:
Ax = 0,430 4 Bq∕g with a combined standard uncertainty of uc = 0,008 4 Bq∕g.
The effective degrees of freedom of uc can be evaluated using the Welch‑Satterthwaite formula in the manner illustrated in H.1.6.
As in H.2, of the two results, the second is preferred because it avoids approximating the mean of a ratio of two quantities by the ratio of the means of the two quantities; and it better reflects the measurement procedure used — the data were in fact collected in separate cycles.
Nevertheless, the difference between the values of Ax resulting from the two approaches is clearly small compared with the standard uncertainty ascribed to either one, and the difference between the two standard uncertainties is entirely negligible. Such agreement demonstrates that the two approaches are equivalent when the observed correlations are properly included.
This example provides a brief introduction to analysis of variance (ANOVA) methods. These statistical techniques are used to identify and quantify individual random effects in a measurement so that they may be properly taken into account when the uncertainty of the result of the measurement is evaluated. Although ANOVA methods are applicable to a wide range of measurements, for example, the calibration of reference standards, such as Zener voltage standards and standards of mass, and the certification of reference materials, ANOVA methods by themselves cannot identify systematic effects that might be present.
There are many different models included under the general name of ANOVA. Because of its importance, the specific model discussed in this example is the balanced nested design. The numerical illustration of this model involves the calibration of a Zener voltage standard; the analysis should be relevant to a variety of practical measurement situations.
ANOVA methods are of special importance in the certification of reference materials (RMs) by interlaboratory testing, a topic covered thoroughly in ISO Guide 35  (see H.5.3.2 for a brief description of such RM certification). Since much of the material contained in ISO Guide 35 is in fact broadly applicable, that publication may be consulted for additional details concerning ANOVA, including unbalanced nested designs. References  and  may be similarly consulted.
Consider a nominally 10 V Zener voltage standard that is calibrated against a stable voltage reference over a two‑week period. On each of J days during the period, K independent repeated observations of the potential difference VS of the standard are made. If Vjk denotes the kth observation of VS (k = 1, 2, ..., K) on the jth day (j = 1, 2, ..., J), the best estimate of the potential difference of the standard is the arithmetic mean V‾‾‾ of the JK observations [see Equation (3) in 4.2.1],
The experimental standard deviation of the mean s(V‾‾‾), which is a measure of the uncertainty of V‾‾‾ as an estimate of the potential difference of the standard, is obtained from [see Equation (5) in 4.2.3]
NOTE It is assumed throughout this example that all corrections applied to the observations to compensate for systematic effects have negligible uncertainties or their uncertainties are such that they can be taken into account at the end of the analysis. A correction in this latter category, and one that can itself be applied to the mean of the observations at the end of the analysis, is the difference between the certified value (assumed to have a given uncertainty) and the working value of the stable voltage reference against which the Zener voltage standard is calibrated. Thus the estimate of the potential difference of the standard obtained statistically from the observations is not necessarily the final result of the measurement; and the experimental standard deviation of that estimate is not necessarily the combined standard uncertainty of the final result.
The experimental standard deviation of the mean s(V‾‾‾) as obtained from Equation (H.24b) is an appropriate measure of the uncertainty of V‾‾‾ only if the day‑to‑day variability of the observations is the same as the variability of the observations made on a single day. If there is evidence that the between‑day variability is significantly larger than can be expected from the within‑day variability, use of this expression could lead to a considerable understatement of the uncertainty of V‾‾‾. Two questions thus arise: How should one decide if the between‑day variability (characterized by a between‑day component of variance) is significant in comparison with the within‑day variability (characterized by a within‑day component of variance) and, if it is, how should one evaluate the uncertainty of the mean?
H.5.2.1 Data which allow the above questions to be addressed are given in Table H.9, where
|V‾‾‾j⁄V||10,000 172||10,000 116||10,000 013||10,000 144||10,000 106||10,000 031||10,000 060||10,000 125||10,000 163||10,000 041|
|V‾‾‾ = 10,000 097 V||s(V‾‾‾j) = 57 µV|
|s2a = Ks2(V‾‾‾j) = 5(57 µV)2 = (128 µV)2||s2b = s2(Vjk)‾‾‾‾‾‾‾‾‾‾‾ = (85 µV)2|
H.5.2.2 The consistency of the within‑day variability and between‑day variability of the observations can be investigated by comparing two independent estimates of σ2W, the within‑day component of variance (that is, the variance of observations made on the same day).
The first estimate of σ2w, denoted by s2a, is obtained from the observed variation of the daily means V‾‾‾j. Since V‾‾‾j is the average of K observations, its estimated variance s2(V‾‾‾j), under the assumption that the between‑day component of variance is zero, estimates σ2w∕K. It then follows from Equation (H.25d) that
The second estimate of σ2w, denoted by s2b, is the pooled estimate of variance obtained from the J = 10 individual values of s2(Vjk) using the equation of the note to H.3.6, where the ten individual values are calculated from Equation (H.25c). Because the degrees of freedom of each of these values is vi = K − 1, the resulting expression for s2b is simply their average. Thus
which is an estimate of σ2w having vb = J(K − 1) = 40 degrees of freedom.
The estimates of σ2w given by Equations (H.26a) and (H.26b) are s2a = (128 µV)2 and s2b = (85 µV)2, respectively (see Table H.9). Since the estimate s2a is based on the variability of the daily means while the estimate s2b is based on the variability of the daily observations, their difference indicates the possible presence of an effect that varies from one day to another but that remains relatively constant when observations are made on any single day. The F‑test is used to test this possibility, and thus the assumption that the between‑day component of variance is zero.
H.5.2.3 The F‑distribution is the probability distribution of the ratio F(va, vb) = s2a(va)⁄s2b(vb) of two independent estimates, s2a(va) and s2b(vb), of the variance σ2 of a normally distributed random variable . The parameters va and vb are the respective degrees of freedom of the two estimates and 0 ≤ F(va, vb) < ∞. Values of F are tabulated for different values of va and vb and various quantiles of the F‑distribution. A value of F(va, vb) > F0,95 or F(va, vb) > F0,975 (the critical value) is usually interpreted as indicating that s2a(va) is larger than s2b(vb) by a statistically significant amount; and that the probability of a value of F as large as that observed, if the two estimates were estimates of the same variance, is less than 0,05 or 0,025, respectively. (Other critical values may also be chosen, such as F0,99.)
H.5.2.4 The application of the F‑test to the present numerical example yields
with va = J − 1 = 9 degrees of freedom in the numerator and vb = J(K − 1) = 40 degrees of freedom in the denominator. Since F0,95(9,40) = 2,12 and F0,975(9,40) = 2,45, it is concluded that there is a statistically significant between‑day effect at the 5 percent level of significance but not at the 2,5 percent level.
H.5.2.5 If the existence of a between‑day effect is rejected because the difference between s2a and s2b is not viewed as statistically significant (an imprudent decision because it could lead to an underestimate of the uncertainty), the estimated variance s2(V‾‾‾) of V‾‾‾ should be calculated from Equation (H.24b). That relation is equivalent to pooling the estimates s2a and s2b (that is, taking a weighted average of s2a and s2b, each weighted by its respective degrees of freedom va and vb — see H.3.6, note) to obtain the best estimate of the variance of the observations; and dividing that estimate by JK, the number of observations, to obtain the best estimate s2(V‾‾‾) of the variance of the mean of the observations. Following this procedure one obtains
If it is assumed that all corrections for systematic effects have already been taken into account and that all other components of uncertainty are insignificant, then the result of the calibration can be stated as VS = V‾‾‾ = 10,000 097 V (see Table H.9), with a combined standard uncertainty of s(V‾‾‾) = uc = 13 µV, and with uc having 49 degrees of freedom.
NOTE 1 In practice, there would very likely be additional components of uncertainty that were significant and therefore would have to be combined with the component of uncertainty obtained statistically from the observations (see H.5.1, note).
NOTE 2 Equation (H.28a) for s2(V‾‾‾) can be shown to be equivalent to Equation (H.24b) by writing the double sum, denoted by S, in that equation as
H.5.2.6 If the existence of a between‑day effect is accepted (a prudent decision because it avoids a possible underestimate of the uncertainty) and it is assumed to be random, then the variance s2(V‾‾‾j) calculated from the J = 10 daily means according to Equation (H.25d) estimates not σ2w⁄K as postulated in H.5.2.2, but σ2w⁄K + σ2B, where σ2B is the between‑day random component of variance. This implies that
The estimated variance of V‾‾‾ is obtained from s2(V‾‾‾j), Equation (H.25d), because s2(V‾‾‾j) properly reflects both the within‑day and between‑day random components of variance [see Equation (H.29)]. Thus
The degrees of freedom of s2w (and thus sw) is J(K − 1) = 40 [see Equation (H.26b)]. The degrees of freedom of s2B (and thus sB) is the effective degrees of freedom of the difference s2B = s2(V‾‾‾j) − s2(Vjk)‾‾‾‾‾‾‾‾‾⁄K [Equation (H.31a)], but its estimation is problematic.
H.5.2.7 The best estimate of the potential difference of the voltage standard is then VS = V‾‾‾ = 10,000 097 V, with s(V‾‾‾) = uc = 18 µV as given in Equation (H.32). This value of uc and its 9 degrees of freedom are to be compared with uc = 13 µV and its 49 degrees of freedom, the result obtained in H.5.2.5 [Equation (H.28b)] when the existence of a between‑day effect was rejected.
In a real measurement an apparent between‑day effect should be further investigated, if possible, in order to determine its cause and whether a systematic effect is present that would negate the use of ANOVA methods. As pointed out at the beginning of this example, ANOVA techniques are designed to identify and evaluate components of uncertainty arising from random effects; they cannot provide information about components arising from systematic effects.
H.5.3.1 This voltage standard example illustrates what is generally termed a balanced, one‑stage nested design. It is a one‑stage nested design because there is one level of “nesting” of the observations with one factor, the day on which observations are made, being varied in the measurement. It is balanced because the same number of observations is made on each day. The analysis presented in the example can be used to determine if there is an “operator effect”, an “instrument effect”, a “laboratory effect”, a “sample effect”, or even a “method effect” in a particular measurement. Thus in the example, one might imagine replacing the observations made on the J different days by observations made on the same day but by J different operators; the between‑day component of variance becomes then a component of variance associated with different operators.
H.5.3.2 As noted in H.5, ANOVA methods are widely used in the certification of reference materials (RMs) by interlaboratory testing. Such certification usually involves having a number of independent, equally competent laboratories measure samples of a material for the property for which the material is to be certified. It is generally assumed that the differences between individual results, both within and between laboratories, are statistical in nature regardless of the causes. Each laboratory mean is considered an unbiased estimate of the property of the material, and usually the unweighted mean of the laboratory means is assumed to be the best estimate of that property.
An RM certification might involve I different laboratories, each of which measures the requisite property of J different samples of the material, with each measurement of a sample consisting of K independent repeated observations. Thus the total number of observations is IJK and the total number of samples is IJ. This is an example of a balanced, two‑stage nested design analogous to the one-stage voltage‑standard example above. In this case, there are two levels of “nesting” of the observations with two different factors, sample and laboratory, being varied in the measurement. The design is balanced because each sample is observed the same number of times (K) in each laboratory and each laboratory measures the same number of samples (J). In further analogy with the voltage‑standard example, in the RM case the purpose of the analysis of the data is to investigate the possible existence of a between‑samples effect and a between‑laboratories effect, and to determine the proper uncertainty to assign to the best estimate of the value of the property to be certified. In keeping with the previous paragraph, that estimate is assumed to be the mean of the I laboratory means, which is also the mean of the IJK observations.
H.5.3.3 The importance of varying the input quantities upon which a measurement result depends so that its uncertainty is based on observed data evaluated statistically is pointed out in 3.4.2. Nested designs and the analysis of the resulting data by ANOVA methods can be successfully used in many measurement situations encountered in practice.
Nonetheless, as indicated in 3.4.1, varying all input quantities is rarely feasible due to limited time and resources; at best, in most practical measurement situations, it is only possible to evaluate a few components of uncertainty using ANOVA methods. As pointed out in 4.3.1, many components must be evaluated by scientific judgement using all of the available information on the possible variability of the input quantities in question; in many instances an uncertainty component, such as arises from a between‑samples effect, a between‑laboratories effect, a between‑instruments effect, or a between‑operators effect, cannot be evaluated by the statistical analysis of series of observations but must be evaluated from the available pool of information.
Hardness is an example of a physical concept that cannot be quantified without reference to a method of measurement; it has no unit that is independent of such a method. The quantity “hardness” is unlike classical measurable quantities in that it cannot be entered into algebraic equations to define other measurable quantities (though it is sometimes used in empirical equations that relate hardness to another property for a category of materials). Its magnitude is determined by a conventional measurement, that of a linear dimension of an indentation in a block of the material of interest, or sample block. The measurement is made according to a written standard, which includes a description of the “indentor”, the construction of the machine by which the indentor is applied, and the way in which the machine is to be operated. There is more than one written standard, so there is more than one scale of hardness.
The hardness reported is a function (depending on the scale) of the linear dimension that is measured. In the example given in this subclause, it is a linear function of the arithmetic mean or average of the depths of five repeated indentations, but for some other scales the function is nonlinear.
Realizations of the standard machine are kept as national standards (there is no international standard realization); a comparison between a particular machine and the national standard machine is made using a transfer‑standard block.
In this example, the hardness of a sample block of material is determined on the scale “Rockwell C” using a machine that has been calibrated against the national standard machine. The scale unit of Rockwell‑C hardness is 0,002 mm, with hardness on that scale defined as 100 × (0,002 mm) minus the average of the depths, measured in mm, of five indentations. The value of that quantity divided by the Rockwell scale unit 0,002 mm is called the “HRC hardness index”. In this example, the quantity is called simply “hardness”, symbol hRockwell C, and the numerical value of hardness expressed in Rockwell units of length is called the “hardness index”, HRockwell C.
To the average of the depths of the indentations made in the sample block by the machine used to determine its hardness, or calibration machine, must be added corrections to determine the average of the depths of the indentations that would have been made in the same block by the national standard machine. Thus
is the average of the depths of five indentations made by the calibration machine in the sample block;
is the correction obtained from a comparison of the calibration machine with the national standard machine using a transfer‑standard block, equal to the average of the depths of 5m indentations made by the national standard machine in this block, minus the average of the depths of 5n indentations made in the same block by the calibration machine;
is the difference in hardness (expressed as a difference of average depth of indentation) between the two parts of the transfer‑standard block used respectively for indentations by the two machines, assumed zero; and
is the error due to the lack of repeatability of the national standard machine and the incomplete definition of the quantity hardness. Although ΔS must be assumed to be zero, it has a standard uncertainty associated with it of u(ΔS).
Since the partial derivatives, ∂f⁄∂d‾‾, ∂f⁄∂Δc, ∂f⁄∂Δb, and ∂f⁄∂ΔS of the function of Equation (H.33a) are all equal to −1, the combined standard uncertainty u2c(h) of the hardness of the sample block as measured by the calibration machine is simply given by
Uncertainty of repeated observations. Strict repetition of an observation is not possible because a new indentation cannot be made on the site of an earlier one. Since each indentation must be made on a different site, any variation in the results includes the effect of variations in hardness between different sites. Thus u(d‾‾), the standard uncertainty of the average of the depths of five indentations in the sample block by the calibration machine, is taken as sp(dk)⁄√5‾‾, where sp(dk) is the pooled experimental standard deviation of the depths of indentations determined by “repeated” measurements on a block known to have very uniform hardness (see 4.2.4).
Uncertainty of indication. Although the correction to d‾‾ due to the display of the calibration machine is zero, there is an uncertainty in d‾‾ due to the uncertainty of the indication of depth due to the resolution δ of the display given by u2(δ) = δ2⁄12 (see F.2.2.1). The estimated variance of d‾‾ is thus
As indicated in H.6.2, Δc is the correction for the difference between the national standard machine and the calibration machine. This correction may be expressed as Δc = z′S − z′, where z′S = (∑mi =1z‾‾‾S, i)⁄m is the average depth of the 5m indentations made by the national standard machine in the transfer‑standard block; and z′ = (∑ni =1z‾‾‾i)⁄n is the average depth of the 5n indentations made in the same block by the calibration machine. Thus, assuming that for the comparison the uncertainty due to the resolution of the display of each machine is negligible, the estimated variance of Δc is
s2av(z‾‾‾S) = [∑mi = 1s2(z‾‾‾S, i)]⁄m is the average of the experimental variances of the means of each of the m series of indentations zS, ik made by the standard machine;
s2av(z‾‾‾) = [∑ni = 1s2(z‾‾‾i)]⁄n is the average of the experimental variances of the means of each of the n series of indentations zik made by the calibration machine.
OIML International Recommendation R 12, Verification and calibration of Rockwell C hardness standardized blocks, requires that the maximum and minimum depths of indentation obtained from five measurements on the transfer‑standard block shall not differ by more than a fraction x of the average depth of indentation, where x is a function of the hardness level. Let, therefore, the maximum difference in the depths of indentation over the entire block be xz′, where z′ is as defined in H.6.3.2 with n = 5. Also let the maximum difference be described by a triangular probability distribution about the average value xz′⁄2 (on the likely assumption that values near the central value are more probable than extreme values —- see 4.3.9). Then, if in Equation (9b) in 4.3.9 a = xz′⁄2, the estimated variance of the correction to the average depth of indentation due to differences of the hardnesses presented respectively to the standard machine and the calibration machine is
As indicated in H.6.2, it is assumed that the best estimate of the correction Δb itself is zero.
The uncertainty of the national standard machine together with the uncertainty due to incomplete definition of the quantity hardness is reported as an estimated standard deviation u(ΔS) (a quantity of dimension length).
Collection of the individual terms discussed in H.6.3.1 to H.6.3.4 and their substitution into Equation (H.34) yields for the estimated variance of the measurement of hardness
The data for this example are summarized in Table H.10.
|Source of uncertainty||Value|
|Average depth d‾‾‾ of 5 indentations made by the calibration machine in the sample block: 0,072 mm||36,0 Rockwell scale unit|
|Indicated hardness index of the sample block from the 5 indentations: HRockwell C = hRockwell C∕(0,002 mm) = [100(0,002 mm) − 0,072 mm]∕(0,002 mm) (see H.6.1)||64,0 HRC|
|Pooled experimental standard deviation sp(dk) of the depths of indentations made by the calibration machine in a block having uniform hardness||0,45 Rockwell scale unit|
|Resolution δ of the display of the calibration machine||0,1 Rockwell scale unit|
|sav(z‾‾‾S), square root of the average of the experimental variances of the means of m series of indentations made by the national standard machine in the transfer‑standard block||0,10 Rockwell scale unit, m = 6|
|sav(z‾‾‾), square root of the average of the experimental variances of the means of n series of indentations made by the calibration machine in the transfer‑standard block||0,11 Rockwell scale unit, n = 6|
|Permitted fractional variation x of the depth of penetration in the transfer‑standard block||1,5 × 10−2|
|Standard uncertainty u(ΔS) of the national standard machine and definition of hardness||0,5 Rockwell scale unit|
The scale is Rockwell C, designated HRC. The Rockwell scale unit is 0,002 mm, and thus in Table H.10 and in the following, it is understood that (for example) “36,0 Rockwell scale unit” means 36,0 × (0,002 mm) = 0,072 mm and is simply a convenient way of expressing the data and results.
If the values for the relevant quantities given in Table H.10 are substituted into Equation (H.38), one obtains the following two expressions:
Thus, if it is assumed that Δc = 0, the hardness of the sample block is
hRockwell C = 64,0 Rockwell scale unit or 0,128 0 mm with a combined standard uncertainty of uc = 0,55 Rockwell scale unit or 0,0011 mm.
The hardness index of the block is hRockwell C⁄(0,002 mm) = (0,128 0 mm)⁄(0,002 mm), or
HRockwell C = 64,0 HRC with a combined standard uncertainty of uc = 0,55 HRC.
In addition to the component of uncertainty due to the national standard machine and the definition of hardness, u(ΔS) = 0,5 Rockwell scale unit, the significant components of uncertainty are those of the repeatability of the machine, sp(dk)⁄√5‾‾ = 0,20 Rockwell scale unit; and the variation of the hardness of the transfer‑standard block, which is (xz′)2⁄24 = 0,11 Rockwell scale unit. The effective degrees of freedom of uc can be evaluated using the Welch-Satterthwaite formula in the manner illustrated in H.1.6.