# Annex H Examples

This annex gives six examples, H.1 to H.6, which are worked out in considerable detail in order to illustrate the basic principles presented in this Guide for evaluating and expressing uncertainty in measurement. Together with the examples included in the main text and in some of the other annexes, they should enable the users of this Guide to put these principles into practice in their own work.

Because the examples are for illustrative purposes, they have by necessity been simplified. Moreover, because they and the numerical data used in them have been chosen mainly to demonstrate the principles of this Guide, neither they nor the data should necessarily be interpreted as describing real measurements. While the data are used as given, in order to prevent rounding errors, more digits are retained in intermediate calculations than are usually shown. Thus the stated result of a calculation involving several quantities may differ slightly from the result implied by the numerical values given in the text for these quantities.

It is pointed out in earlier portions of this Guide that classifying the methods used to evaluate components of uncertainty as Type A or Type B is for convenience only; it is not required for the determination of the combined standard uncertainty or expanded uncertainty of a measurement result because all uncertainty components, however they are evaluated, are treated in the same way (see 3.3.4, 5.1.2, and E.3.7). Thus, in the examples, the method used to evaluate a particular component of uncertainty is not specifically identified as to its type. However, it will be clear from the discussion whether a component is obtained from a Type A or a Type B evaluation.

# H.1   End‑gauge calibration

This example demonstrates that even an apparently simple measurement may involve subtle aspects of uncertainty evaluation.

## H.1.1   The measurement problem

The length of a nominally 50 mm end gauge is determined by comparing it with a known standard of the same nominal length. The direct output of the comparison of the two end gauges is the difference d in their lengths:

(H.1)
where
 l is the measurand, that is, the length at 20 °C of the end gauge being calibrated; lS is the length of the standard at 20 °C as given in its calibration certificate; α and αS are the coefficients of thermal expansion, respectively, of the gauge being calibrated and the standard; θ and θS are the deviations in temperature from the 20 °C reference temperature, respectively, of the gauge and the standard.

## H.1.2   Mathematical model

From Equation (H.1), the measurand is given by

(H.2)

If the difference in temperature between the end gauge being calibrated and the standard is written as δθ = θ − θS, and the difference in their thermal expansion coefficients as δα = α − αS, Equation (H.2) becomes

(H.3)

The differences δθ and δα, but not their uncertainties, are estimated to be zero; and δα, αS, δθ, and θ are assumed to be uncorrelated. (If the measurand were expressed in terms of the variables θ, θS, α, and αS, it would be necessary to include the correlation between θ and θS, and between α and αS.)

It thus follows from Equation (H.3) that the estimate of the value of the measurand l may be obtained from the simple expression lS + d‾‾‾, where lS is the length of the standard at 20 °C as given in its calibration certificate and d is estimated by d‾‾‾, the arithmetic mean of n = 5 independent repeated observations. The combined standard uncertainty uc(l) of l is obtained by applying Equation (10) in 5.1.2 to Equation (H.3), as discussed below.

NOTE   In this and the other examples, for simplicity of notation, the same symbol is used for a quantity and its estimate.

## H.1.3   Contributory variances

The pertinent aspects of this example as discussed in this and the following subclauses are summarized in Table H.1.

Since it is assumed that δα = 0 and δθ = 0, the application of Equation (10) in 5.1.2 to Equation (H.3) yields

(H.4)
with
and thus
(H.5)

### H.1.3.1   Uncertainty of the calibration of the standard, u(lS)

The calibration certificate gives as the expanded uncertainty of the standard U = 0,075 µm and states that it was obtained using a coverage factor of k = 3. The standard uncertainty is then

### H.1.3.2   Uncertainty of the measured difference in lengths, u(d)

The pooled experimental standard deviation characterizing the comparison of l and lS was determined from the variability of 25 independent repeated observations of the difference in lengths of two standard end gauges and was found to be 13 nm. In the comparison of this example, five repeated observations were taken. The standard uncertainty associated with the arithmetic mean of these readings is then (see 4.2.4)

According to the calibration certificate of the comparator used to compare l with lS, its uncertainty “due to random errors” is ±0,01 µm at a level of confidence of 95 percent and is based on 6 replicate measurements; thus the standard uncertainty, using the t‑factor t95(5) = 2,57 for v = 6 − 1 = 5 degrees of freedom (see Annex G, Table G.2), is

The uncertainty of the comparator “due to systematic errors” is given in the certificate as 0,02 µm at the “three sigma level”. The standard uncertainty from this cause may therefore be taken to be

The total contribution is obtained from the sum of the estimated variances:

or

### H.1.3.3   Uncertainty of the thermal expansion coefficient, u(αS)

The coefficient of thermal expansion of the standard end gauge is given as αS = 11,5 × 10−6 °C−1 with an uncertainty represented by a rectangular distribution with bounds ±2 × 10−6 °C−1. The standard uncertainty is then [see Equation (7) in 4.3.7]

Since cαS = ∂fαS = −lSδθ = 0 as indicated in H.1.3, this uncertainty contributes nothing to the uncertainty of l in first order. It does, however, have a second‑order contribution that is discussed in H.1.7.

Table H.1 — Summary of standard uncertainty components
Standard uncertainty component Source of uncertainty Value of standard uncertainty ci ≡ ∂fxi ui(l) ≡ ∣ciu(xi) Degrees of freedom
u(xi) u(xi) (nm)
u(lS) Calibration of standard end gauge 25 nm 1 25 18
u(d) Measured difference between end gauges 9,7 nm 1 9,7 25,6

u(d‾‾‾)
repeated observations
5,8 nm

24

u(d1)
random effects of comparator
3,9 nm

5

u(d2)
Systematic effects of comparator
6,7 nm

8
u(αS) Thermal expansion coefficient of standard end gauge 1,2 × 10−6 °C−1 0 0
u(θ) Temperature of test bed 0,41 °C 0 0

u(θ‾‾‾)
mean temperature of bed
0,2 °C

u(Δ)
cyclic variation of temperature of room
0,35 °C

u(δα) Difference in expansion coefficients of end gauges 0,58 × 10−6 °C−1 lSθ 2,9 50
u(δθ) Difference in temperatures of end gauges 0,029 °C lSαS 16,6   2
u2c(l) = ∑u2i(l) = 1 002 nm2
uc(l) = 32 nm
veff(l) = 16

### H.1.3.4   Uncertainty of the deviation of the temperature of the end gauge, u(θ)

The temperature of the test bed is reported as (19,9 ± 0,5) °C; the temperature at the time of the individual observations was not recorded. The stated maximum offset, Δ = 0,5 °C, is said to represent the amplitude of an approximately cyclical variation of the temperature under a thermostatic system, not the uncertainty of the mean temperature. The value of the mean temperature deviation

is reported as having a standard uncertainty itself due to the uncertainty in the mean temperature of the test bed of
while the cyclic variation in time produces a U‑shaped (arcsine) distribution of temperatures resulting in a standard uncertainty of

The temperature deviation θ may be taken equal to θ‾‾, and the standard uncertainty of θ is obtained from

which gives

Since cθ = ∂fθ = −lSδα = 0 as indicated in H.1.3, this uncertainty also contributes nothing to the uncertainty of l in first order; but it does have a second‑order contribution that is discussed in H.1.7.

### H.1.3.5   Uncertainty of the difference in expansion coefficients, u(δα)

The estimated bounds on the variability of δα are ±1 × 10−6 °C−1, with an equal probability of δα having any value within those bounds. The standard uncertainty is

### H.1.3.6   Uncertainty of the difference in temperature of the gauges, u(δθ)

The standard and the test gauge are expected to be at the same temperature, but the temperature difference could lie with equal probability anywhere in the estimated interval −0,05 °C to +0,05 °C. The standard uncertainty is

## H.1.4   Combined standard uncertainty

The combined standard uncertainty uc(l) is calculated from Equation (H.5). The individual terms are collected and substituted into this expression to obtain

(H.6a)
(H.6b)
or
(H.6c)

The dominant component of uncertainty is obviously that of the standard, u(lS) = 25 nm.

## H.1.5   Final result

The calibration certificate for the standard end gauge gives lS = 50,000 623 mm as its length at 20 °C. The arithmetic mean d‾‾‾ of the five repeated observations of the difference in lengths between the unknown end gauge and the standard gauge is 215 nm. Thus, since l = lS + d‾‾‾ (see H.1.2), the length l of the unknown end gauge at 20 °C is 50,000 838 mm. Following 7.2.2, the final result of the measurement may be stated as:

l = 50,000 838 mm with a combined standard uncertainty uc = 32 nm. The corresponding relative combined standard uncertainty is ucl = 6,4 × 10−7.

## H.1.6   Expanded uncertainty

Suppose that one is required to obtain an expanded uncertainty U99 = k99uc(l) that provides an interval having a level of confidence of approximately 99 percent. The procedure to use is that summarized in G.6.4, and the required degrees of freedom are indicated in Table H.1. These were obtained as follows:

1. Uncertainty of the calibration of the standard, u(lS) [ H.1.3.1]. The calibration certificate states that the effective degrees of freedom of the combined standard uncertainty from which the quoted expanded uncertainty was obtained is veff(lS) = 18.
2. Uncertainty of the measured difference in lengths, u(d) [H.1.3.2]. Although d‾‾‾ was obtained from five repeated observations, because u(d‾‾‾) was obtained from a pooled experimental standard deviation based on 25 observations, the degrees of freedom of u(d‾‾‾) is v(d‾‾‾) = 25 − 1 = 24 (see H.3.6, note). The degrees of freedom of u(d1), the uncertainty due to random effects on the comparator, is v(d1) = 6 − 1 = 5 because d1 was obtained from six repeated measurements. The ±0,02 µm uncertainty for systematic effects on the comparator may be assumed to be reliable to 25 percent, and thus the degrees of freedom from Equation (G.3) in G.4.2 is v(d2) = 8 (see the example of G.4.2). The effective degrees of freedom of u(d), veff(d), is then obtained from Equation (G.2b) in G.4.1:
3. Uncertainty of the difference in expansion coeficients, u(δα) [H.l.3.5]. The estimated bounds of ±1 × 10−6 °C−1 on the variability of δα are deemed to be reliable to 10 percent. This gives, from Equation (G.3) in G.4.2, v(δα) = 50.
4. Uncertainty of the difference in temperatures of the gauges, u(δθ) [H.1.3.6]. The estimated interval −0,05 °C to +0,05 °C for the temperature difference δθ is believed to be reliable only to 50 percent, which from Equation (G.3) in G.4.2 gives v(δθ) = 2.

The calculation of veff(l) from Equation (G.2b) in G.4.1 proceeds in exactly the same way as for the calculation of veff(d) in 2. above. Thus from Equations (H.6b) and (H.6c) and the values for v given in 1. through 4.,

To obtain the required expanded uncertainty, this value is first truncated to the next lower integer, veff(l) = 16. It then follows from Table G.2 in Annex G that t99(16) = 2,92, and hence U99 = t99(16)uc(l) = 2,92 × (32 nm) = 93 nm. Following 7.2.4, the final result of the measurement may be stated as:

l = (50,000 838 ± 0,000 093) mm, where the number following the symbol ± is the numerical value of an expanded uncertainty U = kuc, with U determined from a combined standard uncertainty uc = 32 nm and a coverage factor k = 2,92 based on the t‑distribution for v = 16 degrees of freedom, and defines an interval estimated to have a level of confidence of 99 percent. The corresponding relative expanded uncertainty is Ul = 1,9 × 10−6.

## H.1.7   Second‑order terms

The note to 5.1.2 points out that Equation (10), which is used in this example to obtain the combined standard uncertainty uc(l), must be augmented when the nonlinearity of the function Y = f(X1X2, ..., XN) is so significant that the higher‑order terms in the Taylor series expansion cannot be neglected. Such is the case in this example, and therefore the evaluation of uc(l) as presented up to this point is not complete. Application to Equation (H.3) of the expression given in the note to 5.1.2 yields in fact two distinct non‑negligible second‑order terms to be added to Equation (H.5). These terms, which arise from the quadratic term in the expression of the note, are

but only the first of these terms contributes significantly to uc(l):

The second‑order terms increase uc(l) from 32 nm to 34 nm.

## H.2   Simultaneous resistance and reactance measurement

This example demonstrates the treatment of multiple measurands or output quantities determined simultaneously in the same measurement and the correlation of their estimates. It considers only the random variations of the observations; in actual practice, the uncertainties of corrections for systematic effects would also contribute to the uncertainty of the measurement results. The data are analysed in two different ways with each yielding essentially the same numerical values.

## H.2.1   The measurement problem

The resistance R and the reactance X of a circuit element are determined by measuring the amplitude V of a sinusoidally‑alternating potential difference across its terminals, the amplitude I of the alternating current passing through it, and the phase‑shift angle φ of the alternating potential difference relative to the alternating current. Thus the three input quantities are V, I, and φ and the three output quantities — the measurands — are the three impedance components R, X, and Z. Since Z2 = R2 + X2, there are only two independent output quantities.

## H.2.2   Mathematical model and data

The measurands are related to the input quantities by Ohm's law:

(H.7)

Consider that five independent sets of simultaneous observations of the three input quantities V, I, and φ are obtained under similar conditions (see B.2.15), resulting in the data given in Table H.2. The arithmetic means of the observations and the experimental standard deviations of those means calculated from Equations (3) and (5) in 4.2 are also given. The means are taken as the best estimates of the expected values of the input quantities, and the experimental standard deviations are the standard uncertainties of those means.

Because the means V‾‾‾, I‾‾, and φ‾‾‾ are obtained from simultaneous observations, they are correlated and the correlations must be taken into account in the evaluation of the standard uncertainties of the measurands R, X, and Z. The required correlation coefficients are readily obtained from Equation (14) in 5.2.2 using values of s(V‾‾‾I‾‾), s(V‾‾‾φ‾‾‾), and s(I‾‾φ‾‾‾) calculated from Equation (17) in 5.2.3. The results are included in Table H.2, where it should be recalled that r(xixj) = r(xjxi) and r(xixi) = 1.

Table H.2 — Values of the input quantities V, I, and φ obtained from five sets of simultaneous observations
Set number Input quantities
k V
(V)
I
(mA)
φ
1 5,007 19,663 1,045 6
2 4,994 19,639 1,043 8
3 5,005 19,640 1,046 8
4 4,990 19,685 1,042 8
5 4,999 19,678 1,043 3
Arithmetic mean V‾‾‾ = 4,9990 I‾‾ = 19,6610 φ‾‾‾ = 1,044 46
Experimental standard deviation of mean s(V‾‾‾) = 0,0032 s(I‾‾) = 0,0095 s(φ‾‾‾) = 0,000 75
Correlation coefficients
r(V‾‾‾I‾‾) = −0,36
r(V‾‾‾φ‾‾‾) = 0,86
r(I‾‾φ‾‾‾) = −0,65

## H.2.3   Results: approach 1

Approach 1 is summarized in Table H.3.

The values of the three measurands R, X, and Z are obtained from the relations given in Equation (H.7) using the mean values V‾‾‾, I‾‾, and φ‾‾‾ of Table H.2 for V, I, and φ. The standard uncertainties of R, X, and Z are obtained from Equation (16) in 5.2.2 since, as pointed out above, the input quantities V‾‾‾, I‾‾, and φ‾‾‾ are correlated. As an example, consider Z = V‾‾‾I‾‾. Identifying V‾‾‾ with x1, I‾‾ with x2, and f with Z = V‾‾‾I‾‾, Equation (16) in 5.2.2 yields for the combined standard uncertainty of Z

(H.8a)
(H.8b)
or
(H.8c)
where u(V‾‾‾) = s(V‾‾‾), u(I‾‾) = s(I‾‾), and the subscript “r” in the last expression indicates that u is a relative uncertainty. Substitution of the appropriate values from Table H.2 into Equation (H.8a) then gives uc(Z) = 0,236 Ω.

Because the three measurands or output quantities depend on the same input quantities, they too are correlated. The elements of the covariance matrix that describes this correlation may be written in general as

(H.9)
where yl = fl(x1x2, ..., xN) and ym = fm(x1x2, ..., xN). Equation (H.9) is a generalization of Equation (F.2) in F.1.2.3 when the ql in that expression are correlated. The estimated correlation coefficients of the output quantities are given by r(ylym) = u(ylym)u(yl)u(ym), as indicated in Equation (14) in 5.2.2. It should be recognized that the diagonal elements of the covariance matrix, u(ylyl) ≡ u2(yl), are the estimated variances of the output quantities yl (see 5.2.2, Note 2) and that for m = l, Equation (H.9) is identical to Equation (16) in 5.2.2.

To apply Equation (H.9) to this example, the following identifications are made:

The results of the calculations of R, X, and Z and of their estimated variances and correlation coefficients are given in Table H.3.

Table H.3 — Calculated values of the output quantities R, X, and Z: approach 1
Measurand index Relationship between estimate of measurand
yl
and input estimates
xi
Value of estimate
yl,
which is the result of measurement
Combined standard uncertainty
uc(yl)
of result of measurement
l
1 y1 = R = (V‾‾‾I‾‾) cos φ‾‾‾ y1 = R = 127,732 Ω uc(R) = 0,071 Ω
uc(R)R = 0,06 × 10−2
2 y2 = X = (V‾‾‾I‾‾) sin φ‾‾‾ y2 = X = 219,847 Ω uc(X) = 0,295 Ω
uc(X)X = 0,13 × 10−2
3 y3 = Z = V‾‾‾I‾‾ y3 = Z = 254,260 Ω uc(Z) = 0,236 Ω
uc(Z)Z = 0,09 × 10−2
Correlation coefficients r(ylym)
r(y1y2) = r(RX) = −0,588
r(y1y3) = r(RZ) = −0,485
r(y2y3) = r(XZ) = 0,993

## H.2.4   Results: approach 2

Approach 2 is summarized in Table H.4.

Since the data have been obtained as five sets of observations of the three input quantities V, I, and φ. it is possible to compute a value for R, X, and Z from each set of input data, and then take the arithmetic mean of the five individual values to obtain the best estimates of R, X, and Z. The experimental standard deviation of each mean (which is its combined standard uncertainty) is then calculated from the five individual values in the usual way [Equation (5) in 4.2.3]; and the estimated covariances of the three means are calculated by applying Equation (17) in 5.2.3 directly to the five individual values from which each mean is obtained. There are no differences in the output values, standard uncertainties, and estimated covariances provided by the two approaches except for second‑order effects associated with replacing terms such as V‾‾‾I‾‾ and cos φ‾‾‾ by VI‾‾‾ ‾‾‾ and cos φ‾‾‾‾‾‾‾‾‾.

To demonstrate this approach, Table H.4 gives the values of R, X and Z calculated from each of the five sets of observations. The arithmetic means, standard uncertainties, and estimated correlation coefficients are then directly computed from these individual values. The numerical results obtained in this way are negligibly different from the results given in Table H.3.

Table H.4 — Calculated values of the output quantities RX, and Z: approach 2
Set number Individual values of measurands
k R = (VI) cos φ
(Ω)
X = (VI) sin φ
(Ω)
Z = VI
(Ω)
1 127,67 220,32 254,64
2 127,89 219,79 254,29
3 127,51 220,64 254,84
4 127,71 218,97 253,49
5 127,88 219,51 254,04
Arithmetic mean y1 = R‾‾‾ = 127,732 y2 = X‾‾‾ = 219,847 y3 = Z‾‾‾ = 254,260
Experimental standard deviation of mean s(R‾‾‾) = 0,071 s(X‾‾‾) = 0,295 s(Z‾‾‾) = 0,236
Correlation coefficients r(ylym)
r(y1y2) = r(R‾‾‾X‾‾‾) = −0,588
r(y1y3) = r(R‾‾‾Z‾‾‾) = −0,485
r(y2y3) = r(X‾‾‾Z‾‾‾) = 0,993

In the terminology of the Note to 4.1.4, approach 2 is an example of obtaining the estimate y from Y‾‾‾ = (∑nk = 1Yk)n, while approach 1 is an example of obtaining y from y = f(X‾‾‾1X‾‾‾2, ..., X‾‾‾N). As pointed out in that note, in general, the two approaches will give identical results if f is a linear function of its input quantities (provided that the experimentally observed correlation coefficients are taken into account when implementing approach 1). If f is not a linear function, then the results of approach 1 will differ from those of approach 2 depending on the degree of nonlinearity and the estimated variances and covariances of the Xi. This may be seen from the expression

(H.10)
where the second term on the right‑hand side is the second‑order term in the Taylor series expansion of f in terms of the X‾‾‾i (see also 5.1.2, note). In the present case, approach 2 is preferred because it avoids the approximation y = f(X‾‾‾1X‾‾‾2, ..., X‾‾‾N) and better reflects the measurement procedure used — the data were in fact collected in sets.

On the other hand, approach 2 would be inappropriate if the data of Table H.2 represented n1 = 5 observations of the potential difference V, followed by n2 = 5 observations of the current I, and then followed by n3 = 5 observations of the phase φ, and would be impossible if n1 ≠ n2 ≠ n3. (It is in fact poor measurement procedure to carry out the measurements in this way since the potential difference across a fixed impedance and the current through it are directly related.)

If the data of Table H.2 are reinterpreted in this manner so that approach 2 is inappropriate, and if correlations among the quantities V, I, and φ are assumed to be absent, then the observed correlation coefficients have no significance and should be set equal to zero. If this is done in Table H.2, Equation (H.9) reduces to the equivalent of Equation (F.2) in F.1.2.3, namely,

(H.11)
and its application to the data of Table H.2 leads to the changes in Table H.3 shown in Table H.5.
Table H.5 — Changes in Table H.3 under the assumption that the correlation coefficients of Table H.2 are zero
 Combined standard uncertainty uc(yl) of result of measurement uc(R) = 0,195 Ω uc(R)⁄R = 0,15 × 10−2 uc(X) = 0,201 Ω uc(X)⁄X = 0,09 × 10−2 uc(Z) = 0,204 Ω uc(Z)⁄Z = 0,08 × 10−2 Correlation coefficients r(yl, ym) r(y1, y2) = r(R, X) = 0,056 r(y1, y3) = r(R, Z) = 0,527 r(y2, y3) = r(X, Z) = 0,878

## H.3   Calibration of a thermometer

This example illustrates the use of the method of least squares to obtain a linear calibration curve and how the parameters of the fit, the intercept and slope, and their estimated variances and covariance, are used to obtain from the curve the value and standard uncertainty of a predicted correction.

## H.3.1   The measurement problem

A thermometer is calibrated by comparing n = 11 temperature readings tk of the thermometer, each having negligible uncertainty, with corresponding known reference temperatures tR, k in the temperature range 21 °C to 27 °C to obtain the corrections bk = tR, k − tk to the readings. The measured corrections bk and measured temperatures tk are the input quantities of the evaluation. A linear calibration curve

(H.12)
is fitted to the measured corrections and temperatures by the method of least squares. The parameters y1 and y2, which are respectively the intercept and slope of the calibration curve, are the two measurands or output quantities to be determined. The temperature t0 is a conveniently chosen exact reference temperature; it is not an independent parameter to be determined by the least‑squares fit. Once y1 and y2 are found, along with their estimated variances and covariance, Equation (H.12) can be used to predict the value and standard uncertainty of the correction to be applied to the thermometer for any value t of the temperature.

## H.3.2   Least‑squares fitting

Based on the method of least squares and under the assumptions made in H.3.1 above, the output quantities y1 and y2 and their estimated variances and covariance are obtained by minimizing the sum

This leads to the following equations for y1, y2, their experimental variances s2(y1) and s2(y2), and their estimated correlation coefficient r(y1y2) = s(y1y2)s(y1)s(y2), where s(y1y2) is their estimated covariance:

(H.13a)

(H.13b)

(H.13c)

(H.13d)

(H.13e)

(H.13f)

(H.13g)
where all sums are from k = 1 to n, θk = tk − t0, θ‾‾ = (Σθk)n, and t‾‾ = (Σtk)n; [bk − b(tk)] is the difference between the measured or observed correction bk at the temperature tk and the correction b(tk) predicted by the fitted curve b(t) = y1 + y2(t − t0) at tk. The variance s2 is a measure of the overall uncertainty of the fit, where the factor n − 2 reflects the fact that because two parameters, y1 and y2, are determined by the n observations, the degrees of freedom of s2 is v = n − 2 (see G.3.3).

## H.3.3   Calculation of results

The data to be fitted are given in the second and third columns of Table H.6. Taking t0 = 20 °C as the reference temperature, application of Equations (H.13a) to (H.13g) yields

Table H.6 — Data used to obtain a linear calibration curve for a thermometer by the method of least squares
Reading number Thermometer reading Observed correction Predicted correction Difference between observed and predicted correction
k tk bk = tR, k − tk b(tk) bk − b(tk)
(°C) (°C) (°C) (°C)
1 21,521 −0,171 −0,167 9 −0,003 1
2 22,012 −0,169 −0,166 8 −0,002 2
3 22,512 −0,166 −0,165 7 −0,000 3
4 23,003 −0,159 −0,164 6 +0,005 6
5 23,507 −0,164 −0,163 5 −0,000 5
6 23,999 −0,165 −0,162 5 −0,002 5
7 24,513 −0,156 −0,161 4 +0,005 4
8 25,002 −0,157 −0,160 3 +0,003 3
9 25,503 −0,159 −0,159 2 +0,000 2
10 26,010 −0,161 −0,158 1 −0,002 9
11 26,511 −0,160 −0,157 0 −0,003 0

The fact that the slope y2 is more than three times larger than its standard uncertainty provides some indication that a calibration curve and not a fixed average correction is required.

The calibration curve may then be written as

(H.14)
where the numbers in parentheses are the numerical values of the standard uncertainties referred to the corresponding last digits of the quoted results for the intercept and slope (see 7.2.2). This equation gives the predicted value of the correction b(t) at any temperature t, and in particular the value b(tk) at t = tk These values are given in the fourth column of the table while the last column gives the differences between the measured and predicted values, bk − b(tk). An analysis of these differences can be used to check the validity of the linear model; formal tests exist (see Reference [8]), but are not considered in this example.

## H.3.4   Uncertainty of a predicted value

The expression for the combined standard uncertainty of the predicted value of a correction can be readily obtained by applying the law of propagation of uncertainty, Equation (16) in 5.2.2, to Equation (H.12). Noting that b(t) = f(y1y2) and writing u(y1) = s(y1) and u(y2) = s(y2), one obtains

(H.15)

The estimated variance u2c[b(t)] is a minimum at tmin = t0 − u(y1)r(y1y2)u(y2), which in the present case is tmin = 24,008 5 °C.

As an example of the use of Equation (H.15), consider that one requires the thermometer correction and its uncertainty at t = 30 °C, which is outside the temperature range in which the thermometer was actually calibrated. Substituting t = 30 °C in Equation (H.14) gives

while Equation (H.15) becomes
or

Thus the correction at 30 °C is −0,149 4 °C, with a combined standard uncertainty of uc = 0,004 1 °C, and with uc having v = n − 2 = 9 degrees of freedom.

## H.3.5   Elimination of the correlation between the slope and intercept

Equation (H.13e) for the correlation coefficient r(y1y2) implies that if t0 is so chosen that Σnk = 1θk = Σnk = 1(tk − t0) = 0, then r(y1y2) = 0 and y1 and y2 will be uncorrelated, thereby simplifying the computation of the standard uncertainty of a predicted correction. Since Σnk = 1θk = 0 when t0 = t‾‾ = (Σnk = 1tk)n, and t‾‾ = 24,008 5 °C in the present case, repeating the least‑squares fit with t0 = t‾‾ = 24,008 5 °C would lead to values of y1 and y2 that are uncorrelated. (The temperature t‾‾ is also the temperature at which u2[b(t)] is a minimum — see H.3.4.) However, repeating the fit is unnecessary because it can be shown that

(H.16a)

(H.16b)

(H.16c)

where
and in writing Equation (H.16b), the substitutions u(y1) = s(y1) and u(y2) = s(y2) have been made [see Equation (H.15)].

Application of these relations to the results given in H.3.3 yields

(H.17a)
(H.17b)

That these expressions give the same results as Equations (H.14) and (H.15) can be checked by repeating the calculation of b(30 °C) and uc[b(30 °C)]. The substitution of t = 30 °C into Equations (H.17a) and (H.17b) yields

which are identical to the results obtained in H.3.4. The estimated covariance between two predicted corrections b(t1) and b(t2) may be obtained from Equation (H.9) in H.2.3.

## H.3.6   Other considerations

The least‑squares method can be used to fit higher‑order curves to data points, and is also applicable to cases where the individual data points have uncertainties. Standard texts on the subject should be consulted for details [8]. However, the following examples illustrate two cases where the measured corrections bk are not assumed to be exactly known.

1. Let each tk have negligible uncertainty, let each of the n values tR, k be obtained from a series of m repeated readings, and let the pooled estimate of variance for such readings based on a large amount of data obtained over several months be s2p . Then the estimated variance of each tR, k is s2p m = u20 and each observed correction bk = tR, k − tk has the same standard uncertainty u0. Under these circumstances (and under the assumption that there is no reason to believe that the linear model is incorrect), u20 replaces s2 in Equations (H.13c) and (H.13d).

NOTE   A pooled estimate of variance s2p based on N series of independent observations of the same random variable is obtained from

where s2i is the experimental variance of the ith series of ni independent repeated observations [Equation (4) in 4.2.2] and has degrees of freedom vi = ni − 1. The degrees of freedom of s2p is v = ΣNi = 1vi. The experimental variance s2pm (and the experimental standard deviation spm‾‾‾) of the arithmetic mean of m independent observations characterized by the pooled estimate of variance s2p also has v degrees of freedom.

2. Suppose that each tk has negligible uncertainty, that a correction εk is applied to each of the n values tR, k, and that each correction has the same standard uncertainty ua. Then the standard uncertainty of each bk = tR, k − tk is also ua, and s2(y1) is replaced by s2(y1) + u2a and s2(y1) is replaced by s2(y1) + u2a.

# H.4   Measurement of activity

This example is similar to example H.2, the simultaneous measurement of resistance and reactance, in that the data can be analysed in two different ways but each yields essentially the same numerical result. The first approach illustrates once again the need to take the observed correlations between input quantities into account.

## H.4.1   The measurement problem

The unknown radon (222Rn) activity concentration in a water sample is determined by liquid‑scintillation counting against a radon‑in‑water standard sample having a known activity concentration. The unknown activity concentration is obtained by measuring three counting sources consisting of approximately 5 g of water and 12 g of organic emulsion scintillator in vials of volume 22 ml:
 Source (a) a standard consisting of a mass mS of the standard solution with a known activity concentration; Source (b) a matched blank water sample containing no radioactive material, used to obtain the background counting rate; Source (c) the sample consisting of an aliquot of mass mx with unknown activity concentration.

Six cycles of measurement of the three counting sources are made in the order standard — blank — sample; and each dead‑time‑corrected counting interval T0 for each source during all six cycles is 60 minutes. Although the background counting rate cannot be assumed to be constant over the entire counting interval (65 hours), it is assumed that the number of counts obtained for each blank may be used as representative of the background counting rate during the measurements of the standard and sample in the same cycle. The data are given in Table H.7, where
 tS, tB, tx are the times from the reference time t = 0 to the midpoint of the dead‑time‑corrected counting intervals T0 = 60 min for the standard, blank, and sample vials, respectively; although tB is given for completeness, it is not needed in the analysis; CS, CB, Cx are the number of counts recorded in the dead‑time‑corrected counting intervals T0 = 60 min for the standard, blank, and sample vials, respectively.

The observed counts may be expressed as

(H.18a)
(H.18b)
where
 ε is the liquid scintillation detection efficiency for 222Rn for a given source composition, assumed to be independent of the activity level; AS is the activity concentration of the standard at the reference time t = 0; Ax is the measurand and is defined as the unknown activity concentration of the sample at the reference time t = 0; mS is the mass of the standard solution; mx is the mass of the sample aliquot; λ is the decay constant for 222Rn: λ = (ln 2)⁄T1/2 = 1,258 94 × 10−4 min−1 (T1/2 = 5 505,8 min).
Table H.7 — Counting data for determining the activity concentration of an unknown sample
Cycle Standard Blank Sample
k

tS
(min)
CS
(counts)
tB
(min)
CB
(counts)
tx
(min)
Cx
(counts)
1 243,74 15 380 305,56 4 054 367,37 41 432
2 984,53 14 978 1 046,10 3 922 1 107,66 38 706
3 1 723,87 14 394 1 785,43 4 200 1 846,99 35 860
4 2 463,17 13 254 2 524,73 3 830 2 586,28 32 238
5 3 217,56 12 516 3 279,12 3 956 3 340,68 29 640
6 3 956,83 11 058 4 018,38 3 980 4 079,94 26 356

Equations (H.18a) and (H.18b) indicate that neither the six individual values of CS nor of Cx given in Table H.7 can be averaged directly because of the exponential decay of the activity of the standard and sample, and slight variations in background counts from one cycle to another. Instead, one must deal with the decay‑corrected and background‑corrected counts (or counting rates defined as the number of counts divided by T0 = 60 min). This suggests combining Equations (H.18a) and (H.18b) to obtain the following expression for the unknown concentration in terms of the known quantities:

(H.19)
where (Cx − CB)eλtx and (CS − CB)eλtS are, respectively, the background‑corrected counts of the sample and the standard at the reference time t = 0 and for the time interval T0 = 60 min. Alternatively, one may simply write
(H.20)
where the background‑corrected and decay‑corrected counting rates Rx and RS are given by
(H.21a)
(H.21b)

## H.4.2   Analysis of data

Table H.8 summarizes the values of the background‑corrected and decay‑corrected counting rates RS and Rx calculated from Equations (H.21a) and (H.21b) using the data of Table H.7 and λ = 1,258 94 × 10−4 min−1 as given earlier. It should be noted that the ratio R = RxRS is most simply calculated from the expression

The arithmetic means R‾‾S, R‾‾x, and R‾‾, and their experimental standard deviations s(R‾‾S), s(R‾‾x), and s(R‾‾), are calculated in the usual way [Equations (3) and (5) in 4.2]. The correlation coefficient r(R‾‾xR‾‾S) is calculated from Equation (17) in 5.2.3 and Equation (14) in 5.2.2.

Because of the comparatively small variability of the values of Rx and of RS, the ratio of means R‾‾xR‾‾S and the standard uncertainty u(R‾‾xR‾‾S) of this ratio are, respectively, very nearly the same as the mean ratio R‾‾ and its experimental standard deviation s(R‾‾) as given in the last column of Table H.8 [see H.2.4 and Equation (H.10) therein]. However, in calculating the standard uncertainty u(R‾‾xR‾‾S), the correlation between Rx and RS as represented by the correlation coefficient r(R‾‾xR‾‾S) must be taken into account using Equation (16) in 5.2.2. [That equation yields for the relative estimated variance of R‾‾xR‾‾S the last three terms of Equation (H.22b).]

It should be recognized that the respective experimental standard deviations of Rx and of RS, 6‾‾s(R‾‾x) and 6‾‾s(R‾‾S), indicate a variability in these quantities that is two to three times larger than the variability implied by the Poisson statistics of the counting process; the latter is included in the observed variability of the counts and need not be accounted for separately.

Table H.8 — Calculation of decay-corrected and background-corrected counting rates
Cycle Rx
(min−1)
RS
(min−1)
tx − tS
(min)
R = RxRS
k
1 652,46 194,65 123,63 3,352 0
2 666,48 208,58 123,13 3,195 3
3 665,80 211,08 123,12 3,154 3
4 655,68 214,17 123,11 3,061 5
5 651,87 213,92 123,12 3,047 3
6 623,31 194,13 123,11 3,210 7

R‾‾‾x = 652,60
s(R‾‾‾x) = 6,42
s(R‾‾‾x)R‾‾‾x = 0,98 × 10−2
R‾‾‾S = 206,09
s(R‾‾‾S) = 3,79
s(R‾‾‾S)R‾‾‾S = 1,84 × 10−2

R‾‾‾ = 3,170
s(R‾‾‾) = 0,046
s(R‾‾‾)R‾‾‾ = 1,44 × 10−2
R‾‾‾xR‾‾‾S = 3,167
u(R‾‾‾xR‾‾‾S) = 0,045
u(R‾‾‾xR‾‾‾S)(R‾‾‾xR‾‾‾S) = 1,42 × 10−2

Correlation coefficient
r(R‾‾‾xR‾‾‾S) = 0,646

## H.4.3   Calculation of final results

To obtain the unknown activity concentration Ax and its combined standard uncertainty uc(Ax) from Equation (H.20) requires AS, mx, and mS and their standard uncertainties. These are given as

Other possible sources of uncertainty are evaluated to be negligible:

• standard uncertainties of the decay times, u(tS, k) and u(txk);
• standard uncertainty of the decay constant of 222Rn, u(λ) = 1 × 10−7 min−1. (The significant quantity is the decay factor exp[λ(tx − tS)], which varies from 1,015 63 for cycles k = 4 and 6 to 1,015 70 for cycle k = 1. The standard uncertainty of these values is u = 1,2 × 10−5);
• uncertainty associated with the possible dependence of the detection efficiency of the scintillation counter on the source used (standard, blank, and sample);
• uncertainty of the correction for counter dead‑time and of the correction for the dependence of counting efficiency on activity level.

### H.4.3.1   Results: approach 1

As indicated earlier, Ax and uc(Ax) may be obtained in two different ways from Equation (H.20). In the first approach, Ax is calculated using the arithmetic means R‾‾x and R‾‾S, which leads to

(H.22a)

Application of Equation (16) in 5.2.2 to this expression yields for the combined variance u2c(Ax)

(H.22b)
where, as noted in H.4.2, the last three terms give u2(R‾‾xR‾‾S)(R‾‾xR‾‾S)2, the estimated relative variance of R‾‾xR‾‾S. Consistent with the discussion of H.2.4, the results in Table H.8 show that R‾‾ is not exactly equal to R‾‾xR‾‾S; and that the standard uncertainty u(R‾‾xR‾‾S) of R‾‾xR‾‾S is not exactly equal to the standard uncertainty s(R‾‾) of R‾‾.

Substitution of the values of the relevant quantities into Equations (H.22a) and (H.22b) yields

The result of the measurement may then be stated as:

Ax = 0,430 0 Bqg with a combined standard uncertainty of uc = 0,008 3 Bqg.

### H.4.3.2   Results: approach 2

In the second approach, which avoids the correlation between R‾‾x and R‾‾S, Ax is calculated using the arithmetic mean R‾‾. Thus

(H.23a)

The expression for u2c(Ax) is simply

(H.23b)
which yields

The result of the measurement may then be stated as:

Ax = 0,430 4 Bqg with a combined standard uncertainty of uc = 0,008 4 Bqg.

The effective degrees of freedom of uc can be evaluated using the Welch‑Satterthwaite formula in the manner illustrated in H.1.6.

As in H.2, of the two results, the second is preferred because it avoids approximating the mean of a ratio of two quantities by the ratio of the means of the two quantities; and it better reflects the measurement procedure used — the data were in fact collected in separate cycles.

Nevertheless, the difference between the values of Ax resulting from the two approaches is clearly small compared with the standard uncertainty ascribed to either one, and the difference between the two standard uncertainties is entirely negligible. Such agreement demonstrates that the two approaches are equivalent when the observed correlations are properly included.

# H.5   Analysis of variance

This example provides a brief introduction to analysis of variance (ANOVA) methods. These statistical techniques are used to identify and quantify individual random effects in a measurement so that they may be properly taken into account when the uncertainty of the result of the measurement is evaluated. Although ANOVA methods are applicable to a wide range of measurements, for example, the calibration of reference standards, such as Zener voltage standards and standards of mass, and the certification of reference materials, ANOVA methods by themselves cannot identify systematic effects that might be present.

There are many different models included under the general name of ANOVA. Because of its importance, the specific model discussed in this example is the balanced nested design. The numerical illustration of this model involves the calibration of a Zener voltage standard; the analysis should be relevant to a variety of practical measurement situations.

ANOVA methods are of special importance in the certification of reference materials (RMs) by interlaboratory testing, a topic covered thoroughly in ISO Guide 35 [19] (see H.5.3.2 for a brief description of such RM certification). Since much of the material contained in ISO Guide 35 is in fact broadly applicable, that publication may be consulted for additional details concerning ANOVA, including unbalanced nested designs. References [15] and [20] may be similarly consulted.

## H.5.1   The measurement problem

Consider a nominally 10 V Zener voltage standard that is calibrated against a stable voltage reference over a two‑week period. On each of J days during the period, K independent repeated observations of the potential difference VS of the standard are made. If Vjk denotes the kth observation of VS (k = 1, 2, ..., K) on the jth day (j = 1, 2, ..., J), the best estimate of the potential difference of the standard is the arithmetic mean V‾‾‾ of the JK observations [see Equation (3) in 4.2.1],

(H.24a)

The experimental standard deviation of the mean s(V‾‾‾), which is a measure of the uncertainty of V‾‾‾ as an estimate of the potential difference of the standard, is obtained from [see Equation (5) in 4.2.3]

(H.24b)

NOTE   It is assumed throughout this example that all corrections applied to the observations to compensate for systematic effects have negligible uncertainties or their uncertainties are such that they can be taken into account at the end of the analysis. A correction in this latter category, and one that can itself be applied to the mean of the observations at the end of the analysis, is the difference between the certified value (assumed to have a given uncertainty) and the working value of the stable voltage reference against which the Zener voltage standard is calibrated. Thus the estimate of the potential difference of the standard obtained statistically from the observations is not necessarily the final result of the measurement; and the experimental standard deviation of that estimate is not necessarily the combined standard uncertainty of the final result.

The experimental standard deviation of the mean s(V‾‾‾) as obtained from Equation (H.24b) is an appropriate measure of the uncertainty of V‾‾‾ only if the day‑to‑day variability of the observations is the same as the variability of the observations made on a single day. If there is evidence that the between‑day variability is significantly larger than can be expected from the within‑day variability, use of this expression could lead to a considerable understatement of the uncertainty of V‾‾‾. Two questions thus arise: How should one decide if the between‑day variability (characterized by a between‑day component of variance) is significant in comparison with the within‑day variability (characterized by a within‑day component of variance) and, if it is, how should one evaluate the uncertainty of the mean?

## H.5.2   A numerical example

H.5.2.1   Data which allow the above questions to be addressed are given in Table H.9, where

J = 10 is the number of days on which potential‑difference observations were made;
K = 5 is the number of potential‑difference observations made on each day;
(H.25a)
is the arithmetic mean of the K = 5 potential‑difference observations made on the jth day (there are J = 10 such daily means);
(H.25b)
is the arithmetic mean of the J = 10 daily means and thus the overall mean of the JK = 50 observations;
(H.25c)
is the experimental variance of the K = 5 observations made on the jth day (there are J = 10 such estimates of variance); and
(H.25d)
is the experimental variance of the J = 10 daily means (there is only one such estimate of variance).
Table H.9 — Summary of voltage standard calibration data obtained on J = 10 days, with each daily mean V‾‾‾j and experimental standard deviation s(Vjk) based on K = 5 independent repeated observations
Quantity Day, j
1 2 3 4 5 6 7 8 9 10
V‾‾‾jV 10,000 172 10,000 116 10,000 013 10,000 144 10,000 106 10,000 031 10,000 060 10,000 125 10,000 163 10,000 041
s(Vjk)µV 60 77 111 101 67 93 80 73 88 86
V‾‾‾ = 10,000 097 V
s(V‾‾‾j) = 57 µV
s2a = Ks2(V‾‾‾j) = 5(57 µV)2 = (128 µV)2
s2b = s2(Vjk)‾‾‾‾‾‾‾‾‾‾‾  = (85 µV)2

H.5.2.2   The consistency of the within‑day variability and between‑day variability of the observations can be investigated by comparing two independent estimates of σ2W, the within‑day component of variance (that is, the variance of observations made on the same day).

The first estimate of σ2w, denoted by s2a, is obtained from the observed variation of the daily means V‾‾‾j. Since V‾‾‾j is the average of K observations, its estimated variance s2(V‾‾‾j), under the assumption that the between‑day component of variance is zero, estimates σ2wK. It then follows from Equation (H.25d) that

(H.26a)
which is an estimate of σ2w having va = J − 1 = 9 degrees of freedom.

The second estimate of σ2w, denoted by s2b, is the pooled estimate of variance obtained from the J = 10 individual values of s2(Vjk) using the equation of the note to H.3.6, where the ten individual values are calculated from Equation (H.25c). Because the degrees of freedom of each of these values is vi = K − 1, the resulting expression for s2b is simply their average. Thus

(H.26b)

which is an estimate of σ2w having vb = J(K − 1) = 40 degrees of freedom.

The estimates of σ2w given by Equations (H.26a) and (H.26b) are s2a = (128 µV)2 and s2b = (85 µV)2, respectively (see Table H.9). Since the estimate s2a is based on the variability of the daily means while the estimate s2b is based on the variability of the daily observations, their difference indicates the possible presence of an effect that varies from one day to another but that remains relatively constant when observations are made on any single day. The F‑test is used to test this possibility, and thus the assumption that the between‑day component of variance is zero.

H.5.2.3   The F‑distribution is the probability distribution of the ratio F(vavb) = s2a(va)s2b(vb) of two independent estimates, s2a(va) and s2b(vb), of the variance σ2 of a normally distributed random variable [15]. The parameters va and vb are the respective degrees of freedom of the two estimates and 0 ≤ F(vavb) < ∞. Values of F are tabulated for different values of va and vb and various quantiles of the F‑distribution. A value of F(vavb) > F0,95 or F(vavb) > F0,975 (the critical value) is usually interpreted as indicating that s2a(va) is larger than s2b(vb) by a statistically significant amount; and that the probability of a value of F as large as that observed, if the two estimates were estimates of the same variance, is less than 0,05 or 0,025, respectively. (Other critical values may also be chosen, such as F0,99.)

H.5.2.4   The application of the F‑test to the present numerical example yields

(H.27)

with va = J − 1 = 9 degrees of freedom in the numerator and vb = J(K − 1) = 40 degrees of freedom in the denominator. Since F0,95(9,40) = 2,12 and F0,975(9,40) = 2,45, it is concluded that there is a statistically significant between‑day effect at the 5 percent level of significance but not at the 2,5 percent level.

H.5.2.5   If the existence of a between‑day effect is rejected because the difference between s2a and s2b is not viewed as statistically significant (an imprudent decision because it could lead to an underestimate of the uncertainty), the estimated variance s2(V‾‾‾) of V‾‾‾ should be calculated from Equation (H.24b). That relation is equivalent to pooling the estimates s2a and s2b (that is, taking a weighted average of s2a and s2b, each weighted by its respective degrees of freedom va and vb — see H.3.6, note) to obtain the best estimate of the variance of the observations; and dividing that estimate by JK, the number of observations, to obtain the best estimate s2(V‾‾‾) of the variance of the mean of the observations. Following this procedure one obtains

(H.28a)
(H.28b)
with s(V‾‾‾) having JK − 1 = 49 degrees of freedom.

If it is assumed that all corrections for systematic effects have already been taken into account and that all other components of uncertainty are insignificant, then the result of the calibration can be stated as VS = V‾‾‾ = 10,000 097 V (see Table H.9), with a combined standard uncertainty of s(V‾‾‾) = uc = 13 µV, and with uc having 49 degrees of freedom.

NOTE 1   In practice, there would very likely be additional components of uncertainty that were significant and therefore would have to be combined with the component of uncertainty obtained statistically from the observations (see H.5.1, note).

NOTE 2   Equation (H.28a) for s2(V‾‾‾) can be shown to be equivalent to Equation (H.24b) by writing the double sum, denoted by S, in that equation as

H.5.2.6   If the existence of a between‑day effect is accepted (a prudent decision because it avoids a possible underestimate of the uncertainty) and it is assumed to be random, then the variance s2(V‾‾‾j) calculated from the J = 10 daily means according to Equation (H.25d) estimates not σ2wK as postulated in H.5.2.2, but σ2wK + σ2B, where σ2B is the between‑day random component of variance. This implies that

(H.29)
where s2w estimates σ2w and s2B estimates σ2B. Since s2(Vjk)‾‾‾‾‾‾‾‾‾ calculated from Equation (H.26b) depends only on the within‑day variability of the observations, one may take s2w = s2(Vjk)‾‾‾‾‾‾‾‾‾. Thus the ratio Ks2(V‾‾‾j)s2(Vjk)‾‾‾‾‾‾‾‾‾ used for the F‑test in H.5.2.4 becomes
(H.30)
(H.31a)
(H.31b)

The estimated variance of V‾‾‾ is obtained from s2(V‾‾‾j), Equation (H.25d), because s2(V‾‾‾j) properly reflects both the within‑day and between‑day random components of variance [see Equation (H.29)]. Thus

(H.32)
with s(V‾‾‾) having J − 1 = 9 degrees of freedom.

The degrees of freedom of s2w (and thus sw) is J(K − 1) = 40 [see Equation (H.26b)]. The degrees of freedom of s2B (and thus sB) is the effective degrees of freedom of the difference s2B = s2(V‾‾‾j) − s2(Vjk)‾‾‾‾‾‾‾‾‾K [Equation (H.31a)], but its estimation is problematic.

H.5.2.7   The best estimate of the potential difference of the voltage standard is then VS = V‾‾‾ = 10,000 097 V, with s(V‾‾‾) = uc = 18 µV as given in Equation (H.32). This value of uc and its 9 degrees of freedom are to be compared with uc = 13 µV and its 49 degrees of freedom, the result obtained in H.5.2.5 [Equation (H.28b)] when the existence of a between‑day effect was rejected.

In a real measurement an apparent between‑day effect should be further investigated, if possible, in order to determine its cause and whether a systematic effect is present that would negate the use of ANOVA methods. As pointed out at the beginning of this example, ANOVA techniques are designed to identify and evaluate components of uncertainty arising from random effects; they cannot provide information about components arising from systematic effects.

## H.5.3   The role of ANOVA in measurement

H.5.3.1   This voltage standard example illustrates what is generally termed a balanced, one‑stage nested design. It is a one‑stage nested design because there is one level of “nesting” of the observations with one factor, the day on which observations are made, being varied in the measurement. It is balanced because the same number of observations is made on each day. The analysis presented in the example can be used to determine if there is an “operator effect”, an “instrument effect”, a “laboratory effect”, a “sample effect”, or even a “method effect” in a particular measurement. Thus in the example, one might imagine replacing the observations made on the J different days by observations made on the same day but by J different operators; the between‑day component of variance becomes then a component of variance associated with different operators.

H.5.3.2   As noted in H.5, ANOVA methods are widely used in the certification of reference materials (RMs) by interlaboratory testing. Such certification usually involves having a number of independent, equally competent laboratories measure samples of a material for the property for which the material is to be certified. It is generally assumed that the differences between individual results, both within and between laboratories, are statistical in nature regardless of the causes. Each laboratory mean is considered an unbiased estimate of the property of the material, and usually the unweighted mean of the laboratory means is assumed to be the best estimate of that property.

An RM certification might involve I different laboratories, each of which measures the requisite property of J different samples of the material, with each measurement of a sample consisting of K independent repeated observations. Thus the total number of observations is IJK and the total number of samples is IJ. This is an example of a balanced, two‑stage nested design analogous to the one-stage voltage‑standard example above. In this case, there are two levels of “nesting” of the observations with two different factors, sample and laboratory, being varied in the measurement. The design is balanced because each sample is observed the same number of times (K) in each laboratory and each laboratory measures the same number of samples (J). In further analogy with the voltage‑standard example, in the RM case the purpose of the analysis of the data is to investigate the possible existence of a between‑samples effect and a between‑laboratories effect, and to determine the proper uncertainty to assign to the best estimate of the value of the property to be certified. In keeping with the previous paragraph, that estimate is assumed to be the mean of the I laboratory means, which is also the mean of the IJK observations.

H.5.3.3   The importance of varying the input quantities upon which a measurement result depends so that its uncertainty is based on observed data evaluated statistically is pointed out in 3.4.2. Nested designs and the analysis of the resulting data by ANOVA methods can be successfully used in many measurement situations encountered in practice.

Nonetheless, as indicated in 3.4.1, varying all input quantities is rarely feasible due to limited time and resources; at best, in most practical measurement situations, it is only possible to evaluate a few components of uncertainty using ANOVA methods. As pointed out in 4.3.1, many components must be evaluated by scientific judgement using all of the available information on the possible variability of the input quantities in question; in many instances an uncertainty component, such as arises from a between‑samples effect, a between‑laboratories effect, a between‑instruments effect, or a between‑operators effect, cannot be evaluated by the statistical analysis of series of observations but must be evaluated from the available pool of information.

# H.6   Measurements on a reference scale: hardness

Hardness is an example of a physical concept that cannot be quantified without reference to a method of measurement; it has no unit that is independent of such a method. The quantity “hardness” is unlike classical measurable quantities in that it cannot be entered into algebraic equations to define other measurable quantities (though it is sometimes used in empirical equations that relate hardness to another property for a category of materials). Its magnitude is determined by a conventional measurement, that of a linear dimension of an indentation in a block of the material of interest, or sample block. The measurement is made according to a written standard, which includes a description of the “indentor”, the construction of the machine by which the indentor is applied, and the way in which the machine is to be operated. There is more than one written standard, so there is more than one scale of hardness.

The hardness reported is a function (depending on the scale) of the linear dimension that is measured. In the example given in this subclause, it is a linear function of the arithmetic mean or average of the depths of five repeated indentations, but for some other scales the function is nonlinear.

Realizations of the standard machine are kept as national standards (there is no international standard realization); a comparison between a particular machine and the national standard machine is made using a transfer‑standard block.

## H.6.1   The measurement problem

In this example, the hardness of a sample block of material is determined on the scale “Rockwell C” using a machine that has been calibrated against the national standard machine. The scale unit of Rockwell‑C hardness is 0,002 mm, with hardness on that scale defined as 100 × (0,002 mm) minus the average of the depths, measured in mm, of five indentations. The value of that quantity divided by the Rockwell scale unit 0,002 mm is called the “HRC hardness index”. In this example, the quantity is called simply “hardness”, symbol hRockwell C, and the numerical value of hardness expressed in Rockwell units of length is called the “hardness index”, HRockwell C.

## H.6.2   Mathematical model

To the average of the depths of the indentations made in the sample block by the machine used to determine its hardness, or calibration machine, must be added corrections to determine the average of the depths of the indentations that would have been made in the same block by the national standard machine. Thus

(H.33a)
(H.33b)
where
 d‾‾ is the average of the depths of five indentations made by the calibration machine in the sample block; Δc is the correction obtained from a comparison of the calibration machine with the national standard machine using a transfer‑standard block, equal to the average of the depths of 5m indentations made by the national standard machine in this block, minus the average of the depths of 5n indentations made in the same block by the calibration machine; Δb is the difference in hardness (expressed as a difference of average depth of indentation) between the two parts of the transfer‑standard block used respectively for indentations by the two machines, assumed zero; and ΔS is the error due to the lack of repeatability of the national standard machine and the incomplete definition of the quantity hardness. Although ΔS must be assumed to be zero, it has a standard uncertainty associated with it of u(ΔS).

Since the partial derivatives, fd‾‾, fΔc, fΔb, and fΔS of the function of Equation (H.33a) are all equal to −1, the combined standard uncertainty u2c(h) of the hardness of the sample block as measured by the calibration machine is simply given by

(H.34)
where for simplicity of notation h ≡ hRockwell C.

## H.6.3   Contributory variances

### H.6.3.1   Uncertainty of the average depth of indentation d‾‾‾ of the sample block, u(d‾‾‾)

Uncertainty of repeated observations. Strict repetition of an observation is not possible because a new indentation cannot be made on the site of an earlier one. Since each indentation must be made on a different site, any variation in the results includes the effect of variations in hardness between different sites. Thus u(d‾‾), the standard uncertainty of the average of the depths of five indentations in the sample block by the calibration machine, is taken as sp(dk)5‾‾, where sp(dk) is the pooled experimental standard deviation of the depths of indentations determined by “repeated” measurements on a block known to have very uniform hardness (see 4.2.4).

Uncertainty of indication. Although the correction to d‾‾ due to the display of the calibration machine is zero, there is an uncertainty in d‾‾ due to the uncertainty of the indication of depth due to the resolution δ of the display given by u2(δ) = δ212 (see F.2.2.1). The estimated variance of d‾‾ is thus

(H.35)

### H.6.3.2   Uncertainty of the correction for the difference between the two machines, u(Δc)

As indicated in H.6.2, Δc is the correction for the difference between the national standard machine and the calibration machine. This correction may be expressed as Δc = zS − z, where zS = (∑mi =1z‾‾‾S, i)m is the average depth of the 5m indentations made by the national standard machine in the transfer‑standard block; and z = (∑ni =1z‾‾‾i)n is the average depth of the 5n indentations made in the same block by the calibration machine. Thus, assuming that for the comparison the uncertainty due to the resolution of the display of each machine is negligible, the estimated variance of Δc is

(H.36)
where
 s2av(z‾‾‾S) = [∑mi = 1s2(z‾‾‾S, i)]⁄m is the average of the experimental variances of the means of each of the m series of indentations zS, ik made by the standard machine; s2av(z‾‾‾) = [∑ni = 1s2(z‾‾‾i)]⁄n is the average of the experimental variances of the means of each of the n series of indentations zik made by the calibration machine. NOTE   The variances s2av(z‾‾‾S) and s2av(z‾‾‾) are pooled estimates of variance — see the discussion of Equation (H.26b) in H.5.2.2.

### H.6.3.3   Uncertainty of the correction due to variations in the hardness of the transfer‑standard block, u(Δb)

OIML International Recommendation R 12, Verification and calibration of Rockwell C hardness standardized blocks, requires that the maximum and minimum depths of indentation obtained from five measurements on the transfer‑standard block shall not differ by more than a fraction x of the average depth of indentation, where x is a function of the hardness level. Let, therefore, the maximum difference in the depths of indentation over the entire block be xz, where z is as defined in H.6.3.2 with n = 5. Also let the maximum difference be described by a triangular probability distribution about the average value xz2 (on the likely assumption that values near the central value are more probable than extreme values —- see 4.3.9). Then, if in Equation (9b) in 4.3.9 a = xz2, the estimated variance of the correction to the average depth of indentation due to differences of the hardnesses presented respectively to the standard machine and the calibration machine is

(H.37)

As indicated in H.6.2, it is assumed that the best estimate of the correction Δb itself is zero.

### H.6.3.4   Uncertainty of the national standard machine and the definition of hardness, u(ΔS)

The uncertainty of the national standard machine together with the uncertainty due to incomplete definition of the quantity hardness is reported as an estimated standard deviation u(ΔS) (a quantity of dimension length).

## H.6.4   The combined standard uncertainty, uc(h)

Collection of the individual terms discussed in H.6.3.1 to H.6.3.4 and their substitution into Equation (H.34) yields for the estimated variance of the measurement of hardness

(H.38)
and the combined standard uncertainty is uc(h).

## H.6.5   Numerical example

The data for this example are summarized in Table H.10.

Table H.10 — Summary of data for determining the hardness of a sample block on the scale Rockwell C
Source of uncertainty Value
Average depth d‾‾‾ of 5 indentations made by the calibration machine in the sample block: 0,072 mm 36,0 Rockwell scale unit
Indicated hardness index of the sample block from the 5 indentations: HRockwell C = hRockwell C(0,002 mm) = [100(0,002 mm) − 0,072 mm](0,002 mm) (see H.6.1) 64,0 HRC
Pooled experimental standard deviation sp(dk) of the depths of indentations made by the calibration machine in a block having uniform hardness 0,45 Rockwell scale unit
Resolution δ of the display of the calibration machine 0,1 Rockwell scale unit
sav(z‾‾‾S), square root of the average of the experimental variances of the means of m series of indentations made by the national standard machine in the transfer‑standard block 0,10 Rockwell scale unit, m = 6
sav(z‾‾‾), square root of the average of the experimental variances of the means of n series of indentations made by the calibration machine in the transfer‑standard block 0,11 Rockwell scale unit, n = 6
Permitted fractional variation x of the depth of penetration in the transfer‑standard block 1,5 × 10−2
Standard uncertainty u(ΔS) of the national standard machine and definition of hardness 0,5 Rockwell scale unit

The scale is Rockwell C, designated HRC. The Rockwell scale unit is 0,002 mm, and thus in Table  H.10 and in the following, it is understood that (for example) “36,0 Rockwell scale unit” means 36,0 × (0,002 mm) = 0,072 mm and is simply a convenient way of expressing the data and results.

If the values for the relevant quantities given in Table H.10 are substituted into Equation (H.38), one obtains the following two expressions:

where for the purpose of the calculation of uncertainty it is adequate to take z = d‾‾‾ = 36,0 Rockwell scale unit.

Thus, if it is assumed that Δc = 0, the hardness of the sample block is

hRockwell C = 64,0 Rockwell scale unit or 0,128 0 mm with a combined standard uncertainty of uc = 0,55 Rockwell scale unit or 0,0011 mm.

The hardness index of the block is hRockwell C(0,002 mm) = (0,128 0 mm)(0,002 mm), or

HRockwell C = 64,0 HRC with a combined standard uncertainty of uc = 0,55 HRC.

In addition to the component of uncertainty due to the national standard machine and the definition of hardness, u(ΔS) = 0,5 Rockwell scale unit, the significant components of uncertainty are those of the repeatability of the machine, sp(dk)5‾‾ = 0,20 Rockwell scale unit; and the variation of the hardness of the transfer‑standard block, which is (xz)224 = 0,11 Rockwell scale unit. The effective degrees of freedom of uc can be evaluated using the Welch-Satterthwaite formula in the manner illustrated in H.1.6.