Calibration
Calibration
Calibration
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>Calibration</strong><br />
Learning Objectives<br />
After completing this module, the student will be able to<br />
• explain the purpose of calibration<br />
• find a calibration curve using the Excel function trendline<br />
• write a macro in Excel<br />
• explain the meaning of R 2<br />
• explain sources of error when estimating the independent<br />
variable value<br />
• find a confidence interval for the independent variable value<br />
Knowledge and Skills<br />
• trendline calculation<br />
• linear regression<br />
• coefficient of determination<br />
• calibration<br />
Prerequisites<br />
• linear equation<br />
• average and standard deviation<br />
• normal distribution<br />
Citation: Neuhauser, C. <strong>Calibration</strong><br />
Created: October 18, 2009 Revisions:<br />
Copyright: © 2009 Neuhauser. This is an open‐access article distributed under the terms of the Creative Commons Attribution<br />
Non‐Commercial Share Alike License, which permits unrestricted use, distribution, and reproduction in any medium, and allows<br />
others to translate, make remixes, and produce new stories based on this work, provided the original author and source are<br />
credited and the new work will carry the same license.<br />
Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 1
Pre‐assessment<br />
Before completing the module test whether you master the prerequisites. Linear Equation<br />
1. Find the equation of a horizontal line that goes through the point (2,4).<br />
2. Find the equation of a vertical line that goes through the point (‐1,3).<br />
3. Determine the equation of the line passing through (‐2,1) and (3,‐1/2).<br />
4. Determine the equation of the line passing through (1,‐2) and (‐2,4).<br />
5. Determine the equation of the line with slope 3 and vertical intercept (0,2).<br />
6. Determine the equation of the line passing through (‐1,‐1) and parallel to the line passing through<br />
(0,1) and (3,0).<br />
7. Graph of the line given by the equation y= 2x + 1.<br />
8. Graph the line given by the equation 3x − 4y + 1=<br />
0.<br />
Average and Standard Deviation<br />
9. Find the average and sample standard deviation of the following data set: 2,4,5,6,6,7,8<br />
10. Write down the equation for calculating the average and the sample standard deviation of a data set<br />
of size n: x1, x2,..., x<br />
n<br />
Normal Distribution<br />
11. Suppose X is normally distributed with mean 2 and standard deviation 1. Find (a) the 75 th percentile,<br />
(b) the 95 th percentile, and (c) the 99 th percentile.<br />
12. Suppose X is normally distributed with mean 3 and variance 4. Find the probability that X is between<br />
1 and 4, that is, find P(1 ≤ X ≤4)<br />
.<br />
13. Suppose X is normally distributed with mean ‐1 and standard deviation 4. Find an interval centered<br />
about the mean so that with probability 0.95 X is contained in that interval.<br />
14. Suppose that the number of seeds a plant produces is normally distributed with mean 142 and<br />
standard deviation 31. Find the probability that a randomly sampled plant will produce more than<br />
200 seeds.<br />
Citation: Neuhauser, C. <strong>Calibration</strong><br />
Created: October 18, 2009 Revisions:<br />
Copyright: © 2009 Neuhauser. This is an open‐access article distributed under the terms of the Creative Commons Attribution<br />
Non‐Commercial Share Alike License, which permits unrestricted use, distribution, and reproduction in any medium, and allows<br />
others to translate, make remixes, and produce new stories based on this work, provided the original author and source are<br />
credited and the new work will carry the same license.<br />
Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 2
<strong>Calibration</strong><br />
According to the NIST handbook<br />
(http://www.itl.nist.gov/div898/handbook/pmd/section1/pmd133.htm), “[t]he goal of calibration is to<br />
quantitatively convert measurements made on one of two measurement scales to the other<br />
measurement scale.” The relationship between two measurements is used to convert one measurement<br />
into the other measurement. You saw one such example in your chemistry lab where you measured<br />
absorbance to find the concentration of an unknown sample. In this case, the relationship between<br />
absorbance and concentration was linear. You derived the relationship by measuring absorbance of<br />
standard samples of known concentration. The resulting line is called calibration curve. The basis for the<br />
calibration curve is Beer’s Law, which states that there is a direct linear relationship between<br />
absorbance (A) and concentration (c): When if we graph absorbance as a function of concentration, a<br />
straight line with positive slope provides a good fit. To illustrate this, we provide in the following table<br />
absorption measurements of standard samples:<br />
Concentration Absorbance<br />
[μmole L ‐1 ]<br />
0 0<br />
20 0.2356<br />
40 0.4725<br />
60 0.7127<br />
80 0.9507<br />
If we graph the data points and fit a straight line through the points (Figure 1), we find that the equation<br />
of the straight line is A= 0.0119c− 0.0014 .<br />
Figure 1: Straight line fit<br />
Citation: Neuhauser, C. <strong>Calibration</strong><br />
Created: October 18, 2009 Revisions:<br />
Copyright: © 2009 Neuhauser. This is an open‐access article distributed under the terms of the Creative Commons Attribution<br />
Non‐Commercial Share Alike License, which permits unrestricted use, distribution, and reproduction in any medium, and allows<br />
others to translate, make remixes, and produce new stories based on this work, provided the original author and source are<br />
credited and the new work will carry the same license.<br />
Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 3
This curve is called a standard curve and is used to infer the unknown concentration of a solution. For<br />
instance, if we find that the absorbance A of an unknown solution is 0.6386, we find for the<br />
concentration c<br />
0.6386 −− ( 0.0014)<br />
c = = 53.8<br />
0.0119<br />
The data in our example fits Beer’s Law extremely well. The data was generated using a Virtual Lab on<br />
Spectrophotometry (http://www.chm.davidson.edu/vce/Spectrophotometry/UnknownSolution.html).<br />
When data are obtained in actual lab experiments, measurement errors need to be taken into account.<br />
A Model for Linear <strong>Calibration</strong><br />
We assume in the following that we measure a signal y that depends linearly on a quantity x. We call the<br />
quantity x the independent variable and the quantity y the dependent variable. We assume that we<br />
measure x without error and that the quantity y is measured with an error ε that is normally distributed<br />
with mean 0 and standard deviation σ. The relationship between the two quantities is then<br />
y= a+ bx+<br />
ε<br />
To get a sense for the measurement uncertainty when inferring the quantity x from the measurement y,<br />
we begin with simulating an experiment in which we have a set of n standard samples and for each<br />
sample we measure the signal m times.<br />
In‐class Activity 1<br />
In the spreadsheet <strong>Calibration</strong>Workbook under the tab “Simulation,” you will find the simulation of<br />
standard samples with values x = 10,20,40,60,80 and 90 and where the intercept a = 0 and the slope<br />
b = 1. Each signal is measured 3 times. The simulated data are in the gray‐colored box. The input<br />
parameters for the slope, the intercept, and the standard deviation s.d. for the error are in the yellowcolored<br />
box. The trendline is calculated using the Excel function LINEST. (This function is difficult to use<br />
and you will not need to learn how at this point.)<br />
To investigate how the estimated value of the independent variable x depends on the error ε, we<br />
proceed as follows. We assume that the (unknown) value of the independent variable x is equal to 50<br />
Citation: Neuhauser, C. <strong>Calibration</strong><br />
Created: October 18, 2009 Revisions:<br />
Copyright: © 2009 Neuhauser. This is an open‐access article distributed under the terms of the Creative Commons Attribution<br />
Non‐Commercial Share Alike License, which permits unrestricted use, distribution, and reproduction in any medium, and allows<br />
others to translate, make remixes, and produce new stories based on this work, provided the original author and source are<br />
credited and the new work will carry the same license.<br />
Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 4
(Cell F11). Using the equation y= ax+ b+ ε with a = 1 and b = 0 with s.d. 1, we can calculate the<br />
measured value of the quantity y (Cell F12). We can then use the estimated trendline to find the<br />
estimate for x (Cell F13) . The graph displays the simulated data from the calibration experiment, the<br />
trendline, and the data point corresponding to the unknown sample.<br />
When you press F9, you will see that Excel runs another simulation. By repeatedly pressing F9, you can<br />
get a sense for the variability of the estimated value of x in our simulation experiment. It is tedious to<br />
record manually the values of repeated simulations. Excel has a feature, called Macro, that records<br />
repeated key strokes. Let’s write a macro to record the outcome of repeated simulations for the<br />
estimate of x.<br />
(a) To write a macro to simulate values of x, proceed as follows:<br />
1. Open the Developer tab and click on Record Macro in the Code group.<br />
2. Give the macro a name and select a key, for instance, Ctrl‐a works.<br />
3. Select the Home tab.<br />
4. Copy the value of x from Cell F13.<br />
5. Paste the value of x into Cell Q3 as Paste Value.<br />
6. Click on Insert in the Cells group and click on Shift cells down in the Insert window.<br />
7. Go to the Developer tab and click on Stop Recording in the Code group.<br />
If you press Ctrl‐a, the simulated values will be copied into your spreadsheet in Column Q. Repeat the<br />
simulation 100 times. (The numbers in Column P help you keep track of the simulations.)Sort the<br />
simulated values from Smallest to Largest. Find the middle 90%.<br />
(b) Repeat the simulation when x = 15 . Are the inferred values of x more or less spread out compared<br />
to when x = 50 <br />
(c) Change the standard deviation to see how an increase/decrease in the measurement error affects<br />
the uncertainty in the calculation of x.<br />
Citation: Neuhauser, C. <strong>Calibration</strong><br />
Created: October 18, 2009 Revisions:<br />
Copyright: © 2009 Neuhauser. This is an open‐access article distributed under the terms of the Creative Commons Attribution<br />
Non‐Commercial Share Alike License, which permits unrestricted use, distribution, and reproduction in any medium, and allows<br />
others to translate, make remixes, and produce new stories based on this work, provided the original author and source are<br />
credited and the new work will carry the same license.<br />
Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 5
Figure 2: Screenshot of the simulation. The input parameters are listed in the yellow box; the simulated data are<br />
listed in the gray box; the estimated values of the slope and vertical intercept are listed in the green box together<br />
with the calculation of the unknown quantity x based on the measurement of the unknown sample y. The graph<br />
displays the simulated data (blue symbols), the trendline (black line), and the unknown measurement (red data<br />
point).<br />
Linear Regression<br />
When two quantities are linearly related, such as absorbance and concentration, a straight line provides<br />
a good fit. In Excel, a straight line can be fitted using the Trendline option. The Trendline option is under<br />
the Layout in the Chart Tools. When clicking on the blue triangle under Trendline and choosing More<br />
Trendline Options, a window opens that offers additional options, such as Display Equation on chart<br />
and Display R‐squared value on chart. We already know the meaning of the Equation. We will now look<br />
at the meaning of R‐squared.<br />
Assume a linear model y= a+ bx+ ε where the error has mean 0 and standard deviation σ . We<br />
obtained data points ( x , y ), j= 1,2,..., n, and used the Trendline option to fit a straight line. This results<br />
j<br />
j<br />
in estimates for the slope and the intercept. We denote the estimated value of the intercept by â and<br />
the estimated value of the slope by ˆb.<br />
Citation: Neuhauser, C. <strong>Calibration</strong><br />
Created: October 18, 2009 Revisions:<br />
Copyright: © 2009 Neuhauser. This is an open‐access article distributed under the terms of the Creative Commons Attribution<br />
Non‐Commercial Share Alike License, which permits unrestricted use, distribution, and reproduction in any medium, and allows<br />
others to translate, make remixes, and produce new stories based on this work, provided the original author and source are<br />
credited and the new work will carry the same license.<br />
Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 6
How does Excel estimate the slope and the intercept<br />
The method that Excel uses to estimate the slope and the intercept is called method of least squares.<br />
The method says: Find â and ˆb so that the expression<br />
n<br />
∑<br />
j=<br />
1<br />
⎡y ( ˆ ˆ<br />
j<br />
− a+<br />
bxj)<br />
⎤<br />
⎣ ⎦<br />
2<br />
is as small as possible. We say that the sum of the squared deviations is minimized. Expressions for the<br />
estimated intercept and slope can be given. It is not important to memorize the expressions.<br />
The least square line (or linear regression line) is given by<br />
with<br />
bˆ<br />
=<br />
∑<br />
y= aˆ<br />
+ bx ˆ<br />
n<br />
j=<br />
1 j<br />
n<br />
∑ j=<br />
1<br />
aˆ<br />
= y −bx<br />
ˆ<br />
( x − x)( y − y)<br />
2<br />
( x − x)<br />
j<br />
j<br />
To measure how good the fit is we calculate a quantity called the coefficient of determination, which is<br />
abbreviated as R 2 . For each data point ( x , y ), we can define y = aˆ<br />
+ bx ˆ . We introduce the deviation of<br />
the measured y‐values from their mean,<br />
j<br />
yj<br />
j<br />
− y<br />
ˆ j<br />
, which we can write as<br />
y − y = ( y − yˆ<br />
) + ( yˆ<br />
− y)<br />
j j j j<br />
n<br />
2<br />
A somewhat lengthy calculation shows that the total sum of squared deviations ∑ ( y )<br />
j 1 j<br />
− y can be<br />
=<br />
written as a part that is explained by the linear model (Explained) and a part that reflects the stochastic<br />
errors (Unexplained)<br />
j<br />
n n n<br />
2 2 2<br />
( y ) ( ˆ ) ( ˆ<br />
j<br />
− y = yj − y + yj −yj)<br />
j= 1 j= 1 j=<br />
1<br />
∑ ∑ ∑<br />
<br />
Total Explained Unexplained<br />
The ratio between the explained variation and the total variation is the coefficient of determination<br />
Citation: Neuhauser, C. <strong>Calibration</strong><br />
Created: October 18, 2009 Revisions:<br />
Copyright: © 2009 Neuhauser. This is an open‐access article distributed under the terms of the Creative Commons Attribution<br />
Non‐Commercial Share Alike License, which permits unrestricted use, distribution, and reproduction in any medium, and allows<br />
others to translate, make remixes, and produce new stories based on this work, provided the original author and source are<br />
credited and the new work will carry the same license.<br />
Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 7
R<br />
2<br />
∑<br />
∑<br />
2<br />
Explained ( yˆ<br />
)<br />
1 j<br />
− y<br />
j=<br />
= =<br />
n<br />
Total 2<br />
( y − y)<br />
n<br />
j=<br />
1<br />
j<br />
The coefficient of determination<br />
2<br />
R is the proportion of variation that is explained by the model.<br />
In‐class Activity 2<br />
Return to the spreadsheet <strong>Calibration</strong>Workbook. Under the tab “Simulation,” you have already worked<br />
on the simulation of standard samples with values x = 10,20,40,60,80 and 90 and where the intercept<br />
a = 0 and the slope b = 1. Each signal is measured 3 times. The simulated data are in the gray‐colored<br />
box. The graph has a small textbox where the equation of the trendline and the coefficient of<br />
determination is listed. You will see that when you increase the standard deviation, the coefficient of<br />
determination decreases. Give a verbal explanation as to why you would expect this.<br />
The Chemistry <strong>Calibration</strong> Lab<br />
In your <strong>Calibration</strong> Lab, you were asked to prepare a calibration curve. The spreadsheet<br />
<strong>Calibration</strong>Lab.xlsx will help you do the analysis. Open the spreadsheet. The <strong>Calibration</strong> Lab Analysis<br />
sheet is set up so that you can enter your data into the yellow cells. To calculate the calibration curve,<br />
enter the data from the absorbance measurements of the standard samples into C4:C21 (Step 2). The<br />
spreadsheet will calculate the slope and intercept in the cells I19 and I20, respectively, (see blue cells<br />
and Step 3). In Step 4, the spreadsheet calculates the coefficient of determination. Compare the values<br />
in the cell to the textbox in the figure that has the same information.<br />
(a) To include the uncertainty of the calibration curve in your lab report, record the coefficient of<br />
determination together with the equation of the trendline. Explain in words the meaning of the<br />
coefficient of determination.<br />
(b) In the chemistry lab, you then determined the concentration of an unknown sample based on the<br />
calibration curve. Enter the three measurements into cells B25‐B27 (Step 5). The spreadsheet is set up<br />
so that it calculates the estimated concentration. Use paper and pencil to verify the result in Cell B 31<br />
(estimated concentration) the spreadsheet.<br />
(c) While the theory is beyond this course, the spreadsheet is set up to calculate a confidence interval<br />
for the estimated concentration * x . In Cell K25, you can set the confidence level, for instance 95%. The<br />
lower and upper limits of the confidence interval are listed in Cells K27 and K28, respectively. Record the<br />
Citation: Neuhauser, C. <strong>Calibration</strong><br />
Created: October 18, 2009 Revisions:<br />
Copyright: © 2009 Neuhauser. This is an open‐access article distributed under the terms of the Creative Commons Attribution<br />
Non‐Commercial Share Alike License, which permits unrestricted use, distribution, and reproduction in any medium, and allows<br />
others to translate, make remixes, and produce new stories based on this work, provided the original author and source are<br />
credited and the new work will carry the same license.<br />
Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 8
confidence interval. The Cell K26 contains the value of half the length of the confidence interval, which<br />
we denote by C<br />
x . We can thus report the result also as x* ± Cx<br />
.<br />
If you want to read more about Linear <strong>Calibration</strong>, consult the statistics and data analysis paper by<br />
Burke, S. Regression and <strong>Calibration</strong>. LC GC Europe Online Supplement.<br />
Homework<br />
1. Find a linear regression line through the given points and compute the coefficient of determination<br />
x ‐3.0 ‐2.0 ‐1.0 0.0 1.0 2.0<br />
y ‐6.3 ‐5.6 ‐3.3 0.1 1.7 2.1<br />
2. To determine whether the frequency of chirping crickets depends on temperature, the following<br />
data were obtained by Pierce, 1949 (The Songs of Insects. Cambridge, Mass. Harvard University<br />
Press):<br />
Temperature (F) 69 70 72 75 81 82 83 84 89 93<br />
Chirps/sec 15 15 16 16 17 17 16 18 20 29<br />
Fit a linear trendline and find the coefficient of determination.<br />
3. To determine the glucose in a wine sample an enzyme spectroscopy method is used. The calibration<br />
curve is obtained from the following data<br />
Added glucose, 0.000 0.050 0.100 0.200 0.300 0.400<br />
[glucose] (mM)<br />
Absorbance 0.231 0.279 0.314 0.423 0.540 0.665<br />
(a) Find the equation of the calibration curve and the coefficient of determination.<br />
(b) Suppose the absorbance of an unknown sample is measured as 0.356. Use the calibration curve<br />
to estimate the glucose level.<br />
Citation: Neuhauser, C. <strong>Calibration</strong><br />
Created: October 18, 2009 Revisions:<br />
Copyright: © 2009 Neuhauser. This is an open‐access article distributed under the terms of the Creative Commons Attribution<br />
Non‐Commercial Share Alike License, which permits unrestricted use, distribution, and reproduction in any medium, and allows<br />
others to translate, make remixes, and produce new stories based on this work, provided the original author and source are<br />
credited and the new work will carry the same license.<br />
Funding: This work was partially supported by a HHMI Professors grant from the Howard Hughes Medical Institute. Page 9