12.07.2015 Views

Analytical Chem istry - DePauw University

Analytical Chem istry - DePauw University

Analytical Chem istry - DePauw University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 4 Evaluating <strong>Analytical</strong> Data131of the individual distributions. Figure 4.28b shows the result of enteringthe command> densityplot(penny, xlab = “Mass of Pennies (g)”, main = “KernelDensity Plot of Data in Table 4.13”)The circles at the bottom of the plot show the mass of each penny in thedata set. This display provides a more convincing picture that the data inTable 4.13 are normally distributed, although we can see evidence of a smallclustering of pennies with a mass of approximately 3.06 g.We analyze samples to characterize the parent population. To reach ameaningful conclusion about a population, the samples must be representativeof the population. One important requirement is that the samplesmust be random. A dot chart provides a simple visual display that allowsus look for non-random trends. Figure 4.28c shows the result of entering> dotchart(penny, xlab = “Mass of Pennies (g)”, ylab = “PennyNumber”, main = “Dotchart of Data in Table 4.13”)In this plot the masses of the 100 pennies are arranged along the y-axis inthe order of sampling. If we see a pattern in the data along the y-axis, suchas a trend toward smaller masses as we move from the first penny to thelast penny, then we have clear evidence of non-random sampling. Becauseour data do not show a pattern, we have more confidence in the quality ofour data.The last plot we will consider is a box plot, which is a useful way toidentify potential outliers without making any assumptions about the data’sdistribution. A box plot contains four pieces of information about a dataset: the median, the middle 50% of the data, the smallest value and thelargest value within a set distance of the middle 50% of the data, and possibleoutliers. Figure 4.28d shows the result of entering> bwplot(penny, xlab = “Mass of Pennies (g)”, main = “Boxplot ofData in Table 4.13)”The black dot (•) is the data set’s median. The rectangular box shows therange of masses for the middle 50% of the pennies. This also is known as theinterquartile range, or IQR. The dashed lines, which are called “whiskers,”extend to the smallest value and the largest value that are within ±1.5×IQRof the rectangular box. Potential outliers are shown as open circles (º). Fornormally distributed data the median will be near the center of the box andthe whiskers will be equidistant from the box. As is often true in statistics,the converse is not true—finding that a boxplot is perfectly symmetric doesnot prove that the data are normally distributed.The box plot in Figure 4.28d is consistent with the histogram (Figure4.28a) and the kernel density plot (Figure 4.28b). Together, the three plotsprovide evidence that the data in Table 4.13 are normally distributed. Thepotential outlier, whose mass of 3.198 g, is not sufficiently far away fromthe upper whisker to be of concern, particularly as the size of the data setNote that the dispersion of points alongthe x-axis is not uniform, with morepoints occurring near the center of the x-axis than at either end. This pattern is asexpected for a normal distribution.To find the interquartile range you firstfind the median, which divides the datain half. The median of each half providesthe limits for the box. The IQR is the medianof the upper half of the data minusthe median for the lower half of the data.For the data in Table 4.13 the median is3.098. The median for the lower half ofthe data is 3.068 and the median for theupper half of the data is 3.115. The IQRis 3.115 – 3.068 = 0.047. You can use thecommand “summary(penny)” in R to obtainthese values.The lower “whisker” extend to the firstdata point with a mass larger than3.068 – 1.5 × IQR = 3.068 – 1.5 × 0.047= 2.9975which for this data is 2.998 g. The upper“whisker” extends to the last data pointwith a mass smaller than3.115+1.5×IQR = 3.115 + 1.5×0.047 =3.1855which for this data is 3.181 g.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!