01.03.2013 Views

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

Applied Statistics Using SPSS, STATISTICA, MATLAB and R

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

8 1 Introduction<br />

This can be done, for instance, with the help of a r<strong>and</strong>om number generator. In<br />

practice this “simple” task might not be so simple after all (as when we conduct<br />

statistical studies in a human population). The sampling topic is discussed in<br />

several books, e.g. (Blom G, 1989) <strong>and</strong> (Anderson TW, Finn JD, 1996). Examples<br />

of statistical malpractice, namely by poor sampling, can be found in (Jaffe AJ,<br />

Spirer HF, 1987). The sampling issue is part of the planning phase of the statistical<br />

investigation. The reader can find a good explanation of this topic in (Montgomery<br />

DC, 1984) <strong>and</strong> (Blom G, 1989).<br />

In the case of temporal data a subtler point has to be addressed. Imagine that we<br />

are presented with a list (sequence) of voltage values originated by thermal noise in<br />

an electrical resistance. This sequence should be considered as an instance of a<br />

r<strong>and</strong>om process capable of producing an infinite number of such sequences.<br />

<strong>Statistics</strong> can then be computed either for the ensemble of instances or for the time<br />

sequence of the voltage values. For instance, one could compute a mean voltage<br />

value in two different ways: first, assuming one has available a sample of voltage<br />

sequences r<strong>and</strong>omly drawn from the ensemble, one could compute the mean<br />

voltage value at, say, t = 3 seconds, for all sequences; <strong>and</strong>, secondly, assuming one<br />

such sequence lasting 10 seconds is available, one could compute the mean voltage<br />

value for the duration of the sequence. In the first case, the sample mean is an<br />

estimate of an ensemble mean (at t = 3 s); in the second case, the sample mean is<br />

an estimate of a temporal mean. Fortunately, in a vast number of situations,<br />

corresponding to what are called ergodic r<strong>and</strong>om processes, one can derive<br />

ensemble statistics from temporal statistics, i.e., one can limit the statistical study<br />

to the study of only one time sequence. This applies to the first two examples of<br />

r<strong>and</strong>om processes previously mentioned (as a matter of fact, thermal noise <strong>and</strong> dice<br />

tossing are ergodic processes; Brownian motion is not).<br />

1.3 R<strong>and</strong>om Variables<br />

A r<strong>and</strong>om dataset presents the values of r<strong>and</strong>om variables. These establish a<br />

mapping between an event domain <strong>and</strong> some conveniently chosen value domain<br />

(often a subset of ℜ). A good underst<strong>and</strong>ing of what the r<strong>and</strong>om variables are <strong>and</strong><br />

which mappings they represent is a preliminary essential condition in any<br />

statistical analysis. A rigorous definition of a r<strong>and</strong>om variable (sometimes<br />

abbreviated to r.v.) can be found in Appendix A.<br />

Usually the value domain of a r<strong>and</strong>om variable has a direct correspondence to<br />

the outcomes of a r<strong>and</strong>om experiment, but this is not compulsory. Table 1.4 lists<br />

r<strong>and</strong>om variables corresponding to the examples of the previous section. Italicised<br />

capital letters are used to represent r<strong>and</strong>om variables, sometimes with an<br />

identifying subscript. The Table 1.4 mappings between the event <strong>and</strong> the value<br />

domain are:<br />

XF: {commerce, industry, services} → {1, 2, 3}.<br />

XE: {bad, mediocre, fair, good, excellent} → {1, 2, 3, 4, 5}.<br />

XR: [90 Ω, 110 Ω] → [90, 110].

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!