EurOCEAN 2000 - Vlaams Instituut voor de Zee
EurOCEAN 2000 - Vlaams Instituut voor de Zee
EurOCEAN 2000 - Vlaams Instituut voor de Zee
You also want an ePaper? Increase the reach of your titles
YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.
multi-parameter data requires multi-variate data analysis techniques. Whereas multivariate<br />
statistical approaches can be very successful if the appropriate technique can be found (Carr et<br />
al. 1996), this is often not simple, and invalid assumptions about distributions can cause major<br />
problems. Artificial neural networks (ANNs) on the other hand, do not require a-priori<br />
knowledge of un<strong>de</strong>rlying distributions. Once trained they can make i<strong>de</strong>ntifications in near realtime<br />
and have been shown to have consi<strong>de</strong>rable potential for i<strong>de</strong>ntifying phytoplankton from<br />
AFC data (Boddy et al., 1994; Wilkins et al. 1996).<br />
Species i<strong>de</strong>ntification is not always possible based on light scatter and fluorescence<br />
characteristics, due either to similarities in the optical characteristics amongst species or to<br />
certain species having a wi<strong>de</strong> range of optical characteristics, such as with clumped cells or<br />
chains. One possible solution to these problems is to <strong>de</strong>velop species-specific oligonucleoti<strong>de</strong><br />
probes that hybridise only to their target species regardless of their morphology or life cycle<br />
stage. Flow cytometry or epifluorescence microscopy can <strong>de</strong>tect a fluorescent label attached to<br />
these probes and the probe-conferred fluorescence i<strong>de</strong>ntifies the target, thus verifying species<br />
i<strong>de</strong>ntification. Fluorescent rRNA probes can discriminate amongst components of the plankton<br />
which do not contain autofluorescent pigments including bacteria (Amann et al., 1995) and<br />
protozoa (Rice et al. 1997). Developing probes for species or wi<strong>de</strong>r taxonomic groups is of<br />
fundamental importance in providing comprehensive <strong>de</strong>scriptions of microbial communities<br />
and for assessing biodiversity at various taxonomic levels.<br />
The measurements of light scatter and fluorescence taken by the flow cytometer characterise<br />
the cells and can therefore be used to calculate optical and chemical properties. The time of<br />
flight of a particle/cell through the measurement beam is related to a length scale of the<br />
particle, and the amounts of forward and si<strong>de</strong> scattering of light by individual cells is related to<br />
shape of the cell in suspension. The red chlorophyll fluorescence is related to cell chlorophyll<br />
content (Jonker et al., 1995).<br />
ARTIFICIAL NEURAL NETWORKS (ANNS)<br />
I<strong>de</strong>ntification of microbial populations from flow cytometric observables is done using Radial<br />
Basis Function (RBF) ANNs. RBF ANNs can be trained relatively rapidly, and can<br />
discriminate unknown taxa from those taxa upon which a network has been trained. Detailed<br />
<strong>de</strong>scription of RBF ANNs is provi<strong>de</strong>d by Haykin (1994). The i<strong>de</strong>ntification success of RFB<br />
ANNs is at least as good as other ANN paradigms and non-neural classifiers (Wilkins et al.<br />
1996). Briefly, RBF networks have three layers of no<strong>de</strong>s. Flow cytometric data are input<br />
initially to the first layer, which serves merely to distribute these data to the hid<strong>de</strong>n layer no<strong>de</strong>s<br />
(HLNs). The HLNs each represent a non-linear Gaussian basis function or kernel, the output<br />
value of which <strong>de</strong>pends on the distance between the input data pattern and its centre. Several<br />
HLNs represent the data distributions for each species, and it is essential that there are<br />
sufficient HLNs to a<strong>de</strong>quately fill the data input space. The outputs from the HLNs are passed<br />
to the output layer, which contains one no<strong>de</strong> for each phytoplankton species. The output no<strong>de</strong><br />
with the highest value indicates the predicted i<strong>de</strong>ntity of the input data.<br />
ANNs learn from examples. To train an ANN, many patterns of flow cytometric characteristics<br />
are drawn at random from the data files for each of the target species and presented to the<br />
network together with the i<strong>de</strong>ntity. The success of network training <strong>de</strong>pends on the number of<br />
HLNs and a variety of other network parameters. These must be optimised by experimentally<br />
altering each of these factors in turn. The duration of training can be anything from a few<br />
minutes to half an hour or more, <strong>de</strong>pen<strong>de</strong>nt on computer hardware, the number of species and<br />
511