- Text
- Species,
- Method,
- Richness,
- Intensity,
- Estimate,
- Abundance,
- Sampling,
- Estimation,
- Methods,
- Communities,
- Predictions

Predictions from data

difference = !(F”(k) - F’(k)) 2 ,The basic cycle may be described as follows:1. Give initial values to the parameters " and ! for the community, as well as thecommunity richness, R.2. Use the Pielou transform to produce the expected sample of this community.3. Compare the resulting theoretical sample with the one in hand via the least squaresmeasure.4. If the match is worse than before, reset the most recently changed value for ", !, or Rin the opposite direction. If the match is better, continue as before.Presently, the entire process is embedded in one of two computer programs, depending on themethod, and the complete richness estimation process may take anywhere **from** 2 to 200 cyclesto complete. In other words, a human must execute the search algorithm, a process that can itselfbe automated, cutting the estimation time **from** an hour or two down to a millisecond. Thealgorithm itself systematically cycles through ", !, and R, changing each until no furtherimprovement is seen, then switching to the next parameter. At no point do the correspondingparameter values for the sample at hand play any role. The sample richness R’ plays an implicitrole, however, through the values of the sample function F’ at each abundance category.5.3.1 The two-step method with an exampleThe first procedure described here is called the two-step method. It proceeds in two main steps:Step 1. Find a best-fit for the sample histogram with the logistic-J distribution. The programcalled BestFit does this, taking the sample histogram as input, then comparing these **data** with thenumbers generated **from** a theoretical (logistic-J) sample distribution with (sample) parametervalues input by the user of the program. The values of "’ and !’ thus arrived at can be variedsystematically over the parameter space to discover a global minimum in solution space. In mostcases the method of steepest descent finds the minimum without having to search the entirespace. The measure of fit is the chi square score divided by the number of degrees of freedom, asdetermined by the program. This method of scoring helps to minimize jumps in score values thatwould otherwise result when the program changes the number of degrees of freedom.Step 2. One then inputs the best fit logistic-J parameter values into the program CommRich,along with the sample intensity estimate, r, made by the biologist. The user then conducts adirected search through solution space by systematically varying the community parametervalues " and !, as well as the community richness R, as described above; for each set of valuesthus arrived at, the program computes values for the expected sample and compares the12

theoretical sample with the best fit curve **from** Step 1 using the least squares formula as ameasure of similarity. The underlying algorithm uses the smallest least squares score found so faras the basis for further improvements in the score. Any change in a parameter value that leads toan improved score is adopted as the starting point for the next step. The change is not selectedarbitrarily, but on the basis of producing the greatest improvement of the score, as it steadilydescends toward zero. At the end of the convergence process one reads off not only statisticallyaccurate estimates for " and ! in the community, but its richness, R, as a byproduct of theprocess.Time to convergence during either fitting process depends strongly on the starting parametervalues. But a form of binary search may be employed that speeds the process up, completing in atime that is proportional to the logarithm of the size of the parameter space being searched.An example of the method in action is provided by **data** sent to me by M. G. M. Jansen, a Dutchbiologist who has been conducting an extensive sampling program for lepidoptera inhabitingcoastal salt marshes in the Netherlands. Table 5.3 displays the **data** **from** one of Jansen’ssamples. Each cell of the table under the heading “no. spp.” also contains the correspondingnumber of species predicted for the corresponding abundance. The table shows observedabundances for some 45 species, the remainder having abundances 31, 32, 41, 67, 103, 103,1121, and 2073.abund.no. spp.abund.no. spp.abund.no. spp.115 15.24111 0.87211 0.4227 5.75120 0.79220 0.4033 3.60131 0.73230 0.3843 2.62140 0.67240 0.3652 2.05150 0.62251 0.3461 1.68162 0.58260 0.3372 1.43170 0.54270 0.3182 1.23180 0.50280 0.3091 1.09190 0.47290 0.29101 0.97201 0.45301 0.28Table 5.3. Sample abundances vs predicted ones13

- Page 1 and 2: Chapter 5. Predictions from dataBy
- Page 3 and 4: within the expected range of possib
- Page 5 and 6: It must be remarked that the shape
- Page 7 and 8: community having N individuals and
- Page 9 and 10: The Bootstrap methodPerhaps the mai
- Page 11: samples of it. In doing so, it illu
- Page 15 and 16: 5.4 Experimental illustration of me
- Page 17 and 18: -value# testssample sizemeans.d.err
- Page 19: analyse the respective contribution