12.07.2015 Views

Summary - People.stat.sfu.ca - Simon Fraser University

Summary - People.stat.sfu.ca - Simon Fraser University

Summary - People.stat.sfu.ca - Simon Fraser University

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

An assessment of the effect of hardness on the dose-response curves tosulphates through the use of model averaging.Carl James Schwarz (P.Stat. (Can), PStat (USA))Department of Statistics and Actuarial Science<strong>Simon</strong> <strong>Fraser</strong> <strong>University</strong>Burnaby, BC V5A 1S6cschwarz@<strong>stat</strong>.<strong>sfu</strong>.<strong>ca</strong>2011-07-20<strong>Summary</strong> Model averaging is method to determine the relative support for various hypotheses(models) about the effect of hardness on the dose-response curve to changing levels ofsulphate. For example, one hypothesis may be that the dose-response curve againstsulphate concentration is invariant over different hardness levels; or that the doseresponseagainst sulphate is different over the hardness levels. The Akaike InformationCriteria (AIC) measures the tradeoff between model fit (how well do the dose-responsecurves match the observed data) and model complexity (how many parameters areneeded to describe the curve). The AIC provides a way to determine the relative weightto be attached to various models given by the observed data. Furthermore, the modelweights <strong>ca</strong>n be used to provide estimates of benchmark doses (BMD, such as LCxx) thatweight the estimates provided by the different models according to their support from theobserved data.The model averaging paradigm was applied to mortality and growth responses over anumber of species collected in dose-response experiments at various levels of hardness.For the majority of responses, the majority of model weight was given to models wherethe dose-response curve varies over the hardness levels. In two <strong>ca</strong>ses (Lemna frondgrowth and final weight) was there substantial support for models where the doseresponsecurve was invariant across hardness levels. In these <strong>ca</strong>ses, natural variation inthe data and a limited response to sulphate made it difficult to determine the effect ofhardness on the dose-response curve. In one <strong>ca</strong>se (Rainbow Trout mortality) support wasalmost event split between a model where the dose-response curve was invariant tohardness and where the dose-response curve was “protective” as hardness increased. Inthis <strong>ca</strong>se, extra-binomial variation was detected which made it difficult to distinguish theeffect of hardness on the dose-response curve.Finally, model averaged estimates of the LC/IC10, LC/IC25, and LC/IC50 were obtained.These estimates are formed as a weighted average of the estimates from the individualmodels and the model averaged standard error accounts for both the uncertainty in the


estimates for each individual model and the variation in estimates among the modelsexamined.1. Introduction Dose-response studies are often used to estimate the risks associated with exposure toenvironmental hazards. For example, the B.C. Ministry of Environment recentlycommissioned a large study to investigate the response (mortality, growth, biomass) tosulphates under different water hardness from a number of species (e.g. Rainbow Trout,Daphnia).In each study (combination of species, response, and hardness level), the dose-responsecurve is estimated using a <strong>stat</strong>isti<strong>ca</strong>l model that relates the dose of sulphate to theresponse. For example, a probit regression model may be used when the response ismortality; a log-logistic model may be used if the response is biomass. The fitted curve isthen used to estimate a benchmark dose (BMD) such as the LCxx (the dose at which xx%additional mortality occurs over baseline mortality) or the ICxx (the dose at which theresponse (e.g. weight) is reduced from baseline). The estimates of BMD are model basedbe<strong>ca</strong>use a direct estimation of the BMD may require using several hundred or thousandsof organisms at a wide range of doses.Many software packages (e.g. CETIS) provide a large number of dose-response curves(models) that may be fit to the same set of data, and each curve leads to a differentestimate of the BMD. The risk manager is then faced with the problem of deciding whichmodel should be used, or equivalently, how to incorporate uncertainty in the BMD fromdifferent models that may fit the data equally well.The analyst could choose the model that leads to the lowest BMD under the belief thatthis provides a conservative estimate of the BMD. Or, the analyst could choose the singlebest fitting model and use the associated BMD. Both of these strategies have the flaw thatslight changes in the data or a new set of models could lead to a different “best” modelbeing selected. As well, the selection of the “best fitting” model depends on the criteriaused to define the fit of the model to the data and different criteria could lead to differentchoices of the best model.This report discusses a third option, model averaging, where the BMDs are “averaged”based on the support each model provided by the data. Burnham and Anderson (2002)and Anderson (2008) provide a comprehensive reference on this approach. Bailer et al(2005), Bailer, Noble and Wheeler (2005) provide examples applied to risk assessmentusing dose-response models.


2. Studies used. There are two groups of studies used in this report, generally <strong>ca</strong>lled the EnviromentCanada (EC) studies and the Nautilus Environmental (NA) studies (named after theorganization that performed the studies).In the Environment Canada studies, typi<strong>ca</strong>lly three water hardness values were tested onvarious freshwater species of aquatic organisms. The tests were done at a low waterhardness (50 mg/L), a medium water hardness (100 mg/L), and a high water hardness(250 mg/L). Details of the experimental prot0col are found in Buday and Schroeder(2011). We use the data from Buday and Schroeder (2011) to assess if there is evidenceof an effect of water hardness on the dose-response relationship between sulphate and thevarious endpoints measured.Raw data were provided as an Excel workbook, the raw output sheets (in pdf format)from the analyses done by Buday and Schroeder (2011) using the CETIS software, andadditional pdf files from the Saskatchewan Research Council who performed some of thework under sub-contract from Environment Canada.In the Nautilus Environmental (NA) study, there were between one and four hardnesslevels (ranging from 15 to 320 mg/L) and a variety of freshwater species of aquaticorganisms. We use the data from Elphick et al. (2010) to also assess if there is evidenceof an effect of water hardness on the dose-response relationship between sulphate and thevarious endpoints measured. Only those organisms where at least two levels of waterhardness were studies are used in this paper.The raw data was extracted from copies of the raw output sheets (in pdf format) from theanalyzes done by Elphick et al. (2010) using the CETIS software. Only the organismswhere at least two water hardness levels were tested were used.It is assumed that all the data presented are valid and no examination of the raw data foroutliers or other anomalous points has been done.The sampling protocol for each aquatic organism is presented in detail in Buday andSchoreder (2011) and Elphick (2010). A brief summary is presented in Table 1. All testswere performed at various levels of hardness of water and usually five or six nominalconcentrations of sulphate. In the Environment Canada studies, the actual sulphateconcentration was measured at the start and the end of the experiment and the average ofthe two values was used as the actual sulphate concentration. In the Nautilus studies, thenominal sulphate levels as recorded on the CETIS sheets were used directly. In Elphick etal. (2010, Table 2), a comparison of the measured vs. nominal sulphate levels showed arelatively good agreement. Most experiments also had a control (nominal zeroconcentration) of sulphate.


3. Theory of Model averaging "All models are wrong, but some are useful" (Box and Draper, 1987) is an apt descriptionfor <strong>stat</strong>isti<strong>ca</strong>l modeling of many biologi<strong>ca</strong>l systems. While a Probit (see below) modelmay be an adequate approximation to the underlying dose-response relationship, theProbit model is not the "truth" and is necessarily "wrong" as no biologi<strong>ca</strong>l system followssuch a simple dose-response curve. Consequently, estimates of BMDs are modeldependent and reported precision (e.g. se) are conditional on the choice of this (wrong)model!There are often several models that give essentially the same fit to the data, but couldgive rise to different estimates of the BMD. Model averaging (e.g. Burnham andAnderson, 2002; Anderson, 2008) is a way to recognize that all models are onlyapproximations to reality and that there may be different models giving different answers.The basis behind model averaging is the use of AIC (Akaike Information Criteria) 1 whichmeasures a combination of model fit and complexity. For example, if two models givethe same fit to the data, but one model requires 2 parameters and the other model requires50 parameters, then "Ockham's Razor" says that the model with fewer parameters ispreferred. Similarly, as you increase the number of parameters, the fit of a model mustimprove (more parameters give a more flexible model), but is the improvement"worthwhile" in light of the increase in complexity of the model. AIC 2 values are afunction of likelihood (model fit) and number of parameters (complexity)AIC = !2 " log(likelihood) + 2 " (# parameters)Models with (arithmeti<strong>ca</strong>lly) smaller AIC values are preferred 3 . The actual numeri<strong>ca</strong>lvalue of AIC is not important (nor interpretable), but the differences in AIC amongcompeting models are important and lead to a measure of relative support for thesemodels. Once a set of models is fit to the data, the AIC is computed for each model, andthe models are sorted by the AIC values from the model with the best support from thedata (lowest AIC) to the model with the worst support from the data (highest AIC). Thedifference in AIC (termed delta AIC) is found as:!AIC model= AIC model" AIC best model[By definition, the !AIC = 0 for the model with the best support.] If two models havesimilar AIC values (usually within 2 or 3 units of each other), then these models aresimilar in their support by the data. Models that differ by more than 5 AIC units from themodel best supported by the data are usually not thought of as being competitive. AIC notonly rewards goodness of fit, but also includes a penalty that is an increasing function ofthe number of estimated parameters. This penalty discourages overfitting (increasing the1 There are other model averaging criteria such as BIC (Bayesian Information Criteria).The same general principles apply to these other criteria.2 AIC values are usually corrected for small sample sizes. This is not detailed in thisreport but has been done in the examples that follow. The small sample corrected versionof AIC is referred to as AICc.3 For example a model with an AIC of -10 is preferred over a model with an AIC of -5.


number of free parameters in the model improves the goodness of the fit, regardless ofthe number of free parameters in the data-generating process).Model weights are computed for each model based on a normalized function of the!AICw model= exp(!"AIC model/ 2)#exp(!"AIC model/ 2)These model weights range from 0 to 1 and sum to 1 over the models considered.These model weights provide a way to combine BMDs over competing models. Eachmodel provides an estimate of the BMD and a weighted average (based on the modelweight) is the "best" estimate for this BMD:BMD = w modelBMD model!modelsThe SE from each model <strong>ca</strong>n also be combined and if the estimates of the BMD varyconsiderably among models, an extra component of variation to account for this variationin estimates of BMD is also included:"se(BMD) = w modelse modelmodels( ) 2 + (BMD model! BMD) 2Confidence intervals are formed in the usual way based on the estimate and its standarderror. For example, an approximate 95% confidence interval is found asBMD ±1.96se(BMD)The AIC paradigm is quite different from hypothesis testing and p-value approaches. Thehypothesis testing and p-value approaches assume that one of the two models underconsideration is correct. This is not biologi<strong>ca</strong>lly supported. For example, suppose you areinterested in the effect of water hardness on the dose-response curve. The null hypothesissays that there is NO effect of water hardness. This hypothesis is clearly biologi<strong>ca</strong>llywrong -- there must be an effect of water hardness - it may be small and not detectablefrom the data, but the hypothesis of no effect is "wrong" on biologi<strong>ca</strong>l grounds. The AICparadigm recognizes that all models are wrong, and provides a way to "quantify" if the fitis adequate compared to a more complex model.Models do not have to be "nested" or of the same type to use the AIC paradigm. Forexample, the model set for a mortality dose-response curve could include a Probit model,a logistic model, a Gompertz model etc. However, all models must be likelihood based,so non-parametric models (such as non-parametric Karber-Spearman method for findingthe LC50) are not directly usable. Likelihood methods are a scientifi<strong>ca</strong>lly defensible wayto fit <strong>stat</strong>isti<strong>ca</strong>l models that uses all of the information in the data. Many existing methodsare likelihood models in disguise (e.g. least squares for linear regression with normallydistributed data is a likelihood fit).One key assumption of the AIC paradigm is that the models chosen in the model set are


sensible approximations to reality. For example, if the data showed an increasing effect ofdose on mortality and only models that allowed for a decreasing effect of dose onmortality are fit, AIC will still rank these silly models and give the relative ranking ofthese silly models. Consequently, the fit of the model should also be ascertained by theanalyst (this usually is done via residual plots and other methods).The models in the model set should be specified in advance and the temptation to “datadredge” should be avoided. “Data dredging” would involve looking at the data andadding models that fit this particular dataset well, but have no a priori biologi<strong>ca</strong>lrationale. The danger is that the added models based on inspection of the data may be agood fit for this particular set of data, but minor changes in the data set would lead you tochoose a different model to add. In reality, some model specifi<strong>ca</strong>tion is driven by apreliminary look at the data, e.g. is hormesis present, and as along as the general class ofmodels added is very general, this should be acceptable.Be<strong>ca</strong>use BMDs are typi<strong>ca</strong>lly computed as a function of the model parameters, there issome ambiguity in how the BMDs should be averaged. For example, should the BMD beaveraged on the log-s<strong>ca</strong>le and then the averages are anti-logged, or should the averagingtake place directly on the anti-log s<strong>ca</strong>le. There is no biologi<strong>ca</strong>l definitive answer (forexample, concentration of sulphates are measured on mg/L s<strong>ca</strong>le, but pH are measured ona logarithmic s<strong>ca</strong>le). The two approaches <strong>ca</strong>n lead to slightly different answers be<strong>ca</strong>usethe log() function is a non-linear transform, but the two methods should lead to similarresults. As many models used in this project fit models where sulphates are measured onthe log() s<strong>ca</strong>le, the model averaging will take place on the log-s<strong>ca</strong>le with a final anti-log()taken at the end of the process. This will lead to asymmetric confidence intervals on theanti-log s<strong>ca</strong>le. For example, from Table 4, the estimated BMD on the log-s<strong>ca</strong>le is 4.84(SE 0.51). This gives 95% confidence intervals on the log-s<strong>ca</strong>le of (3.85, 5.84) whichlead to 95% confidence intervals on the antilog s<strong>ca</strong>le of e 3.85 = 47,e 5.84 = 342( ) .Wheeler and Bailer (2007) discuss an alternate way to use model averaging where thedose-response curves are model averaged and the model-averaged curve is used to findthe BMD, rather than model averaging the BMDs directly. This approach has not beenapplied in this report.In some <strong>ca</strong>ses, a model may fit the observed data reasonably well, but is unable toprovide an estimate of the BMD. This usually happens for one of two reasons. First, themodel should (in theory) provide an estimate of the BMD, but sparsity in the data leads toa model fit where the BMD not longer exists. For example, the mortality in the observeddose range in the study is relatively constant but be<strong>ca</strong>use of natural variability, theobserved mortality declines with dose (e.g. 2/10 die at dose 100, 1/10 die at dose 200, and0/2 die at dose 300). A fitted Probit model would lead to a model where the mortalitydeclines as function of dose and would never reach 50% mortality (the LC50) and so noestimate of the BMD is available.Second, the model may fit the observed data well, but <strong>ca</strong>nnot be extrapolated outside theobserved range of the data. This is most common with isotonic models where the


isotonicity constraint <strong>ca</strong>n be applied within the observed range of the data, but it isunclear how to extrapolate outside the observed range of doses. For example, supposethat in the observed range of doses, the mortality rate ranged from 0% to 40% (at thehighest observed dose). It is not clear how to estimate the LC50 as this endpoint isoutside the range of the observed doses. Al that is known is that estimate of the LC50 ishigher than the observed dose, but no estimate is available.It is valid to include models where no BMD <strong>ca</strong>n be determined in the model set and toobtain a model weight for this model. This is a valid comparison of competing models ina general sense – which models are supported by the data. For specific BMDs, the modelmay or may not be able to provide an estimate (e.g. it may be able to provide an estimateof the LC10, but not of the LC50). In <strong>ca</strong>ses where the model <strong>ca</strong>nnot provide an estimateof a BMD, it is assigned a model weight of 0 (even though its weighting in the modelselection may be higher). This is not contradictory as the two analyses are answering twodifferent questions (1) which model is the best tradeoff in fit and complexity for the givendata and (2) how much credence should be given to each models estimate of the BMD.Of course, <strong>ca</strong>ses where all of the high ranking models fail to provide estimates of theBMD while the low ranking models are able to provide estimates of the BMD indi<strong>ca</strong>temore serious problems with the study – most likely the BMD is well outside the range ofthe observed data and extrapolations may be pure fiction!4. Models used. There are two classes of responses in this study – quantal responses where the mortalityof organisms is measured as a function of dose, and continuous responses (e.g. biomass)measured as function of dose.4.1 Mortality Responses.For the mortality responses, Probit models (Bliss, 1934) were used. The basic Probitmodel assumes that the number of deaths follows a binomial distribution where theprobability of mortality is “linked” to a linear function through the normal distribution.For example, consider the Probit model for a fixed hardness level – the <strong>stat</strong>isti<strong>ca</strong>l modelis:Dead ij! Binomial(BatchSize ij, p i)( ( ))p i= ! " 0+ " ilog D ijwhere Dead ijis the number of dead organisms observed in the j th batch out of the initialBatchSize ijunits on tests at dose level (sulphate) D i; ! 0,! 1are the intercept and slope inthe Probit model; and ! is the cumulative normal distribution. [The original papers onProbit analysis added 5 to the linear functions to avoid negative numbers in handcomputations, but this is no longer required when using computers.] The parameters areestimated using maximum likelihood (e.g. via Proc Probit in SAS). Estimates of the


LCxx values (i.e. at what concentration will a fraction xx or organism die) <strong>ca</strong>n be foundonce estimates of the slope and intercept are found by solving the equationLCxx ! /100 = ! ˆ"0 + ˆ" ilog D ij( ( ))Maximum likelihood estimates are asymptoti<strong>ca</strong>lly the best possible estimates and extractthe maximum amount of information from the data. Estimates of precision (i.e. standarderrors) <strong>ca</strong>n be found automati<strong>ca</strong>lly for the parameters of the likelihood equations and bythe delta method (Taylor series expansion) for the LCxx values.The formulation above assumes that the probability of death will decline to zero as thesulphate dose declines to 0. Probit models have been developed to deal with non-zeronatural responses. In the original papers, the observed mortality at control doses wastreated as a fixed known natural response and the Probit analysis applied only tomortalities above this level. This approach ignored the uncertainty in the estimate and theresulting estimates and standard errors from the remainder of the fit did not account forthis. A more modern approach is to let the natural response rate be another parameter tobe estimated in the model along with the slope and intercept of the Probit function. Againconsider the Probit model for a fixed hardness level – the <strong>stat</strong>isti<strong>ca</strong>l model is:Dead ij! Binomial(BatchSize ij, p i)( ( ))p i= NR + (1! NR)" # 0+ # ilog D ijwhere NR is the natural response (mortality) at no (the control batches) sulphate, i.e. thefraction of units expected to die in the absence of an effect of sulphate. The parametersare again estimated using maximum likelihood (e.g. Proc Probit in SAS).Note that in models with a very small dose-response effect, there is some ambiguity inthe parameterization. This is be<strong>ca</strong>use it is very hard then to distinguish between a naturalresponse, or a model with a slope close to 0 as both will give similar fits to the data. In<strong>ca</strong>ses like this, it may be better drop the natural response terms.Be<strong>ca</strong>use of the natural response, estimation of the LCxx values must be done with <strong>ca</strong>re.For example, the LC25 values refer to the dose that results in a 25% mortality of theorganism that survive the natural response. Suppose that the estimated natural response is13%. Consequently, only 87% of the organisms would survive in the absence ofsulphates. The LC25 refers to the additional 25% of 87%=22% mortality above thenatural response for a total mortality of 12% + 22% = 35%. The estimated LC25 value isfound by now solving:( ).35 = .13+ .87! ˆ" 0+ ˆ" 1log(D)which again leads to( ).25 = ! ˆ" 0+ ˆ" 1log(D)i.e. the LC25 does not correspond to the by dose which leads to an overall .25 mortality.In some <strong>ca</strong>ses, the LCxx values <strong>ca</strong>nnot be estimated. For example, if the probit model hasan estimated slope < 0, then the predicted mortality rate declines with dose. [A nonpositiveestimate of the slope typi<strong>ca</strong>lly occurs with sparse data where, just by chance,


fewer mortalities occurred at higher doses than at lower doses.] Even if the Probit modeldoes fit, the dose-response curve may be so shallow that the estimated LCxx value is wellbeyond the range of the observed doses in the study. For example, suppose that mortalityrates range from 0 to 10% in the range of doses in the study. The estimated LC50 valuewill be far to the right of the observed doses. Extrapolation well beyond the observedrange of doses may be inadvisable – consequently, any LCxx value that is more than 2xthe maximum observed dose in the study is “deleted”.A goodness-of-fit <strong>stat</strong>istic of the Probit model (both with and without a natural response)to the data is found by comparing the observed and expected counts:( Dead! 2 ij" BatchSize ij ˆp ij ) 2( Alive ij" BatchSize ij ( 1" ˆp ij )) 2= # + #BatchSize ij ˆp ijBatchSize ij ( 1" ˆp ij )where ˆp ijis the predicted probability of death for each batch. If the assumptions of themodel are satisfied, this <strong>stat</strong>istics should follow a ! 2dfdistribution where the df is foundappropriately. If the X 2 <strong>stat</strong>istic is extreme, it indi<strong>ca</strong>tes a lack-of-fit. There are twocommon reasons for lack-of-fit. First, the model itself <strong>ca</strong>n be wrong (e.g. the response isnot linear on the Probit s<strong>ca</strong>le), or the structural model is valid (i.e. the response is linearon the Probit s<strong>ca</strong>le), but the data are more variable than expected from a binomialresponse. The latter is termed overdispersion. For example, consider the sampleproportion of organisms that die in batches of 30 organisms where the underlyingmortality rate is 30%. Statisti<strong>ca</strong>l theory indi<strong>ca</strong>tes that under the binomial model, theaverage number that would die would be 9 = 30(.3), but the actual number that could diewould range from 4 to 14. If the observed number that dies ranged from 1 to 17, thiswould indi<strong>ca</strong>te overdispersion, even though the average number that dies is still be 9.Typi<strong>ca</strong>lly <strong>ca</strong>uses of overdispersion are non-independence in the fate of the organism. Forexample, if all the organisms are placed in the same test tube, a lo<strong>ca</strong>l contaminant couldreduce/increased the survival rate of this batch from the projected 30%.The consequence of overdispersion is that estimates remain unbiased, but the reportedstandard errors (and p-values derived from them) are under<strong>stat</strong>ed, i.e. the results appear tobe more precise than they really are.Corrections for overdispersion were incorporated directly in the model through therandom effect Probit models (Gibbons et al, 1994; Gibbons and Hedeker, 1994). In therandom effect model, latent (unobserved) random noise is added to the Probit function:Dead ij! Binomial(BatchSize ij, p i)( ( ) + $ ij )p i= NR + (1! NR)" # 0+ # ilog D ij( )$ ij! N 0,% 2for non-control doses of sulphate, andDead ij! Binomial(BatchSize ij, p i)( )( )p i= ! ! "1( NR) + # ij# ij! N 0,$ 2


for control doses of sulphate, where ! ijis a latent random effect that comes from anormal distribution with mean 0 and variance! 2 , i.e. adding extra variation in themortality rate at a specified dose. So even if the expected mortality rate at a particulardose is 30%, the random effect (applied at the batch level) could vary this higher orlower. This model <strong>ca</strong>n also be fit using maximum likelihood (e.g. Proc Nlmixed in SAS).Estimates from the fitted model automati<strong>ca</strong>lly incorporate the effects of the excessrandom variation in their standard errors.The primary goal of this paper is to investigate the effect of hardness levels on the doseresponsecurve. We accomplish this by fitting two (or more) models to the combined datafrom the three hardness levels. In first model (the Separate response model), a separateprobit curve is fit to each hardness level. So, if the basic probit model is used with 3hardness levels, this model will require 6 parameters (an intercept and a slope for eachhardness level). This <strong>ca</strong>n be done in a single model fit rather than (the equivalent)running three separate models (one for each dose). In the second model (the Commonresponse) model, the data are pooled over all hardness levels and single Probit model isfit. This model has 2 parameters. In some <strong>ca</strong>ses, additional models were run whereindividual Probit curves were fit (one for each hardness level), but the curve wereconstrained to have a common LC10, LC25, or LC50 values (e.g. refer to Jeske et al.,2009). For these models, penalized maximum likelihood was used where the penalty iscomputed as the difference among the LCxx values from the individual curves. Thispenalty declines to zero as the common LCxx value is achieved across all the curves.The Separate response model is very general. Each hardness level has its own doseresponsecurve and these curves do not have to have the same shape. Consequently, it ispossible that the dose-response curve for lower hardness levels give rise to higherestimated mortalities than the dose-response curve for a higher hardness value. Anintermediate model between that of the separate curves for each hardness level and acommon curve for all hardness levels, is the Monotonic-Separate response curve wherethe probit curves are “parallel” at different hardness levels and increasing hardness isalways “protective”, i.e. higher hardness values does not lead to an increase in mortalityat any sulphate dose. More formally,Dead ij! Binomial(BatchSize ij, p i)( ( ) +$ hardness )p i= NR + (1! NR)" # 0+ # ilog D ijwhere ! hardnessis the “shift” in the curve due to different hardness’s and constraints areplaced on these parameters to ensure that the dose-response curve never decreases ashardness increases. A schematic of the results from such a model is:


Survival10.90.80.70.60.50.40.30.20.101 2 3 4 5 6 7 8 9log(Sulphate)Notice that as hardness increases (the three lines from left to right), the mortalitydecreases at any sulphate level, i.e. hardness is “protective”. The shape of the threecurves is identi<strong>ca</strong>l – all that happens is that the S-shape is shifted to the right as hardnessincreases.The suite of potential probit models fit is described by a 3 part “code”. First, is themodeling of the effects of hardness as either a separate model for each hardness(Separate), or a model with a separate curve for each hardness but a common LCxx(CommonLCxx), or a common model for each hardness (Common), or a model with ashifted-to-the-right (“protective”) dose-response curves as hardness increases(SeparateMono). Next, does the model assume no natural response (NoNR), a commonnatural response over all hardness levels (CNR), or a separate natural response for eachhardness level (SNR). Finally, does the model include a random effect (RE) or excluderandom effects (NoRE) to account for overdispersion. For example, a Probit modelidentified as CommonLC10, NoNR, NoRE corresponds to fitting the model withseparate curves for each hardness level but a common LC10, no natural responses, and norandom effects.Other possible choices for quantal responses are logistic, Gompertz, log-logistic, etc, butthese lead to very similar dose-response curves especially for estimating BMDs in the .10to .90 range (Ritz, 2010) and so were not fit.Many of the analyses from EC and NA of the individual studies used isotonic regression(see next section) applied to the observed mortalities. This method would be appli<strong>ca</strong>ble ifthere is evidence of a structural lack of fit in the Probit model (i.e. the response is notlinear on the probit s<strong>ca</strong>le) but no large lack of fit was detected in any o the studies. Theisotonic model treats a natural response as simply another set of data values. In these<strong>ca</strong>ses, the LCxx values from isotonic regression are not directly comparable to those fromthe maximum likelihood Probit approach with a natural response. In the isotonic method,


no natural response is assumed and so the LCxx value includes the natural response intotal mortality.The choice of model to be fit to a particular study depended on a preliminary inspectionof the data (see Schwarz, 2011). For mortality studies that are very sparse (few animalson test) only simple models are tenable (i.e. without natural responses or random effects)<strong>ca</strong>n be fit as more complex models will fail to fit be<strong>ca</strong>use of a lack of clear effect.For example, consider the plots of fitted models for the EC-RT mortality data found inFigures 1a-1f and summarized in Table 2-EC-RT. The most general model, the Probit,Separate, SNR, RE model (Figure 1a) has a separate dose-response curve for eachhardness levels along with a separate natural response curve for each hardness level. Thismodel has a 10 parameter (a slope, intercept, natural response for each of 3 hardnesslevels plus 1 parameter for the variance of the random effects). The dose-response curvefor hardness level 50 is to the left of the dose-response curve for hardness 100 which inturn is to the left of the dose-response curve for hardness 250 in the range of dosesstudied in this experiment. This ordering is NOT enforced by this model and occurred“naturally” as the data is fairly strong. However, the natural responses do not follow thissame ordering with the natural response at hardness 250 is between that of hardness 50and 100. This model is the most flexible and so has the best fit to the data (largestlikelihood value of -165.6 and an AICc value of 356.3.Figure 1b plots the dose-response curves for the Probit, Separate, CNR, RE model wherethere are three separate dose-response curves for the three hardness levels, but now allthree dose-response curves have the same natural response. This model has 8 parameters(slope and intercept for the 3 dose-response curves plus one parameter for the commonnatural response plus one parameter for the variance of the random effects). This model isless complex than the previous model (fewer parameters), but will fit the data less well(has a lower likelihood value of -165.8). However, the reduction in fit compared to theprevious model is .03 which is a small reduction in fit for a reduction by 2 in the numberof parameters be<strong>ca</strong>use the three separate natural responses from the model in Figure 1aseems to be too flexible as the three natural responses are not very different.Consequently, the AICc of 350.7 is smaller than the AICc of the previous modelindi<strong>ca</strong>ting a model with more support from the data.Figure 1c plots the dose-response curves for the Probit, SeparateMono, CNR, RE modelwhere the three dose-response curves are “parallel” on the probit s<strong>ca</strong>le (which leads to S-shaped curves on the mortality s<strong>ca</strong>le that are shifted left or right). While the doseresponsecurves for this model look very similar to those in Figure 1b (each hardness hasa separate dose-response curve), the fit is not as good (the likelihood for this model (-165.9) is slightly less than the likelihood for the previous model (-165.8)). However, thismodel has fewer parameters (6 in total being the slope and intercept for the first curveplus the variance of the random effects plus the common natural response plus the twoshift for hardness levels 100 and 250). Consequently, the AICc is much improved (345.5for this model vs. 350.7 for the previous model) as the loss in fit (difference in likelihood)


is inconsequential relative to the reduction in complexity. This model has better supportfrom the data than the previous two models.Figure 1d, 1e, and 1f display the fit of models where the LC50, LC25, and LC10 areforced to be the same across all three dose-response curves while the overall doseresponsecurve still varies across hardness levels. Be<strong>ca</strong>use the natural response is allowedto vary among the hardness levels, the common L50 does NOT occur at total mortality of.50, but rather .50 of the remaining mortality over and above the natural response occursat the same LCxx values. For example, at hardness 50, the natural response is about .18.Consequently, 50% of the remaining mortality (.50(.82)=.41) occurs at a total mortalityof .41+.18=.59 which is at sulphate level just over 1000. Similarly, for hardness 100, thenatural response is about .10 and 50% of the remaining mortality (.50(.90)=.45) leads to atotal mortality of approximately .10+.45=.55 which also occurs at sulphate level just over1000.Counting parameters in these models is not straightforward be<strong>ca</strong>use of the constraint ofequal LCxx values. For each of these models, the first curve fit to the smallest hardnesslevel has 3 parameters (slope, intercept, natural response). For each successive hardnessconcentration, there is one more parameter for the natural response and only 1 moreparameter to fit a curve that goes though the same LCxx value as the first curve. Addingone more parameter for the variance of the random effects gives a total of 8 parameters.The log-likelihood is worse than the most general model (Figure 1a), but the reduction infit is offset by the reduction in the number of parameters, and so the AICc gives moresupport to these models compared to the most general model.Finally, Figure 1f fits the Probit, Common, CNR, RE model where a single dose-responsecurve is fit for all hardness levels. This model has only 4 parameters but has the worst fitto the data (smallest likelihood of -168.1) but the reduction in fit is again offset by thelarge reduction in the number of parameters required for the fit. The AICc indi<strong>ca</strong>tes thatthis model has the most support of the models considered in Table 2-EC-RT.Figure 2a-2c illustrates what happens when the SeparateMono model is fit to data that isnot monotonic as the hardness level increases. The fit in shown in Figure 2a shows theapparent mortality at hardness 15 is lower than the mortality at hardness 80. The Probit,Separate, NoRN, NoRe model (and all other models where separate curves are fit) doesnot enforce “protective” effect of hardness. In Figure 2b, the SeparateMono model is fitwhich enforces a “protective” effect of hardness. Consequently, a single curve is drawn.In fact, this model reduces to the Probit, Common, NoNR, NoRE model [This will onlyhappen in <strong>ca</strong>ses with two hardness levels.] Table 2-NA-TA-mortality shows that thelikelihood values for the Common and SeparateMono models is the same (implying anindenti<strong>ca</strong>l fit), but the SeparateMono model has an extra parameter (the effect of hardness80 relative to hardness 15 which happens to be estimated at 0) and so has less supportfrom the data. Be<strong>ca</strong>use the Separate model does not enforce the “protective” effect ofhardness, it has more support from the data than either of the two other models. In fact,the model with a common LC50 point seems to have the highest support from the data,


ut there is still substantial support for other models. The sparsity of the data makes itdifficult to distinguish among the various models fit to the data.4.2 Continuous responses:There is no common model suitable for modeling weight, reproduction, frond number, orother non-binomials endpoints. The CETIS software has a wide suite of potential models(e.g. the Gompertz) but in the majority of the <strong>ca</strong>ses here, the CETIS software uses a linearinterpolation method (ICPIN). This is also known as isotonic regression (Barlow et al,1972). The basic premise is that the response variable should decline with increasingsulphate levels. However, be<strong>ca</strong>use of sampling fluctuation, the observed curve may notshow the monotonic decline with increasing sulphate levels.Basi<strong>ca</strong>lly, isotonic regression works from left to right through the data. If the meanresponse at the next X value is higher than the current fitted Y value, then the previousdata and the new Y are pooled, a new mean is computed, and algorithm moves to the nextX value. This is a “non-parametric” method, but <strong>ca</strong>n be shown to be the maximumlikelihood approach under monotonicity of the sulphate effect. The R function isoreg()<strong>ca</strong>n be used to fit these models. The likelihood, assuming that the distribution of datavalues is normally distributed at a particular dose level, <strong>ca</strong>n be found from atransformation of the sum-of-squares of the residuals from the fit.Estimates of the ICxx values are found by linear interpolation on the log(dose) s<strong>ca</strong>le.ICxx responses are measured from the mean response at the lowest observable doserather than at dose 0. For example, if a study used doses 100, 200, 400, 800, 1600 forsulphate, the baseline response is estimated from the dose 100 mean. Be<strong>ca</strong>use differentstarting doses were used for different hardness levels, the baseline response may differamong these studies solely be<strong>ca</strong>use of different initial doses and not be<strong>ca</strong>use of hardnesseffects. [A similar problem occurs with functional curves fit as discussed later in thissection.] Standard errors (and confidence limits) for the ICxx values are found using abootstrap method. Several hundred bootstrap samples were generated with replacementfrom the observed data. For each bootstrap sample, the isotonic regression model was fitand the estimate of the ICxx value determined. The 2.5 th and 97.5 th percentile of thebootstrap estimates were used as the 95% confidence intervals for the parameter. Notethat it is impossible to estimate any ICxx value that exceeds the largest dose observed inthe experiment be<strong>ca</strong>use there is no information from the data on the shape of the curveafter the largest observed dose. In these <strong>ca</strong>ses, no estimate is reported. Similarly, in some<strong>ca</strong>ses, the isotonic regression line is completely flat and no estimate of the ICxx values<strong>ca</strong>n be computed.Isotonic regression models were fit where a single curve was common for all hardnesslevels (denoted as IR.Common) or a separate curve was fit for each hardness level(IR.Separate). It is not possible to fit an isotonic regression model with a separate curvefor each hardness level but a common ICxx value. I am also unaware of any method thatcould be used to enforce (declining) monotonicity in the effects of increasing sulphate


The function fit isA(1+ E • X)Y =1+ exp(!C(X ! D))and the estimated values from the fit were A=104.42, C=-.004325, D=1279, andE=.0006324.CETIS estimates the IC50 as 1509.9 mg/L. The value of Y at this point is:A(1+ E • X)Y =1+ exp(!C(X ! D)) = 104.42(1+ .0006324 •1509.9)1+ exp(!(!.004325)(1509.9 !1279))= 55.06This is NOT 50% of the 104.42 (the value of mean biomass when dose = 0). Rather it is50% of the mean biomass at dose = 93. At does=93, the response isA(1+ E • X)Y =1+ exp(!C(X ! D)) = 104.42(1+ .0006324 • 93)1+ exp(!(!.004325)(93!1279))= 110.12For both the 3-parameter log-logistic and the 4-parameter logistic hormesis model, thebaseline values for this report were taken as the expected response at dose=0.Consequently, estimates of BMDs may differ in the report from those reported by CETIS.


5. Results. The model selection table for each species/response listed in Table 1 is presented in Table2. Plots of the models fit to the raw data are available in Schwarz (2011) and on the webat http://www.<strong>stat</strong>.<strong>sfu</strong>.<strong>ca</strong>/~cschwarz/Consulting/MOE/2011-Sulphates/EC-analyses andhttp://www.<strong>stat</strong>.<strong>sfu</strong>.<strong>ca</strong>/~cschwarz/Consulting/MOE/2011-Sulphates/NA-analyses. As alsoshown in the previous report (Schwarz, 2011), models with separate dose-response curvesfor each hardness level are given the majority of model weight in all but three <strong>ca</strong>ses. Intwo <strong>ca</strong>ses, the EC-Lemna frond numbers, EC-Lemna frond weight, there was substantialsupport for models where the dose-response curve was invariant across hardness levels.In these <strong>ca</strong>ses, natural variation in the data and a limited response to sulphate (i.e. theresponse was almost flat across the levels of sulphate) made it difficult to determine theeffect of hardness on the dose-response curve. In one <strong>ca</strong>se (Rainbow Trout mortality)support was almost evenly split between a model where the dose-response curve wasinvariant to hardness and where the dose-response curve was “protective” as hardnessincreased. In this <strong>ca</strong>se, extra-binomial variation (the random effect in the probit model)was detected which also made it difficult to distinguish the effect of hardness on the doseresponsecurve. This implies, that except in these latter <strong>ca</strong>ses, there is strong evidence thathardness appears to influence the dose-response curve against sulphate.A summary of the model averaged estimates of the LCxx/ICxx are presented in Table 3with complete details of the individual estimates from each model for each studyavailable on the web at http://www.<strong>stat</strong>.<strong>sfu</strong>.<strong>ca</strong>/~cschwarz/Consulting/MOE/Sulphate-2011/Reports/Appendix-2011-07-20. For example, Table 4 presents an extract of themodel averaging for the LC10 value for EC-Rainbow Trout at hardness 50. Both thecommon curve over all hardness levels and the separate curve with a monotonic(protective) effect of hardness have substantial support, with minor support for the othermodels. Estimates of the LC10 (on the log-s<strong>ca</strong>le) range from 4.73 (123 on the anti-logs<strong>ca</strong>le) to 5.22 (185 on the antilog s<strong>ca</strong>le). The weighted average LC10 is 4.84 (on the logs<strong>ca</strong>le) corresponding to 127 on the anti-log s<strong>ca</strong>le as reported in Table 3. The modelaverage SE incorporates the variability in the estimates among the models fit to the data.Be<strong>ca</strong>use the two top models are “contradictory” (one has no effect of hardness while theother has a protective effect of hardness), the model averaged estimates of the LC10 atthe three hardness levels (127, 163, and 213 at hardness 50, 100, 250 respectively) are notthe same, but are closer together than the estimates from the Separate.Mono model alone(99, 181, and 257 for hardness 50, 100 and 250 respectively as extracted from theAppendix). The model averaged standard errors are larger than the standard errors forany model to account for this model uncertainty.In some <strong>ca</strong>ses, no estimates of the BMD are provided (e.g. estimates of LCxx values athardness levels 50 and 100 for the mortality studies of EC-Chinook eggs). Examinationof the actual data shows that observed mortality was so low, that no model was able toprovide sensible estimates of the LCxx values at lower hardness levels.


In some <strong>ca</strong>ses, model averaged estimates have very large standard errors. For example,the model averaged LC10 for EC-Fathead Minnows mortality at hardness 250 is 3200mg/L with a SE=16000! In this <strong>ca</strong>se, the observed mortality in the best fitting model atthe highest dose was very small, based on only a few organisms, and the extrapolation isnot very reliable.Conversely, the observed standard errors may appear to be very small (e.g. estimate ofLC10 for EC-Fathead Minnows mortality at hardness 100 is 1400 (SE 7). In this <strong>ca</strong>se, thebest fitting model is the 4-parameter logistic hormesis model (see fit below)The observed data has such a steep decline from increasing in doses up to 1000, then to 0in higher doses that the curve fit must be very sharp.Interpretation of the results then follows a two-step process. First, examine the modelselection tables (Table 2) to examine the support for models with a common dose-


esponse curve across all hardness levels vs. models with a separate dose-response curveby hardness level. The majority of these tables indi<strong>ca</strong>te that there is strong evidence thatthe dose-response curve varies by hardness. Next consider the model averaged estimatedLCxx/ICxx values for each hardness level. Even if there is strong support for an effect ofhardness, the effect of hardness may be small enough that a common LCxx/ICxx valueacross the hardness levels may be appropriate. This is not contradictory as the modelselection in Table 2 is comparing the models over the entire-dose response curve, whilethe estimates in Table 4 are for specific points on the dose-response curve.6. Discussion As outlined by Wheeler and Bailer (2005), model averaging provides a way toincorporate model uncertainty into the risk assessment process. Simply selecting thesingle “best” model may give a false sense of precision (i.e. single model reportedstandard errors typi<strong>ca</strong>lly underreport the true uncertainty in the BMD).Model averaging is not a panacea. Estimates of BMDs within the observed range of thedoses in a study will typi<strong>ca</strong>lly be very similar across a wide range of models as all of themodels must come “close” to the observed data. However, extrapolations that are faroutside the observed dose ranges of the data will typi<strong>ca</strong>lly be very sensitive to the choiceof models.It would be possible to extend the above modeling approach by incorporating both theeffects of hardness and sulphate upon the observed responses and deriving a single doseresponsecurve that incorporates both hardness and sulphate levels. The advantage of thismore complex approach is that a (model averaged) prediction equation for the BMD <strong>ca</strong>nbe established as a function of any hardness rather than relying on the observedhardnesses in the study. Unfortunately, in most <strong>ca</strong>ses, only a few levels of hardness werestudies and so the models for the effect of hardness must be very simple (e.g. linear) andextrapolations outside the observed ranges of hardness will be unwise.ReferencesAnderson, D. (2008) Model Based Inference in the Life Sciences: A Primer on Evidence.Springer: New York.Bailer AJ, Wheeler M, Dankovick D, Noble R, Bena J (2005) Incorporating uncertaintyand variability in the assessment of occupational hazards. Int J Risk Assess Manage5:344–357.Bailer, Noble, and Wheeler (2005) Model uncertainty and risk estimation for


experimental studies of quantal responses. Risk Analysis, 25, 291-299.Barlow, R. E., Bartholomew, D. J., Bremner, J. M., and Brunk, H. D. (1972) Statisti<strong>ca</strong>linference under order restrictions; Wiley, London.Bliss C.I. (1934). The method of probits. Science 79 (2037): 38–39.Box, G.E.P. and Draper, N.R. (1987). Empiri<strong>ca</strong>l Model-Building and Response Surfaces,p. 424, Wiley.Buday, C. and Schroeder, G. (2011). Biologi<strong>ca</strong>l assessment of sulphate at three waterhardness values using various freshwater chronic toxicity tests. Report prepared for theB.C. Ministry of Environment, dated January 2011.Burnham K.P. and Anderson D.R. (2002) Model selection and multi-model inference: apracti<strong>ca</strong>l information-theoretic approach. Springer, New York.Elphick, J. Davies, M., Gilron, G., Canaria, E.C., Lo, B. and Bailey, H. C. (2010). Anaquatic toxicologi<strong>ca</strong>l evaluation of sulphate: the <strong>ca</strong>se for considering hardness as amodifying factor in setting water quality guidelines. Environmental Toxicology andChemistry. Online early. DOI: 10.1002/etc.363.Gibbons. R. D. and Hedeker. D. (1994). Appli<strong>ca</strong>tion of random effects probit regressionmodels. Journal of Clin<strong>ca</strong>l and Consulting Psychology 62, 285-296.Gibbons. R. D. Hedeker. D.; Charles. S. C. and Frisch. P. (1994). A random effects probitmodel for predicting medi<strong>ca</strong>l malpractice claims. Journal of the Ameri<strong>ca</strong>n Statisti<strong>ca</strong>lAssociation 89, 760-767.Jeske, D.R. Xu, H.K., Blessinger, T., Jensen, P. Trumble, J. (2009) testing for theEquality of EC50 Values in the Presence of Unequal Slopes With Appli<strong>ca</strong>tion to Toxicityof Selenium Types. Journal of Agricultural, Biologi<strong>ca</strong>l and Environmental Statistics 14,469-483.Ritz, C. (2010). Toward a unified approach to dose-response modeling in ecotoxicology.Environmental Toxicology and Chemistry, 20, 220-229.Schwarz, C. J. (2011) A <strong>stat</strong>isti<strong>ca</strong>l examination of the effect of water hardness on thedose-response of fresh water aquatic species to sulphate. Report prepared for the BCMinistry of Environment, dated 2011-03-28.Wheeler, M.W. and Bailer, A.J. (2007). Properties of Model-Averaged BMDL’s: A studyof Model averaging in Dichotomous Response Risk Estimation. Risk Analysis, 27, 659-670.


Figure 1a. The Probit, Separate, SNR, RE model as fit to the EC-RT mortality data.


Figure 1b. The Probit, Separate, CNR, RE model as fit to the EC-RT mortality data.


Figure 1c. The Probit, SeparateMono, CNR, RE model as fit to the EC-RT mortality data.


Figure 1d. The Probit, Common LC50, SNR, RE model fit to the EC-RT mortality data.


Figure 1f. The Probit, Common LC50, SNR, RE model fit to the EC-RT mortality data.


Figure 1g. The Probit, Common, CNR, RE model fit to the EC-RT mortality data.


Figure 2a. An illustration of a model fit (Probit, Separate, NoNR, NoRE) where the effectof hardness is not monotonic. This is the fit for the NA-TA-mortality data. The doseresponse curve at hardness 15 leads to a lower apparent mortality than the same sulphatedose at hardness 80. The Probit, Separate, NoNR, NoRE model does not enforce“protective” effects of hardness.


Figure 2b. The Probit, SeparateMono, NoNR, NoRE model fit to the NA-TA-mortalitydata.


Figure 2c. The Probit, Common LC50, NoNR, NoRe model fit to the NA-TA-mortalitydata.


Table 1. <strong>Summary</strong> of sampling protocols for the experiments conducted.Environment Canada StudiesAquatic species Response Sampling protocol at each combination of waterhardness and sulphate levelsRainbow TroutChinookHyalellaMusselsBullfrogtadpolesFat headminnowsLemnaDaphniaRotiferFat headminnowsSurvival of eggs to21 daysSurvival of eggs to28 days.Survival andgrowth oforganisms to 28days.Survival andgrowth oforganisms to 28days.Survival andgrowth to 28 days.Survival andgrowth to 7 days.Frond growth andincrease in weightSurvival for 6 daysand reproductionReproduction after49 hours.Survival andgrowth to 7 days.Tripli<strong>ca</strong>te batches of 30 eggs were incubated andthe number of mortalities from each batch wasrecorded.Tripli<strong>ca</strong>te batches of 30 eggs were incubated andthe number of mortalities from each batch wasrecorded.Quintupli<strong>ca</strong>te batches (except for 10 batches inthe <strong>ca</strong>se of control doses of sulphate in softwater) of 15 Hyalella were incubated and thenumber of mortalities from each batch wasrecorded. The mean weight of each batch of theorganisms at the end of the experiment wasmeasured.Tripli<strong>ca</strong>te batches of 3, 3, or 4 mussels wereincubated and the number of mortalities in eachbatch was recorded. Wet weight and thebeginning and end of the experiment wasmeasured.Tripli<strong>ca</strong>te batches of 5 tadpoles were incubatedand the number of mortalities in each batch wasrecorded. The change in weight over the 28 dayswas also recorded.Quadrupli<strong>ca</strong>te batches of 10 minnows wereincubated and the number of mortalities in eachbatch was recorded. The final mean weight ineach batch was also recorded.Quadrupli<strong>ca</strong>te repli<strong>ca</strong>tes of Lemna wereincubated and the number of new fronds andfinal weight were recorded for each survivingorganism.Nautilus Studies10 individual organisms were incubated and the<strong>stat</strong>us (dead/alive) and reproductive output wasrecorded.8 individual organisms were incubated and thepopulation growth was recorded.Tripli<strong>ca</strong>te batches of 10 minnows were incubatedand the number of mortalities was recorded. Thefinal mean weight in each batch was alsorecorded.


BullfrogtadpolesSurvival andgrowth to 28 days.Tripli<strong>ca</strong>te batches of 5 tadpoles were incubatedand the number of mortalities in each batch wasrecorded. The final biomass was also recorded.Algae Cell yield Four to 10 batches of 10,000 cells wereincubated and the percentage increase in thenumber of cells was recorded.


Table 2-EC-CH-mortality. <strong>Summary</strong> of AIC model selection for Chinook, mortality conducted by ECGroup Response Species Model Name#ParametersNumber datavaluesLoglikelihood AICc !AICAICcweightEC mortality CH Probit, Separate, NoNR, NoRE 6 54 -73.8 161.5 0.0 1.00EC mortality CH Probit, Common, NoNR, NoRE 2 54 -91.8 187.9 26.4 0.00EC mortality CH Probit, SeparateMono, NoNR,NoRE4 54 -91.8 192.4 31.0 0.00


Table 2-EC-FH-mortality. <strong>Summary</strong> of AIC model selection for Fathead Minnow, mortality conducted by ECGroup Response Species Model Name#ParametersNumber datavaluesLoglikelihood AICc !AICAICcweightEC mortality FH Probit, Separate, NoNR, NoRE 6 72 -83.2 179.6 0.0 0.95EC mortality FH Probit, SeparateMono, NoNR,NoREEC mortality FH Probit, CommonLC 10 , NoNR,NoREEC mortality FH Probit, CommonLC 50 , NoNR,NoRE4 72 -88.4 185.4 5.8 0.054 72 -117.6 243.8 64.2 0.004 72 -129.1 266.8 87.2 0.00EC mortality FH Probit, Common, NoNR, NoRE 2 72 -195.4 395.0 215.4 0.00EC mortality FH Probit, CommonLC 25 , NoNR,NoRECommonLC=Common LCxx value; NoNR=no natural response; NoRE=no random effects4 72 -210.1 428.9 249.3 0.00


Table 2-EC-FH-weight. <strong>Summary</strong> of AIC model selection for Fathead Minnow growth conducted by EC.Group Response Species Model Name # Parameters Number data values Log likelihood AICc !AIC AICc weightEC weight FH LH4p.Separate 15 66 48.2 -56.8 0.0 1.00EC weight FH IR.Separate 11 66 20.7 -14.6 42.2 0.00EC weight FH LL3p.Separate 12 66 20.2 -10.5 46.3 0.00EC weight FH LL3p.Mono 6 66 2.8 7.7 64.5 0.00EC weight FH IR.Common 4 66 -37.8 84.2 141.0 0.00EC weight FH LL3p.Common 4 66 -39.9 88.5 145.3 0.00EC weight FH LH4p.Common 5 66 -39.3 89.6 146.4 0.00LH4p=4-parameter logistic hormesis models; LL3p = 3-parameter log-logistic model; IR=isotonic regression model.


Table EC-HY-mortality. <strong>Summary</strong> of AIC model selection for Hyalella, mortality conducted by ECGroup Response Species Model Name#ParametersNumber datavaluesLoglikelihood AICc !AICAICcweightEC mortality HY Probit, Separate*, SNR, NoRE 7 100 -142.4 300.0 0.0 1.00EC mortality HY Probit, Common, CNR, NoRE 3 100 -153.9 314.0 14.0 0.00EC mortality HY Probit, SeparateMono, CNR,NoRE5 100 -160.2 331.1 31.1 0.00CNR=common natural response; SNR=separate natural response; NoRE=no random effects. No dose-response curve as a function ofsulphates could be fit for the medium hardness, and so only a natural response was modeled at this hardness.


Table 2-EC-HY-weight. <strong>Summary</strong> of AIC model selection for Hyalella weight conducted byEC.Group Response Species Model Name # Parameters Number data values Log likelihood AICc !AIC AICc weightEC weight HY IR.Separate 13 100 33.9 -37.6 0.0 0.69EC weight HY LL3p.Common 4 100 21.6 -34.7 2.9 0.16EC weight HY LH4p.Common 5 100 21.6 -32.5 5.1 0.05EC weight HY IR.Common 7 100 23.6 -32.0 5.6 0.04EC weight HY LH4p.Separate 15 100 33.6 -31.4 6.2 0.03EC weight HY LL3p.Mono 6 100 21.6 -30.2 7.4 0.02EC weight HY LL3p.Separate 12 100 27.6 -27.6 10.0 0.00LH4p=4-parameter logistic hormesis models; LL3p = 3-parameter log-logistic model; IR=isotonic regression model.


Table 2-EC-LM-frond. <strong>Summary</strong> of AIC model selection for Lemna, frond growth conducted by EC.Group Response Species Model Name # Parameters Number data values Log likelihood AICc !AIC AICc weightEC frond LM IR.Common 4 71 -335.0 678.5 0.0 0.62EC frond LM LH4p.Common 5 71 -334.6 680.1 1.6 0.28EC frond LM LH4p.Separate 15 71 -322.2 683.1 4.6 0.06EC frond LM LL3p.Common 4 71 -338.4 685.4 6.9 0.02EC frond LM LL3p.Mono 6 71 -336.5 686.3 7.8 0.01EC frond LM IR.Separate 11 71 -333.2 692.9 14.4 0.00EC frond LM LL3p.Separate 12 71 -335.5 700.4 21.9 0.00LH4p=4-parameter logistic hormesis models; LL3p = 3-parameter log-logistic model; IR=isotonic regression model.


Table EC-MY-mortality. <strong>Summary</strong> of AIC model selection for Mussels mortality conducted by ECGroup Response Species Model NameEC mortality MY Probit, CommonLC 10 , NoNR,NoREEC mortality MY Probit, SeparateMono, NoNR,NoRE#ParametersNumber datavaluesLoglikelihood AICc !AICAICcweight4 54 -43.9 96.7 0.0 0.674 54 -45.3 99.5 2.8 0.17EC mortality MY Probit, Separate, NoNR, NoRE 6 54 -43.7 101.2 4.6 0.07EC mortality MY Probit, CommonLC 50 , NoNR,NoRE4 54 -46.6 102.1 5.4 0.05EC mortality MY Probit, Common, NoNR, NoRE 2 54 -49.0 102.3 5.6 0.04EC mortality MY Probit, CommonLC 25 , NoNR,NoRECommonLC=Common LCxx value; NoNR=no natural response; NoRE=no random effects4 54 -48.0 104.8 8.1 0.01


Table 2-EC-RT-mortality. <strong>Summary</strong> of AIC model selection for Rainbow Trout mortality conducted by EC.Group Response Species Model NameS#ParametersNumber datavaluesLoglikelihood AICc !AICAICcweightEC mortality RT Probit, Common, CNR, RE 4 54 -168.1 345.1 0.0 0.49EC mortality RT Probit, SeparateMono, CNR, RE 6 54 -165.9 345.5 0.4 0.40EC mortality RT Probit, Separate, CNR, RE 8 54 -165.8 350.7 5.6 0.03EC mortality RT Probit, Separate, CNR, RE 8 54 -165.8 350.7 5.6 0.03EC mortality RT Probit, CommonLC 10 , SNR,REEC mortality RT Probit, CommonLC 25 , SNR,REEC mortality RT Probit, CommonLC 50 , SNR,RE8 54 -165.9 351.0 5.9 0.038 54 -166.5 352.2 7.1 0.018 54 -167.1 353.4 8.3 0.01CommonLC=Common LCxx value; SNR=separate natural response; CNR=common natural response; RE=random effects


Table 2-NA-AL-cell. <strong>Summary</strong> of AIC model selection for Algae cell increases conducted by NA.Group Response Species Model Name # Parameters Number data values Log likelihood AICc !AIC AICc weightNA cell.incre AL LL3p.Separate 12 80 -329.8 688.2 0.0 0.96NA cell.incre AL LL3p.Mono 6 80 -340.8 694.8 6.6 0.04NA cell.incre AL LL3p.Common 4 80 -353.7 715.9 27.7 0.00NA cell.incre AL IR.Common 15 80 -346.6 730.6 42.4 0.00NA cell.incre AL IR.Separate 25 80 -329.4 732.9 44.7 0.00LH4p=4-parameter logistic hormesis models; LL3p = 3-parameter log-logistic model; IR=isotonic regression model.


Table 2-NA-DA-mortality. <strong>Summary</strong> of AIC model selection for Daphnia mortality conducted by NA.Group Response Species Model Name#ParametersNumber datavaluesLoglikelihood AICc !AICAICcweightNA mortality DA Probit, Separate, NoNR, NoRE 8 320 -71.1 158.6 0.0 0.65NA mortality DA Probit, SeparateMono, NoNR,NoRENA mortality DA Probit, CommonLC 50 , NoNR,NoRE5 320 -74.9 160.0 1.3 0.345 320 -77.8 165.9 7.2 0.02NA mortality DA Probit, Common, NoNR, NoRE 2 320 -84.1 172.3 13.6 0.00NA mortality DA Probit, CommonLC 10 , NoNR,NoRENA mortality DA Probit, CommonLC 25 , NoNR,NoRECommonLC=Common LCxx value; NoNR=no natural response; NoRE=no random effects5 320 -82.5 175.2 16.5 0.005 320 -83.2 176.5 17.9 0.00


Table 2-NA-DA-repro. <strong>Summary</strong> of model selection for Daphnia reproduction conducted by NA.Group Response Species Model Name # Parameters Number data values Log likelihood AICc !AIC AICc weightNA repro DA LL3p.Separate 16 318 -1349.3 2732.3 0.0 1.00NA repro DA LL3p.Mono 7 318 -1406.5 2827.4 95.1 0.00NA repro DA LL3p.Common 4 318 -1458.6 2925.3 193.0 0.00NA repro DA IR.Separate 113 318 -1337.4 3027.2 294.9 0.00NA repro DA IR.Common 88 318 -1445.4 3135.2 402.9 0.00LH4p=4-parameter logistic hormesis models; LL3p = 3-parameter log-logistic model; IR=isotonic regression model.


Table 2-NA-FH-mortality. <strong>Summary</strong> of model selection for Fathead Minnow mortality conducted by NA.Group Response Species Model Name#ParametersNumber datavaluesLoglikelihood AICc !AICAICcweightNA mortality FH Probit, Separate, NoNR, NoRE 8 95 -129.7 277.1 0.0 0.53NA mortality FH Probit, SeparateMono, NoNR,NoRENA mortality FH Probit, CommonLC 50 , NoNR,NoRE5 95 -133.3 277.3 0.2 0.475 95 -163.5 337.6 60.5 0.00NA mortality FH Probit, Common, NoNR, NoRE 2 95 -179.0 362.1 85.0 0.00NA mortality FH Probit, CommonLC 10 , NoNR,NoRENA mortality FH Probit, CommonLC 25 , NoNR,NoRE5 95 -225.8 462.4 185.3 0.005 95 -267.4 545.4 268.3 0.00CommonLC=Common LCxx value; NoNR=no natural response; NoRE=no random effects


Table 2-NA-FH-weight. <strong>Summary</strong> of AIC model selection for Fathead Minnow weight conducted by NA.Group Response Species Model Name # Parameters Number data values Log likelihood AICc !AIC AICc weightNA weight FH LL3p.Mono 7 91 33.3 -51.2 0.0 0.55NA weight FH LL3p.Separate 16 91 45.1 -50.8 0.4 0.45NA weight FH IR.Separate 27 91 52.1 -26.2 25.0 0.00NA weight FH LL3p.Common 4 91 -1.1 10.6 61.8 0.00NA weight FH IR.Common 10 91 4.3 14.2 65.4 0.00LH4p=4-parameter logistic hormesis models; LL3p = 3-parameter log-logistic model; IR=isotonic regression model.


Table 2-NA-RO-reproduction. <strong>Summary</strong> of AIC model selection for Rotifers reproduction conducted by NA.Group Response Species Model Name # Parameters Number data values Log likelihood AICc !AIC AICc weightNA growth RO IR.Separate 19 140 -130.1 304.4 0.0 0.79NA growth RO LL3p.Separate 16 140 -135.8 308.0 3.6 0.13NA growth RO IR.Common 8 140 -146.5 310.2 5.8 0.04NA growth RO LL3p.Mono 7 140 -148.1 311.1 6.7 0.03NA growth RO LL3p.Common 4 140 -152.6 313.4 9.0 0.01LH4p=4-parameter logistic hormesis models; LL3p = 3-parameter log-logistic model; IR=isotonic regression model.


Table 2-NA-TA-weight. <strong>Summary</strong> of model selection for Tadpole weight gain conducted by NA.Group Response Species Model Name # Parameters Number data values Log likelihood AICc !AIC AICc weightNA weight TA IR.Separate 7 30 -140.8 300.7 0.0 0.71NA weight TA LL3p.Common 4 30 -147.4 304.4 3.7 0.11NA weight TA LL3p.Separate 8 30 -140.8 304.5 3.8 0.11NA weight TA LH4p.Common 5 30 -147.4 307.3 6.6 0.03NA weight TA LL3p.Mono 5 30 -147.4 307.3 6.6 0.03NA weight TA IR.Common 6 30 -146.5 308.7 8.0 0.01NA weight TA LH4p.Separate 10 30 -139.3 310.2 9.5 0.01LH4p=4-parameter logistic hormesis models; LL3p = 3-parameter log-logistic model; IR=isotonic regression model.


Table 3. <strong>Summary</strong> of model averaged LCxx/ICxx values. Estimates that are far outside the range of the observed doses are not reported.Group Species Response HardnessEC CH mortality 50MAestMAseBMDLC/IC 10 LC/IC 25 LC/IC 50MALCLMAUCLMAestMAseMALCLMAUCLMAestMAseMALCLMAUCL. . . . . . . . . . . .100 . . . . . . . . . . . .250 1248 151 984 1582 2678 518 1833 3914 . . . .FH mortality 50 295 42 223 391 499 53 405 615 896 86 742 1083100 961 98 786 1175 1307 100 1126 1518 1840 141 1583 2139250 3206 15697 0 4.72E7 3359 32706 0 6.52E11 3642 54068 0 1.58E16weight 50 931 162 662 1309 1004 142 762 1324 1111 117 904 1365100 1397 7 1383 1411 1408 7 1394 1422 1428 7 1414 1442250 2969 12 2946 2992 2999 12 2975 3023 3053 15 3024 3083HY mortality 50 1401 233 1011 1941 2177 305 1654 2865 3550 898 2162 5828100 2235 344 1653 3023 3824 630 2769 5281 . . . .250 2235 344 1653 3023 3824 630 2769 5281 . . . .weight 50 1170 386 613 2234 1739 439 1060 2854 2404 243 1972 2931100 682 303 286 1628 1030 297 585 1814 2323 205 1954 2762250 437 238 150 1271 1198 321 709 2026 1929 391 1297 2870LM frond 50 2143 2526 213 21598 2824 804 1617 4934 3381 717 2232 5123


Table 3. <strong>Summary</strong> of model averaged LCxx/ICxx values. Estimates that are far outside the range of the observed doses are not reported.MAestMAseBMDLC/IC 10 LC/IC 25 LC/IC 50MALCLMAUCLMAestMAseMALCLMAUCLMAestMAseMALCLMAUCL100 2243 2488 255 19715 2758 953 1401 5429 3462 13737 1 8261144250 2314 2524 273 19614 2919 472 2127 4007 3550 783 2304 5469weight 50 . . . . . . . . . . . .100 . . . . . . . . . . . .250 . . . . . . . . . . . .MY mortality 50 104 83 21 502 548 307 183 1641 2128 1387 593 7635100 196 171 36 1086 2619 2199 505 13574 . . . .250 212 203 32 1389 2619 2199 505 13574 . . . .RT mortality 50 127 64 47 342 320 125 149 687 893 352 412 1933100 163 74 67 398 420 130 229 769 1197 380 642 2232250 192 98 71 520 493 186 236 1032 1406 542 661 2993


Table 3. <strong>Summary</strong> of model averaged LCxx/ICxx values. Estimates that are far outside the range of the observed doses are not reported.MAestMAseBMDLC/IC 10 LC/IC 25 LC/IC 50MALCLMAUCLNA AL cell.incre 10 879 137 647 1194 1111 121 898 1375 1404 104 1215 1623MAestMAseMALCLMAUCLMAestMAseMALCLMAUCL80 2561 71 2426 2705 2631 46 2543 2722 2701 27 2649 2754320 2526 70 2393 2666 2649 36 2579 2722 2779 18 2744 2815DA mortality 40 327 62 226 473 487 78 356 666 758 118 559 102980 464 107 295 729 731 127 521 1027 1214 206 871 1692160 855 168 582 1255 1114 153 851 1459 1497 191 1165 1923320 749 111 560 1003 1053 132 824 1347 1537 201 1190 1987repro 40 158 205 12 2016 272 198 65 1132 468 172 227 96280 714 230 380 1342 895 183 600 1335 1122 116 917 1373160 1184 43 1104 1271 1223 23 1178 1269 1263 6 1250 1275320 253 216 47 1350 425 250 134 1347 717 287 327 1573FH mortality 40 302 60 204 445 652 97 487 872 1533 226 1148 204780 426 79 297 612 989 150 735 1332 2522 431 1804 3526160 1075 196 752 1536 2365 399 1699 3290 5679 1238 3705 8706320 2318 698 1285 4181 7462 4855 2085 26707 5109 650 3982 6556weight 40 600 167 347 1034 869 160 606 1246 1260 176 958 1657


Table 3. <strong>Summary</strong> of model averaged LCxx/ICxx values. Estimates that are far outside the range of the observed doses are not reported.MAestMAseBMDLC/IC 10 LC/IC 25 LC/IC 50MALCLMAUCLMAestMAseMALCLMAUCLMAestMAseMALCLMAUCL80 1330 262 904 1958 1845 246 1421 2396 2559 290 2049 3195160 2102 523 1291 3423 2809 483 2006 3933 3752 568 2789 5048320 3041 535 2154 4294 716 4409 0 1.241E8 4304 2095 1658 11175RO growth 40 731 333 300 1786 996 279 575 1725 1213 295 754 195380 347 304 63 1927 1797 774 772 4182 2191 528 1366 3513160 734 321 312 1729 1304 1485 140 12154 2235 424 1541 3242320 848 938 97 7420 1070 1955 30 38443 1915 3049 84 43401TA mortality 15 648 209 345 1218 1109 228 741 1658 1963 459 1242 310480 186 99 65 531 554 170 304 1009 1868 551 1047 3330weight 15 1246 469 596 2604 1441 320 932 2228 1828 95 1651 202580 1276 428 662 2462 1385 332 866 2214 1577 269 1128 2204


Report ResponseTable 4. Example of model averaging for estimates of LC10 for Rainbow Trout at hardness 50 conducted by EC.Species Model EstimateStandardError AICcDeltaAICcAICweightModelAverageEstimateModelAverageSE95%cilowerboundEC mortality RT Probit, Common, CNR, RE 5.03 0.42 345.1 0.0 0.51 . . . .EC mortality RT Probit, SeparateMono, CNR,RE95%ciupperbound4.60 0.49 345.5 0.4 0.41 . . . .EC mortality RT Probit, Separate, CNR, RE 4.73 0.55 350.7 5.6 0.03 . . . .EC mortality RT Probit, CommonLC 10 ,SNR, REEC mortality RT Probit, CommonLC 25 ,SNR, REEC mortality RT Probit, CommonLC 50 ,SNR, RE4.99 0.41 351.0 5.9 0.03 . . . .5.22 0.45 352.2 7.1 0.01 . . . .4.97 0.86 353.4 8.3 0.01 . . . .EC mortality RT Probit, Separate, SNR, RE 4.84 0.58 356.3 11.2 0.00 . . . .EC mortality RT 99-Model Averaged . . . . . 4.84 0.51 3.85 5.84EC mortality RT 99-Model Averaged onantilog. . . . . 127 64 47 342

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!