03.02.2015 Views

HIERARCHAL INDUCTIVE PROCESS MODELING AND ANALYSIS ...

HIERARCHAL INDUCTIVE PROCESS MODELING AND ANALYSIS ...

HIERARCHAL INDUCTIVE PROCESS MODELING AND ANALYSIS ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>HIERARCHAL</strong> <strong>INDUCTIVE</strong> <strong>PROCESS</strong> <strong>MODELING</strong> <strong>AND</strong> <strong>ANALYSIS</strong><br />

Youri Noël Nelson<br />

A Thesis Submitted to the<br />

University of North Carolina Wilmington in Partial Fulfillment<br />

of the Requirements for the Degree of<br />

Master of Science<br />

Department of Mathematics and Statistics<br />

University of North Carolina Wilmington<br />

2011<br />

Approved by<br />

Advisory Committee<br />

Michael Freeze<br />

Xin Lu<br />

Wei Feng<br />

Chair<br />

Stuart Borrett<br />

Co-Chair<br />

Accepted by<br />

Dean, Graduate School


TABLE OF CONTENTS<br />

ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii<br />

DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv<br />

ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . v<br />

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .<br />

vi<br />

LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii<br />

LIST OF SYMBOLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .<br />

viii<br />

1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br />

2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />

2.1 HIPM Description . . . . . . . . . . . . . . . . . . . . . . . . 10<br />

2.1.1 Measure of Fit . . . . . . . . . . . . . . . . . . . . . 12<br />

2.1.2 Entities specification and model library . . . . . . . . 13<br />

2.2 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . 16<br />

3 COMPUTATIONAL RESULTS . . . . . . . . . . . . . . . . . . . . . 20<br />

3.1 Increase in number of time-series input . . . . . . . . . . . . . 24<br />

3.2 Value of Information . . . . . . . . . . . . . . . . . . . . . . . 28<br />

3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30<br />

4 ANALYTICAL <strong>ANALYSIS</strong> . . . . . . . . . . . . . . . . . . . . . . . 33<br />

4.1 Most recurrent models . . . . . . . . . . . . . . . . . . . . . . 33<br />

4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 38<br />

4.3 Model A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41<br />

4.4 Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48<br />

4.5 Model C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />

4.6 Effects of increasing the number of constraints . . . . . . . . . 63<br />

5 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65<br />

REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69<br />

ii


APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72<br />

A. Sample CIAO data - 1997 . . . . . . . . . . . . . . . . . . . . . . . 72<br />

B. Full entity specification file . . . . . . . . . . . . . . . . . . . . . . 73<br />

C. Full ross Sea generic model library . . . . . . . . . . . . . . . . . . 75<br />

D. Models selected in both experiment 8 and 19 . . . . . . . . . . . . 87<br />

E. Models selected in both experiment 8 and 21 . . . . . . . . . . . . 89<br />

iii


ABSTRACT<br />

Understanding the Phytoplankton dynamic in the Ross Sea Polynya may yield useful<br />

knowledge in the search for solving the worlds rising carbon dioxide levels. Modeling<br />

such dynamics is a very lengthy and tedious process that can be helped with the use<br />

of computational tools like HIPM. This system relies on knowledge that is already<br />

available, in the shape of time series data and process library, to construct and then<br />

evaluates these models.<br />

In this research models were ranked by sum of squared<br />

error, from lowest to highest. The lowest being the best fit model. Some of the<br />

questions that arise from the use of HIPM are about the amount and value of the<br />

time series provided to the software, from which we formulated two hypotheses.<br />

Will having more time series better the output of the system Will time series<br />

for different variables provide different quality of output Through 31 experiments<br />

and mathematical analysis, we began to answer these questions. The computational<br />

result showed us that our first hypothesis does not always hold true, which is thought<br />

to be because of the way the fit is measured. On the other hand the mathematical<br />

analysis showed us many variations, over all the experiments, in the zooplankton<br />

equation structure which can be indication that the process library needs to be better<br />

defined and that the system needs to take into consideration not only Phaeocystis<br />

antartica phytoplankton species but also diatoms. This thesis provides the start to<br />

an answer for this hypothesis but further research is still needed.<br />

iv


DEDICATION<br />

This Thesis is dedicated to all my friends and family have supported me in this<br />

incredible journey I started 5 years ago. More importantly I want to dedicate to our<br />

Lord and Savior as I certainly would not be here today without his help, support<br />

and comfort.<br />

“I can do anything through God who strengthens me.”(Philippians 4:13)<br />

I also want to dedicate this to my nephew Noah Nelson and my niece Sarah Nelson<br />

for always putting a smile on my face during the tough times, their unconditional<br />

love and making me want to persevere always. I love you beyond words.<br />

Thank you, Christel & Douglas Nelson, Lara Nelson, Celio & Elise Nelson, Sven<br />

Diebold, Andrew & Robin Nelson, Ed & Pat Nelson, Joann Nelson, Philip Varvaris,<br />

Luke Brown, Taylor Jackson and Bud Edwards (for always being there at the right<br />

place at the right time) and all my other friends and family members that are not<br />

named here but are present in my heart and to whom I am so grateful for all the<br />

words of encouragement and support throughout the years.<br />

v


ACKNOWLEDGMENTS<br />

I would like to thank Dr. Feng, Dr. Borrett, Dr. Simmons, Dr. Freeze and Dr.<br />

Lu for all their help and support in this endeavor and process, as well as my friend<br />

Brevin Rock for his advice in completing a Masters thesis.<br />

vi


LIST OF TABLES<br />

1 Example of entity definition and instantiation (P) . . . . . . . . . . . 15<br />

2 Example of process definition (Growth) . . . . . . . . . . . . . . . . . 16<br />

3 Data contained in CIAO set . . . . . . . . . . . . . . . . . . . . . . . 18<br />

4 Cutoff Value Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />

5 Model A Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . 34<br />

6 Model B Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . 36<br />

7 Model C Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . 57<br />

vii


LIST OF FIGURES<br />

1 Initial Conceptual Model . . . . . . . . . . . . . . . . . . . . . . . . . 4<br />

2 Tree diagram representing the process library . . . . . . . . . . . . . 5<br />

3 Map of the Ross Sea . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br />

4 reMSE summary - Part 1 . . . . . . . . . . . . . . . . . . . . . . . . . 21<br />

5 reMSE summary - Part 2 . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />

6 reMSE summary - Part 3 . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />

7 Good fit Models VS. Number of inputted time-series . . . . . . . . . 24<br />

8 Mean Activation Values Graph . . . . . . . . . . . . . . . . . . . . . 29<br />

viii


LIST OF SYMBOLS<br />

P = Amount of Phytoplankton present in the system (mg Chla/m 3 ),<br />

D = Detritus concentration (mg C/m 3 ),<br />

F = Iron concentration (µM),<br />

Z = Zooplankton concentration (mg C/m 3 ),<br />

N = Nitrate concentration (µM),<br />

E ice (t) = Sea ice concentration<br />

E T H2 O(t) = Temperature of the water ( ◦ C)<br />

E P UR (t) = Photosynthetically usable radiation ( µmol photons m −2 s −1 )<br />

E T H2 O max<br />

= Maximum water temperature<br />

E T H2 O min<br />

= Minimum water temperature<br />

a i = Optimal parameters of the system selected by HIPM software<br />

ix


1 INTRODUCTION<br />

Whether you talk about biology, mathematics, physics, ecology, or any other type<br />

of science, all have a common objective to explain and describe the world that surrounds<br />

us. All of these fields build upon the collection of observations, to explain<br />

recurring phenomena. To explain and depict some of these phenomena scientists<br />

make use of models which can take a variety of forms including conceptual, formal,<br />

physical and diagrammatic (Haefner, 2005).<br />

Models are widely used in science and researchers continue to look for tools or<br />

techniques that will enhance and optimize their ability to construct new models or<br />

improve existing ones.<br />

Given a certain task the type of modeling technique will<br />

differ, for instance in his book Haefner (2005) uses a Forrester Diagram to model a<br />

hypothetical agro-ecosystem system, which is a qualitative model formulation. Another<br />

example would be in biology when describing predator-prey interaction, one<br />

can use differential equations models like those formulated by Lokta and Volterra<br />

(Berryman 1992). Models are useful for system study because they let researchers<br />

conduct experiments and test theories on the system that would otherwise be unethical<br />

or impossible to perform, as well as enabling them to predict the behavior of<br />

varying components of an ecosystem.<br />

Model construction is a difficult and lengthy endeavor. For a given system there<br />

may be many different combinations of processes (i.e. grazing, decay, growth) that<br />

could provide a plausible explanation for the behavior being studied.<br />

Thus, exploring<br />

and evaluating all these possibilities makes for a tedious task. In the past,<br />

limitations in computational powers restricted scientists in their ability to investigate<br />

more complex models, certain known or suspected processes would be left out<br />

to simplify calculations in part because as computational powers increased so did our<br />

capacity to evaluate more intricate models (Oreskes 2000). In addition, numerical


models of natural systems are non-unique, there is multiple ways to represent the<br />

same dynamic. Creating computational tools that would quickly and automatically<br />

evaluate multiple models seemed to be a promising idea to search through the extensive<br />

model space. The success of machine learning and data mining in commercial<br />

domains led scientists to investigate the field of automated modeling to serve that<br />

particular purpose (Fayyad et al., 1996).<br />

The act of gathering small pieces of information and combining it to prior knowledge<br />

to formulate a complex overview of an object or process studied is called induction.<br />

Induction prevents from searching the entire space of possible equations<br />

by only piecing together the meaningful terms, for instance a predator-prey model<br />

will need terms specifying growth and death (Todorovski et al. 2005). Inductive<br />

modeling methods (i.e. LAGRAMGE, HIPM, ARIMA, FUSE) use the principles of<br />

induction to construct models of the studied system. Methods used for commercial<br />

application, such as Knowledge Discovery in Database (KDD) process, were insufficient<br />

for scientific purposes as they only described and did not explain the observed<br />

system behavior (Langley et al. 2006). A simple example would be the modeling of<br />

water consumption in a city, a water company could easily create a numerical model<br />

based on previous years that would give a good estimate of the projected water<br />

consumption over time but it may not explain why the consumption fluctuates the<br />

way it does. In other words the commercial methods were able to produce models<br />

that are useful when trying to make accurate predictions for a system but become<br />

very limited when trying to explain which processes drive systems behaviors; these<br />

methods did not explore the realm of all possible models. Thus, induction methods<br />

had to be enhanced to automate the task of building and evaluating multiple models<br />

(Dzeroski et al. 1995).<br />

In this thesis, I used the hierarchal inductive process modeling technique, which<br />

is encoded as computer algorithm called HIPM (Langley et al. 2006; Bridewell et<br />

2


al. 2005; Dzeroski et al. 1995; Borrett et al. 2007). Inductive process modeling<br />

methods such as HIPM (Bridewell et al. 2008; Borrett et al. 2007; Langley et al.<br />

2006; Todorovski et al. 2005) searches through two spaces; the first space is made<br />

up of mathematical formulations and alternative model structures, which consist of<br />

entities, processes and the connection biding the two and the second space is made<br />

up of parameter values (Borrett et al. 2007).The system takes as input a hierarchy<br />

of generic processes - a process being a certain action on the system which is defined<br />

by mean of fragment mathematical equations and the rule on how to combine these<br />

fragments with the rest of the equations -, a set of entities - an entity being an object<br />

regrouping the properties of the organism or nutrient by mean of variables and<br />

parameters - and a set of observed time series of the entities variables (Todorovski<br />

et al. 2005). HIPM will perform one of two search for for the model structure, a<br />

heuristic search or exhaustive search. With the search option selected, HIPM creates<br />

all the possible model structures with the given background knowledge and selects<br />

the best set of parameters for each model structure. Finally, the system ranks the<br />

models based on their sum of squared error (Todorovski et al. 2005).<br />

This system allows for model representation of complex system dynamics, for<br />

example in the study of photosynthesis regulation it generated a model that reproduced<br />

both the qualitative shape and the quantitative details of the time series data<br />

while incorporating processes that made biological sense (Langley et al. 2006). In<br />

our case we studied the phytoplankton dynamic in the aquatic ecosystem of the Ross<br />

Sea.<br />

In this thesis I used the HIPM tool combined with the appropriate process library<br />

to study of the phytoplankton dynamic in Ross Sea ecosystem. Here the term<br />

process library is defined as the collection of processes (i.e. grazing, decay, growth)<br />

and entities (i.e. phytoplankton, zooplankton, nitrate), with their relation to one<br />

another. It is best represented by Figure 2.<br />

3


Figure 1: This schematic represent the interaction between entities and exogenous<br />

variables driving the model. Here, P, Z , D , NO3 and Fe are the state variables.<br />

PUR, T and Ice are the exogenous variables acting on the system and influencing the<br />

state variables. The arrows represent the interaction of one variable onto another<br />

(Borrett, unpublished research).<br />

Arrigo, Borrett, Bridewell and Langley used HIPM and the Ross Sea process library<br />

to create and search a space of over 1120 possible model structures to explain<br />

the phytoplankton and nitrogen temporal dynamics in the Ross Sea ecosystem; all<br />

models contained five state variables, phytoplankton, zooplankton, detritus, nitrogen<br />

and iron. Time series for both phytoplankton and nitrogen where available and<br />

given to HIPM along with the process library. Their initial research found that 200<br />

model structures were deemed of good fit, in this case good fit was defined by models<br />

having a sum of squared error less than or equal to 0.2. From a computer scientist<br />

standpoint, reducing the search space from 1120 models structure to 200 is a great<br />

accomplishment; however for a biologist the solution is not specific enough and offers<br />

few insights on the ecosystem dynamics. There is a need for ways to constraint the<br />

search further, bringing down the number of good fit models, making the output<br />

4


Figure 2: A tree diagram representing the process library constructed for the Ross<br />

Sea ecosystem problem. The interaction between processes and entities is defined in<br />

the library as explained in Section 2.1.2 ( Borrett et al. 2007)<br />

useful to biologists.<br />

Superficially, HIPM appears related to equation discovery methods, which is a<br />

subfield of machine learning (Langley, 1995; Mitchell, 1997) that investigates collections<br />

of measurements and observations, using different computational methods,<br />

in search of quantitative laws (Todorovski, 2003). For example the LAGRAMGE<br />

system will take in as input background knowledge encoded in terms of a grammar<br />

5


specifying the space of possible equations and a dependent variable and will output<br />

the best equation for the variable, able to only perform the search for one variable<br />

at the time (Dzeroski et al. 1993, Todrovski 2003). This is further related to the<br />

methods used in Ljungs work (1993) on system identification, but is further removed<br />

to that of inductive process modeling.<br />

The main assumption behind system identification is that the model structure<br />

is known and that the primary concern is finding the adequate parameter values;<br />

equation discovery focuses on both the structure and parameter values (Todorovski<br />

et al. 1998). Both of these approach produce descriptive models that summarize<br />

and predict the data but they fail to search through the space of alternative explanations,<br />

these methods do not take into account models with theoretical variables<br />

or consider alternate processes to explain certain dynamics (Bridewell et al. 2005).<br />

The Southern Ocean covers an area equivalent to about 10% of the global ocean<br />

and is a key element of the global ocean system as it links all major ocean basins and<br />

facilitates the global distribution of its deep water; it is considered to play an important<br />

part in the global carbon (C) cycle (Arrigo et al. 2003). The Ross Sea polynya<br />

(area of open water surrounded by sea ice) is one of the most productive ecosystems<br />

in the Southern Ocean as it experiences some of the largest phytoplankton blooms<br />

in the region (Arrigo et al 1994, 1998, 2000, 2003). Indeed, phytoplankton productivity<br />

(photosynthesis) is important to the carbon cycle as it removes carbon dioxide<br />

(CO 2 ) from surface water during photosynthesis, part of which will then be exported<br />

to deep ocean water. What makes the Ross Sea polynya so interesting for ecologist<br />

compared to other locations such as Terra Nova Bay, is the type of phytoplankton<br />

dominating the ecosystem. In the Ross Sea polynya , Phaeocystis antartica dominates<br />

as opposed to diatoms (species such as Fragilariopsis spp.) in Terra Nova Bay.<br />

Phaeocystis antartica are thought to resist grazing more than other phytoplankton<br />

species, which could imply that more carbon would be taken from shallow water into<br />

6


the depth as the un-eaten phytoplankton full of CO 2 sinks to the bottom (Tagliabue<br />

and Arrigo 2003). Deep ocean water has a larger residence time than shallow water,<br />

meaning that carbon trapped in deep ocean water will be effectively removed from<br />

atmospheric circulation for a much longer time than the carbon contained in surface<br />

water.<br />

Figure 3: Map of the southwestern Ross Sea showing the Ross Sea ploynya, located<br />

north of the Ross Sea Ice Shelf, and the Terra Nova Bay polynya, located on the<br />

western continental shelf (Arrigo et al. 2003)<br />

Thus, there is an incentive to understand the ecological processes that control the<br />

7


phytoplankton productivity and community composition -which species dominatesin<br />

the Ross Sea. Fluctuations in phytoplankton population could potentially have<br />

effects on the CO 2 levels in the atmosphere (Carlson et al. 1998) and if we can<br />

figure out why Phaeocystis antartica is predominant it would be useful information<br />

to scientist as they entertain the idea of altering phytoplankton populations<br />

around the world to create carbon sinks, providing a temporary solution to our CO 2<br />

problem. It is all these elements that initiated the search for the best process explanation<br />

of the phytoplankton dynamics in the Ross Sea, by determining which<br />

processes act upon the system and which entities are most important, scientist will<br />

accumulate knowledge that may prove valuable in the fight against rising CO 2 levels.<br />

As mentioned the tool that I have chosen for model search relies on measurements<br />

and observations of one or more variables of a system to make inferences on<br />

the remaining variables for which no data is available and the processes at works in<br />

the system. In Borrett’s study, the only state variables for which he had measurements<br />

and observations are Phytoplankton and Nitrate. Ultimately the goal is to<br />

select model structures that would be good approximations of the natural system<br />

and give good insights on the processes at work in the system. However, here I was<br />

faced with an under constrained optimization problem, there was no data available<br />

for 3 of the state variables. Indeed, one of the big challenges of using HIPM for this<br />

particular ecosystem was that the data that is used to conduct the search is very<br />

expensive to collect, and it becomes especially complicated when it comes to iron<br />

(Fe) as it is difficult to measure. From this last statement arise two questions: does<br />

knowing data for more than one state variable narrow down the number of possible<br />

good fit models in a significant manner Will knowledge about certain variable have<br />

better optimization power than for others For example if we could only afford to<br />

collect data for one of the five variables in the system, would phytoplankton give us<br />

8


etter model output (fewer good fit models) in HIPM than zooplankton or would it<br />

be detritus <br />

This is an important question because as scientist are trying to advance their knowledge<br />

on the Ross Sea; there is a need to make educated decisions on what information<br />

to collect in an effort to optimize the use of resources.<br />

This thesis is structured in five parts, firstly I described the method used to<br />

gather the data that was used in my analysis, and this includes the HIPM software<br />

as well as an overview of the data sets. I then went into the quantitative analysis,<br />

by looking strictly at the results generated from the HIPM software and discussing<br />

what it tells us on an ecological standpoint. In section 4, I entered the analytical<br />

part of our analysis, picking and studying some of the best-fit models selected during<br />

the quantitative analysis. I then discussed these analytical results and in the next<br />

section tied it back to the biology in an effort to link both qualitative and quantitative<br />

research. Through this analysis we saw how we can help HIPMs model selection<br />

method as well as assist scientists in finding a model that most accurately explain<br />

the processes at works in the ecosystem observed.<br />

9


2 METHOD<br />

The method employed in this paper involves constructing process models from continuous<br />

data. To assist in this task we used a piece of software named HIPM. It<br />

is the output and model selection efficiency of this computer software that we are<br />

investigating. To better understand the task at hand it is important to define what<br />

HIPM does, as well as the steps we are taking to test its efficiency.<br />

2.1 HIPM Description<br />

Ecologists rely on system modeling quite heavily to build ecological theory, guide<br />

environmental assessment and management (Borrett et al. 2007). Typically scientists<br />

will build and study a couple of models, basing the model structure on previous<br />

research or by making a judgement call on which entities and processes should or<br />

not be included. One of the aspirations and problems of modeling natural systems is<br />

to capture the essence of the system necessary for the model purpose by figuring out<br />

what can be left out; in that regards which entities and processes should be included,<br />

and what are the best mathematical formulation and parameter values for a given<br />

structure become an essential part of this search. Choosing from among the possible<br />

model structures presents an intricate and time consuming challenge for ecologists<br />

who want to navigate this space (Borrett et al. 2007). In searching through this<br />

space of possible models, we are guided by the claim made by Langley et al. (1987),<br />

which we support, that we must look for models that will fit real-life observations. In<br />

summary,we are faced with the problem of constructing models anchored in domain<br />

theory, conducting a time consuming search and linking the models to empirical<br />

data (Borrett et al. 2007). This is where the HIPM software comes into play to<br />

remedy these issues, HIPM stands for Hierarchal Inductive Process Modeling. This<br />

scientific approach (Lantley et al. 2005) assumes the following:<br />

10


• Given: Time-series data for continuous variables.<br />

• Given: Background knowledge about the entities of the system; in other words<br />

constraints on variables and other parameters driving these entities.<br />

• Given: Background knowledge on the type of processes that may be involved<br />

in driving the ecosystem as well as the constraints that may exist for the said<br />

processes.<br />

Then the task for the software is to perform a search through the structure and<br />

parameter space defined by the process-entity library to find the models that best<br />

fit the data. HIPM operates in four phases.<br />

1. In an exhaustive search, it first finds all the possible instantiations of the<br />

generic processes for all variables. This means that the system will find all the<br />

possible combinations of processes that can affect a given variable (We will<br />

give an example in Section 2.1.2 ). For our purposes we used the exhaustive<br />

search option programmed into the software but there is also a heuristic search<br />

option available.<br />

2. The system then walks through each model and puts them together. In other<br />

words, it puts together, into a generic model, one instantiation of generic<br />

processes for each variable present in the system. It uses the constraints given<br />

by the users to determine which instantiations can be linked together into a<br />

generic model; the program goes through an exhaustive search to find all the<br />

possible models. In our study it makes 1120 model structures, due mainly to<br />

the large amount of different grazing processes that are potentially present in<br />

the ecosystem.<br />

3. It searches for the parameter values for each model using the constraints defined<br />

by the users.<br />

To infer these parameters, the system picks a random<br />

11


set of values that respect the constraints and, using the Levenberg-Marquardt<br />

gradient descent method, finds a local optimum. To avoid entrapment in local<br />

minima, the system will restart the parameter estimation from multiple<br />

random points retaining only the parameters that produce the lowest error.<br />

In our experiment we set the number of restarts to 128. This technique has<br />

been found to produce reasonable matches to time series in multiple systems<br />

(Langley et al. 2007).<br />

4. Evaluates the performances of the produced model structures (predicted values)<br />

against the data series (observed values) by calculating the root mean<br />

square error (reMSE); models with the lowest reMSE will be considered best<br />

fit models.<br />

2.1.1 Measure of Fit<br />

As mentioned above, HIPM evaluates and selects the best model structure and set<br />

of parameters according to a fitness measure. The system currently uses the sum<br />

of square error (SSE) to evaluate fitness (Bridewell et al. 2007), which is defined as<br />

follow:<br />

n∑<br />

i=1<br />

SSE(x i , x obs<br />

i ) =<br />

n∑<br />

i=1<br />

m∑<br />

k=1<br />

(x i,k − x obs<br />

i,k ) 2<br />

where x i , . . . , x n are the variables that are being fitted with m observed values for<br />

each. To take into account the modeling of variables of varying scale, the system<br />

uses a relative mean squared error that we define in the following way:<br />

reMSE =<br />

∑ n SSE(x i ,x obs<br />

i )<br />

i=1 s 2 (x obs<br />

i )<br />

nm<br />

Here s 2 (x obs<br />

i ) is the sample variance of the observation for x i . Across this paper<br />

12


we will refer to the relative mean squared error as reMSE. The biggest asset to this<br />

rescaling is the ability to compare values across data sets. Typically, an ReMSE of<br />

1.0 or above signifies that the model performs poorly and inversely, the lower the<br />

reMSE, the better the fit.<br />

2.1.2 Entities specification and model library<br />

Each entity of a system is defined by a combination of variables and parameters<br />

which makes them actors but also receivers of action in the model. A distinction is<br />

to be made between generic entity and instantiated entity. Indeed, a formal generic<br />

entity has a name and a set of properties which can include both variables and<br />

parameters.<br />

In a given model the parameters of the instantiated entity will not<br />

change whereas the variables do. Every variable in the entity has a name and a<br />

rule that determines how multiple processes and their subprocesses are combined<br />

(e.g. summed, minimum, product, etc...). For the parameters there is a name<br />

and a range that constrains their possible values. On the other hand, instantiated<br />

entities have their variables associated with either time-series or they are given initial<br />

values and the parameters have been assigned real values. A field is also included<br />

to indicate the parent generic entity (Borrett et al. 2007). One given generic entity<br />

can be instantiated multiple times, the generic entity can be thought of as a blue<br />

print for the instantiated entities. For example in our system we defined the entity<br />

phytoplankton as presented in Table 1. Here our entity’s name is “P”; it contains the<br />

variables “conc”, “growth rate” and “growth lim” with the rules determining how<br />

they will be aggregated with other processes; the next part of the entity definition is<br />

the list of parameters that are of concern for this entity such as “max growth’ with<br />

possible values in the (0,600) range. Following the definition of a generic entity in<br />

Table 1 is an instantiated entity, “pe” which refers to the parent generic entity. The<br />

variables are then either given the name of a time-series to which the model will be<br />

13


fitted such as for “conc”, with the “PHA c” referring to the phytoplankton column<br />

of the CIAO data set, or an initial value such as 0 for “growth rate”, indicating<br />

that this particular state variable won’t be fitted to a time-series.<br />

The mention<br />

“system” as opposed to “exogenous” simply states that this variable is dependent<br />

on the system as opposed to being independent like variables such as solar radiation<br />

or water temperature. The full instantiated entity library can be found in Appendix<br />

B and the generic entity library in Appendix C.<br />

For HIPM to be fully functional there needs to be a library of processes. Processes<br />

are the physical, chemical, or biological actions that drive change in dynamic models.<br />

Just as we made a distinction between generic entity and instantiated entity, we<br />

make a distinction between generic processes and instantiated processes. All generic<br />

processes are defined by a name by which entities can tie into the process, the<br />

subprocesses that are tied to that one process and one or multiple equations. The<br />

generic process can also include a set of Bolean conditions that determine if the<br />

process is active, making the process dynamic by turning the process on and off<br />

depending on whether the conditions are satisfied (Borrett et al. 2007). For instance<br />

we could set the photosynthetic process to only occur if a set environment light<br />

variable is greater than zero. We have an example of generic process in Table 2, it is<br />

named “growth”, and any of the following entities “P, N, D, E”can take a role in the<br />

process, then there is a list of the subprocesses, with the entities that can take a role<br />

in the subprocess, that are linked to this process and finally the equation that defined<br />

this process; this equation calls onto the “conc” and “growth rate’ variables that all<br />

entities must have. The instantiated process will take on a specific name and will be<br />

bound to a specific instantiated entity, one of P, N, D or E. The instantiated entity<br />

will take it’s role in the equation of the instantiated process. All the instantiated<br />

processes will be aggregated according to the rule defined in the generic entity. It<br />

is this organization in terms of entity and process that drives inductive process<br />

14


modeling. It makes for an easier construction of systems of equations by building in<br />

fragments.<br />

Table 1: In this table we are first giving an example of generic entity definition with<br />

its variables and parameters followed by an example of an instantiated entity, more<br />

specifically Phytoplankton - P, to which the variable “conc” is given a time series<br />

and the other variables initial values.<br />

pe = lib.add_generic_entity("P",<br />

{ "conc":"sum",<br />

"growth_rate":"prod",<br />

"growth_lim":"min"},<br />

{ "max_growth": (0.4,0.8),<br />

"exude_rate": (0.001,0.2),<br />

"death_rate": (0.02,0.04),<br />

"Ek_max":(1,100),<br />

"sinking_rate":(0.0001,0.25),<br />

"biomin":(0.02,0.04),<br />

"PhotoInhib":(200,1500),});<br />

p1 = entity_instance (pe,<br />

"phyto",<br />

{ "conc": ("system", "PHA_c", (0,600)),<br />

"growth_rate": ("system", 0, (0,1)),<br />

"growth_lim": ("system", 1, (0,1))},<br />

{ "max_growth":0.59,<br />

"exude_rate":0.19,<br />

"death_rate":0.025,<br />

"Ek_max":30,<br />

"biomin":0.025,<br />

"PhotoInhib":200 } );<br />

15


Table 2: Defining a process - Growth<br />

lib.add_generic_process(<br />

"growth", "",<br />

[("P",[pe],1,1), ("N",[no3,fe],1,100),<br />

("D",[de],1,1), ("E",[ee],1,1)],<br />

[("limited_growth", ["P","N","E"], 0),<br />

("exudation",["P"],1),<br />

("nutrient_uptake",["P","N"],0)],<br />

{},<br />

{},<br />

{"P.conc": "P.growth_rate * P.conc"} );<br />

To sum it up, HIPM’s power resides in its knowledge of the modeled domain as<br />

well as its ability to estimate parameters (Bridewell et al. 2007).<br />

2.2 Experiment Design<br />

Having now established how HIPM works let us consider the problem at hand.<br />

Though in theory HIPM is an extremely powerful tool which permits a search<br />

through a wide structure and parameter space, previous research has demonstrated<br />

that a more thorough investigation of HIPM’s output is necessary to evaluate its<br />

potential and usefulness to biologist.<br />

In our example of the Ross Sea ecosystem<br />

with the process-entity library set up as described, the search space represents 1120<br />

possible models; each model can take on a wide variety of parameters set depending<br />

on the constraints given to the software. The Phytoplankton dynamic models of the<br />

Ross Sea have five variables: Phytoplankton (P ), Zooplankton (Z), Detritus (D),<br />

Nitrate (N) and Iron (F ). In previous research, real-life time series about Phyto-<br />

16


plankton and Nitrate were available to us for this particular ecosystem, thus the<br />

data was fed to HIPM. By doing so, HIPM came out with about 200 possible models<br />

that have a reMSE of less or equal to 0.2 which from a computer science stand<br />

point is a good improvement. Indeed, we reduce the search space from 1120 possible<br />

models to 200 models. However, for a biologist that is still a quite large amount of<br />

models approximating the ecosystem studied; going through and testing out every<br />

one of these 200 models would be extremely time-consuming. Therefore, it is clear<br />

that we somehow need to lower this number of possible models to a point deemed<br />

reasonable/useful to biologist. Logically we assume that increasing the number of<br />

constraints (i.e. add real-life time series of a variable for which we had no previous<br />

empirical data) would help model discrimination in HIPM. But this would imply<br />

that the scientist would have to go into the field and collect time series for one of<br />

the variables in the system; that process being very expensive, can HIPM be used<br />

to make an informed decision about which variable would yield the most discriminatory<br />

powers, if there is at all a difference between variables This is what we are<br />

investigating and in the light of these elements we have formulated two hypotheses:<br />

• Hypothesis 1: Increasing the number of constraints: increasing the number of<br />

time-series for which we have data in HIPM for model selection will induce<br />

better fits. In other words, the increase in number of known time-series of<br />

system variables leads to better model discrimination and therefore better<br />

model selection.<br />

• Hypothesis 2: Variables yield different values of information: some variables<br />

will have more discriminatory power and restrict the best fit models more than<br />

others.<br />

To test our two hypotheses it was imperative to employ a full data set including<br />

time-series for all variables of the system in order to compare the results depending<br />

17


upon whether certain time-series are included or not as constraint for HIPM. Since<br />

no full data set with real-life data was available, we turned to a simulated data set<br />

called the ”Couple Ice and Ocean model” datasets otherwise referred to as CIAO<br />

datasets. This dataset is generated from a three dimensional ecosystem model that<br />

spans the entire water column and multiple stations across the Ross Sea. However,<br />

for our purposes only a portion of this data, the top 5 meters at the Ross Sea Polynya<br />

station 01, is used. The type of information contained in the CIAO dataset is stated<br />

in Table 3.<br />

Table 3: Information included in the CIAO data set.<br />

NOTE: A sample of the CIAO 1997 data can be found as Appendix A.<br />

Symbol Units Description<br />

JDAY Day Day of the measurements<br />

TEMP ◦ C Temperature of the water<br />

DPML m Mixed layer depth<br />

AI<br />

Sea ice concentration<br />

NITR µM Nitrate concentration<br />

PHOS mg Chla/m 3 Phosphate concentration,<br />

SILC µM Silicate concentration<br />

IRON nM or µM Iron concentration<br />

PARL µmol photons m −2 s −1 Solar radiation used by organism in photosynthesis.<br />

PHA mg Chla/m 3 Phaeo chlorophyll concentration<br />

DIAT mg Chla/m 3 Diatom chlorophyll concentration<br />

ZOO mg C/m 3 Zooplankton concentration<br />

DET mg C/m 3 Detritus concentration<br />

PURL µmol photons m −2 s −1 Photosynthetically usable radiation<br />

In addition to a full data set, it is necessary to have a working library, that, as<br />

stated in Section 2.1.2, defined both entities and processes for HIPM. The processentity<br />

library that we used is available in Appendix B and C, it was previously<br />

put together by Bridewell, Borrett, Langley and Arrigo.<br />

All the processes and<br />

subprocesses in which the instantiated entities can take a role in our study are<br />

represented in Figure 2.<br />

18


Having the background knowledge necessary for HIPM to conduct successful runs<br />

we designed thirty one experiments; each experiment represents a possible combination<br />

of time-series constraints that could potentially be entered into the software.<br />

For example, if we had time-series for Iron and Nitrate and fed the information into<br />

HIPM they would act as additional constraints in the model selection process. To<br />

be selected, models have to exhibit behavior close to the given time-series. All the<br />

experiments are summarized in Table 4 .<br />

19


3 COMPUTATIONAL RESULTS<br />

The main topic in this paper, is to determine how to optimize the usage we make of<br />

HIPM to assist scientists in there decision making process when it comes to selecting<br />

a model that most accurately represent an ecosystem. The first need is to narrow<br />

down the number of possible good fit models capable of describing the system. We<br />

did this feeding additional time series about one of the state variable into HIPM,<br />

thus providing more constraints; so did this assumption hold true<br />

Secondly, if<br />

adding more constraints to HIPM does reduce that number, are observations for a<br />

specific state variable holding more reducing power than the other state variables<br />

The data collected helped us answer these questions as well as discuss the efficiency<br />

of HIPM in its current state.<br />

There were thirty-one different experiments performed, each returning a measure of<br />

fit value (reMSE) every one of the 1120 models tested in every experiment. This<br />

makes for a large amount of data to analyze. To get a better idea of what this data<br />

looks like, the measures of fit values of models that had an reMSE between 0 and<br />

2 were graphed, ranking and graphing them from lowest to highest (see Figure 4, 5<br />

and 6) value. We did not look at reMSE higher than 2.0 since, as stated previously,<br />

models with reMSE higher than 1.0 are typically classified as poorly performing<br />

models as it indicates a very large difference between observed and expected values.<br />

We estimated that the (0,2) range would be sufficient for our purpose, as it would<br />

encompass most models. Based on these initial results we decided to pick an reMSE<br />

of 0.5 as our good fit model cutoff; any model under that cutoff is considered of good<br />

fit. This choice of cutoff was made because the multiple graphs seemed to exhibit a<br />

turning point or slight step pattern around this reMSE value, such as portrayed in<br />

the graph for experiments 1, 5 or 20.<br />

20


2.0 [P] 1<br />

[Z]<br />

2<br />

[D]<br />

●<br />

●<br />

3<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

●<br />

●<br />

●<br />

● ●<br />

●<br />

●<br />

● ●<br />

● ●●●<br />

● 197 Good Fit Models 101 Good Fit Models 366 Good Fit Models<br />

●<br />

●<br />

● ● ●<br />

●<br />

●●<br />

●<br />

●<br />

● ●<br />

●<br />

● ●<br />

●<br />

● ●●●<br />

●<br />

●● ●<br />

2.0 [N] 4<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

●<br />

●●●●●<br />

●●<br />

●<br />

●<br />

439 Good Fit Models 509 Good Fit Models<br />

2.0 [P,D] 7<br />

●<br />

●<br />

●<br />

● ●<br />

61 Good Fit Models<br />

2.0 [Z,D] 10<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

● ● ● ●<br />

●<br />

● ●●<br />

● ● ●<br />

● ●<br />

8 Good Fit Models<br />

●<br />

[F]<br />

●<br />

●<br />

● ●<br />

●<br />

● ● [P,N]<br />

●<br />

●<br />

● ●● ●<br />

●●●● ●<br />

●● ● ●● ●<br />

●<br />

5<br />

● ● ●●<br />

● ●<br />

● ●●<br />

[P,Z]<br />

●<br />

● ●●<br />

● ●●● ●<br />

5 Good Fit Models<br />

● ●●●●<br />

●<br />

8<br />

[P,F] 9<br />

●<br />

●<br />

●<br />

●●<br />

●<br />

● ●●● ●<br />

25 Good Fit Models 79 Good Fit Models●<br />

[Z,N]<br />

●<br />

● ●●●●●<br />

1 Good Fit Models<br />

●<br />

●<br />

●<br />

11<br />

●<br />

● ●<br />

●<br />

[Z,F]<br />

●●<br />

0 Good Fit Models<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

● ● ●<br />

6<br />

12<br />

0 200 400 600 800 1200<br />

0 200 400 600 800 1000<br />

0 200 400 600 800 1000<br />

Figure 4: reMSE value are ranked from lowest to highest. The reMSE = 0.5 signifies<br />

the good fit model cutoff, any models under that value are considered good fit models.<br />

The experimental setup for each run as well as the ID number is indicated in the<br />

top right corner.<br />

21


●<br />

●<br />

●<br />

2.0 ●<br />

[D,N]<br />

●●●<br />

● ●<br />

●<br />

●<br />

● 13<br />

[D,F] 14<br />

[N,F]<br />

● 15<br />

●<br />

1.5<br />

●●●●● ●<br />

●<br />

●●<br />

●<br />

1.0<br />

●<br />

●<br />

0.5<br />

● ●<br />

● ● ● ●<br />

67 Good Fit Models 190 Good Fit Models 128 Good Fit Models<br />

0.0<br />

●<br />

2.0 16<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

● ●●<br />

●●● ●<br />

●<br />

● ●●● ●● [P,Z,D]<br />

● ●●●●●● ●<br />

●<br />

●● ● ●● ●<br />

● ●<br />

●<br />

●<br />

●<br />

●<br />

●<br />

● ●<br />

●●<br />

● ● 0 Good Fit Models<br />

●<br />

[P,D,N] ●<br />

●● ● ●●●<br />

●<br />

●<br />

●● ●<br />

●<br />

●● ●<br />

● ●●●<br />

●<br />

● ●●<br />

● ● ● ● ●● ● ●<br />

●<br />

● ● ●<br />

● ● ●<br />

● ● ● ● ●●● ● ● ●<br />

●<br />

[P,Z,N]<br />

0 Good Fit Models<br />

● ●●<br />

●<br />

2.0 19<br />

[P,D,F] 20<br />

●<br />

1.5<br />

●●<br />

1.0<br />

●● ● ● ●●<br />

0.5<br />

13 Good Fit Models 177 Good Fit Models<br />

0.0<br />

2.0 [Z,D,N] 22<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

●<br />

0 Good Fit Models<br />

● ●●<br />

●●<br />

● [Z,D,F]<br />

● ●●<br />

● ● ●<br />

0 Good Fit Models<br />

●<br />

●<br />

●<br />

●<br />

● ●<br />

●<br />

17<br />

23<br />

●<br />

●<br />

[P,Z,F]<br />

●●● 18<br />

●<br />

●<br />

●<br />

●<br />

●<br />

0 Good Fit Models<br />

●<br />

[P,N,F] 21<br />

●●●●●●● ● ● ●●●●<br />

● ●●● ● ● ●● ● ● ●●● ●<br />

●<br />

●<br />

●<br />

● ●●<br />

● ●● ●<br />

● ●<br />

●<br />

●<br />

● ●●●<br />

●<br />

15 Good Fit Models<br />

[Z,N,F]<br />

3 Good Fit Models<br />

● ●<br />

●<br />

24<br />

0 200 400 600 800 1200<br />

0 200 400 600 800 1000<br />

0 200 400 600 800 1000<br />

Figure 5: reMSE value are ranked from lowest to highest. The reMSE = 0.5 signifies<br />

the good fit model cutoff, any models under that value are considered good fit models.<br />

The experimental setup for each run as well as the ID number is indicated in the<br />

top right corner.<br />

22


2.0 [D,N,F] 25<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

1.5<br />

1.0<br />

0.5<br />

0.0<br />

● ●●<br />

● ● ●●●●<br />

● ●●● ●<br />

● ●<br />

● ●●● ●<br />

● ●<br />

●● ● ●●<br />

● ●<br />

●<br />

●<br />

●<br />

●●●<br />

●●●<br />

●<br />

●<br />

●<br />

● ●<br />

●<br />

●<br />

● ●●<br />

●<br />

●● ●<br />

● ●●<br />

●<br />

● ● ● ●● ●<br />

●● ● ●<br />

● ●● ●<br />

● ●<br />

39 Good Fit Models<br />

2.0 [P,Z,N,F] 28<br />

2 Good Fit Models<br />

●<br />

● ●●<br />

● ●●●<br />

●<br />

●<br />

● ●●<br />

● ●●<br />

[P,Z,D,N]<br />

0 Good Fit Models<br />

[P,D,N,F]<br />

5 Good Fit Models<br />

● ● ●●<br />

●<br />

● ● ●●<br />

●●<br />

●● ● ●<br />

●●●●<br />

0 200 400 600 800 1000<br />

26<br />

29<br />

●● ●●● ● ●●●● ● ●<br />

[P,Z,D,F]<br />

0 Good Fit Models<br />

[Z,D,N,F]<br />

0 Good Fit Models<br />

●● ●<br />

●●<br />

●<br />

●●●<br />

●●<br />

●● ●<br />

●<br />

●<br />

●<br />

0 200 400 600 800 1000<br />

● ●●●●<br />

27<br />

30<br />

●<br />

2.0 [P,Z,D,N,F] 31<br />

1.5<br />

●●<br />

1.0<br />

0.5<br />

0.0<br />

0 Good Fit Models<br />

0 200 400 600 800 1200<br />

Figure 6: reMSE value are ranked from lowest to highest. The reMSE = 0.5 signifies<br />

the good fit model cutoff, any models under that value are considered good fit models.<br />

The experimental setup for each run as well as the ID number is indicated in the<br />

top right corner.<br />

23


3.1 Increase in number of time-series input<br />

One of the first observations that was made when looking at the data set, is that the<br />

general trend was the more time-series were used in HIPM the smaller the number<br />

of good fit models, as represented in Figure 7.<br />

Number of Good Fit Models (reMSE


dramatically, with very small or non existent variance, and get very close or equal<br />

to zero. This suggest there may be some issues in the selection process which could<br />

originate from over-constraining the system or from a need to improve the processentity<br />

library. Furthermore, looking at Figure 6 we observe that there are no models<br />

with an reMSE lower than 1.5 which means that all models have performed poorly<br />

given the constraints. At first glance and momentarily putting aside the observed<br />

behavior for four and five time-series constraints, we can conclude that adding up<br />

to three multiple time-series constraints produces the desired effect and reduces the<br />

number of good fit models. But the conclusion of this initial examination does not<br />

always hold true, as a closer look at the data reveals.<br />

At this point my research entered the field of exploratory statistics as opposed to<br />

hypothesis testing statistics, conventional statistics tool such as p-value or confidence<br />

interval were not suitable to evaluate the hypotheses.The data has been reformatted<br />

in the more reader-friendly 4 which represents each experiment in a binary format:<br />

the instantiated entities given time-series for a run received a 1 and the ones with<br />

only initial values received a zero. In addition to this the number of good fit models<br />

for each reMSE cutoff value from 0.1 to 1 were added up in order to analyze the<br />

individual effect, on the model selection process, of adding time-series constraint for<br />

each entity. In order to do so, we select a subset of Table 4 for which a certain entity<br />

has the value of 1. For example for P, we selected the subset of rows where P had a<br />

value of 1. By doing so we are only looking at the runs in which the constraints on<br />

P had a role, excluding the experiments where P was not constrained.<br />

By carefully looking at Table 4, we notice that in experiment 1 when given<br />

observations only for phytoplankton the number of good fit models under .5 reMSE<br />

is 197. In experiment 7 and 9, this number dropped to 61 and 79 respectively, with<br />

the addition of observations for detritus in one case and iron in the other. However,<br />

notice that in experiment 20, where observations for phytoplankton, detritus and<br />

25


Table 4: This table represents each experiment in binary form, 1 signifying that a<br />

time-series was given for this entity and 0 that no time-series were given for this run.<br />

We counted the number of models present under each reMSE cutoff value<br />

ID Data Constraints reMSE Cutoff<br />

P Z D N F .1 .2 .3 .4 .5 .6 .7 .8 .9 1<br />

1 1 0 0 0 0 11 122 161 183 197 213 233 253 284 336<br />

2 0 1 0 0 0 14 38 46 72 101 141 186 236 296 487<br />

3 0 0 1 0 0 95 184 248 331 366 404 441 501 552 602<br />

4 0 0 0 1 0 67 188 301 376 439 482 517 531 547 605<br />

5 0 0 0 0 1 167 361 414 452 509 537 563 594 628 1094<br />

6 1 1 0 0 0 0 0 1 4 5 9 14 19 20 22<br />

7 1 0 1 0 0 0 18 35 45 61 75 88 94 100 102<br />

8 1 0 0 1 0 0 0 15 18 25 30 35 38 45 48<br />

9 1 0 0 0 1 0 37 60 73 79 91 110 123 143 158<br />

10 0 1 1 0 0 0 1 5 5 8 10 11 14 18 18<br />

11 0 1 0 1 0 0 0 0 1 1 1 1 1 1 5<br />

12 0 1 0 0 1 0 0 0 0 0 3 10 14 16 23<br />

13 0 0 1 1 0 2 8 13 48 67 93 120 142 167 178<br />

14 0 0 1 0 1 59 94 140 156 190 232 255 276 295 311<br />

15 0 0 0 1 1 23 40 57 89 128 151 179 218 252 290<br />

16 1 1 1 0 0 0 0 0 0 0 1 4 5 7 7<br />

17 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0<br />

18 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0<br />

19 1 0 1 1 0 0 0 0 9 13 19 22 24 24 25<br />

20 1 0 1 0 1 44 91 132 149 177 226 253 280 295 312<br />

21 1 0 0 1 1 0 0 0 11 15 17 25 25 27 31<br />

22 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0<br />

23 0 1 1 0 1 0 0 0 0 0 3 4 7 7 7<br />

24 0 1 0 1 1 0 0 1 1 3 4 5 6 7 8<br />

25 0 0 1 1 1 3 13 21 27 39 51 65 87 100 114<br />

26 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0<br />

27 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0<br />

28 1 1 0 1 1 0 0 1 1 2 2 3 4 5 6<br />

29 1 0 1 1 1 0 0 0 2 5 10 13 17 17 18<br />

30 0 1 1 1 1 0 0 0 0 0 1 2 2 2 2<br />

31 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0<br />

Iron were used that the number of good fit models is 177 which is higher than<br />

that of experiment 7 and 9. This perfect counter example demonstrates that more<br />

26


observations is not always synonymous to fewer models. Yet, according to Table<br />

4 there are some experiments for which this assumption did hold true. Indeed, in<br />

experiment 8 both phytoplankton and nitrate are known and HIPM output 25 good<br />

fit models under 0.5 reMSE; when Iron observations were added in experiment 21 we<br />

count 15 models selected and again when detritus was added in experiment 29 this<br />

number dropped to 5. Thus, in some cases more information will provide further<br />

restriction in the number of models selected.<br />

If observations and measurements for one or two state variables and we want to<br />

collect data about one of the remaining state variables, it is clear that our choice of<br />

which state variable to use should be highly influenced by the restriction power of<br />

each variable which can be calculated from data already known. To place it back<br />

into context, in experiment 9 data for both phytoplankton and iron were used in<br />

the selection process and the output was 79 good fit models under 0.5 reMSE; if<br />

we wanted to decrease this number adding detritus would be an unwise choice, as<br />

it outputs 177 good fit models. However, this is not always the case. For instance,<br />

in experiment 8, where phytoplankton and nitrate data are known, the output was<br />

25 good fit models, with the addition of detritus this number went down to 13<br />

in experiment 19. Indeed, the assumption was that more time-series data would<br />

constraint the model selection process further.<br />

However, the way the reMSE is calculated may be the reason why the results<br />

contradict our assumptions. The reMSE is the average of the fit for each of the<br />

variables being fitted. For example if for a model I was fitting both phytoplankton<br />

and iron and they both had an reMSE of 0.6, their average would be 0.6 and thus<br />

would not be selected as good fit model but if in the next experiment we added<br />

detritus and it had a fit of 0.1, the overall average would then be 0.35 which would<br />

put the model into the good fit model category. This explain why more time-series<br />

does not always mean fewer model being selected. Hence, different entities will yield<br />

27


different restriction powers based on the pre-existing knowledge. It seems that the<br />

value of the information varies according to the data previously available; but is<br />

there a specific state variable that tends to provide more restriction than the others<br />

regardless of the case or inversely, is there a state variable that tends to provide very<br />

little additional information in the model selection process<br />

3.2 Value of Information<br />

It seems realistic to think that each state variable could yield varying restrictive<br />

power when it comes to model selection through HIPM. To verify this assumption<br />

we used subsets of Table 4 to create Figure 8: five subsets were created for each of the<br />

state variables using the experiments for which the observations and measurements<br />

for that variable were known. I evaluated the overall impact of each state variable<br />

over all experiments by first looking at the mean number of good fit models for each<br />

reMSE cut offs. Realizing that the variation in these subsets was great because of<br />

the presence of so many zeros we decided to look at the median number of models<br />

for each of the different reMSE cutoff values in order to quantify the discriminatory<br />

powers of each entity. We will refer to these values as Median Activation Values in<br />

reference to Bayesian Statistics Activation Probabilities that inspired this approach.<br />

The lower the Median Activation Value the more restrictive power it holds.<br />

Indeed the activation value refers to the median number of models selected across<br />

all the runs that included that entity. In our case we are looking for the state variable<br />

that would reduce the number of good fit models the most. This is determine<br />

by the lower the activation value, the lower it is the more discriminatory power<br />

this variable holds at that particular cutoff. Looking at Figure 8 that zooplankton<br />

for cutoff between 0.4 and 1 has the most discriminatory power. Nitrate comes in<br />

second with an overlap with Iron from reMSE cutoff between 0.4 and 0.6 but for<br />

higher cutoff nitrate will have lower Median Activation Value than iron. As far as<br />

28


25<br />

20<br />

●<br />

P<br />

Z<br />

D<br />

N<br />

F<br />

●<br />

●<br />

●<br />

Median Activation Values<br />

15<br />

10<br />

●<br />

●<br />

5<br />

●<br />

●<br />

0<br />

● ● ●<br />

0.2 0.4 0.6 0.8 1.0<br />

ReMSE cutoff<br />

Figure 8: Mean Activation Values by different Cutoffs. The lower the median activation<br />

value the more discriminatory powers that entity holds at that particular cutoff.<br />

Between 0.4 and 1, Zooplankton consistently has the lowest median activation value.<br />

phytoplankton and detritus are concerned, they seem to have similar behavior with<br />

high activations values. We are to note that for cutoffs less than 0.4 no one entity<br />

seems to have greater discriminatory power. Based on this graph alone it would<br />

seem that zooplankton is the one entity that yields the most information when it<br />

comes to model selection and therefore would be the entity worth collecting in the<br />

field.<br />

That said, Table 4 also reveals a worrisome amount of data constraints combinations<br />

which yield no models with reMSE less than one. A example of that being<br />

29


experiment 22, yielding no good fit models under 1 reMSE cutoff. Another case<br />

that is cause for worry is the one where time-series are given to all 5 entities which<br />

we would expect to have at least one model selected in between the 0.1 to 1 range<br />

of reMSE cutoff. This observation raises the question that there may be an underlying<br />

issue with the model selection process. Incidentally, all of the combinations<br />

that yield no models under the 1 reMSE cutoff are experiments to which we gave<br />

Z a time-series constraint, which may say something about the processes that drive<br />

zooplankton; the library may be in need of improvements.<br />

3.3 Summary<br />

This result analysis allows us to make the following observations:<br />

• In most cases, increasing the number of time-series constraints up to 3 seemed<br />

to reduce the number of good fit models under a 0.5 cutoff. If we consider<br />

experiments 1 through 25, there was only one case (experiment 20) for which<br />

the number of good fit models increased and six cases for which the number of<br />

good fit models went to zero (experiment 12, 16,17,18, 22, 23) which can been<br />

interpreted as a deficiency in the library. Overall, this result is due to the fact<br />

that the reMSE is an average of the fit of all the state variables for which we<br />

have time-series.<br />

• The decision as to which data to collect next should take into consideration the<br />

previously acquired time-series. Recommendations may differ based on state<br />

variables previously measured and used with HIPM in the model selection<br />

process. The reason for this is once again the way the reMSE is calculated.<br />

Indeed, depending on how well the previously included time-series fitted a<br />

particular model will determine whether or not the addition of another timeseries<br />

will throw the said model in or out of the pool of good fit models.<br />

30


• For an reMSE cutoff of less than 0.4 the Median Activation Values for all 5<br />

entities blend together and are not useful. However, for reMSE cutoff equal to<br />

0.4 or greater the Median Activation Values seem to indicate that zooplankton<br />

yields the most discriminatory powers which could be due to the numerous<br />

experiments for which we obtain zero good fit models. These zeros could be<br />

the result of two things, either that the discriminatory power of Zooplankton is<br />

superior or the most plausible answer at this time would be that zooplankton<br />

are not appropriately defined in the process library.<br />

That being said there are a couple of elements that raise question in regards<br />

to the accuracy of the model selection process or the process-entity library. These<br />

elements being:<br />

• The behavior observed in Figure 7 with time-series for four or five of the entities<br />

as an average number of good fit models very close or equal to zero as well<br />

as a spike in reMSE fosters doubt as to the accuracy of the selection process<br />

when provided too many constraints.<br />

• The lack of good fit models, under reMSE cutoffs ranging from 0.1 to 1, for<br />

many of the experiments that included Zooplankton time-series constraints.<br />

This leads us to conjecture that when HIPM is given more than 3 data-series the<br />

system becomes overconstrained thus preventing it from accurately selecting models.<br />

Another conjecture is that the entity Zooplankton as defined in the process-entity<br />

library needs to be reviewed; it could be this element alone that is at the origin of<br />

this issue in the model selection. More specifically, one of the assumption of the<br />

system is that zooplankton feed very little if at all on Phaeocystis antartica as they<br />

are more resistant to grazing in comparison to diatoms that are more typically grazed<br />

upon by zooplankton. The way the process-library is currently set-up diatoms are<br />

31


not taken into consideration which could then in turn affect how well zooplankton<br />

performs when fitting models.<br />

32


4 ANALYTICAL <strong>ANALYSIS</strong><br />

The quantitative analysis of HIPM’s results enabled us to make some useful observations.<br />

Since the main purpose of this software for a biologist is to approximate<br />

the natural system observed in order to use this model to perform experiments, we<br />

decided to choose two of the models with an reMSE of less than 0.5 that came up<br />

most frequently over all 31 runs.<br />

4.1 Most recurrent models<br />

The initial concept driving the models is represented in Figure 1. Phytoplankton<br />

plays a role in both Zooplankton and Detritus concentration, it is acted upon by both<br />

Nitrate and Iron which are in turned acted upon by Detritus. The environmental<br />

factors act on the Phytoplankton concentration as well as Nitrate and Iron. Model<br />

A came up 13 times and Model B came up 11 times over all runs. We analyzed these<br />

models to figure out if their behavior make sense from an ecological standpoint and if<br />

they could give us information on how to improve the HIPM selection process. The<br />

models are composed of five differential equations, each one determined by one of<br />

the five principal concentrations: Phytoplankton (P), Zooplankton (Z), Detritus (D),<br />

Nitrate (N)and Iron (F). All theses entities are acted upon by sets of parameters<br />

listed in Table 5 and 6.<br />

There are also a set of exogenous variables acting on<br />

the system, defined as follow: E P UR (t) is the photosynthetically usable radiation,<br />

E T H2 O(t) is the temperature of the water and E ice (t) is the sea ice concentration.<br />

33


Table 5: This table summarizes all the parameters that play a role in Model A<br />

Model A<br />

ID Name Value<br />

a 0 phyto.max growth 0.8<br />

a 1 phyto.Ek max 12.033<br />

a 2 phyto.PhotoInhib 771.158<br />

a 3 arrigoetal1998 w photoinhibition coefficient 13.2302<br />

a 4 NO3 monod lim coefficient 0.00099718<br />

a 5 Fe monod lim coefficient 0.000394882<br />

a 6 phyto.exude rate 0.0228636<br />

a 7 NO3.toCratio 6.6<br />

a 8 Fe.toCratio 308026<br />

a 9 phyto.death rate 0.0311617<br />

a 10 environment.beta 0.327204<br />

a 11 zoo.death rate 0.270568<br />

a 12 zoo.assim eff 0.167516<br />

a 13 zoo.gmax 0.403535<br />

a 14 grazing ivlev delta coefficient 0.997648<br />

a 15 detritus.remin rate 0.0335311<br />

a 16 zoo.respiration rate 0.0103725<br />

a 17 phyto.sinking rate 0.015739<br />

a 18 detritus.sinking rate 0.074487<br />

a 19 NO3.avg deep conc 31<br />

a 20 NO3 linear temp control max mixing rate 0.729376<br />

a 21 Fe.avg deep conc 0.00045<br />

a 22 Fe linear temp control max mixing rate 0.00794959<br />

34


Model A<br />

Where,<br />

dP<br />

dt<br />

dZ<br />

dt<br />

dD<br />

dt<br />

dN<br />

dt<br />

dF<br />

dt<br />

=<br />

=<br />

=<br />

=<br />

=<br />

[ [ ]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) (1 − a 6 ) − a 9 − a 17<br />

]P (1)<br />

} {{ }<br />

(<br />

−<br />

P Growth<br />

)<br />

Rate<br />

a 13 (1 − e (−a 14P ) ) Z<br />

} {{ }<br />

Z Grazing Rate<br />

[<br />

a 12 a 13 (1 − e (−a 14P ) ) −a<br />

} {{ } 11 − a 16<br />

]Z (2)<br />

Z Grazing Rate<br />

(<br />

) (<br />

)<br />

(1 − a 10 )a 11 P + (1 − a 10 )a 11 Z<br />

(<br />

)<br />

+ (1 − a 10 )(1 − a 12 ) a 13 (1 − e (−a 14P ) ) Z<br />

} {{ }<br />

− D(a 15 + a 18 )<br />

Z Grazing Rate<br />

[<br />

]<br />

E T H2 O<br />

(a 19 − N) a<br />

max<br />

− E T H2 O(t)<br />

20<br />

E T H2 O max<br />

− E T H2 O<br />

} {{ min<br />

}<br />

N Mixing Rate<br />

−<br />

[<br />

−<br />

[<br />

P<br />

(a 7 12.0107)<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)<br />

} {{ }<br />

P Growth Rate<br />

]<br />

E T H2 O<br />

(a 21 − F ) a<br />

max<br />

− E T H2 O(t)<br />

22<br />

E T H2 O max<br />

− E T H2 O<br />

} {{ min<br />

}<br />

F Mixing Rate<br />

[<br />

P<br />

(a 8 12.0107)<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)<br />

} {{ }<br />

P Growth Rate<br />

(3)<br />

(4)<br />

]<br />

+ a 15D<br />

(a 7 12.0107)<br />

(5)<br />

]<br />

+ a 15D<br />

(a 8 12.0107)<br />

{<br />

F<br />

M(t) = min<br />

(F + a 5 ) , N<br />

(N + a 4 ) , E P UR (t)<br />

(e− a 2 )(1 − e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1<br />

} ) ) )<br />

a 1 ))<br />

} {{ }<br />

Phytoplankton Growth Limitation<br />

35


Table 6: This table summarizes all the parameters that play a role in Model B<br />

Model B<br />

ID Name Value<br />

a 0 phyto.max growth 0.561196<br />

a 1 phyto.Ek max 37.5096<br />

a 2 phyto.PhotoInhib 394.809<br />

a 3 arrigoetal1998 w photoinhibition coefficient 10.7433<br />

a 4 nut lim exp coefficient 0.784127<br />

a 5 monod lim coefficient 0.000722964<br />

a 6 phyto.exude rate 0.168121<br />

a 7 NO3.toCratio 6.6<br />

a 8 Fe.toCratio 335345<br />

a 9 phyto.death rate 0.0293637<br />

a 10 environment.beta 0.473748<br />

a 11 zoo.death rate 0.00199206<br />

a 12 zoo.assim eff 0.307847<br />

a 13 zoo.gmax 0.350046<br />

a 14 zoo.gcap 288.23<br />

a 15 zoo.glim 19.0002<br />

a 16 phyto.biomin 0.0201679<br />

a 17 detritus.remin rate 0.03<br />

a 18 zoo.respiration rate 0.0234653<br />

a 19 phyto.sinking rate 0.00273829<br />

a 20 detritus.sinking rate 0.00390565<br />

a 21 NO3.avg deep conc 31<br />

a 22 NO3 linear temp control max mixing rate 0.00307192<br />

a 23 Fe.avg deep conc 0.00045<br />

a 24 Fe linear temp control max mixing rate 0.00444523<br />

36


Model B<br />

dP<br />

dt<br />

dZ<br />

dt<br />

dD<br />

dt<br />

=<br />

=<br />

=<br />

[ ]<br />

[ ]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) P (1 − a 6 ) (6)<br />

} {{ }<br />

(<br />

P Growth<br />

)<br />

Rate<br />

−(a 9 P ) − H(t)Z − (a 19 P )<br />

( )<br />

a 12 H(t)Z − a 11 Z 2 − a 18 Z (7)<br />

(<br />

)<br />

)<br />

(1 − a 10 )a 9 P +<br />

((1 − a 10 )a 11 Z 2 (8)<br />

(<br />

)<br />

+ (1 − a 10 )(1 − a 12 )H(t)Z − D(a 17 + a 20 )<br />

dN<br />

dt<br />

dF<br />

dt<br />

=<br />

=<br />

[<br />

] [<br />

a 17 D<br />

E T H2 O<br />

+ (a 21 − N) (a<br />

max<br />

− E T H2 O(t)<br />

22<br />

(a 7 12.0107)<br />

E T H2 O max<br />

− E T H2 O<br />

} {{ min<br />

}<br />

N Mixing Rate<br />

−<br />

[<br />

−<br />

[<br />

P<br />

(a 7 12.0107)<br />

Da 17<br />

(a 8 12.0107)<br />

[<br />

P<br />

(a 8 12.0107)<br />

[<br />

]<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)<br />

} {{ }<br />

P Growth Rate<br />

] [<br />

]<br />

+<br />

E T H2 O<br />

(a 23 − F ) a<br />

max<br />

− E T H2 O(t)<br />

24<br />

E T H2 O max<br />

− E T H2 O<br />

} {{ min<br />

}<br />

F Mixing Rate<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)<br />

} {{ }<br />

P Growth Rate<br />

]<br />

]<br />

(9)<br />

(10)<br />

Where,<br />

{<br />

F<br />

M(t) = min<br />

(F + a 5 ) , (1 − e−a 5N ), (e −E P UR (t)<br />

a 2 )(1 − e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1<br />

} ) ) )<br />

a 1 ))<br />

} {{ }<br />

{<br />

H(t) = max<br />

Phytoplankton Growth Limitation<br />

}<br />

a 13 (P − a 16 − a 15 )<br />

0,<br />

a 14 + (P − a 16 − a 15 )<br />

} {{ }<br />

Zooplankton Grazing Rate<br />

37


4.2 Preliminaries<br />

Both Models A and B have complex structures which differ from more theoretical<br />

models studied by mathematicians. Since solving these differential equations directly<br />

is extremely difficult, we decided to take a more indirect approach by looking at<br />

the bounds of function, using the positive lemma and comparison arguments which<br />

follow.<br />

Lemma 1 A Positivity Lemma. Let W (t) be a smooth function over a domain<br />

[0, T ], T ∈ R . If W satisfies W ′ (t) + M(t)W (t) ≥ 0 in (0, T ] and W (0) ≥ 0, where<br />

M(t) is a bounded function in [0, T ], then W (t) ≥ 0 on [0, T ].<br />

Proof: We prove this lemma by contradiction. Assume that the statement W (t) ≥ 0<br />

in [0, T ] were not true, then there would exist a point t 0 ∈ [0, T ] such that W (t 0 ) is<br />

a negative minimum of W on [0, T ]. Since W (0) ≥ 0, then t 0 ∈ (0, T ] which means<br />

that<br />

W ′ (t 0 ) + M(t 0 )W (t 0 ) ≥ 0.<br />

Since W reaches its minimum value at t 0 , then we have W ′ (t 0 ) = 0 if t 0 ≠ T and<br />

W ′ (t 0 ) ≤ 0 if t 0 = T . This ensures that<br />

M(t 0 )W (t 0 ) ≥ 0<br />

which contradicts our assumption about W (t 0 ) < 0 when M(t 0 ) > 0.<br />

For the case of M(t 0 ) ≤ 0, we let V (t) = e −γt W (t) for some constant γ with<br />

γ > −M(t) in (0, T ], then V will satisfy the relation V ′ (t) + (γ + M)V (t) ≥ 0 in<br />

(0, T ] and V (0) ≥ 0, where γ+M(t) > 0 for all t ∈ (0, T ]. From the above arguments<br />

we have V (t) ≥ 0 in [0, T ]. It follows from W (t) = e γt V (t) that W (t) ≥ 0 on [0, T ].<br />

□<br />

38


As an application of Lemma 1, we have the following comparison argument for<br />

the respective solutions u 1 and u 2 of the initial-value problem<br />

u ′ i = f i (t, u i ) in (0, T ], u i (0) = u i,0 , (11)<br />

where i = 1, 2. f 1 and f 2 are continuous functions in [0, T ] × R.<br />

Lemma 2 The Comparison Argument.<br />

Assume that both ∂f 1<br />

∂u and ∂f 2<br />

∂u are continuous in [0, T ] × R. If f 1(t, u) ≤ f 2 (t, u)<br />

in (0, T ] × R and u 1,0 ≤ u 2,0 , then the respective solutions u 1 and u 2 of (11) satisfy<br />

u 1 (t) ≤ u 2 (t) on [0, T ].<br />

Proof: Let W = u 2 − u 1 , and let M = M(t) be any bounded function in [0, T ] × Ω.<br />

Then by (10), W satisfies<br />

W ′ (t) + M(t)W (t)<br />

= M(t)[u 2 (t) − u 1 (t)] + f 2 (t, u 2 (t)) − f 1 (t, u 1 (t)) in (0, T ]<br />

W (0) = u 2,0 − u 1,0 ≥ 0.<br />

Since ∂f 1<br />

∂u<br />

is continuous in u, then by the mean value theorem [2],<br />

f 2 (t, u 2 ) − f 1 (t, u 1 )<br />

= [f 2 (t, u 2 ) − f 1 (t, u 2 )] + [f 1 (t, u 2 ) − f 1 (t, u 1 )]<br />

≥ ∂f 1<br />

∂u (t, ˆη)(u 2 − u 1 )<br />

where ˆη = ˆη(t) is an intermediate value between u 1 and u 2 . Hence, for the bounded<br />

function M(t) = − ∂f 1<br />

∂u (t, ˆη(t)), W satisfies W ′ (t) + M(t)W (t) ≥ 0 in (0, T ]. It is<br />

known from lemma 1 that W ≥ 0, i.e. u 2 (t) ≥ u 1 (t) on [0, T ]. This proves lemma 2.<br />

□<br />

39


In addition to these 2 Lemmas, let us introduce a method for solving a first order<br />

linear differential equation.<br />

Proposition 1 Suppose u is a function that satisfies:<br />

du<br />

dt = αu + β, u(0) = u 0,<br />

then,<br />

u(t) = (u 0 + α β )eαt − α β .<br />

Proof:<br />

du<br />

dt<br />

= αu + β is a first order linear differential equation and can be solved<br />

using the method of integrating factor.<br />

(e −αt u) ′ = βe −αt<br />

e −αt u =<br />

∫ t<br />

0<br />

βe −αs ds<br />

u = − β α + Ceαt<br />

Using initial condition u(0) = u 0 we find the following solution:<br />

u = (u 0 + β α )eαt − β α<br />

Hence, proving proposition 1. □<br />

40


4.3 Model A<br />

Our analysis of Model A begins with two entities that have very similar structure,<br />

and only differ in variables and parameters, iron and nitrate.<br />

dN<br />

dt = [<br />

(a 19 − N)<br />

−<br />

[<br />

P<br />

(a 7 12.0107)<br />

≤a 20<br />

{ }} { ]<br />

E T H2 O<br />

a<br />

max<br />

− E T H2 O(t)<br />

20<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

]<br />

a 15 D<br />

(a 7 12.0107)<br />

We decide to go for a very wide upper bound and to try keeping the bounds simple<br />

but yet still informative. Thus, for the upper bound we decided to drop the<br />

subtracted term.<br />

dN<br />

dt ≤(a 19 − N)a 20<br />

≤a 19 a 20 − Na 20<br />

Then, solving for N u (t)<br />

dN u<br />

dt<br />

+ N u a 20 = a 19 a 20<br />

Using integrating factor e a 20t and Proposition 1 we get,<br />

N u (t) = a 19 + R u e −a 20t<br />

where R u = N 0 − a 19 , As far as the lower bound is concern, we chose it to be zero.<br />

Thus summarizing the bounds we get,<br />

0 ≤ N(t) ≤ a 19 + R u e −a 20t<br />

(12)<br />

41


When t → ∞ we get the following,<br />

0 ≤ N(t) ≤ a 19 (13)<br />

This result tells us that the Nitrate concentration in this model will not exceed<br />

the value of parameter 19 which is the Nitrate average deep concentration. (12) also<br />

tells us that the maximum rate of decline of Nitrate will be that of Parameter 20<br />

which represents the Nitrate maximum mixing rate. This means that the accuracy<br />

of the Nitrate concentration is extremely dependent on how well the parameters are<br />

selected. Since Iron has the same equation structures, the same analysis applies:<br />

dF<br />

dt = [<br />

(a 21 − N)<br />

−<br />

[<br />

P<br />

(a 8 12.0107)<br />

a<br />

{ }}<br />

22<br />

{ ]<br />

E T H2 O<br />

a<br />

max<br />

− E T H2 O(t)<br />

22<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

]<br />

a 15 D<br />

(a 8 12.0107)<br />

Using the same method as Nitrate the upper found for Iron is<br />

F u (t) = a 21 + Q u e −a 22t<br />

where Q u = F 0 − a 21 ,<br />

Summarizing the bounds we get,<br />

0 ≤ F (t) ≤ a 21 + Q u e −a 22t<br />

(14)<br />

When t → ∞ we get it to be,<br />

∴ 0 ≤ F (t) ≤ a 21 (15)<br />

42


As for Nitrate, Iron Concentration will not exceed a maximum set by parameter<br />

21 which is the Iron average depth concentration. Similarly, the maximum rate<br />

of decline for Iron is set by its maximum mixing rate. Behavior of Iron and Nitrate<br />

concentrations are thus dependent on the accuracy of the parameter selection<br />

process.<br />

The next entity in our analysis is zooplankton as it has a fairly simple equation<br />

structure. By the Comparison Argument in Lemma 2 and since (1 − e (−a 14P ) ) ≤ 1<br />

we can write,<br />

dZ<br />

[<br />

dt = a 12 a 13 (1 − e (−a 14P ) ) − a 11 − a 16<br />

]Z<br />

]<br />

≤<br />

[a 12 a 13 − a 11 − a 16 Z<br />

Thus, by Proposition 1,<br />

Z(t) ≤ Z 0 e (a 12a 13 −a 11 −a 16 )t = Z 0 e −0.2166944309t<br />

We know that by definition the Zooplankton concentration is positive which gives<br />

us a lower bound of sero. Then,<br />

0 ≤ Z(t) ≤ Z 0 e −0.2166944309t<br />

0 ≤ Z(t) ≤ Z 0 e −δt where, δ = 0.2166944309 (16)<br />

We notice that as t → ∞ Z(t) goes to zero which implies that the Zooplankton<br />

population is driven to extinction.<br />

Since for a biologist this result goes against<br />

expectations, the validity of this model structure is questioned.<br />

lim Z(t) = 0 (17)<br />

t→+∞<br />

43


For the Zooplankton not to go to zero as t goes to infinity, (a 12 a 13 − a 11 − a 16 )<br />

would have to be greater than zero. This may be a clue to refining the constraints on<br />

the parameter selection process, so that it is strictly positive, insuring a zooplankton<br />

concentration not going to zero for this model structure.<br />

This result is then used to further our analysis by looking at the Phytoplankton<br />

(1) equation as knowing P(t) will help us find bounds for the other entities. The<br />

phytoplankton differential equation like those of Nitrate and Iron is composed of a<br />

minimum function M(t), not often found in differential equations. In order to find<br />

bounds for P(t) we must first find bounds for M(t). Recall,<br />

{<br />

F<br />

M(t) = min<br />

(F + a 5 ) , N<br />

(N + a 4 ) , E P UR (t)<br />

(e− a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />

a 1 ))<br />

}<br />

M(t) being a minimum function it will always pick the smallest value of the 3<br />

functions stated above, thus using (15) we can safely estimate the range of M(t) to<br />

be:<br />

0 ≤ M(t) ≤ F upperbound<br />

F upperbound + a 5<br />

= a 21<br />

a 21 + a 5<br />

= 0.53262. (18)<br />

Using the lower bound of (16), we are trying to find an upper bound for P(t) since<br />

Z(t) is subtracted we used its small value (i.e. lower bound), and Lemma 2 we have,<br />

[<br />

dP [ ]<br />

dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)(1 − a 6 ) − a 9 − a 17<br />

]P<br />

(<br />

)<br />

− a 13 (1 − e (−a 14P ) )Z<br />

≤<br />

[ [<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

]<br />

M(t)(1 − a 6 ) − a 9 − a 17<br />

]P<br />

44


Using the exogenous time-series at our disposal we find,<br />

0.01406716 ≤<br />

[<br />

]<br />

(1 − E ice (t)) ∗ a 0 ∗ e (0.06933∗E T H 2 O(t))<br />

≤ 0.8639959 (19)<br />

Using (18) and (19),<br />

We may rewrite as follow,<br />

dP<br />

[<br />

]<br />

dt ≤ (0.8639959)(0.53262)(1 − a 6 ) − (a 9 + a 17 ) P<br />

≤ (0.402759) P<br />

} {{ }<br />

α u<br />

dP<br />

≤ α<br />

dt u P where α u = 0.402759<br />

For the lower bound of P(t), using the upper bound of (16), (18) and Lemma 2, we<br />

get a manageable lower bound,<br />

dP<br />

(<br />

)<br />

dt ≥ − (a 9 + a 17 ) P − a<br />

} {{ }<br />

13 (1 − e (−a 14P ) ) Z<br />

} {{ } 0 e<br />

} {{ −δt<br />

}<br />

≥a 13<br />

(16)<br />

α l<br />

≥ −α l P − a 13 Z 0 e −δt where, α l = 0.0469007<br />

By Proposition 1 we get,<br />

P (t) ≥<br />

(<br />

P 0 + a )<br />

13<br />

α − δ Z 0 e −αlt − a 13<br />

α − δ Z 0e −δt<br />

Summarizing the bounds for P(t) we get,<br />

45


(<br />

∴ P 0 + a )<br />

13<br />

α − δ Z 0 e −αlt − a 13<br />

α − δ Z 0e −δt ≤ P (t) ≤ P 0 e αut (20)<br />

P (t) > 0 ∈ (0, +∞)<br />

From a biological standpoint α l is the maximum rate of decline and α u is the maximum<br />

rate of growth. Theses bounds give us little information about the model, as<br />

they simply state that Phytoplankton concentration is contained between zero and<br />

infinity.<br />

Next we look at Detritus:<br />

dD<br />

(<br />

) (<br />

{ }} { )<br />

dt = (1 − a 10 )a 11 P + a 11 + (1 − a 12 ) a 13 (1 − e (−a 14P ) ) (1 − a 10 )Z − D(a 15 + a 18 )<br />

(<br />

)<br />

)<br />

≤ (1 − a 10 )a 11 P +<br />

(a 11 + (1 − a 12 )a 13 (1 − a 10 )Z − D(a 15 + a 18 )<br />

(<br />

)<br />

)<br />

≤ (1 − a 10 )a 11 P 0 e<br />

} {{ αut +<br />

(a<br />

} 11 + (1 − a 12 )a 13 (1 − a 10 ) Z 0 e<br />

} {{ −δt<br />

}<br />

(20)<br />

(16)<br />

− D(a 15 + a 18 )<br />

≥a 13<br />

Using (16) and (20)and simplifying a bit we get a more manageable upper bound.<br />

Solving for upper bound D u (t)<br />

dD u<br />

dt<br />

+ (a 15 + a 18 )D u = (1 − a 10 )a 11 P 0 e αut +<br />

(a 11 + (1 − a 12 )a 13<br />

)<br />

(1 − a 10 )Z 0 e −δt<br />

46


Using integrating factor e (a 15+a 18 )t and Proposition 1 we get,<br />

D u (t) =<br />

( (1 − a10 )a<br />

) (<br />

11<br />

P 0 e αut (a11 + (1 − a 12 )a 13 )(1 − a 10 )<br />

)<br />

+<br />

Z 0 e −δt + C u e −(a 15+a 18 )t<br />

α u + a 15 + a<br />

} {{ 18 −δ + a<br />

}<br />

15 + a<br />

} {{ 18<br />

}<br />

β u1 β u2<br />

Let’s rewrite to simplify the expression a bit,<br />

D u (t) = β u1 P 0 e αut + β u2 Z 0 e −δt + C u e −(a 15+a 18 )t ,<br />

where β u1 = 0.224039 and β u2 = −3.754762.<br />

Assuming D(0) = D 0 and still following Proposition 1 we solve for C u ,<br />

C u = D 0 − β u1 P 0 − β u2 Z 0<br />

Similarly for the lower bound, using Proposition 1, (16) and (20) we get,<br />

Let’s rewrite it as,<br />

D l (t) =<br />

( (1 − a10 )a<br />

)<br />

11<br />

P 0 e −αlt + C l e −(a 15+a 18 )t<br />

−α l + a 15 + a<br />

} {{ 18<br />

}<br />

β l<br />

D l (t) = β l P 0 e −α lt + C l e −(a 15+a 18 )t ,<br />

where β l = 2.9784819 and C l = D 0 − β l P 0 .<br />

Summarizing the bounds,<br />

β l P 0 e −α lt + C l e −(a 15+a 18 )t ≤ D(t) ≤ β u1 P 0 e αut + β u2 Z 0 e −δt + C u e −(a 15+a 18 )t<br />

(21)<br />

This concludes our analysis of Model A; the results will be discussed further on.<br />

47


4.4 Model B<br />

Shifting our focus to Model B we find different model structures. Indeed, (7) yields<br />

a Lokta-Volterra structure which will make for an interesting analysis.<br />

Following the procedure used for Model A, we start our analysis with the Zooplankton<br />

equation (7), the simplest of all five. To find bounds for Z(t) we first need<br />

to find that of H(t).<br />

{<br />

}<br />

a 13 (P − a 16 − a 15 )<br />

H(t) =max 0,<br />

a 14 + (P − a 16 − a 15 )<br />

a 13 (P − a 16 − a 15 )<br />

a 14 + (P − a 16 − a 15 ) ≤ a 13<br />

Thus we get,<br />

0 ≤ H(t) < a 13 (22)<br />

Knowing (22) we can conclude,<br />

dZ<br />

( )<br />

dt = a 12 H(t)Z − a 11 Z 2 − a 18 Z<br />

≤a 12 a 13 Z − a 11 Z 2 − a 18 Z = Z [a 12 a 13 − a 18 − a 11 Z]<br />

} {{ }<br />

Logistic Equation<br />

Setting, a 12 a 13 − a 18 − a 11 Z = 0 we can find the carrying capacity K.<br />

K = a 12a 13 − a 18<br />

a 12<br />

= 42.3156486<br />

48


Thus, an upper bound for Z(t) will be,<br />

lim Z(t) ≤ K = 42.3156486<br />

t→+∞<br />

This is significant, since the Zooplankton concentration will have a maximum of K<br />

and is closer to the type of behavior an ecologist would expect to see in Zooplankton<br />

concentrations. The lower bound of this entity will be zero, since (22) and we know<br />

from biology that Zooplankton concentration cannot be negative. Hence,<br />

∴ 0 ≤ Z(t) ≤ 42.3156486 (23)<br />

However if P (0) ≤ a 16 + a 15 then Z(t) would go to zero as t goes to infinity<br />

because H(t) = 0 hence changing the structure of the equation and driving the<br />

population to extinction.<br />

In order to proceed to the analysis of P(t) we must first find the bounds for M(t).<br />

{<br />

F<br />

M(t) =min<br />

(F + a 5 ) , (1 − e−a 5N ), (e −E P UR (t)<br />

a 2 )(1 − e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />

a 1 ))<br />

}<br />

Since M(t) is a minimum function its bounds are,<br />

0 ≤ M(t) ≤ 1. (24)<br />

We now are able to find bounds for P(t),<br />

[ ]<br />

dP [ ]<br />

dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)P (1 − a 6 )<br />

( )<br />

− (a 9 P ) − H(t)Z − (a 19 P )<br />

49


Using the exogenous variables time-series we estimate:<br />

0.009868042 ≤<br />

[<br />

]<br />

(1 − E ice (t)) ∗ a 0 ∗ e (0.06933∗E T H 2 O(t))<br />

≤ 0.6060888 (25)<br />

Using (24), (37) and dropping the subtracted elements we find the upper bound to<br />

be,<br />

We may rewrite as follow,<br />

dP<br />

[<br />

]<br />

dt ≤ (0.6060888)(1)(1 − a 6 ) − (a 9 + a 19 ) P<br />

} {{ }<br />

α u<br />

For the lower bound, since (24) and (23):<br />

dP<br />

≤ α<br />

dt u P where α u = 0.4720906<br />

We may rewrite as follow,<br />

dP<br />

dt ≥ − (a 9 + a 19 ) P − a<br />

} {{ } 13 K<br />

α l<br />

dP<br />

dt<br />

≥ −α l P − K<br />

where α l = 0.032101<br />

Using proposition 1 we get,<br />

∴ (P 0 + K α l<br />

)e −α lt − K α l<br />

≤ P (t) ≤ P 0 e αut (26)<br />

P (t) > 0 ∈ (0, +∞)<br />

50


On a biology standpoint α l is the maximum rate of decline and α u is the maximum<br />

rate of growth. Theses bounds give us little information about the model, as it states<br />

that Phytoplankton concentration is contained between zero and infinity. Continue<br />

our analysis with Detritus:<br />

dD<br />

(<br />

)<br />

) (<br />

)<br />

dt = (1 − a 10 )a 9 P +<br />

((1 − a 10 )a 11 Z 2 + (1 − a 10 )(1 − a 12 )H(t)Z<br />

− D(a 17 + a 20 )<br />

Using (22), (23) and (26) we get,<br />

dD<br />

(<br />

) ) (<br />

)<br />

dt ≤ (1 − a 10 )a 9 P 0 e αut +<br />

((1 − a 10 )a 11 K 2 + (1 − a 10 )(1 − a 12 )a 13 K − D(a 17 + a 20 )<br />

Then solving for the upper bound,<br />

dD u<br />

dt<br />

)<br />

)<br />

+ D u (a 17 + a 20 ) =<br />

((1 − a 10 )a 9 P 0 e αut +<br />

(a 11 K + (1 − a 12 )a 13 (1 − a 10 )K<br />

Using Proposition 1,<br />

D u (t) =<br />

lim<br />

t→+∞ Du (t) = ∞<br />

+<br />

(a 11 K + (1 − a 12 )a 13<br />

)<br />

(1 − a 10 )K<br />

+ (1 − a 10)a 9<br />

P 0 e αut<br />

α u + a 17 + a } {{ 20}<br />

β u1 β u2<br />

(a 17 + a 20 )<br />

} {{ }<br />

(D 0 − β u1 − β u2 P 0<br />

)<br />

e −(a 17+a 20 )t<br />

where β u1 = 220.00119 and β u2 = 0.03054<br />

51


Then finding an lower bound,<br />

dD l<br />

dt + Dl (a 17 + a 20 ) = 0<br />

D l (t) = D 0 e −(a 17+a 20 )t<br />

lim<br />

t→+∞ Dl (t) = 0<br />

Thus,<br />

D 0 e −(a 17+a 20 )t ≤ D(t) ≤ β u1 + β u2 P 0 e αut +<br />

(D 0 − β u1 − β u2 P 0<br />

)<br />

e −(a 17+a 20 )t<br />

(27)<br />

These bound (27) show a maximum rate of decline driven by parameter 17 and<br />

20. As they were in Model A, the equation structure for Nitrate and Iron are very<br />

similar differing only by parameter and variables.<br />

[<br />

] [<br />

dN<br />

dt = a 17 D<br />

+ (a 21 − N)<br />

(a 7 12.0107)<br />

[<br />

P<br />

−<br />

(a 7 12.0107)<br />

≤a 22<br />

{ }} {<br />

]<br />

E T H2 O<br />

(a<br />

max<br />

− E T H2 O(t)<br />

22<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)<br />

]<br />

Using (26) and (27) we find an upper bound, the P term is dropped as it’s lower<br />

bound is zero . Thus,<br />

dN<br />

a 17<br />

(β<br />

dt ≤ (a u1 + β u2 P 0 e αut +<br />

21 − N)a 22 +<br />

) )<br />

(D 0 − β u1 − β u2 P 0 e −(a 17+a 20 )t<br />

(a 7 12.0107)<br />

52


Setting up to solve N u (t)<br />

dN u<br />

dt<br />

dN u<br />

dt<br />

a 17<br />

(β<br />

= (a 21 − N u u1 + β u2 P 0 e αut +<br />

)a 22 +<br />

(a 7 12.0107)<br />

a 17<br />

(β<br />

+ N u u1 + β u2 P 0 e αut +<br />

a 22 = a 21 a 22 +<br />

(a 7 12.0107)<br />

) )<br />

(D 0 − β u1 − β u2 P 0 e −(a 17+a 20 )t<br />

(D 0 − β u1 − β u2 P 0<br />

)<br />

e −(a 17+a 20 )t<br />

)<br />

Solving for N u (t) using Proposition 1,<br />

(<br />

N u (t) = a 21 +<br />

) )<br />

a 17<br />

(β u1 + β u2 P 0 e αut +<br />

(D 0 − β u1 − β u2 P 0 e −(a 17+a 20 )t<br />

)<br />

(a 22 a 7 12.0107)<br />

} {{ }<br />

+<br />

(N 0 − γ N<br />

)<br />

e −(a 22)t<br />

γ N<br />

where,<br />

lim γ N(t) = ∞<br />

t→+∞<br />

Let’s rewrite it as,<br />

N u (t) = γ N +<br />

lim N u (t) = ∞<br />

t→+∞<br />

(N 0 − γ N<br />

)<br />

e −(a 22)t<br />

For the lower bound, the D term is dropped as its lower bound goes to zero.<br />

dN<br />

dt ≤ (a 21 − N)a 22<br />

53


Thus, using Proposition 1 we get,<br />

dN l<br />

= (a 21 − N l )a 22<br />

dt<br />

dN u<br />

+ N l a 22 = a 21 a 22<br />

dt<br />

N l (t) = a 21 + N 0 e −a 22t<br />

lim<br />

t→+∞ N l (t) = a 21<br />

To summarize the bounds,<br />

a 21 + N 0 e −a 22<br />

≤ N(t) ≤ γ N +<br />

(N 0 − γ N<br />

)<br />

e −a 22t<br />

When t → ∞ we obtain,<br />

∴ a 21 = 31 ≤ N(t) ≤ ∞ (28)<br />

Nitrate then has constant lower bound, which implies that the concentration will<br />

never go below a 21 for this particular model structure. This make us re-iterate that<br />

these models are very sensitive to parameter selection process. As mentioned above<br />

Iron as the same equation structures, thus using the same analysis we found Iron as<br />

follows:<br />

[<br />

] [<br />

dF<br />

dt = Da 17<br />

+<br />

(a 8 12.0107)<br />

[<br />

P<br />

−<br />

(a 8 12.0107)<br />

]<br />

E T H2 O<br />

(a 23 − F )a<br />

max<br />

− E T H2 O(t)<br />

24<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)<br />

]<br />

Thus,<br />

54


(<br />

F u (t) = a 23 +<br />

) )<br />

a 17<br />

(β u1 + β u2 P 0 e αut +<br />

(D 0 − β u1 − β u2 P 0 e −(a 17+a 20 )t<br />

)<br />

(a 24 a 8 12.0107)<br />

} {{ }<br />

+<br />

(F 0 − γ F<br />

)<br />

e −(a 24)t<br />

γ F<br />

where,<br />

lim γ F (t) = ∞<br />

t→+∞<br />

Let’s rewrite it as,<br />

F u (t) = γ F +<br />

lim F u (t) = ∞<br />

t→+∞<br />

(F 0 − γ F<br />

)<br />

e −(a 24)t<br />

For the lower bound we get,<br />

F l (t) = a 23 + F 0 e −a 24t<br />

lim<br />

t→+∞ F l (t) = a 23<br />

To summarize we get,<br />

a 23 + F 0 e −a 24<br />

≤ F (t) ≤ γ F +<br />

(F 0 − γ F<br />

)<br />

e −a 24t<br />

which as t → ∞,<br />

55


∴ a 23 = 4.5.10 −4 ≤ F (t) ≤ ∞ (29)<br />

As was the case for Nitrate, Iron is bounded below by Parameter 23.<br />

This<br />

concludes the analysis of Model B.<br />

Models A and B are the two good fit models under a .5 reMSE which came<br />

up the most frequently throughout the 31 experiments.<br />

Our analysis has shown<br />

that the structure of the equations for phytoplankton and detritus produce similar<br />

dynamics and bounds for both models; on the other hand where iron and nitrate<br />

were bounded above with a parameter in Model A they were bounded below by<br />

a parameter value in Model B . Also, the structure and bounds for zooplankton<br />

had much more variations. For instance, the bounds for Model A implied that the<br />

zooplankton population will go to extinction whereas bounds for Model B indicated<br />

that the population has an upper bound at the carrying capacity K. This simple<br />

observation led me to look more into the zooplankton dynamic, to do so I chose to<br />

select the model with the lowest reMSE from experiment 6. In this experiment HIPM<br />

was provided observations for both phytoplankton and zooplankton dynamics. This<br />

was not a random choice since phytoplankton is the dynamic we are trying to model<br />

and zooplankton is the state variable demonstrating the most variability in structure<br />

and having potentially the most restrictive power out of all state variables, based on<br />

computational results. This model will be presented as Model C.<br />

56


4.5 Model C<br />

Table 7: This table summarize all the parameters that play a role in Model C<br />

Model C<br />

ID Name Value<br />

a 0 phyto.max growth 0.786225<br />

a 1 phyto.Ek max 1.69379<br />

a 2 arrigoetal1998 w photoinhibition coefficient 12.8247<br />

a 3 Nitrate monod lim coefficient 5.13429e-05<br />

a 4 Iron monod lim 0.0001252<br />

a 5 phyto.exude rate 0.0010004<br />

a 6 NO3.toCratio 6.68978<br />

a 7 Fe.toCratio 119659<br />

a 8 phyto.death rate 0.0273329<br />

a 9 environment.beta 0.993959<br />

a 10 zoo.death rate 0.00239853<br />

a 11 zoo.assim eff 0.393807<br />

a 12 zoo.attack 0.340717<br />

a 13 zoo grazing.ratio dependent 3 coefficient 1.86168<br />

a 14 detritus.remin rate 0.0374133<br />

a 15 zoo.respiration rate 0.0282409<br />

a 16 phyto.sinking rate 0.0143703<br />

a 17 detritus.sinking rate 0.0990947<br />

a 18 NO3.avg deep conc 31.2197<br />

a 19 NO3 linear temp control max mixing rate 0.753984<br />

a 20 Fe.avg deep conc 0.00045<br />

a 21 Fe linear temp control max mixing rate 0.0681006<br />

57


Model C<br />

Where,<br />

dP<br />

dt<br />

dZ<br />

dt<br />

dD<br />

dt<br />

dN<br />

dt<br />

dF<br />

dt<br />

=<br />

=<br />

=<br />

=<br />

=<br />

[ [ ]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) (1 − a 5 ) − a 8 − a 16<br />

]P (30)<br />

} {{ }<br />

(<br />

−<br />

(a 12 P 2 )<br />

Z 2 + a 12 a 13 P 2<br />

} {{ }<br />

Z Grazing Rate<br />

P Growth Rate<br />

)<br />

Z<br />

(<br />

(a 12 P 2 )<br />

)<br />

)<br />

a 11 Z −<br />

(a<br />

Z 2 + a 12 a 13 P<br />

} {{ 2<br />

10 Z + a 15 Z (31)<br />

}<br />

Z Grazing Rate<br />

(<br />

)<br />

(1 − a 9 )(a 8 P + a 10 Z 2 )<br />

(32)<br />

(<br />

+ (1 − a 9 )(1 − a 11 )<br />

[<br />

] [<br />

a 14 D<br />

+<br />

(a 6 ∗ 12.0107)<br />

(a 12 P 2 )<br />

Z 2 + a 12 a 13 P<br />

} {{ 2<br />

}<br />

Z Grazing Rate<br />

)<br />

Z − D(a 14 + a 17 )<br />

E T H2 O<br />

(a 18 − N) a<br />

max<br />

− E T H2 O(t)<br />

19<br />

E T H2 O max<br />

− E T H2 O<br />

} {{ min<br />

}<br />

N Mixing Rate<br />

[<br />

]<br />

P<br />

[<br />

]<br />

−<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)<br />

(a 6 12.0107) } {{ }<br />

P Growth Rate<br />

[<br />

] [<br />

]<br />

a 14 D<br />

+<br />

(a 7 12.0107)<br />

−<br />

[<br />

P<br />

(a 7 12.0107)<br />

E T H2 O<br />

(a 20 − F ) a<br />

max<br />

− E T H2 O(t)<br />

21<br />

E T H2 O max<br />

− E T H2 O<br />

} {{ min<br />

}<br />

F Mixing Rate<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)<br />

} {{ }<br />

P Growth Rate<br />

]<br />

]<br />

(33)<br />

(34)<br />

{<br />

M(t) = min<br />

F<br />

(F + a 4 ) , N<br />

(N + a 3 ) , (1 − e −E P UR (t)(1+a 2 e(E P UR (t)e1.089−2.12log 10 (a 1<br />

} ) ) )<br />

a 1 ))<br />

} {{ }<br />

Phytoplankton Growth Limitation<br />

58


The analysis of Model C is very similar to that of Model B, especially for Nitrate<br />

and Iron since their equations structures are identical expect for the parameters and<br />

M(t). In this case since we are using the approximation of the bound of M(t) to<br />

be between zero and one for all the models, the bounds for both Nitrate and Iron<br />

are going to be close in structure to that of model C. Following the procedure used<br />

in both previous models we start our analysis with the Zooplankton equation (31)<br />

which is the simplest of the five.<br />

dZ<br />

(<br />

dt =<br />

(a 12 P 2 )<br />

a 11<br />

Z 2 + a 12 a 13 P<br />

} {{ 2<br />

}<br />

Z Grazing Rate<br />

≤Z [ a 11<br />

a 13<br />

− a 15 − a 10 Z]<br />

} {{ }<br />

Logistic Equation<br />

)<br />

)<br />

Z −<br />

(a 10 Z + a 15 Z<br />

Setting, a 11<br />

a 13<br />

− a 15 − a 10 Z = 0 we can find the carrying capacity K.<br />

K =<br />

a 11<br />

a 13<br />

− a 15<br />

a 10<br />

= 76.41856944<br />

Thus, an upper bound for Z(t) will be,<br />

lim Z(t) ≤ K = 76.41856944<br />

t→+∞<br />

This is significant since the Zooplankton concentration will have a maximum of<br />

K. The lower bound of this entity will be zero, since we know from biology that<br />

Zooplankton concentration cannot be negative. Hence,<br />

∴ 0 ≤ Z(t) ≤ 76.41856944 (35)<br />

59


Next we turn our attention to P (t) for which we take M(t) to have the following<br />

bounds:<br />

0 ≤ M(t) ≤ 1, (36)<br />

and using the exogenous variables time-series we estimate:<br />

0.0138249 ≤<br />

[<br />

]<br />

(1 − E ice (t)) ∗ a 0 ∗ e (0.06933∗E T H 2 O(t))<br />

≤ 0.8491189 (37)<br />

Then,<br />

[<br />

dP [ ]<br />

dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)(1 − a 5 ) − a 8 − a 16<br />

]P<br />

(<br />

(a 12 P 2 )<br />

)<br />

−<br />

Z<br />

Z 2 + a 12 a 13 P 2 ]<br />

[(0.8491189)(1)(1 − a 5 ) − a 8 − a 16<br />

≤<br />

We may rewrite as follow,<br />

} {{ }<br />

α u<br />

P<br />

Using proposition 1 we get,<br />

dP<br />

≤ α<br />

dt u P where α u = 0.806566<br />

∴ 0 ≤ P (t) ≤ P 0 e αut (38)<br />

lim P (t) = ∞ (39)<br />

t→+∞<br />

60


That being said let’s continue the analysis of Model C with Detritus.<br />

dD<br />

dt = (<br />

+<br />

)<br />

(1 − a 9 )(a 8 P + a 10 Z 2 )<br />

(<br />

(1 − a 9 )(1 − a 11 )<br />

Using (35) and (39) we get,<br />

(a 12 P 2 )<br />

Z<br />

Z 2 + a 12 a 13 P<br />

} {{ 2<br />

}<br />

Z Grazing Rate<br />

)<br />

− D(a 14 + a 17 )<br />

dD<br />

(<br />

)<br />

dt ≤ (1 − a 9 )(a 8 P 0 e αut + a 10 K 2 ) +<br />

((1 − a 9 )(1 − a 11 ) K )<br />

− D(a 14 + a 17 )<br />

a 13<br />

Then solving for the upper bound,<br />

dD u<br />

dt<br />

+ D u (a 14 + a 17 ) =(1 − a 9 )a 8 P 0 e αut +<br />

(<br />

a 10 K + (1 − a )<br />

11)<br />

(1 − a 9 )K<br />

a 13<br />

Using Proposition 1,<br />

(<br />

a 10 K + (1−a 11)<br />

D u a 13<br />

)(1 − a 9 )K<br />

(t) =<br />

+ (1 − a 9)a 8<br />

P 0 e αut<br />

(a 14 + a 17 ) α<br />

} {{ } u + (a 14 + a 17 )<br />

} {{ }<br />

β u1 β u2<br />

)<br />

(D 0 − β u1 − β u2 P 0 e −(a 14+a 17 )t<br />

where β u1 = 1.721033 and β u2 = 1.7508e −04<br />

61


We can rewrite D u (t) as,<br />

D u (t) = β u1 + β u2 P 0 e αut +<br />

lim<br />

t→+∞ Du (t) = ∞<br />

(D 0 − β u1 − β u2 P 0<br />

)<br />

e −(a 14+a 17 )t<br />

Then finding an lower bound,<br />

dD l<br />

dt + Dl (a 14 + a 17 ) = 0<br />

D l (t) = D 0 e −(a 14+a 17 )t<br />

lim<br />

t→+∞ Dl (t) = 0<br />

Thus,<br />

D 0 e −(a 14+a 17 )t ≤ D(t) ≤ β u1 + β u2 P 0 e αut +<br />

(D 0 − β u1 − β u2 P 0<br />

)<br />

e −(a 14+a 17 )t<br />

(40)<br />

When t → ∞ we get the following,<br />

∴ 0 ≤ D(t) ≤ ∞ (41)<br />

The approach for Nitrate and Iron is identical to that of Model B so we get, for<br />

Nitrate<br />

a 18 + N 0 e −a 19<br />

≤ N(t) ≤ γ N +<br />

(N 0 − γ N<br />

)<br />

e −a 19t<br />

where,<br />

γ N =<br />

)<br />

( a 14 β u1 + β u2 P 0 e αut +<br />

(D 0 − β u1 − β u2 P 0<br />

a 18 +<br />

(a 19 a 6 12.0107)<br />

e −(a 14+a 17 )t<br />

)<br />

62


When t → ∞ we obtain,<br />

∴ a 18 = 31.2197 ≤ N(t) ≤ ∞ (42)<br />

And for iron we get,<br />

a 20 + F 0 e −a 21<br />

≤ F (t) ≤ γ F +<br />

(F 0 − γ F<br />

)<br />

e −a 21t<br />

where,<br />

γ F =<br />

)<br />

( a 17 β u1 + β u2 P 0 e αut +<br />

(D 0 − β u1 − β u2 P 0<br />

a 20 +<br />

(a 21 a 7 12.0107)<br />

e −(a 14+a 17 )t<br />

)<br />

which as t → ∞,<br />

∴ a 20 = 4.5.10−4 ≤ F (t) ≤ ∞ (43)<br />

The analysis of Model C yield some interesting results.<br />

Indeed, we obtained<br />

realistic bounds for all the states variables. The upper bounds for Phytoplankton,<br />

Nitrate and Iron are not informative as they go to infinity.That being said with<br />

only two set of time-series (Phytoplankton and Zooplankton) HIPM produced five<br />

good fit model under .5 reMSE, one of which I have analyzed and seemed to yield<br />

dynamics in accordance with what domain scientists would expect.<br />

4.6 Effects of increasing the number of constraints<br />

The computational results have established that increasing the number of constraints<br />

inputted into HIPM by adding additional entities’ time-series will in some cases<br />

reduce the number of good fit models selected by the software. A few cases were<br />

studied to look at the impact of increase in constraint on the models selected. During<br />

63


his research on the Ross Sea Phytoplankton dynamic, Borrett (unpublished data)<br />

worked with real-life measurements for Phytoplankton and Nitrate and inputted this<br />

data into HIPM, for that particular reason I decided to look at experiment number 8<br />

which assumed data for both Nitrate and Phytoplankton (cf. Table 4). The number<br />

of good fit models, under a .5 reMSE, produced by HIPM is 25. When observations<br />

for Zooplankton are added no models under the chosen reMSE cutoff are selected; on<br />

the other hand when Iron or Detritus are added a significant decrease in the number<br />

of good fit models can be observed. The addition of detritus (experiment 19) and<br />

iron (experiment 21) yielded respectively 13 and 15 good fit models. However, out<br />

of all the models selected in experiment 8 only 2 models were part of the set selected<br />

in experiment 19 (addition of detritus data) and 3 models from the set selected<br />

in experiment 21 (addition of iron data). A comparison of the structures of these<br />

models (the models can be found in appendix D and E) that they differ only, by<br />

the type of Zooplankton grazing process used, by the parameter values and, in some<br />

cases, by the Phytoplankton growth limitation (aka M(t)). Otherwise the structures<br />

of these models are comparable, implying perhaps that the grazing processes used<br />

in these models have similar effects on the ecosystem, which is plausible given the<br />

number of grazing processes present in the process library. It is in fact this high<br />

number of grazing options that makes for a very large structural search space.<br />

64


5 CONCLUSION<br />

The hierarchal inductive process-modeling framework was effective in its role to cover<br />

two very extensive search spaces in a short amount time and with the availability of<br />

the CIAO data I was able to investigate the usefulness of the software in our search<br />

for the best model representation.<br />

Some of the major observations that can be made throughout this paper are<br />

about the Zooplankton state variable.<br />

Not only did it tend to provide great restrictive<br />

power when its time series was inputted into HIPM as portrayed with its<br />

median activation value and Table 4, but most often some of the good fit models<br />

I selected only differed in the type of Zooplankton grazing process chosen. All<br />

this may suggest one of two things.<br />

Either, that indeed Zooplankton yields the<br />

most important discriminatory power out of all state variables and is the data that<br />

should be collected first and foremost for the Ross Sea ecosystem, or that the way<br />

the Zooplankton entity is defined in the HIPM framework is inadequate for this<br />

type of ecosystem which incidentally weeds out many of the model that are being<br />

searched through (i.e. all the zeros in Table 4 ). This makes for high variability in<br />

structure for the good fit models selected which in turn creates an array of dynamic<br />

some of which are at opposite end of the spectrum (i.e. zooplankton population<br />

going to extinction in some case or going to the carrying capacity in other) when<br />

HIPM is provided with time-series data that are not that of Zooplankton. The other<br />

explanation about the issues encountered with Zooplankton could be found not in<br />

the way Zooplankton is defined in the process library but rather in an assumption<br />

made within the biological knowledge encoded into HIPM. Indeed, Phaeocystis are<br />

assumed to be grazing resistant to zooplankton, meaning that it is more difficult for<br />

zooplankton to graze upon on Phaeocystis than it is on diatoms; the later not being<br />

included in the process-library as we will discuss later on. Hence, the inability for<br />

65


zooplankton to be properly fitted may lie in the way phytoplankton has been defined<br />

and not zooplankton.<br />

Even though phytoplankton did not seem to reduce the number of good fit model<br />

as effectively as zooplankton, it did have an unexpected dynamic for the good fit<br />

models studied. Indeed, in all models the phytoplankton state variable upper bound<br />

is infinity as t goes to infinity. As mentioned previously the Ross Sea is scene to one<br />

the largest phytoplankton blooms in the Southern Ocean. The population of phytoplankton<br />

in the Ross Sea is primarily made up of two species, Phaeocystis antartica<br />

and diatoms. Since Phaeocystis dominates the phytoplankton bloom in the region,<br />

it was the only species taken into consideration in this experiment which may have<br />

influenced the output of the system and the dynamics of the states variables and<br />

more specifically those of zooplankton. The addition of another species of phytoplankton<br />

in HIPM could change the outputs significantly. There is a need here to<br />

modify the way phytoplankton are defined in HIPM and incorporate the multiplicity<br />

of species of phytoplankton in the system. This can be ground for future research.<br />

Surprisingly, we found contrary evidence to the assumption that more information<br />

meant fewer models being selected. As explained in the computational results,<br />

there were instances where inputting two time series into HIPM selected fewer good<br />

fit models then inputting three time-series. Once again, as previously mentioned,<br />

the explanation for this observation is the way reMSE is defined within HIPM when<br />

dealing with multiple variables: the mean square errors of the state variable being<br />

fitted are averaged. For instance, if both phytoplankton and detritus have a mean<br />

error of 0.3 the reMSE would be 0.3 if we then add iron with a mean error of 0.5<br />

the overall reMSE is now 0.36 which is under the cutoff for good fit models. We<br />

can imagine cases for which models that were not in the good fit category with two<br />

time-series then become good fit models with the addition of another time-series.<br />

Hence, the set of good fit models selected with three time-series entries, will not<br />

66


e a subset of the set of good fit models selected with two of the three time-series<br />

previously mentioned. This puts an important emphasis on which measurements<br />

and observations added to the search. The decision of adding an extra time-series to<br />

the software should be highly influenced by which data has already been collected<br />

and used with HIPM. That being said this could be an area where HIPM could be<br />

enhanced and a direction for further research. Indeed, it would be interesting to look<br />

at the output of HIPM if the reMSE was calculated by taking the maximum square<br />

error of all the variables being fitted which would then make our assumption, that<br />

more time-series data equates fewer models, valid.<br />

The parameter selection process<br />

is a very important step of HIPM model selection; this statement was reinforced<br />

by the mathematical analysis that made evident the sensitivity of the system to parameter<br />

values. Differences in parameter values could mean the difference between a<br />

population going to extinction or not. In the case of nitrate and iron I observed that<br />

specific parameters acted as bounds for these variables, which is useful information.<br />

Indeed, when looking at a model through mathematical analysis one can determine<br />

if a parameter will have a significant effect on the overall dynamics of the system.<br />

Coupling this information with experts knowledge it would then be possible to redefine<br />

the ranges set within HIPM for the parameter selection, which would in turn<br />

refine the search process. Scientists could then run the software once again, take<br />

the result and see if parameters ranges could be further refined. This can almost be<br />

seen as a cycle, deriving information on the parameters from the analytical analysis<br />

which in turn help better the constraints given to HIPM, then repeating the process<br />

to see the improvement made to the type of model being selected. This in itself can<br />

be seen as a procedure that transcend just the phytoplankton dynamic in the Ross<br />

Sea ecosystem and can be generalized to other systems.<br />

Automated modeling is a successful method, the LAGRAMGE framework has<br />

been successfully applied in a real-world domain and selected models that performed<br />

67


eally well (Atanasova et al. 2007). While this framework can evaluate only able one<br />

state variable at the time, HIPM is capable of evaluating multiple variables simultaneously<br />

; however we are faced with an under-constrained optimization problem<br />

since we want to select models with data for only a couple of the variables. Using<br />

CIAO simulated data we were able to explore and investigate the response of HIPM<br />

for the phytoplankton dynamic in the Ross Sea. Even though we did not use real-life<br />

data the results generate conclusions. First, more data is not synonymous with fewer<br />

models being selected. This conclusion must be tested to see if it can be generalized<br />

to other ecosystems or if becomes obsolete with a refined and improved processlibrary.<br />

Secondly, the result that zooplankton contains more restrictive power than<br />

the other state variables was attained only through multiple experiments using a<br />

full data set. There is room for further work in the area of exploratory statistics<br />

with the median activation value in order to develop a formal procedure that would<br />

assist scientist in their decision making process for data collection.<br />

The number of processes taken into consideration for the Ross Sea ecosystem<br />

make for an extensive process-library, which creates models with very intricate and<br />

complex structures. A direction for improvement would be to look at some sort of<br />

measure of complexity for the models, somewhat motivated by the law of parsimony<br />

that states that the simplest explanation is often the best.This could be coupled<br />

with a measure of distance between models; this two concepts could potentially be<br />

of great value when comparing different models. However, at this point the more<br />

plausible and logical step for future research would be first to incorporate diatoms<br />

in the way in which phytoplankton is defined in the process library and second to<br />

switch from a mean square error to a maximum square error, which in my opinion<br />

would yield very different results. In the long run, this thesis could be the premise<br />

to a protocol towards decision making in the data collection process.<br />

68


REFERENCES<br />

[1] Arrigo, K. R., and C. R. McClain, “Spring phytoplankton production in the<br />

western Ross Sea”, Science, 266, 261263, 1994.<br />

[2] Arrigo, K. R., A. M. Weiss, and W. O. Smith Jr., “Physical forcing of phytoplankton<br />

dynamics in the southwestern Ross Sea”, J. Geophys. Res., 103,<br />

10071021, 1998.<br />

[3] Arrigo, K. R., G. R. DiTullio, R. B. Dunbar, M. P. Lizotte, D. H. Robinson,<br />

M. VanWoert, and D. L. Worthen, “Phytoplankton taxonomic variability and<br />

nutrient utilization and primary production in the Ross Sea”, J. Geophys. Res.,<br />

105, 8827 8846, 2000.<br />

[4] Arrigo, K. R., D. Worthen & D. Robinson, “A coupled ocean-ecosystem model<br />

of the Ross Sea: 2. Iron regulation of phytoplankton taxonomic variability and<br />

primary production”, Journal of Geophysical Research, VOL 108, NO. C7, 3231,<br />

2003.<br />

[5] Atanasova, N., L. Todorovski, S. Dzeroski &B. Kompare, “Application of automated<br />

model discovery from data and expert knowledge to a real-world domain:<br />

Lake Glums” Ecological Modelling, 212, 92-98, 2008.<br />

[6] Borrett, S. R., W. Bridewell, P. Langley & K. Arrigo, “A method for representing<br />

and developing process models” Ecological Complexity, 4, I-12, 2007.<br />

[7] Bridewell, W., P. Langley , S. Racunas, & Borrett, S. “Learning process models<br />

with missing data”. Proceedings of the Seventeenth European Conference on<br />

Machine Learning, 557–565. 2006.<br />

[8] Bridewell, W., P. Langley, L. Todorovski &S. Dzeroski, “Inductive Process Modeling”,<br />

Standford University, Standford, CA, 2007.<br />

69


[9] Dzeroski, S. and Todorovski, L. “Discovering dynamics: from inductive logic<br />

programming to machine discovery.” Journal of Intelligent Information Systems,<br />

4: 89-108. 1995.<br />

[10] Dzeroski, S., Todorovski, L. “Discovering dynamics. Proceedings of the Tenth<br />

International Conference on Machine learning”, Morgan Kaufmann, San Mateo,<br />

CA, pp. 97103. 1993.<br />

[11] Fayyad, U., Haussler, D., & Stolorz, P. KDD “for science data analysis: Issues<br />

and examples. Proceedings of the Second International Conference of Knowledge<br />

Discovery and Data Mining” (pp. 5056). Portland, OR: AAAI Press. 1996.<br />

[12] Feng, W., X. Lu & R. Donovan, “Population Dynamics in a Model Territory<br />

Acquisition” Discrete And Continuous Dynamical Systems,Added Volume, 156-<br />

165, 2001.<br />

[13] Langley, P., J. Shrager, N. Asgharbeygi & S. Bay, “Inducing Explanatory Process<br />

Models from Biological Time Series”,Standford University, Standford, CA.<br />

[14] Langley, P. Elements of machine learning. San Mateo, CA: Morgan Kaufmann.1995.<br />

[15] Langley, P., Shiran, O., Shrager, J., Todorovski, L., & Pohorille, A. “Constructing<br />

explanatory process models from biological data and knowledge.” AI<br />

in Medicine, 37, 191-201. 2006.<br />

[16] Ljung, L. “Modelling of industrial systems. Proceedings of Seventh International<br />

Symposium on Methodologies for Intelligent Systems” (pp. 338-349). Berlin:<br />

Springer. 1993.<br />

[17] Mitchell, T. M. Machine learning. New York, NY: McGraw Hill. 1997.<br />

70


[18] Oreskes, N., K. Shrader-Frechette & K. Belitz.“Verification, validation, and<br />

confirmation of numerical models in the earth sciences. Science, vol. 263, pp.<br />

641-646. [Reprinted in Transactions of the Computer Measurement Group, vol.<br />

84, pp. 85-92].1994.<br />

[19] Oreskes, N. “Why believe a computer Models, measures, and meaning in the<br />

natural world, in The Earth Around Us: Maintaining a Livable Planet, edited<br />

by Jill S. Schneiderman (San Francisco: W.H. Freeman and Co.), pp. 70-82.<br />

2000.<br />

[20] Tagliabue A. & K. R. Arrigo. “Anomalously Low Zooplankton Abundance in<br />

the Ross Sea: An Alternative Explanation, Limnology and Oceanography Vol.<br />

48, No. 2, pp. 686-699. 2003.<br />

[21] Todorovski, L., Dzeroski, S., Kompare, B. “Modelling and prediction of phytoplankton<br />

growth with equation discovery.” Ecological Modelling 113, 7181.<br />

1998.<br />

[22] Todorovski, L. “Using domain knowledge for automated modeling of dynamic<br />

systems with equation discovery”. Doctoral dissertation, Faculty of Computer<br />

and Information Science, University of Ljubljana. Ljubljana, Slovenia. 2003.<br />

[23] Todorovski, L., W. Bridewell, O. Shiran, & P. Langley. “Inducing hierarchical<br />

process models in dynamic domains”. Proceedings of the Twentieth National<br />

Conference on Artificial Intelligence, 892–897. 2005.<br />

71


APPENDIX<br />

A. Sample CIAO data - 1997<br />

JDAY TEMP DPML AI NITR PHOS SILC IRON_nm IRON_um PARL PHA PHA_c DIA DIA_c<br />

ZOO DET PURL TP<br />

229 -1.842 68.86 0.87 31 2.1 75.97 0.5052 0.0005052 3.204 0.02518 2.2662<br />

0.02518 1.7626 1.999 0.02289 1.678 4.0288<br />

230 -1.842 79 0.84 31 2.1 75.86 0.505 0.000505 7.401 0.02518 2.2662 0.02518<br />

1.7626 1.999 0.02289 3.891 4.0288<br />

231 -1.844 83.75 0.82 31 2.1 75.73 0.5054 0.0005054 5.875 0.02518 2.2662<br />

0.02518 1.7626 1.999 0.02289 3.166 4.0288<br />

232 -1.848 84.19 0.84 31 2.1 75.7 0.5062 0.0005062 8.494 0.02518 2.2662<br />

0.02518 1.7626 1.999 0.02289 4.425 4.0288<br />

233 -1.838 112.1 0.83 31 2.1 75.7 0.5069 0.0005069 10.76 0.02518 2.2662<br />

0.02518 1.7626 1.999 0.02289 5.595 4.0288<br />

234 -1.832 114.4 0.86 31 2.1 75.64 0.5073 0.0005073 16.48 0.02518 2.2662<br />

0.02518 1.7626 1.999 0.02289 8.723 4.0288<br />

235 -1.84 103 0.9 31 2.1 75.54 0.5083 0.0005083 19.51 0.02518 2.2662 0.02518<br />

1.7626 1.999 0.02289 10.33 4.0288<br />

236 -1.844 92.1 0.88 31 2.1 75.44 0.5087 0.0005087<br />

72


B. Full entity Specification File<br />

#!/usr/bin/python<br />

"""<br />

This is the revised file for entity specification<br />

Stuart Borrett<br />

April 26, 2007<br />

"""<br />

from ross_lib import *;<br />

# import library<br />

# observed primary producer<br />

p1 = entity_instance(pe, "phyto",<br />

{"conc": ("system", "PHA_c", (0,600)), # ugC/L<br />

"growth_rate": ("system", 0, (0,1)),<br />

"growth_lim": ("system", 1, (0,1))},<br />

{"max_growth":0.59,<br />

"exude_rate":0.19,<br />

"death_rate":0.025,<br />

"Ek_max":30,<br />

"biomin":0.025,<br />

"PhotoInhib":200}<br />

);<br />

# unobserved grazer with initial value from [0,1] default 0.1<br />

Z1 = entity_instance(ze, "zoo",<br />

{"conc": ("system", 0.1 , (0.10,510)),<br />

"growth_rate": ("system", 0.1, (0, 1))},<br />

{"assim_eff":0.75,<br />

73


"death_rate":0.02,<br />

"respiration_rate":0.019,<br />

"gmax":0.4,<br />

"gcap":200}<br />

);<br />

# observed nitrate<br />

no3 = entity_instance(no3, "NO3",<br />

{"conc": ("system", "NITR", (0,32)),<br />

"mixing_rate": ("system", 0, (0,1))}, None);<br />

# unobserved iron<br />

fe = entity_instance(fe, "Fe",<br />

{"conc": ("system",.00042920, (0,0.001)),<br />

"mixing_rate": ("system", 0, (0,1))}, None);<br />

# observed/exogenous ENVIRONMENT<br />

e1 = entity_instance(ee, "environment",<br />

{"PUR": ("exogenous", "PURL", None),<br />

"TH2O": ("exogenous", "TEMP", None),<br />

"ice":("exogenous", "AI", None) },<br />

{"beta":0.7}<br />

);<br />

# unobsevable detritus with initial value from [0,1] default 0.1<br />

D1 = entity_instance(de, "detritus",<br />

{"conc": ("system", 0.1, (0.001, 210))}, None);<br />

74


C. Full ross Sea generic model library<br />

#!/usr/bin/python<br />

"""<br />

This generic model library supports the construction<br />

of an ecosystem model of the Ross Sea.<br />

It is hierarchical in processes, but the entites are flat.<br />

This version is updated and corrected.<br />

It is designed for use with the sensitivity analysis experiments<br />

"""<br />

from library import *;<br />

from entities import *;<br />

from processes import *;<br />

lib = library("aquatic_ecosystem");<br />

# -----------------------------------------------------------------------<br />

# -----------------------------------------------------------------------<br />

# GENERIC ENTITIES<br />

# id, variables, constant parameters<br />

# -----------------------------------------------------------------------<br />

# --- PHYTOPLANKTON ---<br />

pe = lib.add_generic_entity("P",<br />

{"conc":"sum",<br />

"growth_rate":"prod",<br />

"growth_lim":"min"},<br />

{"max_growth": (0.4,0.8),<br />

"exude_rate": (0.001,0.2),<br />

75


"death_rate": (0.02,0.04),<br />

"Ek_max":(1,100),<br />

"sinking_rate":(0.0001,0.25),<br />

"biomin":(0.02,0.04),<br />

"PhotoInhib":(200,1500),<br />

}<br />

);<br />

# --- ZOOPLANKTON ---<br />

ze = lib.add_generic_entity("Z",<br />

{"conc": "sum",<br />

"grazing_rate": "prod"},<br />

{"assim_eff":(0.05,0.4),<br />

"death_rate": (0.001,0.3),<br />

"respiration_rate":(0.01,0.04),<br />

"sinking_rate":(0.001,0.25),<br />

"gmax":(0.3,0.5),<br />

"glim":(19,21),<br />

"gcap":(199,301)}<br />

);<br />

# --- NUTRIENTs ---<br />

# nitrate<br />

no3 = lib.add_generic_entity("Nitrate",<br />

{"conc":"sum",<br />

"mixing_rate":"sum"},<br />

{"toCratio": (6.6,6.7),<br />

"avg_deep_conc": (31,32)}<br />

);<br />

76


# iron<br />

fe = lib.add_generic_entity("Iron",<br />

{"conc":"sum",<br />

"mixing_rate":"sum"},<br />

{"toCratio": (3000,450000),<br />

"avg_deep_conc": (0.00035,0.00045)}<br />

);<br />

# --- DETRITUS ---<br />

de = lib.add_generic_entity("D",<br />

{"conc": "sum"},<br />

{"remin_rate": (0.03,0.04),<br />

"sinking_rate":(0.00001,0.1)}<br />

);<br />

# --- ENVIRONMENT ---<br />

ee = lib.add_generic_entity("E",<br />

{"TH2O":"sum",<br />

"PUR":"sum",<br />

"ice":"sum"},<br />

{"beta":(0.001,1),<br />

}<br />

);<br />

# -----------------------------------------------------------------------<br />

# -----------------------------------------------------------------------<br />

# GENERIC <strong>PROCESS</strong>ES:<br />

# id, type, entities related, list of subprocesses,<br />

# constant parameters, equations<br />

# -----------------------------------------------------------------------<br />

77


# --- GROWTH ---<br />

lib.add_generic_process(<br />

"growth", "",<br />

[("P",[pe],1,1), ("N",[no3,fe],1,100), ("D",[de],1,1), ("E",[ee],1,1)],<br />

[("limited_growth", ["P","N","E"], 0),<br />

("exudation",["P"],1),<br />

("nutrient_uptake",["P","N"],0)],<br />

{},<br />

{},<br />

{"P.conc": "P.growth_rate * P.conc"}<br />

);<br />

lib.add_generic_process(<br />

"exudation", "exudation",<br />

[("P",[pe],1,1)],<br />

[],<br />

{},<br />

{},<br />

{"P.conc": "-1 * P.exude_rate * P.growth_rate * P.conc"}<br />

);<br />

lib.add_generic_process(<br />

"nutrient_uptake", "nutrient_uptake",<br />

[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />

[],<br />

{},<br />

{},<br />

78


{"N.conc": "-1 * 1/( N.toCratio * 12.0107)<br />

* P.growth_rate * P.conc"}<br />

);<br />

lib.add_generic_process(<br />

"limited_growth", "limited_growth",<br />

[("P",[pe],1,1), ("N",[no3,fe],1,100), ("E",[ee],1,1)],<br />

[("light_lim", ["P","E"], 0), ("nutrient_lim",["P","N"], 0)],<br />

{},<br />

{"P.growth_rate": "(1-E.ice) * P.max_growth<br />

* exp(0.06933 * E.TH2O) * P.growth_lim"},<br />

{}<br />

);<br />

# ------ P.growth_lim --<br />

# there are multiple factors (and formulations of factors)<br />

# that might limit growth.<br />

# In this library nutrient and light limitations are combined<br />

# into P.growth_lim using a minimum function<br />

# so that only one operates at a time (i.e., they are substitutable).<br />

# The disadvantage<br />

# of this encoding is that it will not be possible to determine<br />

# which factor is operating at a given time. Temperature<br />

# is a multiplicative control factor encoded in the P.growth_rate<br />

# equation, and in the present library we do not consider<br />

# alternative temperature effect functions.<br />

# --light lim --<br />

lib.add_generic_process(<br />

79


"arrigoetal1998", "light_lim",<br />

[("P",[pe],1,1), ("E",[ee],1,1)],<br />

[],<br />

{"a":(5,15)},<br />

{"P.growth_lim": "(1 - exp(-E.PUR / (P.Ek_max / (1 + a<br />

* exp(E.PUR * exp(1.089 - 2.12 * log10(P.Ek_max)))))))"},<br />

{}<br />

);<br />

lib.add_generic_process(<br />

"arrigoetal1998_w_photoinhibition", "light_lim",<br />

[("P",[pe],1,1), ("E",[ee],1,1)],<br />

[],<br />

{"a":(5,15)},<br />

{"P.growth_lim": "(1 - exp(-E.PUR / (P.Ek_max / (1 + a<br />

* exp(E.PUR * exp(1.089 - 2.12 * log10(P.Ek_max)))))))<br />

* exp(-1 * E.PUR /P.PhotoInhib)"},<br />

{}<br />

);<br />

# -- nutrient lim --<br />

lib.add_generic_process(<br />

"monod_lim", "nutrient_lim",<br />

[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />

[],<br />

{"k":(0.000001,0.001)},<br />

{"P.growth_lim": "N.conc / (N.conc + k)"},<br />

{}<br />

);<br />

80


’’’<br />

lib.add_generic_process(<br />

"ratio_lim", "nutrient_lim",<br />

[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />

[],<br />

{"k":(0.000001,1)},<br />

{"P.growth_lim": "N.conc / (N.conc + k * P.conc)"},<br />

{}<br />

);<br />

’’’<br />

lib.add_generic_process(<br />

"monod_2nd", "nutrient_lim",<br />

[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />

[],<br />

{"k":(0.000001,0.001)},<br />

{"P.growth_lim": "pow(N.conc,2) / (pow(N.conc,2) + k)"},<br />

{}<br />

);<br />

lib.add_generic_process(<br />

"nut_lim_exp", "nutrient_lim",<br />

[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />

[],<br />

{"k":(0.000001,1)},<br />

{"P.growth_lim": "1-exp(-1* k * N.conc)"},<br />

{}<br />

);<br />

# --- DEATH ---<br />

81


lib.add_generic_process(<br />

"death_exp", "",<br />

[("S",[pe,ze],1,1), ("D",[de],1,1), ("E",[ee],1,1)],<br />

[],<br />

{},<br />

{},<br />

{"S.conc": "-1 * S.death_rate * S.conc",<br />

"D.conc": "(1-E.beta) * S.death_rate * S.conc"},<br />

);<br />

# --- REMINERALIZATION ---<br />

lib.add_generic_process(<br />

"remineralization", "",<br />

[("D",[de],1,1), ("N",[fe,no3],1,3)],<br />

[("nutrient_remineralization",["D","N"], 0)],<br />

{},<br />

{},<br />

{"D.conc": "-1 * D.remin_rate * D.conc"}<br />

);<br />

lib.add_generic_process(<br />

"nutrient_remineralization", "",<br />

[("D", [de], 1,1), ("N", [fe], 1, 1)],<br />

[],<br />

{},<br />

{},<br />

{ "N.conc": "1/(N.toCratio * 12.0107) * D.remin_rate * D.conc" }<br />

);<br />

# --- RESPIRATION ---<br />

82


lib.add_generic_process(<br />

"respiration", "",<br />

[("Z",[ze],1,1)],<br />

[],<br />

{},<br />

{},<br />

{"Z.conc":"-1 * Z.respiration_rate * Z.conc"}<br />

);<br />

# --- SINKING ---<br />

lib.add_generic_process(<br />

"sinking", "",<br />

[("V",[pe,ze,de],1,1)],<br />

[],<br />

{},<br />

{},<br />

{"V.conc": "-1 * V.sinking_rate * V.conc"}<br />

);<br />

# --- GRAZING ---<br />

lib.add_generic_process(<br />

"holling_type_1", "graze_rate",<br />

[("Z",[ze],1,1), ("P",[pe],1,1)],<br />

[],<br />

{},<br />

{"Z.grazing_rate": "Z.gmax * P.conc"},<br />

{}<br />

);<br />

83


lib.add_generic_process(<br />

"holling_type_2", "graze_rate",<br />

[("Z",[ze],1,1), ("P",[pe],1,1)],<br />

[],<br />

{},<br />

{"Z.grazing_rate": "max(0,Z.gmax * P.conc / (Z.gcap + P.conc))"},<br />

{}<br />

);<br />

lib.add_generic_process(<br />

"holling_type_2_mod", "graze_rate",<br />

[("Z",[ze],1,1), ("P",[pe],1,1)],<br />

[],<br />

{},<br />

{"Z.grazing_rate": "max(0,(Z.gmax * (P.conc - P.biomin - Z.glim)<br />

/ (Z.gcap + (P.conc - P.biomin - Z.glim))))"},<br />

{}<br />

);<br />

lib.add_generic_process(<br />

"ivlev", "graze_rate",<br />

[("Z", [ze],1,1), ("P", [pe],1,1)],<br />

[],<br />

{"delta":(0.01,0.5)},<br />

{"Z.grazing_rate": "max(0,Z.gmax * (1 - exp(-1 * delta * P.conc)))"<br />

},<br />

84


{}<br />

);<br />

lib.add_generic_process(<br />

"grazing", "grazing",<br />

[("Z",[ze],1,1), ("P",[pe],0,1), ("D",[de],0,1), ("E",[ee],0,1)],<br />

[("graze_rate", ["Z","P"], 0)],<br />

{},<br />

{},<br />

{"Z.conc": "Z.assim_eff * Z.grazing_rate * Z.conc",<br />

"P.conc": "-1 * Z.grazing_rate * Z.conc",<br />

"D.conc": "(1-E.beta) * (1-Z.assim_eff) * Z.grazing_rate * Z.conc"}<br />

);<br />

# --- Nutrient Mixing ------------------------------------------<br />

# this process represents an input of nutrients (nitrate)<br />

# due to mixing or upwelling.<br />

lib.add_generic_process(<br />

"nutrient_mixing", "",<br />

[("N",[no3,fe],1,1),("E",[ee],1,1)],<br />

[("mixing_rate", ["N","E"],0)],<br />

{},<br />

{},<br />

{"N.conc": "(N.avg_deep_conc - N.conc) * N.mixing_rate"}<br />

);<br />

lib.add_generic_process(<br />

"linear_temp_control", "mixing_rate",<br />

[("N",[no3,fe],1,1),("E",[ee],1,1)],<br />

85


[],<br />

{"max_mixing_rate":(0.000001,1)},<br />

{"N.mixing_rate": "max_mixing_rate<br />

*(datamax(E.TH2O)-E.TH2O)/(datamax(E.TH2O)-datamin(E.TH2O))"},<br />

{},<br />

);<br />

# --- ROOT ---<br />

lib.add_generic_process("root", "",<br />

[("Z",[ze],0,1), ("P",[pe],1,2),<br />

("N",[no3,fe],2,2), ("D",[de],1,1), ("E",[ee],1,1)],<br />

[("growth", ["P","N","D","E"], 0),<br />

("death_exp", ["P","D","E"],1),<br />

("death_exp", ["Z","D","E"],1),<br />

("grazing", ["Z","P","D","E"], 0),<br />

("remineralization", ["D","N"], 0),<br />

("respiration", ["Z"], 1),<br />

("sinking", ["P"],1),<br />

("sinking", ["D"],1),<br />

("nutrient_mixing", ["N","E"],1),<br />

],<br />

{}, {}, {}<br />

);<br />

86


D. Models selected in both experiment 8 and 19<br />

Model D<br />

[<br />

dP [ ]<br />

dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)(1 − a 6 ) − a 9 − a 17<br />

]P<br />

−<br />

(<br />

( a 13 P a 14<br />

) Z<br />

} {{ }<br />

Z Grazing Rate<br />

dZ<br />

(<br />

dt = a 12 ( a 13 P a 14<br />

)<br />

} {{ }<br />

Z Grazing Rate<br />

dD<br />

(<br />

dt = (1 − a 10 )(a 9 P + a 11 Z 2 )<br />

− D(a 15 + a 18 )<br />

)<br />

)<br />

)<br />

Z −<br />

(a 11 Z + a 16 Z<br />

)<br />

+<br />

[<br />

]<br />

dN<br />

dt = E T H2 O<br />

(a 19 − N)a<br />

max<br />

− E T H2 O(t)<br />

20<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 7 12.0107)<br />

(<br />

)<br />

(1 − a 10 )(1 − a 12 )( a 13 P a 14<br />

) Z<br />

} {{ }<br />

Z Grazing Rate<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

[<br />

]<br />

dF<br />

dt = E T H2 O<br />

(a 21 − F )a<br />

max<br />

− E T H2 O(t)<br />

22<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 8 12.0107)<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

]<br />

a 15 D<br />

(a 7 ∗ 12.0107)<br />

[<br />

]]<br />

a 15 D<br />

(a 7 12.0107)<br />

{<br />

F<br />

M(t) = min<br />

(F + a 5 ) , (1−e−a 4N ), (e − E P UR (t)<br />

a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />

a 1 ))<br />

}<br />

87


Model E<br />

[<br />

dP [ ]<br />

dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)(1 − a 5 ) − a 8 − a 18<br />

]P<br />

( {<br />

− max 0, a }<br />

12(P − a 15 − a 14 )<br />

)<br />

Z<br />

a 13 + P − a 15 − a<br />

} {{ 14<br />

}<br />

Z Grazing Rate<br />

dZ<br />

( {<br />

dt = a 11 max 0, a }<br />

12(P − a 15 − a 14 )<br />

)<br />

)<br />

Z −<br />

(a 10 Z + a 17 Z<br />

a 13 + P − a 15 − a<br />

} {{ 14<br />

}<br />

Z Grazing Rate<br />

dD<br />

(<br />

) (<br />

{<br />

dt = (1 − a 9 )(a 8 P + a 10 Z 2 ) + (1 − a 9 )(1 − a 11 ) max 0, a }<br />

12(P − a 15 − a 14 )<br />

a 13 + P − a 15 − a 14<br />

− D(a 16 + a 19 )<br />

[<br />

]<br />

dN<br />

dt = E T H2 O<br />

(a 20 − N)a<br />

max<br />

− E T H2 O(t)<br />

21<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 6 12.0107)<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

[<br />

]<br />

dF<br />

dt = E T H2 O<br />

(a 22 − F )a<br />

max<br />

− E T H2 O(t)<br />

23<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 7 12.0107)<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

} {{ }<br />

Z Grazing Rate<br />

]<br />

a 16 D<br />

(a 6 ∗ 12.0107)<br />

[<br />

]]<br />

a 16 D<br />

(a 7 12.0107)<br />

)<br />

Z<br />

{<br />

M(t) = min<br />

}<br />

a 1 ))<br />

F<br />

(F + a 4 ) , (1 − e−a 3N ), (1 − e −E P UR (t)(1+a 2 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />

88


E. Models selected in both experiment 8 and 21<br />

Model F<br />

[<br />

dP [ ]<br />

(<br />

dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)(1 − a 6 ) − a 9 − a 17<br />

]P −<br />

a 13 P<br />

1 + a 13 a 14 P<br />

} {{ }<br />

Z Grazing Rate<br />

)<br />

Z<br />

dZ<br />

(<br />

dt = a 12<br />

a 13 P<br />

1 + a 13 a 14 P<br />

} {{ }<br />

Z Grazing Rate<br />

)<br />

)<br />

Z −<br />

(a 11 + a 16 Z<br />

dD<br />

(<br />

) (<br />

dt = (1 − a 10 )(a 9 P + a 11 Z) + (1 − a 10 )(1 − a 12 )<br />

a 13 P<br />

1 + a 13 a 14 P<br />

} {{ }<br />

Z Grazing Rate<br />

)<br />

Z − D(a 15 + a 18 )<br />

[<br />

]<br />

dN<br />

dt = E T H2 O<br />

(a 19 − N)a<br />

max<br />

− E T H2 O(t)<br />

20<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 7 12.0107)<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

]<br />

a 15 D<br />

(a 7 ∗ 12.0107)<br />

[<br />

]<br />

dF<br />

dt = E T H2 O<br />

(a 21 − F )a<br />

max<br />

− E T H2 O(t)<br />

22<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 8 12.0107)<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

[<br />

]]<br />

a 15 D<br />

(a 7 12.0107)<br />

{<br />

F<br />

M(t) = min<br />

(F + a 5 ) , N<br />

(N + a 4 ) , E P UR (t)<br />

(e− a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />

a 1 ))<br />

}<br />

89


Model G<br />

[<br />

dP [ ]<br />

(<br />

dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)(1 − a 6 ) − a 9 − a 17<br />

]P −<br />

dZ<br />

(<br />

dt =<br />

a 13 P 2<br />

a 12<br />

1 + a 13 a 14 P<br />

} {{ 2<br />

}<br />

Z Grazing Rate<br />

dD<br />

dt = (<br />

(1 − a 10 )(a 9 P + a 11 Z)<br />

)<br />

)<br />

Z −<br />

(a 11 + a 16 Z<br />

)<br />

+<br />

(<br />

(1 − a 10 )(1 − a 12 )<br />

[<br />

]<br />

dN<br />

dt = E T H2 O<br />

(a 19 − N)a<br />

max<br />

− E T H2 O(t)<br />

20<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 7 12.0107)<br />

a 13 P 2<br />

1 + a 13 a 14 P 2<br />

} {{ }<br />

Z Grazing Rate<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

[<br />

]<br />

dF<br />

dt = E T H2 O<br />

(a 21 − F )a<br />

max<br />

− E T H2 O(t)<br />

22<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 8 12.0107)<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

a 13 P 2<br />

1 + a 13 a 14 P 2<br />

} {{ }<br />

Z Grazing Rate<br />

)<br />

Z − D(a 15 + a 18 )<br />

]<br />

a 15 D<br />

(a 7 ∗ 12.0107)<br />

[<br />

]]<br />

a 15 D<br />

(a 7 12.0107)<br />

)<br />

Z<br />

{<br />

F<br />

M(t) = min<br />

(F + a 5 ) , N<br />

(N + a 4 ) , E P UR (t)<br />

(e− a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />

a 1 ))<br />

}<br />

90


Model H<br />

[<br />

dP [ ]<br />

(<br />

dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t)(1 − a 6 ) − a 9 − a 17<br />

]P − a 13 (1 − e −a14P )<br />

} {{ }<br />

Z Grazing Rate<br />

dZ<br />

(<br />

dt = a 12 a 13 (1 − e −a14P )<br />

} {{ }<br />

Z Grazing Rate<br />

dD<br />

dt = (<br />

(1 − a 10 )(a 9 P + a 11 Z)<br />

)<br />

)<br />

Z −<br />

(a 11 + a 16 Z<br />

)<br />

+<br />

[<br />

]<br />

dN<br />

dt = E T H2 O<br />

(a 19 − N)a<br />

max<br />

− E T H2 O(t)<br />

20<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 7 12.0107)<br />

(<br />

)<br />

(1 − a 10 )(1 − a 12 ) a 13 (1 − e −a14P ) Z − D(a<br />

} {{ }<br />

15 + a 18 )<br />

Z Grazing Rate<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

[<br />

]<br />

dF<br />

dt = E T H2 O<br />

(a 21 − F )a<br />

max<br />

− E T H2 O(t)<br />

22<br />

E T H2 O max<br />

− E T H2 O min<br />

[<br />

−<br />

P<br />

(a 8 12.0107)<br />

[<br />

]<br />

(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />

M(t) +<br />

]<br />

a 15 D<br />

(a 7 ∗ 12.0107)<br />

[<br />

]]<br />

a 15 D<br />

(a 7 12.0107)<br />

)<br />

Z<br />

{<br />

F<br />

M(t) = min<br />

(F + a 5 ) , N<br />

(N + a 4 ) , E P UR (t)<br />

(e− a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />

a 1 ))<br />

}<br />

91


BIOGRAPHICAL SKETCH<br />

I was born and raised in France and came to the United States in 2006 to further<br />

my education. I saw there an incredible opportunity not only to explore my father’s<br />

origins but also to set out on a journey that promised to be full of learning experiences.<br />

I used to be terrible in math. If you would have told me in High School that I<br />

would study math later on in life, I probably would have laughed. But sure enough<br />

I completed my Undergraduate Degree in Applied Mathematics at the University of<br />

North Carolina Wilmington in 2010. For the past year and a half I have conducted<br />

research under Dr. Borrett on Inductive Process Modeling. I am now looking at<br />

possibility of traveling and working for a non-profit Christian organization which<br />

work with orphanages around the world. I have a heart for service and helping<br />

others. I trust that God will use the skills that I have acquired during my Masters<br />

where he sees fit.<br />

92

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!