HIERARCHAL INDUCTIVE PROCESS MODELING AND ANALYSIS ...
HIERARCHAL INDUCTIVE PROCESS MODELING AND ANALYSIS ...
HIERARCHAL INDUCTIVE PROCESS MODELING AND ANALYSIS ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
<strong>HIERARCHAL</strong> <strong>INDUCTIVE</strong> <strong>PROCESS</strong> <strong>MODELING</strong> <strong>AND</strong> <strong>ANALYSIS</strong><br />
Youri Noël Nelson<br />
A Thesis Submitted to the<br />
University of North Carolina Wilmington in Partial Fulfillment<br />
of the Requirements for the Degree of<br />
Master of Science<br />
Department of Mathematics and Statistics<br />
University of North Carolina Wilmington<br />
2011<br />
Approved by<br />
Advisory Committee<br />
Michael Freeze<br />
Xin Lu<br />
Wei Feng<br />
Chair<br />
Stuart Borrett<br />
Co-Chair<br />
Accepted by<br />
Dean, Graduate School
TABLE OF CONTENTS<br />
ABSTRACT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii<br />
DEDICATION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv<br />
ACKNOWLEDGMENTS . . . . . . . . . . . . . . . . . . . . . . . . . . . v<br />
LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .<br />
vi<br />
LIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii<br />
LIST OF SYMBOLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .<br />
viii<br />
1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1<br />
2 METHOD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10<br />
2.1 HIPM Description . . . . . . . . . . . . . . . . . . . . . . . . 10<br />
2.1.1 Measure of Fit . . . . . . . . . . . . . . . . . . . . . 12<br />
2.1.2 Entities specification and model library . . . . . . . . 13<br />
2.2 Experiment Design . . . . . . . . . . . . . . . . . . . . . . . . 16<br />
3 COMPUTATIONAL RESULTS . . . . . . . . . . . . . . . . . . . . . 20<br />
3.1 Increase in number of time-series input . . . . . . . . . . . . . 24<br />
3.2 Value of Information . . . . . . . . . . . . . . . . . . . . . . . 28<br />
3.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30<br />
4 ANALYTICAL <strong>ANALYSIS</strong> . . . . . . . . . . . . . . . . . . . . . . . 33<br />
4.1 Most recurrent models . . . . . . . . . . . . . . . . . . . . . . 33<br />
4.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 38<br />
4.3 Model A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41<br />
4.4 Model B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48<br />
4.5 Model C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57<br />
4.6 Effects of increasing the number of constraints . . . . . . . . . 63<br />
5 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65<br />
REFERENCES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69<br />
ii
APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72<br />
A. Sample CIAO data - 1997 . . . . . . . . . . . . . . . . . . . . . . . 72<br />
B. Full entity specification file . . . . . . . . . . . . . . . . . . . . . . 73<br />
C. Full ross Sea generic model library . . . . . . . . . . . . . . . . . . 75<br />
D. Models selected in both experiment 8 and 19 . . . . . . . . . . . . 87<br />
E. Models selected in both experiment 8 and 21 . . . . . . . . . . . . 89<br />
iii
ABSTRACT<br />
Understanding the Phytoplankton dynamic in the Ross Sea Polynya may yield useful<br />
knowledge in the search for solving the worlds rising carbon dioxide levels. Modeling<br />
such dynamics is a very lengthy and tedious process that can be helped with the use<br />
of computational tools like HIPM. This system relies on knowledge that is already<br />
available, in the shape of time series data and process library, to construct and then<br />
evaluates these models.<br />
In this research models were ranked by sum of squared<br />
error, from lowest to highest. The lowest being the best fit model. Some of the<br />
questions that arise from the use of HIPM are about the amount and value of the<br />
time series provided to the software, from which we formulated two hypotheses.<br />
Will having more time series better the output of the system Will time series<br />
for different variables provide different quality of output Through 31 experiments<br />
and mathematical analysis, we began to answer these questions. The computational<br />
result showed us that our first hypothesis does not always hold true, which is thought<br />
to be because of the way the fit is measured. On the other hand the mathematical<br />
analysis showed us many variations, over all the experiments, in the zooplankton<br />
equation structure which can be indication that the process library needs to be better<br />
defined and that the system needs to take into consideration not only Phaeocystis<br />
antartica phytoplankton species but also diatoms. This thesis provides the start to<br />
an answer for this hypothesis but further research is still needed.<br />
iv
DEDICATION<br />
This Thesis is dedicated to all my friends and family have supported me in this<br />
incredible journey I started 5 years ago. More importantly I want to dedicate to our<br />
Lord and Savior as I certainly would not be here today without his help, support<br />
and comfort.<br />
“I can do anything through God who strengthens me.”(Philippians 4:13)<br />
I also want to dedicate this to my nephew Noah Nelson and my niece Sarah Nelson<br />
for always putting a smile on my face during the tough times, their unconditional<br />
love and making me want to persevere always. I love you beyond words.<br />
Thank you, Christel & Douglas Nelson, Lara Nelson, Celio & Elise Nelson, Sven<br />
Diebold, Andrew & Robin Nelson, Ed & Pat Nelson, Joann Nelson, Philip Varvaris,<br />
Luke Brown, Taylor Jackson and Bud Edwards (for always being there at the right<br />
place at the right time) and all my other friends and family members that are not<br />
named here but are present in my heart and to whom I am so grateful for all the<br />
words of encouragement and support throughout the years.<br />
v
ACKNOWLEDGMENTS<br />
I would like to thank Dr. Feng, Dr. Borrett, Dr. Simmons, Dr. Freeze and Dr.<br />
Lu for all their help and support in this endeavor and process, as well as my friend<br />
Brevin Rock for his advice in completing a Masters thesis.<br />
vi
LIST OF TABLES<br />
1 Example of entity definition and instantiation (P) . . . . . . . . . . . 15<br />
2 Example of process definition (Growth) . . . . . . . . . . . . . . . . . 16<br />
3 Data contained in CIAO set . . . . . . . . . . . . . . . . . . . . . . . 18<br />
4 Cutoff Value Chart . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26<br />
5 Model A Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . 34<br />
6 Model B Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . 36<br />
7 Model C Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . 57<br />
vii
LIST OF FIGURES<br />
1 Initial Conceptual Model . . . . . . . . . . . . . . . . . . . . . . . . . 4<br />
2 Tree diagram representing the process library . . . . . . . . . . . . . 5<br />
3 Map of the Ross Sea . . . . . . . . . . . . . . . . . . . . . . . . . . . 7<br />
4 reMSE summary - Part 1 . . . . . . . . . . . . . . . . . . . . . . . . . 21<br />
5 reMSE summary - Part 2 . . . . . . . . . . . . . . . . . . . . . . . . . 22<br />
6 reMSE summary - Part 3 . . . . . . . . . . . . . . . . . . . . . . . . . 23<br />
7 Good fit Models VS. Number of inputted time-series . . . . . . . . . 24<br />
8 Mean Activation Values Graph . . . . . . . . . . . . . . . . . . . . . 29<br />
viii
LIST OF SYMBOLS<br />
P = Amount of Phytoplankton present in the system (mg Chla/m 3 ),<br />
D = Detritus concentration (mg C/m 3 ),<br />
F = Iron concentration (µM),<br />
Z = Zooplankton concentration (mg C/m 3 ),<br />
N = Nitrate concentration (µM),<br />
E ice (t) = Sea ice concentration<br />
E T H2 O(t) = Temperature of the water ( ◦ C)<br />
E P UR (t) = Photosynthetically usable radiation ( µmol photons m −2 s −1 )<br />
E T H2 O max<br />
= Maximum water temperature<br />
E T H2 O min<br />
= Minimum water temperature<br />
a i = Optimal parameters of the system selected by HIPM software<br />
ix
1 INTRODUCTION<br />
Whether you talk about biology, mathematics, physics, ecology, or any other type<br />
of science, all have a common objective to explain and describe the world that surrounds<br />
us. All of these fields build upon the collection of observations, to explain<br />
recurring phenomena. To explain and depict some of these phenomena scientists<br />
make use of models which can take a variety of forms including conceptual, formal,<br />
physical and diagrammatic (Haefner, 2005).<br />
Models are widely used in science and researchers continue to look for tools or<br />
techniques that will enhance and optimize their ability to construct new models or<br />
improve existing ones.<br />
Given a certain task the type of modeling technique will<br />
differ, for instance in his book Haefner (2005) uses a Forrester Diagram to model a<br />
hypothetical agro-ecosystem system, which is a qualitative model formulation. Another<br />
example would be in biology when describing predator-prey interaction, one<br />
can use differential equations models like those formulated by Lokta and Volterra<br />
(Berryman 1992). Models are useful for system study because they let researchers<br />
conduct experiments and test theories on the system that would otherwise be unethical<br />
or impossible to perform, as well as enabling them to predict the behavior of<br />
varying components of an ecosystem.<br />
Model construction is a difficult and lengthy endeavor. For a given system there<br />
may be many different combinations of processes (i.e. grazing, decay, growth) that<br />
could provide a plausible explanation for the behavior being studied.<br />
Thus, exploring<br />
and evaluating all these possibilities makes for a tedious task. In the past,<br />
limitations in computational powers restricted scientists in their ability to investigate<br />
more complex models, certain known or suspected processes would be left out<br />
to simplify calculations in part because as computational powers increased so did our<br />
capacity to evaluate more intricate models (Oreskes 2000). In addition, numerical
models of natural systems are non-unique, there is multiple ways to represent the<br />
same dynamic. Creating computational tools that would quickly and automatically<br />
evaluate multiple models seemed to be a promising idea to search through the extensive<br />
model space. The success of machine learning and data mining in commercial<br />
domains led scientists to investigate the field of automated modeling to serve that<br />
particular purpose (Fayyad et al., 1996).<br />
The act of gathering small pieces of information and combining it to prior knowledge<br />
to formulate a complex overview of an object or process studied is called induction.<br />
Induction prevents from searching the entire space of possible equations<br />
by only piecing together the meaningful terms, for instance a predator-prey model<br />
will need terms specifying growth and death (Todorovski et al. 2005). Inductive<br />
modeling methods (i.e. LAGRAMGE, HIPM, ARIMA, FUSE) use the principles of<br />
induction to construct models of the studied system. Methods used for commercial<br />
application, such as Knowledge Discovery in Database (KDD) process, were insufficient<br />
for scientific purposes as they only described and did not explain the observed<br />
system behavior (Langley et al. 2006). A simple example would be the modeling of<br />
water consumption in a city, a water company could easily create a numerical model<br />
based on previous years that would give a good estimate of the projected water<br />
consumption over time but it may not explain why the consumption fluctuates the<br />
way it does. In other words the commercial methods were able to produce models<br />
that are useful when trying to make accurate predictions for a system but become<br />
very limited when trying to explain which processes drive systems behaviors; these<br />
methods did not explore the realm of all possible models. Thus, induction methods<br />
had to be enhanced to automate the task of building and evaluating multiple models<br />
(Dzeroski et al. 1995).<br />
In this thesis, I used the hierarchal inductive process modeling technique, which<br />
is encoded as computer algorithm called HIPM (Langley et al. 2006; Bridewell et<br />
2
al. 2005; Dzeroski et al. 1995; Borrett et al. 2007). Inductive process modeling<br />
methods such as HIPM (Bridewell et al. 2008; Borrett et al. 2007; Langley et al.<br />
2006; Todorovski et al. 2005) searches through two spaces; the first space is made<br />
up of mathematical formulations and alternative model structures, which consist of<br />
entities, processes and the connection biding the two and the second space is made<br />
up of parameter values (Borrett et al. 2007).The system takes as input a hierarchy<br />
of generic processes - a process being a certain action on the system which is defined<br />
by mean of fragment mathematical equations and the rule on how to combine these<br />
fragments with the rest of the equations -, a set of entities - an entity being an object<br />
regrouping the properties of the organism or nutrient by mean of variables and<br />
parameters - and a set of observed time series of the entities variables (Todorovski<br />
et al. 2005). HIPM will perform one of two search for for the model structure, a<br />
heuristic search or exhaustive search. With the search option selected, HIPM creates<br />
all the possible model structures with the given background knowledge and selects<br />
the best set of parameters for each model structure. Finally, the system ranks the<br />
models based on their sum of squared error (Todorovski et al. 2005).<br />
This system allows for model representation of complex system dynamics, for<br />
example in the study of photosynthesis regulation it generated a model that reproduced<br />
both the qualitative shape and the quantitative details of the time series data<br />
while incorporating processes that made biological sense (Langley et al. 2006). In<br />
our case we studied the phytoplankton dynamic in the aquatic ecosystem of the Ross<br />
Sea.<br />
In this thesis I used the HIPM tool combined with the appropriate process library<br />
to study of the phytoplankton dynamic in Ross Sea ecosystem. Here the term<br />
process library is defined as the collection of processes (i.e. grazing, decay, growth)<br />
and entities (i.e. phytoplankton, zooplankton, nitrate), with their relation to one<br />
another. It is best represented by Figure 2.<br />
3
Figure 1: This schematic represent the interaction between entities and exogenous<br />
variables driving the model. Here, P, Z , D , NO3 and Fe are the state variables.<br />
PUR, T and Ice are the exogenous variables acting on the system and influencing the<br />
state variables. The arrows represent the interaction of one variable onto another<br />
(Borrett, unpublished research).<br />
Arrigo, Borrett, Bridewell and Langley used HIPM and the Ross Sea process library<br />
to create and search a space of over 1120 possible model structures to explain<br />
the phytoplankton and nitrogen temporal dynamics in the Ross Sea ecosystem; all<br />
models contained five state variables, phytoplankton, zooplankton, detritus, nitrogen<br />
and iron. Time series for both phytoplankton and nitrogen where available and<br />
given to HIPM along with the process library. Their initial research found that 200<br />
model structures were deemed of good fit, in this case good fit was defined by models<br />
having a sum of squared error less than or equal to 0.2. From a computer scientist<br />
standpoint, reducing the search space from 1120 models structure to 200 is a great<br />
accomplishment; however for a biologist the solution is not specific enough and offers<br />
few insights on the ecosystem dynamics. There is a need for ways to constraint the<br />
search further, bringing down the number of good fit models, making the output<br />
4
Figure 2: A tree diagram representing the process library constructed for the Ross<br />
Sea ecosystem problem. The interaction between processes and entities is defined in<br />
the library as explained in Section 2.1.2 ( Borrett et al. 2007)<br />
useful to biologists.<br />
Superficially, HIPM appears related to equation discovery methods, which is a<br />
subfield of machine learning (Langley, 1995; Mitchell, 1997) that investigates collections<br />
of measurements and observations, using different computational methods,<br />
in search of quantitative laws (Todorovski, 2003). For example the LAGRAMGE<br />
system will take in as input background knowledge encoded in terms of a grammar<br />
5
specifying the space of possible equations and a dependent variable and will output<br />
the best equation for the variable, able to only perform the search for one variable<br />
at the time (Dzeroski et al. 1993, Todrovski 2003). This is further related to the<br />
methods used in Ljungs work (1993) on system identification, but is further removed<br />
to that of inductive process modeling.<br />
The main assumption behind system identification is that the model structure<br />
is known and that the primary concern is finding the adequate parameter values;<br />
equation discovery focuses on both the structure and parameter values (Todorovski<br />
et al. 1998). Both of these approach produce descriptive models that summarize<br />
and predict the data but they fail to search through the space of alternative explanations,<br />
these methods do not take into account models with theoretical variables<br />
or consider alternate processes to explain certain dynamics (Bridewell et al. 2005).<br />
The Southern Ocean covers an area equivalent to about 10% of the global ocean<br />
and is a key element of the global ocean system as it links all major ocean basins and<br />
facilitates the global distribution of its deep water; it is considered to play an important<br />
part in the global carbon (C) cycle (Arrigo et al. 2003). The Ross Sea polynya<br />
(area of open water surrounded by sea ice) is one of the most productive ecosystems<br />
in the Southern Ocean as it experiences some of the largest phytoplankton blooms<br />
in the region (Arrigo et al 1994, 1998, 2000, 2003). Indeed, phytoplankton productivity<br />
(photosynthesis) is important to the carbon cycle as it removes carbon dioxide<br />
(CO 2 ) from surface water during photosynthesis, part of which will then be exported<br />
to deep ocean water. What makes the Ross Sea polynya so interesting for ecologist<br />
compared to other locations such as Terra Nova Bay, is the type of phytoplankton<br />
dominating the ecosystem. In the Ross Sea polynya , Phaeocystis antartica dominates<br />
as opposed to diatoms (species such as Fragilariopsis spp.) in Terra Nova Bay.<br />
Phaeocystis antartica are thought to resist grazing more than other phytoplankton<br />
species, which could imply that more carbon would be taken from shallow water into<br />
6
the depth as the un-eaten phytoplankton full of CO 2 sinks to the bottom (Tagliabue<br />
and Arrigo 2003). Deep ocean water has a larger residence time than shallow water,<br />
meaning that carbon trapped in deep ocean water will be effectively removed from<br />
atmospheric circulation for a much longer time than the carbon contained in surface<br />
water.<br />
Figure 3: Map of the southwestern Ross Sea showing the Ross Sea ploynya, located<br />
north of the Ross Sea Ice Shelf, and the Terra Nova Bay polynya, located on the<br />
western continental shelf (Arrigo et al. 2003)<br />
Thus, there is an incentive to understand the ecological processes that control the<br />
7
phytoplankton productivity and community composition -which species dominatesin<br />
the Ross Sea. Fluctuations in phytoplankton population could potentially have<br />
effects on the CO 2 levels in the atmosphere (Carlson et al. 1998) and if we can<br />
figure out why Phaeocystis antartica is predominant it would be useful information<br />
to scientist as they entertain the idea of altering phytoplankton populations<br />
around the world to create carbon sinks, providing a temporary solution to our CO 2<br />
problem. It is all these elements that initiated the search for the best process explanation<br />
of the phytoplankton dynamics in the Ross Sea, by determining which<br />
processes act upon the system and which entities are most important, scientist will<br />
accumulate knowledge that may prove valuable in the fight against rising CO 2 levels.<br />
As mentioned the tool that I have chosen for model search relies on measurements<br />
and observations of one or more variables of a system to make inferences on<br />
the remaining variables for which no data is available and the processes at works in<br />
the system. In Borrett’s study, the only state variables for which he had measurements<br />
and observations are Phytoplankton and Nitrate. Ultimately the goal is to<br />
select model structures that would be good approximations of the natural system<br />
and give good insights on the processes at work in the system. However, here I was<br />
faced with an under constrained optimization problem, there was no data available<br />
for 3 of the state variables. Indeed, one of the big challenges of using HIPM for this<br />
particular ecosystem was that the data that is used to conduct the search is very<br />
expensive to collect, and it becomes especially complicated when it comes to iron<br />
(Fe) as it is difficult to measure. From this last statement arise two questions: does<br />
knowing data for more than one state variable narrow down the number of possible<br />
good fit models in a significant manner Will knowledge about certain variable have<br />
better optimization power than for others For example if we could only afford to<br />
collect data for one of the five variables in the system, would phytoplankton give us<br />
8
etter model output (fewer good fit models) in HIPM than zooplankton or would it<br />
be detritus <br />
This is an important question because as scientist are trying to advance their knowledge<br />
on the Ross Sea; there is a need to make educated decisions on what information<br />
to collect in an effort to optimize the use of resources.<br />
This thesis is structured in five parts, firstly I described the method used to<br />
gather the data that was used in my analysis, and this includes the HIPM software<br />
as well as an overview of the data sets. I then went into the quantitative analysis,<br />
by looking strictly at the results generated from the HIPM software and discussing<br />
what it tells us on an ecological standpoint. In section 4, I entered the analytical<br />
part of our analysis, picking and studying some of the best-fit models selected during<br />
the quantitative analysis. I then discussed these analytical results and in the next<br />
section tied it back to the biology in an effort to link both qualitative and quantitative<br />
research. Through this analysis we saw how we can help HIPMs model selection<br />
method as well as assist scientists in finding a model that most accurately explain<br />
the processes at works in the ecosystem observed.<br />
9
2 METHOD<br />
The method employed in this paper involves constructing process models from continuous<br />
data. To assist in this task we used a piece of software named HIPM. It<br />
is the output and model selection efficiency of this computer software that we are<br />
investigating. To better understand the task at hand it is important to define what<br />
HIPM does, as well as the steps we are taking to test its efficiency.<br />
2.1 HIPM Description<br />
Ecologists rely on system modeling quite heavily to build ecological theory, guide<br />
environmental assessment and management (Borrett et al. 2007). Typically scientists<br />
will build and study a couple of models, basing the model structure on previous<br />
research or by making a judgement call on which entities and processes should or<br />
not be included. One of the aspirations and problems of modeling natural systems is<br />
to capture the essence of the system necessary for the model purpose by figuring out<br />
what can be left out; in that regards which entities and processes should be included,<br />
and what are the best mathematical formulation and parameter values for a given<br />
structure become an essential part of this search. Choosing from among the possible<br />
model structures presents an intricate and time consuming challenge for ecologists<br />
who want to navigate this space (Borrett et al. 2007). In searching through this<br />
space of possible models, we are guided by the claim made by Langley et al. (1987),<br />
which we support, that we must look for models that will fit real-life observations. In<br />
summary,we are faced with the problem of constructing models anchored in domain<br />
theory, conducting a time consuming search and linking the models to empirical<br />
data (Borrett et al. 2007). This is where the HIPM software comes into play to<br />
remedy these issues, HIPM stands for Hierarchal Inductive Process Modeling. This<br />
scientific approach (Lantley et al. 2005) assumes the following:<br />
10
• Given: Time-series data for continuous variables.<br />
• Given: Background knowledge about the entities of the system; in other words<br />
constraints on variables and other parameters driving these entities.<br />
• Given: Background knowledge on the type of processes that may be involved<br />
in driving the ecosystem as well as the constraints that may exist for the said<br />
processes.<br />
Then the task for the software is to perform a search through the structure and<br />
parameter space defined by the process-entity library to find the models that best<br />
fit the data. HIPM operates in four phases.<br />
1. In an exhaustive search, it first finds all the possible instantiations of the<br />
generic processes for all variables. This means that the system will find all the<br />
possible combinations of processes that can affect a given variable (We will<br />
give an example in Section 2.1.2 ). For our purposes we used the exhaustive<br />
search option programmed into the software but there is also a heuristic search<br />
option available.<br />
2. The system then walks through each model and puts them together. In other<br />
words, it puts together, into a generic model, one instantiation of generic<br />
processes for each variable present in the system. It uses the constraints given<br />
by the users to determine which instantiations can be linked together into a<br />
generic model; the program goes through an exhaustive search to find all the<br />
possible models. In our study it makes 1120 model structures, due mainly to<br />
the large amount of different grazing processes that are potentially present in<br />
the ecosystem.<br />
3. It searches for the parameter values for each model using the constraints defined<br />
by the users.<br />
To infer these parameters, the system picks a random<br />
11
set of values that respect the constraints and, using the Levenberg-Marquardt<br />
gradient descent method, finds a local optimum. To avoid entrapment in local<br />
minima, the system will restart the parameter estimation from multiple<br />
random points retaining only the parameters that produce the lowest error.<br />
In our experiment we set the number of restarts to 128. This technique has<br />
been found to produce reasonable matches to time series in multiple systems<br />
(Langley et al. 2007).<br />
4. Evaluates the performances of the produced model structures (predicted values)<br />
against the data series (observed values) by calculating the root mean<br />
square error (reMSE); models with the lowest reMSE will be considered best<br />
fit models.<br />
2.1.1 Measure of Fit<br />
As mentioned above, HIPM evaluates and selects the best model structure and set<br />
of parameters according to a fitness measure. The system currently uses the sum<br />
of square error (SSE) to evaluate fitness (Bridewell et al. 2007), which is defined as<br />
follow:<br />
n∑<br />
i=1<br />
SSE(x i , x obs<br />
i ) =<br />
n∑<br />
i=1<br />
m∑<br />
k=1<br />
(x i,k − x obs<br />
i,k ) 2<br />
where x i , . . . , x n are the variables that are being fitted with m observed values for<br />
each. To take into account the modeling of variables of varying scale, the system<br />
uses a relative mean squared error that we define in the following way:<br />
reMSE =<br />
∑ n SSE(x i ,x obs<br />
i )<br />
i=1 s 2 (x obs<br />
i )<br />
nm<br />
Here s 2 (x obs<br />
i ) is the sample variance of the observation for x i . Across this paper<br />
12
we will refer to the relative mean squared error as reMSE. The biggest asset to this<br />
rescaling is the ability to compare values across data sets. Typically, an ReMSE of<br />
1.0 or above signifies that the model performs poorly and inversely, the lower the<br />
reMSE, the better the fit.<br />
2.1.2 Entities specification and model library<br />
Each entity of a system is defined by a combination of variables and parameters<br />
which makes them actors but also receivers of action in the model. A distinction is<br />
to be made between generic entity and instantiated entity. Indeed, a formal generic<br />
entity has a name and a set of properties which can include both variables and<br />
parameters.<br />
In a given model the parameters of the instantiated entity will not<br />
change whereas the variables do. Every variable in the entity has a name and a<br />
rule that determines how multiple processes and their subprocesses are combined<br />
(e.g. summed, minimum, product, etc...). For the parameters there is a name<br />
and a range that constrains their possible values. On the other hand, instantiated<br />
entities have their variables associated with either time-series or they are given initial<br />
values and the parameters have been assigned real values. A field is also included<br />
to indicate the parent generic entity (Borrett et al. 2007). One given generic entity<br />
can be instantiated multiple times, the generic entity can be thought of as a blue<br />
print for the instantiated entities. For example in our system we defined the entity<br />
phytoplankton as presented in Table 1. Here our entity’s name is “P”; it contains the<br />
variables “conc”, “growth rate” and “growth lim” with the rules determining how<br />
they will be aggregated with other processes; the next part of the entity definition is<br />
the list of parameters that are of concern for this entity such as “max growth’ with<br />
possible values in the (0,600) range. Following the definition of a generic entity in<br />
Table 1 is an instantiated entity, “pe” which refers to the parent generic entity. The<br />
variables are then either given the name of a time-series to which the model will be<br />
13
fitted such as for “conc”, with the “PHA c” referring to the phytoplankton column<br />
of the CIAO data set, or an initial value such as 0 for “growth rate”, indicating<br />
that this particular state variable won’t be fitted to a time-series.<br />
The mention<br />
“system” as opposed to “exogenous” simply states that this variable is dependent<br />
on the system as opposed to being independent like variables such as solar radiation<br />
or water temperature. The full instantiated entity library can be found in Appendix<br />
B and the generic entity library in Appendix C.<br />
For HIPM to be fully functional there needs to be a library of processes. Processes<br />
are the physical, chemical, or biological actions that drive change in dynamic models.<br />
Just as we made a distinction between generic entity and instantiated entity, we<br />
make a distinction between generic processes and instantiated processes. All generic<br />
processes are defined by a name by which entities can tie into the process, the<br />
subprocesses that are tied to that one process and one or multiple equations. The<br />
generic process can also include a set of Bolean conditions that determine if the<br />
process is active, making the process dynamic by turning the process on and off<br />
depending on whether the conditions are satisfied (Borrett et al. 2007). For instance<br />
we could set the photosynthetic process to only occur if a set environment light<br />
variable is greater than zero. We have an example of generic process in Table 2, it is<br />
named “growth”, and any of the following entities “P, N, D, E”can take a role in the<br />
process, then there is a list of the subprocesses, with the entities that can take a role<br />
in the subprocess, that are linked to this process and finally the equation that defined<br />
this process; this equation calls onto the “conc” and “growth rate’ variables that all<br />
entities must have. The instantiated process will take on a specific name and will be<br />
bound to a specific instantiated entity, one of P, N, D or E. The instantiated entity<br />
will take it’s role in the equation of the instantiated process. All the instantiated<br />
processes will be aggregated according to the rule defined in the generic entity. It<br />
is this organization in terms of entity and process that drives inductive process<br />
14
modeling. It makes for an easier construction of systems of equations by building in<br />
fragments.<br />
Table 1: In this table we are first giving an example of generic entity definition with<br />
its variables and parameters followed by an example of an instantiated entity, more<br />
specifically Phytoplankton - P, to which the variable “conc” is given a time series<br />
and the other variables initial values.<br />
pe = lib.add_generic_entity("P",<br />
{ "conc":"sum",<br />
"growth_rate":"prod",<br />
"growth_lim":"min"},<br />
{ "max_growth": (0.4,0.8),<br />
"exude_rate": (0.001,0.2),<br />
"death_rate": (0.02,0.04),<br />
"Ek_max":(1,100),<br />
"sinking_rate":(0.0001,0.25),<br />
"biomin":(0.02,0.04),<br />
"PhotoInhib":(200,1500),});<br />
p1 = entity_instance (pe,<br />
"phyto",<br />
{ "conc": ("system", "PHA_c", (0,600)),<br />
"growth_rate": ("system", 0, (0,1)),<br />
"growth_lim": ("system", 1, (0,1))},<br />
{ "max_growth":0.59,<br />
"exude_rate":0.19,<br />
"death_rate":0.025,<br />
"Ek_max":30,<br />
"biomin":0.025,<br />
"PhotoInhib":200 } );<br />
15
Table 2: Defining a process - Growth<br />
lib.add_generic_process(<br />
"growth", "",<br />
[("P",[pe],1,1), ("N",[no3,fe],1,100),<br />
("D",[de],1,1), ("E",[ee],1,1)],<br />
[("limited_growth", ["P","N","E"], 0),<br />
("exudation",["P"],1),<br />
("nutrient_uptake",["P","N"],0)],<br />
{},<br />
{},<br />
{"P.conc": "P.growth_rate * P.conc"} );<br />
To sum it up, HIPM’s power resides in its knowledge of the modeled domain as<br />
well as its ability to estimate parameters (Bridewell et al. 2007).<br />
2.2 Experiment Design<br />
Having now established how HIPM works let us consider the problem at hand.<br />
Though in theory HIPM is an extremely powerful tool which permits a search<br />
through a wide structure and parameter space, previous research has demonstrated<br />
that a more thorough investigation of HIPM’s output is necessary to evaluate its<br />
potential and usefulness to biologist.<br />
In our example of the Ross Sea ecosystem<br />
with the process-entity library set up as described, the search space represents 1120<br />
possible models; each model can take on a wide variety of parameters set depending<br />
on the constraints given to the software. The Phytoplankton dynamic models of the<br />
Ross Sea have five variables: Phytoplankton (P ), Zooplankton (Z), Detritus (D),<br />
Nitrate (N) and Iron (F ). In previous research, real-life time series about Phyto-<br />
16
plankton and Nitrate were available to us for this particular ecosystem, thus the<br />
data was fed to HIPM. By doing so, HIPM came out with about 200 possible models<br />
that have a reMSE of less or equal to 0.2 which from a computer science stand<br />
point is a good improvement. Indeed, we reduce the search space from 1120 possible<br />
models to 200 models. However, for a biologist that is still a quite large amount of<br />
models approximating the ecosystem studied; going through and testing out every<br />
one of these 200 models would be extremely time-consuming. Therefore, it is clear<br />
that we somehow need to lower this number of possible models to a point deemed<br />
reasonable/useful to biologist. Logically we assume that increasing the number of<br />
constraints (i.e. add real-life time series of a variable for which we had no previous<br />
empirical data) would help model discrimination in HIPM. But this would imply<br />
that the scientist would have to go into the field and collect time series for one of<br />
the variables in the system; that process being very expensive, can HIPM be used<br />
to make an informed decision about which variable would yield the most discriminatory<br />
powers, if there is at all a difference between variables This is what we are<br />
investigating and in the light of these elements we have formulated two hypotheses:<br />
• Hypothesis 1: Increasing the number of constraints: increasing the number of<br />
time-series for which we have data in HIPM for model selection will induce<br />
better fits. In other words, the increase in number of known time-series of<br />
system variables leads to better model discrimination and therefore better<br />
model selection.<br />
• Hypothesis 2: Variables yield different values of information: some variables<br />
will have more discriminatory power and restrict the best fit models more than<br />
others.<br />
To test our two hypotheses it was imperative to employ a full data set including<br />
time-series for all variables of the system in order to compare the results depending<br />
17
upon whether certain time-series are included or not as constraint for HIPM. Since<br />
no full data set with real-life data was available, we turned to a simulated data set<br />
called the ”Couple Ice and Ocean model” datasets otherwise referred to as CIAO<br />
datasets. This dataset is generated from a three dimensional ecosystem model that<br />
spans the entire water column and multiple stations across the Ross Sea. However,<br />
for our purposes only a portion of this data, the top 5 meters at the Ross Sea Polynya<br />
station 01, is used. The type of information contained in the CIAO dataset is stated<br />
in Table 3.<br />
Table 3: Information included in the CIAO data set.<br />
NOTE: A sample of the CIAO 1997 data can be found as Appendix A.<br />
Symbol Units Description<br />
JDAY Day Day of the measurements<br />
TEMP ◦ C Temperature of the water<br />
DPML m Mixed layer depth<br />
AI<br />
Sea ice concentration<br />
NITR µM Nitrate concentration<br />
PHOS mg Chla/m 3 Phosphate concentration,<br />
SILC µM Silicate concentration<br />
IRON nM or µM Iron concentration<br />
PARL µmol photons m −2 s −1 Solar radiation used by organism in photosynthesis.<br />
PHA mg Chla/m 3 Phaeo chlorophyll concentration<br />
DIAT mg Chla/m 3 Diatom chlorophyll concentration<br />
ZOO mg C/m 3 Zooplankton concentration<br />
DET mg C/m 3 Detritus concentration<br />
PURL µmol photons m −2 s −1 Photosynthetically usable radiation<br />
In addition to a full data set, it is necessary to have a working library, that, as<br />
stated in Section 2.1.2, defined both entities and processes for HIPM. The processentity<br />
library that we used is available in Appendix B and C, it was previously<br />
put together by Bridewell, Borrett, Langley and Arrigo.<br />
All the processes and<br />
subprocesses in which the instantiated entities can take a role in our study are<br />
represented in Figure 2.<br />
18
Having the background knowledge necessary for HIPM to conduct successful runs<br />
we designed thirty one experiments; each experiment represents a possible combination<br />
of time-series constraints that could potentially be entered into the software.<br />
For example, if we had time-series for Iron and Nitrate and fed the information into<br />
HIPM they would act as additional constraints in the model selection process. To<br />
be selected, models have to exhibit behavior close to the given time-series. All the<br />
experiments are summarized in Table 4 .<br />
19
3 COMPUTATIONAL RESULTS<br />
The main topic in this paper, is to determine how to optimize the usage we make of<br />
HIPM to assist scientists in there decision making process when it comes to selecting<br />
a model that most accurately represent an ecosystem. The first need is to narrow<br />
down the number of possible good fit models capable of describing the system. We<br />
did this feeding additional time series about one of the state variable into HIPM,<br />
thus providing more constraints; so did this assumption hold true<br />
Secondly, if<br />
adding more constraints to HIPM does reduce that number, are observations for a<br />
specific state variable holding more reducing power than the other state variables<br />
The data collected helped us answer these questions as well as discuss the efficiency<br />
of HIPM in its current state.<br />
There were thirty-one different experiments performed, each returning a measure of<br />
fit value (reMSE) every one of the 1120 models tested in every experiment. This<br />
makes for a large amount of data to analyze. To get a better idea of what this data<br />
looks like, the measures of fit values of models that had an reMSE between 0 and<br />
2 were graphed, ranking and graphing them from lowest to highest (see Figure 4, 5<br />
and 6) value. We did not look at reMSE higher than 2.0 since, as stated previously,<br />
models with reMSE higher than 1.0 are typically classified as poorly performing<br />
models as it indicates a very large difference between observed and expected values.<br />
We estimated that the (0,2) range would be sufficient for our purpose, as it would<br />
encompass most models. Based on these initial results we decided to pick an reMSE<br />
of 0.5 as our good fit model cutoff; any model under that cutoff is considered of good<br />
fit. This choice of cutoff was made because the multiple graphs seemed to exhibit a<br />
turning point or slight step pattern around this reMSE value, such as portrayed in<br />
the graph for experiments 1, 5 or 20.<br />
20
2.0 [P] 1<br />
[Z]<br />
2<br />
[D]<br />
●<br />
●<br />
3<br />
1.5<br />
1.0<br />
0.5<br />
0.0<br />
1.5<br />
1.0<br />
0.5<br />
0.0<br />
●<br />
●<br />
●<br />
● ●<br />
●<br />
●<br />
● ●<br />
● ●●●<br />
● 197 Good Fit Models 101 Good Fit Models 366 Good Fit Models<br />
●<br />
●<br />
● ● ●<br />
●<br />
●●<br />
●<br />
●<br />
● ●<br />
●<br />
● ●<br />
●<br />
● ●●●<br />
●<br />
●● ●<br />
2.0 [N] 4<br />
1.5<br />
1.0<br />
0.5<br />
0.0<br />
●<br />
●●●●●<br />
●●<br />
●<br />
●<br />
439 Good Fit Models 509 Good Fit Models<br />
2.0 [P,D] 7<br />
●<br />
●<br />
●<br />
● ●<br />
61 Good Fit Models<br />
2.0 [Z,D] 10<br />
1.5<br />
1.0<br />
0.5<br />
0.0<br />
● ● ● ●<br />
●<br />
● ●●<br />
● ● ●<br />
● ●<br />
8 Good Fit Models<br />
●<br />
[F]<br />
●<br />
●<br />
● ●<br />
●<br />
● ● [P,N]<br />
●<br />
●<br />
● ●● ●<br />
●●●● ●<br />
●● ● ●● ●<br />
●<br />
5<br />
● ● ●●<br />
● ●<br />
● ●●<br />
[P,Z]<br />
●<br />
● ●●<br />
● ●●● ●<br />
5 Good Fit Models<br />
● ●●●●<br />
●<br />
8<br />
[P,F] 9<br />
●<br />
●<br />
●<br />
●●<br />
●<br />
● ●●● ●<br />
25 Good Fit Models 79 Good Fit Models●<br />
[Z,N]<br />
●<br />
● ●●●●●<br />
1 Good Fit Models<br />
●<br />
●<br />
●<br />
11<br />
●<br />
● ●<br />
●<br />
[Z,F]<br />
●●<br />
0 Good Fit Models<br />
●<br />
●<br />
●<br />
●<br />
●<br />
●<br />
●<br />
●<br />
●<br />
● ● ●<br />
6<br />
12<br />
0 200 400 600 800 1200<br />
0 200 400 600 800 1000<br />
0 200 400 600 800 1000<br />
Figure 4: reMSE value are ranked from lowest to highest. The reMSE = 0.5 signifies<br />
the good fit model cutoff, any models under that value are considered good fit models.<br />
The experimental setup for each run as well as the ID number is indicated in the<br />
top right corner.<br />
21
●<br />
●<br />
●<br />
2.0 ●<br />
[D,N]<br />
●●●<br />
● ●<br />
●<br />
●<br />
● 13<br />
[D,F] 14<br />
[N,F]<br />
● 15<br />
●<br />
1.5<br />
●●●●● ●<br />
●<br />
●●<br />
●<br />
1.0<br />
●<br />
●<br />
0.5<br />
● ●<br />
● ● ● ●<br />
67 Good Fit Models 190 Good Fit Models 128 Good Fit Models<br />
0.0<br />
●<br />
2.0 16<br />
1.5<br />
1.0<br />
0.5<br />
0.0<br />
● ●●<br />
●●● ●<br />
●<br />
● ●●● ●● [P,Z,D]<br />
● ●●●●●● ●<br />
●<br />
●● ● ●● ●<br />
● ●<br />
●<br />
●<br />
●<br />
●<br />
●<br />
● ●<br />
●●<br />
● ● 0 Good Fit Models<br />
●<br />
[P,D,N] ●<br />
●● ● ●●●<br />
●<br />
●<br />
●● ●<br />
●<br />
●● ●<br />
● ●●●<br />
●<br />
● ●●<br />
● ● ● ● ●● ● ●<br />
●<br />
● ● ●<br />
● ● ●<br />
● ● ● ● ●●● ● ● ●<br />
●<br />
[P,Z,N]<br />
0 Good Fit Models<br />
● ●●<br />
●<br />
2.0 19<br />
[P,D,F] 20<br />
●<br />
1.5<br />
●●<br />
1.0<br />
●● ● ● ●●<br />
0.5<br />
13 Good Fit Models 177 Good Fit Models<br />
0.0<br />
2.0 [Z,D,N] 22<br />
1.5<br />
1.0<br />
0.5<br />
0.0<br />
●<br />
0 Good Fit Models<br />
● ●●<br />
●●<br />
● [Z,D,F]<br />
● ●●<br />
● ● ●<br />
0 Good Fit Models<br />
●<br />
●<br />
●<br />
●<br />
● ●<br />
●<br />
17<br />
23<br />
●<br />
●<br />
[P,Z,F]<br />
●●● 18<br />
●<br />
●<br />
●<br />
●<br />
●<br />
0 Good Fit Models<br />
●<br />
[P,N,F] 21<br />
●●●●●●● ● ● ●●●●<br />
● ●●● ● ● ●● ● ● ●●● ●<br />
●<br />
●<br />
●<br />
● ●●<br />
● ●● ●<br />
● ●<br />
●<br />
●<br />
● ●●●<br />
●<br />
15 Good Fit Models<br />
[Z,N,F]<br />
3 Good Fit Models<br />
● ●<br />
●<br />
24<br />
0 200 400 600 800 1200<br />
0 200 400 600 800 1000<br />
0 200 400 600 800 1000<br />
Figure 5: reMSE value are ranked from lowest to highest. The reMSE = 0.5 signifies<br />
the good fit model cutoff, any models under that value are considered good fit models.<br />
The experimental setup for each run as well as the ID number is indicated in the<br />
top right corner.<br />
22
2.0 [D,N,F] 25<br />
1.5<br />
1.0<br />
0.5<br />
0.0<br />
1.5<br />
1.0<br />
0.5<br />
0.0<br />
● ●●<br />
● ● ●●●●<br />
● ●●● ●<br />
● ●<br />
● ●●● ●<br />
● ●<br />
●● ● ●●<br />
● ●<br />
●<br />
●<br />
●<br />
●●●<br />
●●●<br />
●<br />
●<br />
●<br />
● ●<br />
●<br />
●<br />
● ●●<br />
●<br />
●● ●<br />
● ●●<br />
●<br />
● ● ● ●● ●<br />
●● ● ●<br />
● ●● ●<br />
● ●<br />
39 Good Fit Models<br />
2.0 [P,Z,N,F] 28<br />
2 Good Fit Models<br />
●<br />
● ●●<br />
● ●●●<br />
●<br />
●<br />
● ●●<br />
● ●●<br />
[P,Z,D,N]<br />
0 Good Fit Models<br />
[P,D,N,F]<br />
5 Good Fit Models<br />
● ● ●●<br />
●<br />
● ● ●●<br />
●●<br />
●● ● ●<br />
●●●●<br />
0 200 400 600 800 1000<br />
26<br />
29<br />
●● ●●● ● ●●●● ● ●<br />
[P,Z,D,F]<br />
0 Good Fit Models<br />
[Z,D,N,F]<br />
0 Good Fit Models<br />
●● ●<br />
●●<br />
●<br />
●●●<br />
●●<br />
●● ●<br />
●<br />
●<br />
●<br />
0 200 400 600 800 1000<br />
● ●●●●<br />
27<br />
30<br />
●<br />
2.0 [P,Z,D,N,F] 31<br />
1.5<br />
●●<br />
1.0<br />
0.5<br />
0.0<br />
0 Good Fit Models<br />
0 200 400 600 800 1200<br />
Figure 6: reMSE value are ranked from lowest to highest. The reMSE = 0.5 signifies<br />
the good fit model cutoff, any models under that value are considered good fit models.<br />
The experimental setup for each run as well as the ID number is indicated in the<br />
top right corner.<br />
23
3.1 Increase in number of time-series input<br />
One of the first observations that was made when looking at the data set, is that the<br />
general trend was the more time-series were used in HIPM the smaller the number<br />
of good fit models, as represented in Figure 7.<br />
Number of Good Fit Models (reMSE
dramatically, with very small or non existent variance, and get very close or equal<br />
to zero. This suggest there may be some issues in the selection process which could<br />
originate from over-constraining the system or from a need to improve the processentity<br />
library. Furthermore, looking at Figure 6 we observe that there are no models<br />
with an reMSE lower than 1.5 which means that all models have performed poorly<br />
given the constraints. At first glance and momentarily putting aside the observed<br />
behavior for four and five time-series constraints, we can conclude that adding up<br />
to three multiple time-series constraints produces the desired effect and reduces the<br />
number of good fit models. But the conclusion of this initial examination does not<br />
always hold true, as a closer look at the data reveals.<br />
At this point my research entered the field of exploratory statistics as opposed to<br />
hypothesis testing statistics, conventional statistics tool such as p-value or confidence<br />
interval were not suitable to evaluate the hypotheses.The data has been reformatted<br />
in the more reader-friendly 4 which represents each experiment in a binary format:<br />
the instantiated entities given time-series for a run received a 1 and the ones with<br />
only initial values received a zero. In addition to this the number of good fit models<br />
for each reMSE cutoff value from 0.1 to 1 were added up in order to analyze the<br />
individual effect, on the model selection process, of adding time-series constraint for<br />
each entity. In order to do so, we select a subset of Table 4 for which a certain entity<br />
has the value of 1. For example for P, we selected the subset of rows where P had a<br />
value of 1. By doing so we are only looking at the runs in which the constraints on<br />
P had a role, excluding the experiments where P was not constrained.<br />
By carefully looking at Table 4, we notice that in experiment 1 when given<br />
observations only for phytoplankton the number of good fit models under .5 reMSE<br />
is 197. In experiment 7 and 9, this number dropped to 61 and 79 respectively, with<br />
the addition of observations for detritus in one case and iron in the other. However,<br />
notice that in experiment 20, where observations for phytoplankton, detritus and<br />
25
Table 4: This table represents each experiment in binary form, 1 signifying that a<br />
time-series was given for this entity and 0 that no time-series were given for this run.<br />
We counted the number of models present under each reMSE cutoff value<br />
ID Data Constraints reMSE Cutoff<br />
P Z D N F .1 .2 .3 .4 .5 .6 .7 .8 .9 1<br />
1 1 0 0 0 0 11 122 161 183 197 213 233 253 284 336<br />
2 0 1 0 0 0 14 38 46 72 101 141 186 236 296 487<br />
3 0 0 1 0 0 95 184 248 331 366 404 441 501 552 602<br />
4 0 0 0 1 0 67 188 301 376 439 482 517 531 547 605<br />
5 0 0 0 0 1 167 361 414 452 509 537 563 594 628 1094<br />
6 1 1 0 0 0 0 0 1 4 5 9 14 19 20 22<br />
7 1 0 1 0 0 0 18 35 45 61 75 88 94 100 102<br />
8 1 0 0 1 0 0 0 15 18 25 30 35 38 45 48<br />
9 1 0 0 0 1 0 37 60 73 79 91 110 123 143 158<br />
10 0 1 1 0 0 0 1 5 5 8 10 11 14 18 18<br />
11 0 1 0 1 0 0 0 0 1 1 1 1 1 1 5<br />
12 0 1 0 0 1 0 0 0 0 0 3 10 14 16 23<br />
13 0 0 1 1 0 2 8 13 48 67 93 120 142 167 178<br />
14 0 0 1 0 1 59 94 140 156 190 232 255 276 295 311<br />
15 0 0 0 1 1 23 40 57 89 128 151 179 218 252 290<br />
16 1 1 1 0 0 0 0 0 0 0 1 4 5 7 7<br />
17 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0<br />
18 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0<br />
19 1 0 1 1 0 0 0 0 9 13 19 22 24 24 25<br />
20 1 0 1 0 1 44 91 132 149 177 226 253 280 295 312<br />
21 1 0 0 1 1 0 0 0 11 15 17 25 25 27 31<br />
22 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0<br />
23 0 1 1 0 1 0 0 0 0 0 3 4 7 7 7<br />
24 0 1 0 1 1 0 0 1 1 3 4 5 6 7 8<br />
25 0 0 1 1 1 3 13 21 27 39 51 65 87 100 114<br />
26 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0<br />
27 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0<br />
28 1 1 0 1 1 0 0 1 1 2 2 3 4 5 6<br />
29 1 0 1 1 1 0 0 0 2 5 10 13 17 17 18<br />
30 0 1 1 1 1 0 0 0 0 0 1 2 2 2 2<br />
31 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0<br />
Iron were used that the number of good fit models is 177 which is higher than<br />
that of experiment 7 and 9. This perfect counter example demonstrates that more<br />
26
observations is not always synonymous to fewer models. Yet, according to Table<br />
4 there are some experiments for which this assumption did hold true. Indeed, in<br />
experiment 8 both phytoplankton and nitrate are known and HIPM output 25 good<br />
fit models under 0.5 reMSE; when Iron observations were added in experiment 21 we<br />
count 15 models selected and again when detritus was added in experiment 29 this<br />
number dropped to 5. Thus, in some cases more information will provide further<br />
restriction in the number of models selected.<br />
If observations and measurements for one or two state variables and we want to<br />
collect data about one of the remaining state variables, it is clear that our choice of<br />
which state variable to use should be highly influenced by the restriction power of<br />
each variable which can be calculated from data already known. To place it back<br />
into context, in experiment 9 data for both phytoplankton and iron were used in<br />
the selection process and the output was 79 good fit models under 0.5 reMSE; if<br />
we wanted to decrease this number adding detritus would be an unwise choice, as<br />
it outputs 177 good fit models. However, this is not always the case. For instance,<br />
in experiment 8, where phytoplankton and nitrate data are known, the output was<br />
25 good fit models, with the addition of detritus this number went down to 13<br />
in experiment 19. Indeed, the assumption was that more time-series data would<br />
constraint the model selection process further.<br />
However, the way the reMSE is calculated may be the reason why the results<br />
contradict our assumptions. The reMSE is the average of the fit for each of the<br />
variables being fitted. For example if for a model I was fitting both phytoplankton<br />
and iron and they both had an reMSE of 0.6, their average would be 0.6 and thus<br />
would not be selected as good fit model but if in the next experiment we added<br />
detritus and it had a fit of 0.1, the overall average would then be 0.35 which would<br />
put the model into the good fit model category. This explain why more time-series<br />
does not always mean fewer model being selected. Hence, different entities will yield<br />
27
different restriction powers based on the pre-existing knowledge. It seems that the<br />
value of the information varies according to the data previously available; but is<br />
there a specific state variable that tends to provide more restriction than the others<br />
regardless of the case or inversely, is there a state variable that tends to provide very<br />
little additional information in the model selection process<br />
3.2 Value of Information<br />
It seems realistic to think that each state variable could yield varying restrictive<br />
power when it comes to model selection through HIPM. To verify this assumption<br />
we used subsets of Table 4 to create Figure 8: five subsets were created for each of the<br />
state variables using the experiments for which the observations and measurements<br />
for that variable were known. I evaluated the overall impact of each state variable<br />
over all experiments by first looking at the mean number of good fit models for each<br />
reMSE cut offs. Realizing that the variation in these subsets was great because of<br />
the presence of so many zeros we decided to look at the median number of models<br />
for each of the different reMSE cutoff values in order to quantify the discriminatory<br />
powers of each entity. We will refer to these values as Median Activation Values in<br />
reference to Bayesian Statistics Activation Probabilities that inspired this approach.<br />
The lower the Median Activation Value the more restrictive power it holds.<br />
Indeed the activation value refers to the median number of models selected across<br />
all the runs that included that entity. In our case we are looking for the state variable<br />
that would reduce the number of good fit models the most. This is determine<br />
by the lower the activation value, the lower it is the more discriminatory power<br />
this variable holds at that particular cutoff. Looking at Figure 8 that zooplankton<br />
for cutoff between 0.4 and 1 has the most discriminatory power. Nitrate comes in<br />
second with an overlap with Iron from reMSE cutoff between 0.4 and 0.6 but for<br />
higher cutoff nitrate will have lower Median Activation Value than iron. As far as<br />
28
25<br />
20<br />
●<br />
P<br />
Z<br />
D<br />
N<br />
F<br />
●<br />
●<br />
●<br />
Median Activation Values<br />
15<br />
10<br />
●<br />
●<br />
5<br />
●<br />
●<br />
0<br />
● ● ●<br />
0.2 0.4 0.6 0.8 1.0<br />
ReMSE cutoff<br />
Figure 8: Mean Activation Values by different Cutoffs. The lower the median activation<br />
value the more discriminatory powers that entity holds at that particular cutoff.<br />
Between 0.4 and 1, Zooplankton consistently has the lowest median activation value.<br />
phytoplankton and detritus are concerned, they seem to have similar behavior with<br />
high activations values. We are to note that for cutoffs less than 0.4 no one entity<br />
seems to have greater discriminatory power. Based on this graph alone it would<br />
seem that zooplankton is the one entity that yields the most information when it<br />
comes to model selection and therefore would be the entity worth collecting in the<br />
field.<br />
That said, Table 4 also reveals a worrisome amount of data constraints combinations<br />
which yield no models with reMSE less than one. A example of that being<br />
29
experiment 22, yielding no good fit models under 1 reMSE cutoff. Another case<br />
that is cause for worry is the one where time-series are given to all 5 entities which<br />
we would expect to have at least one model selected in between the 0.1 to 1 range<br />
of reMSE cutoff. This observation raises the question that there may be an underlying<br />
issue with the model selection process. Incidentally, all of the combinations<br />
that yield no models under the 1 reMSE cutoff are experiments to which we gave<br />
Z a time-series constraint, which may say something about the processes that drive<br />
zooplankton; the library may be in need of improvements.<br />
3.3 Summary<br />
This result analysis allows us to make the following observations:<br />
• In most cases, increasing the number of time-series constraints up to 3 seemed<br />
to reduce the number of good fit models under a 0.5 cutoff. If we consider<br />
experiments 1 through 25, there was only one case (experiment 20) for which<br />
the number of good fit models increased and six cases for which the number of<br />
good fit models went to zero (experiment 12, 16,17,18, 22, 23) which can been<br />
interpreted as a deficiency in the library. Overall, this result is due to the fact<br />
that the reMSE is an average of the fit of all the state variables for which we<br />
have time-series.<br />
• The decision as to which data to collect next should take into consideration the<br />
previously acquired time-series. Recommendations may differ based on state<br />
variables previously measured and used with HIPM in the model selection<br />
process. The reason for this is once again the way the reMSE is calculated.<br />
Indeed, depending on how well the previously included time-series fitted a<br />
particular model will determine whether or not the addition of another timeseries<br />
will throw the said model in or out of the pool of good fit models.<br />
30
• For an reMSE cutoff of less than 0.4 the Median Activation Values for all 5<br />
entities blend together and are not useful. However, for reMSE cutoff equal to<br />
0.4 or greater the Median Activation Values seem to indicate that zooplankton<br />
yields the most discriminatory powers which could be due to the numerous<br />
experiments for which we obtain zero good fit models. These zeros could be<br />
the result of two things, either that the discriminatory power of Zooplankton is<br />
superior or the most plausible answer at this time would be that zooplankton<br />
are not appropriately defined in the process library.<br />
That being said there are a couple of elements that raise question in regards<br />
to the accuracy of the model selection process or the process-entity library. These<br />
elements being:<br />
• The behavior observed in Figure 7 with time-series for four or five of the entities<br />
as an average number of good fit models very close or equal to zero as well<br />
as a spike in reMSE fosters doubt as to the accuracy of the selection process<br />
when provided too many constraints.<br />
• The lack of good fit models, under reMSE cutoffs ranging from 0.1 to 1, for<br />
many of the experiments that included Zooplankton time-series constraints.<br />
This leads us to conjecture that when HIPM is given more than 3 data-series the<br />
system becomes overconstrained thus preventing it from accurately selecting models.<br />
Another conjecture is that the entity Zooplankton as defined in the process-entity<br />
library needs to be reviewed; it could be this element alone that is at the origin of<br />
this issue in the model selection. More specifically, one of the assumption of the<br />
system is that zooplankton feed very little if at all on Phaeocystis antartica as they<br />
are more resistant to grazing in comparison to diatoms that are more typically grazed<br />
upon by zooplankton. The way the process-library is currently set-up diatoms are<br />
31
not taken into consideration which could then in turn affect how well zooplankton<br />
performs when fitting models.<br />
32
4 ANALYTICAL <strong>ANALYSIS</strong><br />
The quantitative analysis of HIPM’s results enabled us to make some useful observations.<br />
Since the main purpose of this software for a biologist is to approximate<br />
the natural system observed in order to use this model to perform experiments, we<br />
decided to choose two of the models with an reMSE of less than 0.5 that came up<br />
most frequently over all 31 runs.<br />
4.1 Most recurrent models<br />
The initial concept driving the models is represented in Figure 1. Phytoplankton<br />
plays a role in both Zooplankton and Detritus concentration, it is acted upon by both<br />
Nitrate and Iron which are in turned acted upon by Detritus. The environmental<br />
factors act on the Phytoplankton concentration as well as Nitrate and Iron. Model<br />
A came up 13 times and Model B came up 11 times over all runs. We analyzed these<br />
models to figure out if their behavior make sense from an ecological standpoint and if<br />
they could give us information on how to improve the HIPM selection process. The<br />
models are composed of five differential equations, each one determined by one of<br />
the five principal concentrations: Phytoplankton (P), Zooplankton (Z), Detritus (D),<br />
Nitrate (N)and Iron (F). All theses entities are acted upon by sets of parameters<br />
listed in Table 5 and 6.<br />
There are also a set of exogenous variables acting on<br />
the system, defined as follow: E P UR (t) is the photosynthetically usable radiation,<br />
E T H2 O(t) is the temperature of the water and E ice (t) is the sea ice concentration.<br />
33
Table 5: This table summarizes all the parameters that play a role in Model A<br />
Model A<br />
ID Name Value<br />
a 0 phyto.max growth 0.8<br />
a 1 phyto.Ek max 12.033<br />
a 2 phyto.PhotoInhib 771.158<br />
a 3 arrigoetal1998 w photoinhibition coefficient 13.2302<br />
a 4 NO3 monod lim coefficient 0.00099718<br />
a 5 Fe monod lim coefficient 0.000394882<br />
a 6 phyto.exude rate 0.0228636<br />
a 7 NO3.toCratio 6.6<br />
a 8 Fe.toCratio 308026<br />
a 9 phyto.death rate 0.0311617<br />
a 10 environment.beta 0.327204<br />
a 11 zoo.death rate 0.270568<br />
a 12 zoo.assim eff 0.167516<br />
a 13 zoo.gmax 0.403535<br />
a 14 grazing ivlev delta coefficient 0.997648<br />
a 15 detritus.remin rate 0.0335311<br />
a 16 zoo.respiration rate 0.0103725<br />
a 17 phyto.sinking rate 0.015739<br />
a 18 detritus.sinking rate 0.074487<br />
a 19 NO3.avg deep conc 31<br />
a 20 NO3 linear temp control max mixing rate 0.729376<br />
a 21 Fe.avg deep conc 0.00045<br />
a 22 Fe linear temp control max mixing rate 0.00794959<br />
34
Model A<br />
Where,<br />
dP<br />
dt<br />
dZ<br />
dt<br />
dD<br />
dt<br />
dN<br />
dt<br />
dF<br />
dt<br />
=<br />
=<br />
=<br />
=<br />
=<br />
[ [ ]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) (1 − a 6 ) − a 9 − a 17<br />
]P (1)<br />
} {{ }<br />
(<br />
−<br />
P Growth<br />
)<br />
Rate<br />
a 13 (1 − e (−a 14P ) ) Z<br />
} {{ }<br />
Z Grazing Rate<br />
[<br />
a 12 a 13 (1 − e (−a 14P ) ) −a<br />
} {{ } 11 − a 16<br />
]Z (2)<br />
Z Grazing Rate<br />
(<br />
) (<br />
)<br />
(1 − a 10 )a 11 P + (1 − a 10 )a 11 Z<br />
(<br />
)<br />
+ (1 − a 10 )(1 − a 12 ) a 13 (1 − e (−a 14P ) ) Z<br />
} {{ }<br />
− D(a 15 + a 18 )<br />
Z Grazing Rate<br />
[<br />
]<br />
E T H2 O<br />
(a 19 − N) a<br />
max<br />
− E T H2 O(t)<br />
20<br />
E T H2 O max<br />
− E T H2 O<br />
} {{ min<br />
}<br />
N Mixing Rate<br />
−<br />
[<br />
−<br />
[<br />
P<br />
(a 7 12.0107)<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)<br />
} {{ }<br />
P Growth Rate<br />
]<br />
E T H2 O<br />
(a 21 − F ) a<br />
max<br />
− E T H2 O(t)<br />
22<br />
E T H2 O max<br />
− E T H2 O<br />
} {{ min<br />
}<br />
F Mixing Rate<br />
[<br />
P<br />
(a 8 12.0107)<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)<br />
} {{ }<br />
P Growth Rate<br />
(3)<br />
(4)<br />
]<br />
+ a 15D<br />
(a 7 12.0107)<br />
(5)<br />
]<br />
+ a 15D<br />
(a 8 12.0107)<br />
{<br />
F<br />
M(t) = min<br />
(F + a 5 ) , N<br />
(N + a 4 ) , E P UR (t)<br />
(e− a 2 )(1 − e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1<br />
} ) ) )<br />
a 1 ))<br />
} {{ }<br />
Phytoplankton Growth Limitation<br />
35
Table 6: This table summarizes all the parameters that play a role in Model B<br />
Model B<br />
ID Name Value<br />
a 0 phyto.max growth 0.561196<br />
a 1 phyto.Ek max 37.5096<br />
a 2 phyto.PhotoInhib 394.809<br />
a 3 arrigoetal1998 w photoinhibition coefficient 10.7433<br />
a 4 nut lim exp coefficient 0.784127<br />
a 5 monod lim coefficient 0.000722964<br />
a 6 phyto.exude rate 0.168121<br />
a 7 NO3.toCratio 6.6<br />
a 8 Fe.toCratio 335345<br />
a 9 phyto.death rate 0.0293637<br />
a 10 environment.beta 0.473748<br />
a 11 zoo.death rate 0.00199206<br />
a 12 zoo.assim eff 0.307847<br />
a 13 zoo.gmax 0.350046<br />
a 14 zoo.gcap 288.23<br />
a 15 zoo.glim 19.0002<br />
a 16 phyto.biomin 0.0201679<br />
a 17 detritus.remin rate 0.03<br />
a 18 zoo.respiration rate 0.0234653<br />
a 19 phyto.sinking rate 0.00273829<br />
a 20 detritus.sinking rate 0.00390565<br />
a 21 NO3.avg deep conc 31<br />
a 22 NO3 linear temp control max mixing rate 0.00307192<br />
a 23 Fe.avg deep conc 0.00045<br />
a 24 Fe linear temp control max mixing rate 0.00444523<br />
36
Model B<br />
dP<br />
dt<br />
dZ<br />
dt<br />
dD<br />
dt<br />
=<br />
=<br />
=<br />
[ ]<br />
[ ]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) P (1 − a 6 ) (6)<br />
} {{ }<br />
(<br />
P Growth<br />
)<br />
Rate<br />
−(a 9 P ) − H(t)Z − (a 19 P )<br />
( )<br />
a 12 H(t)Z − a 11 Z 2 − a 18 Z (7)<br />
(<br />
)<br />
)<br />
(1 − a 10 )a 9 P +<br />
((1 − a 10 )a 11 Z 2 (8)<br />
(<br />
)<br />
+ (1 − a 10 )(1 − a 12 )H(t)Z − D(a 17 + a 20 )<br />
dN<br />
dt<br />
dF<br />
dt<br />
=<br />
=<br />
[<br />
] [<br />
a 17 D<br />
E T H2 O<br />
+ (a 21 − N) (a<br />
max<br />
− E T H2 O(t)<br />
22<br />
(a 7 12.0107)<br />
E T H2 O max<br />
− E T H2 O<br />
} {{ min<br />
}<br />
N Mixing Rate<br />
−<br />
[<br />
−<br />
[<br />
P<br />
(a 7 12.0107)<br />
Da 17<br />
(a 8 12.0107)<br />
[<br />
P<br />
(a 8 12.0107)<br />
[<br />
]<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)<br />
} {{ }<br />
P Growth Rate<br />
] [<br />
]<br />
+<br />
E T H2 O<br />
(a 23 − F ) a<br />
max<br />
− E T H2 O(t)<br />
24<br />
E T H2 O max<br />
− E T H2 O<br />
} {{ min<br />
}<br />
F Mixing Rate<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)<br />
} {{ }<br />
P Growth Rate<br />
]<br />
]<br />
(9)<br />
(10)<br />
Where,<br />
{<br />
F<br />
M(t) = min<br />
(F + a 5 ) , (1 − e−a 5N ), (e −E P UR (t)<br />
a 2 )(1 − e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1<br />
} ) ) )<br />
a 1 ))<br />
} {{ }<br />
{<br />
H(t) = max<br />
Phytoplankton Growth Limitation<br />
}<br />
a 13 (P − a 16 − a 15 )<br />
0,<br />
a 14 + (P − a 16 − a 15 )<br />
} {{ }<br />
Zooplankton Grazing Rate<br />
37
4.2 Preliminaries<br />
Both Models A and B have complex structures which differ from more theoretical<br />
models studied by mathematicians. Since solving these differential equations directly<br />
is extremely difficult, we decided to take a more indirect approach by looking at<br />
the bounds of function, using the positive lemma and comparison arguments which<br />
follow.<br />
Lemma 1 A Positivity Lemma. Let W (t) be a smooth function over a domain<br />
[0, T ], T ∈ R . If W satisfies W ′ (t) + M(t)W (t) ≥ 0 in (0, T ] and W (0) ≥ 0, where<br />
M(t) is a bounded function in [0, T ], then W (t) ≥ 0 on [0, T ].<br />
Proof: We prove this lemma by contradiction. Assume that the statement W (t) ≥ 0<br />
in [0, T ] were not true, then there would exist a point t 0 ∈ [0, T ] such that W (t 0 ) is<br />
a negative minimum of W on [0, T ]. Since W (0) ≥ 0, then t 0 ∈ (0, T ] which means<br />
that<br />
W ′ (t 0 ) + M(t 0 )W (t 0 ) ≥ 0.<br />
Since W reaches its minimum value at t 0 , then we have W ′ (t 0 ) = 0 if t 0 ≠ T and<br />
W ′ (t 0 ) ≤ 0 if t 0 = T . This ensures that<br />
M(t 0 )W (t 0 ) ≥ 0<br />
which contradicts our assumption about W (t 0 ) < 0 when M(t 0 ) > 0.<br />
For the case of M(t 0 ) ≤ 0, we let V (t) = e −γt W (t) for some constant γ with<br />
γ > −M(t) in (0, T ], then V will satisfy the relation V ′ (t) + (γ + M)V (t) ≥ 0 in<br />
(0, T ] and V (0) ≥ 0, where γ+M(t) > 0 for all t ∈ (0, T ]. From the above arguments<br />
we have V (t) ≥ 0 in [0, T ]. It follows from W (t) = e γt V (t) that W (t) ≥ 0 on [0, T ].<br />
□<br />
38
As an application of Lemma 1, we have the following comparison argument for<br />
the respective solutions u 1 and u 2 of the initial-value problem<br />
u ′ i = f i (t, u i ) in (0, T ], u i (0) = u i,0 , (11)<br />
where i = 1, 2. f 1 and f 2 are continuous functions in [0, T ] × R.<br />
Lemma 2 The Comparison Argument.<br />
Assume that both ∂f 1<br />
∂u and ∂f 2<br />
∂u are continuous in [0, T ] × R. If f 1(t, u) ≤ f 2 (t, u)<br />
in (0, T ] × R and u 1,0 ≤ u 2,0 , then the respective solutions u 1 and u 2 of (11) satisfy<br />
u 1 (t) ≤ u 2 (t) on [0, T ].<br />
Proof: Let W = u 2 − u 1 , and let M = M(t) be any bounded function in [0, T ] × Ω.<br />
Then by (10), W satisfies<br />
W ′ (t) + M(t)W (t)<br />
= M(t)[u 2 (t) − u 1 (t)] + f 2 (t, u 2 (t)) − f 1 (t, u 1 (t)) in (0, T ]<br />
W (0) = u 2,0 − u 1,0 ≥ 0.<br />
Since ∂f 1<br />
∂u<br />
is continuous in u, then by the mean value theorem [2],<br />
f 2 (t, u 2 ) − f 1 (t, u 1 )<br />
= [f 2 (t, u 2 ) − f 1 (t, u 2 )] + [f 1 (t, u 2 ) − f 1 (t, u 1 )]<br />
≥ ∂f 1<br />
∂u (t, ˆη)(u 2 − u 1 )<br />
where ˆη = ˆη(t) is an intermediate value between u 1 and u 2 . Hence, for the bounded<br />
function M(t) = − ∂f 1<br />
∂u (t, ˆη(t)), W satisfies W ′ (t) + M(t)W (t) ≥ 0 in (0, T ]. It is<br />
known from lemma 1 that W ≥ 0, i.e. u 2 (t) ≥ u 1 (t) on [0, T ]. This proves lemma 2.<br />
□<br />
39
In addition to these 2 Lemmas, let us introduce a method for solving a first order<br />
linear differential equation.<br />
Proposition 1 Suppose u is a function that satisfies:<br />
du<br />
dt = αu + β, u(0) = u 0,<br />
then,<br />
u(t) = (u 0 + α β )eαt − α β .<br />
Proof:<br />
du<br />
dt<br />
= αu + β is a first order linear differential equation and can be solved<br />
using the method of integrating factor.<br />
(e −αt u) ′ = βe −αt<br />
e −αt u =<br />
∫ t<br />
0<br />
βe −αs ds<br />
u = − β α + Ceαt<br />
Using initial condition u(0) = u 0 we find the following solution:<br />
u = (u 0 + β α )eαt − β α<br />
Hence, proving proposition 1. □<br />
40
4.3 Model A<br />
Our analysis of Model A begins with two entities that have very similar structure,<br />
and only differ in variables and parameters, iron and nitrate.<br />
dN<br />
dt = [<br />
(a 19 − N)<br />
−<br />
[<br />
P<br />
(a 7 12.0107)<br />
≤a 20<br />
{ }} { ]<br />
E T H2 O<br />
a<br />
max<br />
− E T H2 O(t)<br />
20<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
]<br />
a 15 D<br />
(a 7 12.0107)<br />
We decide to go for a very wide upper bound and to try keeping the bounds simple<br />
but yet still informative. Thus, for the upper bound we decided to drop the<br />
subtracted term.<br />
dN<br />
dt ≤(a 19 − N)a 20<br />
≤a 19 a 20 − Na 20<br />
Then, solving for N u (t)<br />
dN u<br />
dt<br />
+ N u a 20 = a 19 a 20<br />
Using integrating factor e a 20t and Proposition 1 we get,<br />
N u (t) = a 19 + R u e −a 20t<br />
where R u = N 0 − a 19 , As far as the lower bound is concern, we chose it to be zero.<br />
Thus summarizing the bounds we get,<br />
0 ≤ N(t) ≤ a 19 + R u e −a 20t<br />
(12)<br />
41
When t → ∞ we get the following,<br />
0 ≤ N(t) ≤ a 19 (13)<br />
This result tells us that the Nitrate concentration in this model will not exceed<br />
the value of parameter 19 which is the Nitrate average deep concentration. (12) also<br />
tells us that the maximum rate of decline of Nitrate will be that of Parameter 20<br />
which represents the Nitrate maximum mixing rate. This means that the accuracy<br />
of the Nitrate concentration is extremely dependent on how well the parameters are<br />
selected. Since Iron has the same equation structures, the same analysis applies:<br />
dF<br />
dt = [<br />
(a 21 − N)<br />
−<br />
[<br />
P<br />
(a 8 12.0107)<br />
a<br />
{ }}<br />
22<br />
{ ]<br />
E T H2 O<br />
a<br />
max<br />
− E T H2 O(t)<br />
22<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
]<br />
a 15 D<br />
(a 8 12.0107)<br />
Using the same method as Nitrate the upper found for Iron is<br />
F u (t) = a 21 + Q u e −a 22t<br />
where Q u = F 0 − a 21 ,<br />
Summarizing the bounds we get,<br />
0 ≤ F (t) ≤ a 21 + Q u e −a 22t<br />
(14)<br />
When t → ∞ we get it to be,<br />
∴ 0 ≤ F (t) ≤ a 21 (15)<br />
42
As for Nitrate, Iron Concentration will not exceed a maximum set by parameter<br />
21 which is the Iron average depth concentration. Similarly, the maximum rate<br />
of decline for Iron is set by its maximum mixing rate. Behavior of Iron and Nitrate<br />
concentrations are thus dependent on the accuracy of the parameter selection<br />
process.<br />
The next entity in our analysis is zooplankton as it has a fairly simple equation<br />
structure. By the Comparison Argument in Lemma 2 and since (1 − e (−a 14P ) ) ≤ 1<br />
we can write,<br />
dZ<br />
[<br />
dt = a 12 a 13 (1 − e (−a 14P ) ) − a 11 − a 16<br />
]Z<br />
]<br />
≤<br />
[a 12 a 13 − a 11 − a 16 Z<br />
Thus, by Proposition 1,<br />
Z(t) ≤ Z 0 e (a 12a 13 −a 11 −a 16 )t = Z 0 e −0.2166944309t<br />
We know that by definition the Zooplankton concentration is positive which gives<br />
us a lower bound of sero. Then,<br />
0 ≤ Z(t) ≤ Z 0 e −0.2166944309t<br />
0 ≤ Z(t) ≤ Z 0 e −δt where, δ = 0.2166944309 (16)<br />
We notice that as t → ∞ Z(t) goes to zero which implies that the Zooplankton<br />
population is driven to extinction.<br />
Since for a biologist this result goes against<br />
expectations, the validity of this model structure is questioned.<br />
lim Z(t) = 0 (17)<br />
t→+∞<br />
43
For the Zooplankton not to go to zero as t goes to infinity, (a 12 a 13 − a 11 − a 16 )<br />
would have to be greater than zero. This may be a clue to refining the constraints on<br />
the parameter selection process, so that it is strictly positive, insuring a zooplankton<br />
concentration not going to zero for this model structure.<br />
This result is then used to further our analysis by looking at the Phytoplankton<br />
(1) equation as knowing P(t) will help us find bounds for the other entities. The<br />
phytoplankton differential equation like those of Nitrate and Iron is composed of a<br />
minimum function M(t), not often found in differential equations. In order to find<br />
bounds for P(t) we must first find bounds for M(t). Recall,<br />
{<br />
F<br />
M(t) = min<br />
(F + a 5 ) , N<br />
(N + a 4 ) , E P UR (t)<br />
(e− a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />
a 1 ))<br />
}<br />
M(t) being a minimum function it will always pick the smallest value of the 3<br />
functions stated above, thus using (15) we can safely estimate the range of M(t) to<br />
be:<br />
0 ≤ M(t) ≤ F upperbound<br />
F upperbound + a 5<br />
= a 21<br />
a 21 + a 5<br />
= 0.53262. (18)<br />
Using the lower bound of (16), we are trying to find an upper bound for P(t) since<br />
Z(t) is subtracted we used its small value (i.e. lower bound), and Lemma 2 we have,<br />
[<br />
dP [ ]<br />
dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)(1 − a 6 ) − a 9 − a 17<br />
]P<br />
(<br />
)<br />
− a 13 (1 − e (−a 14P ) )Z<br />
≤<br />
[ [<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
]<br />
M(t)(1 − a 6 ) − a 9 − a 17<br />
]P<br />
44
Using the exogenous time-series at our disposal we find,<br />
0.01406716 ≤<br />
[<br />
]<br />
(1 − E ice (t)) ∗ a 0 ∗ e (0.06933∗E T H 2 O(t))<br />
≤ 0.8639959 (19)<br />
Using (18) and (19),<br />
We may rewrite as follow,<br />
dP<br />
[<br />
]<br />
dt ≤ (0.8639959)(0.53262)(1 − a 6 ) − (a 9 + a 17 ) P<br />
≤ (0.402759) P<br />
} {{ }<br />
α u<br />
dP<br />
≤ α<br />
dt u P where α u = 0.402759<br />
For the lower bound of P(t), using the upper bound of (16), (18) and Lemma 2, we<br />
get a manageable lower bound,<br />
dP<br />
(<br />
)<br />
dt ≥ − (a 9 + a 17 ) P − a<br />
} {{ }<br />
13 (1 − e (−a 14P ) ) Z<br />
} {{ } 0 e<br />
} {{ −δt<br />
}<br />
≥a 13<br />
(16)<br />
α l<br />
≥ −α l P − a 13 Z 0 e −δt where, α l = 0.0469007<br />
By Proposition 1 we get,<br />
P (t) ≥<br />
(<br />
P 0 + a )<br />
13<br />
α − δ Z 0 e −αlt − a 13<br />
α − δ Z 0e −δt<br />
Summarizing the bounds for P(t) we get,<br />
45
(<br />
∴ P 0 + a )<br />
13<br />
α − δ Z 0 e −αlt − a 13<br />
α − δ Z 0e −δt ≤ P (t) ≤ P 0 e αut (20)<br />
P (t) > 0 ∈ (0, +∞)<br />
From a biological standpoint α l is the maximum rate of decline and α u is the maximum<br />
rate of growth. Theses bounds give us little information about the model, as<br />
they simply state that Phytoplankton concentration is contained between zero and<br />
infinity.<br />
Next we look at Detritus:<br />
dD<br />
(<br />
) (<br />
{ }} { )<br />
dt = (1 − a 10 )a 11 P + a 11 + (1 − a 12 ) a 13 (1 − e (−a 14P ) ) (1 − a 10 )Z − D(a 15 + a 18 )<br />
(<br />
)<br />
)<br />
≤ (1 − a 10 )a 11 P +<br />
(a 11 + (1 − a 12 )a 13 (1 − a 10 )Z − D(a 15 + a 18 )<br />
(<br />
)<br />
)<br />
≤ (1 − a 10 )a 11 P 0 e<br />
} {{ αut +<br />
(a<br />
} 11 + (1 − a 12 )a 13 (1 − a 10 ) Z 0 e<br />
} {{ −δt<br />
}<br />
(20)<br />
(16)<br />
− D(a 15 + a 18 )<br />
≥a 13<br />
Using (16) and (20)and simplifying a bit we get a more manageable upper bound.<br />
Solving for upper bound D u (t)<br />
dD u<br />
dt<br />
+ (a 15 + a 18 )D u = (1 − a 10 )a 11 P 0 e αut +<br />
(a 11 + (1 − a 12 )a 13<br />
)<br />
(1 − a 10 )Z 0 e −δt<br />
46
Using integrating factor e (a 15+a 18 )t and Proposition 1 we get,<br />
D u (t) =<br />
( (1 − a10 )a<br />
) (<br />
11<br />
P 0 e αut (a11 + (1 − a 12 )a 13 )(1 − a 10 )<br />
)<br />
+<br />
Z 0 e −δt + C u e −(a 15+a 18 )t<br />
α u + a 15 + a<br />
} {{ 18 −δ + a<br />
}<br />
15 + a<br />
} {{ 18<br />
}<br />
β u1 β u2<br />
Let’s rewrite to simplify the expression a bit,<br />
D u (t) = β u1 P 0 e αut + β u2 Z 0 e −δt + C u e −(a 15+a 18 )t ,<br />
where β u1 = 0.224039 and β u2 = −3.754762.<br />
Assuming D(0) = D 0 and still following Proposition 1 we solve for C u ,<br />
C u = D 0 − β u1 P 0 − β u2 Z 0<br />
Similarly for the lower bound, using Proposition 1, (16) and (20) we get,<br />
Let’s rewrite it as,<br />
D l (t) =<br />
( (1 − a10 )a<br />
)<br />
11<br />
P 0 e −αlt + C l e −(a 15+a 18 )t<br />
−α l + a 15 + a<br />
} {{ 18<br />
}<br />
β l<br />
D l (t) = β l P 0 e −α lt + C l e −(a 15+a 18 )t ,<br />
where β l = 2.9784819 and C l = D 0 − β l P 0 .<br />
Summarizing the bounds,<br />
β l P 0 e −α lt + C l e −(a 15+a 18 )t ≤ D(t) ≤ β u1 P 0 e αut + β u2 Z 0 e −δt + C u e −(a 15+a 18 )t<br />
(21)<br />
This concludes our analysis of Model A; the results will be discussed further on.<br />
47
4.4 Model B<br />
Shifting our focus to Model B we find different model structures. Indeed, (7) yields<br />
a Lokta-Volterra structure which will make for an interesting analysis.<br />
Following the procedure used for Model A, we start our analysis with the Zooplankton<br />
equation (7), the simplest of all five. To find bounds for Z(t) we first need<br />
to find that of H(t).<br />
{<br />
}<br />
a 13 (P − a 16 − a 15 )<br />
H(t) =max 0,<br />
a 14 + (P − a 16 − a 15 )<br />
a 13 (P − a 16 − a 15 )<br />
a 14 + (P − a 16 − a 15 ) ≤ a 13<br />
Thus we get,<br />
0 ≤ H(t) < a 13 (22)<br />
Knowing (22) we can conclude,<br />
dZ<br />
( )<br />
dt = a 12 H(t)Z − a 11 Z 2 − a 18 Z<br />
≤a 12 a 13 Z − a 11 Z 2 − a 18 Z = Z [a 12 a 13 − a 18 − a 11 Z]<br />
} {{ }<br />
Logistic Equation<br />
Setting, a 12 a 13 − a 18 − a 11 Z = 0 we can find the carrying capacity K.<br />
K = a 12a 13 − a 18<br />
a 12<br />
= 42.3156486<br />
48
Thus, an upper bound for Z(t) will be,<br />
lim Z(t) ≤ K = 42.3156486<br />
t→+∞<br />
This is significant, since the Zooplankton concentration will have a maximum of K<br />
and is closer to the type of behavior an ecologist would expect to see in Zooplankton<br />
concentrations. The lower bound of this entity will be zero, since (22) and we know<br />
from biology that Zooplankton concentration cannot be negative. Hence,<br />
∴ 0 ≤ Z(t) ≤ 42.3156486 (23)<br />
However if P (0) ≤ a 16 + a 15 then Z(t) would go to zero as t goes to infinity<br />
because H(t) = 0 hence changing the structure of the equation and driving the<br />
population to extinction.<br />
In order to proceed to the analysis of P(t) we must first find the bounds for M(t).<br />
{<br />
F<br />
M(t) =min<br />
(F + a 5 ) , (1 − e−a 5N ), (e −E P UR (t)<br />
a 2 )(1 − e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />
a 1 ))<br />
}<br />
Since M(t) is a minimum function its bounds are,<br />
0 ≤ M(t) ≤ 1. (24)<br />
We now are able to find bounds for P(t),<br />
[ ]<br />
dP [ ]<br />
dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)P (1 − a 6 )<br />
( )<br />
− (a 9 P ) − H(t)Z − (a 19 P )<br />
49
Using the exogenous variables time-series we estimate:<br />
0.009868042 ≤<br />
[<br />
]<br />
(1 − E ice (t)) ∗ a 0 ∗ e (0.06933∗E T H 2 O(t))<br />
≤ 0.6060888 (25)<br />
Using (24), (37) and dropping the subtracted elements we find the upper bound to<br />
be,<br />
We may rewrite as follow,<br />
dP<br />
[<br />
]<br />
dt ≤ (0.6060888)(1)(1 − a 6 ) − (a 9 + a 19 ) P<br />
} {{ }<br />
α u<br />
For the lower bound, since (24) and (23):<br />
dP<br />
≤ α<br />
dt u P where α u = 0.4720906<br />
We may rewrite as follow,<br />
dP<br />
dt ≥ − (a 9 + a 19 ) P − a<br />
} {{ } 13 K<br />
α l<br />
dP<br />
dt<br />
≥ −α l P − K<br />
where α l = 0.032101<br />
Using proposition 1 we get,<br />
∴ (P 0 + K α l<br />
)e −α lt − K α l<br />
≤ P (t) ≤ P 0 e αut (26)<br />
P (t) > 0 ∈ (0, +∞)<br />
50
On a biology standpoint α l is the maximum rate of decline and α u is the maximum<br />
rate of growth. Theses bounds give us little information about the model, as it states<br />
that Phytoplankton concentration is contained between zero and infinity. Continue<br />
our analysis with Detritus:<br />
dD<br />
(<br />
)<br />
) (<br />
)<br />
dt = (1 − a 10 )a 9 P +<br />
((1 − a 10 )a 11 Z 2 + (1 − a 10 )(1 − a 12 )H(t)Z<br />
− D(a 17 + a 20 )<br />
Using (22), (23) and (26) we get,<br />
dD<br />
(<br />
) ) (<br />
)<br />
dt ≤ (1 − a 10 )a 9 P 0 e αut +<br />
((1 − a 10 )a 11 K 2 + (1 − a 10 )(1 − a 12 )a 13 K − D(a 17 + a 20 )<br />
Then solving for the upper bound,<br />
dD u<br />
dt<br />
)<br />
)<br />
+ D u (a 17 + a 20 ) =<br />
((1 − a 10 )a 9 P 0 e αut +<br />
(a 11 K + (1 − a 12 )a 13 (1 − a 10 )K<br />
Using Proposition 1,<br />
D u (t) =<br />
lim<br />
t→+∞ Du (t) = ∞<br />
+<br />
(a 11 K + (1 − a 12 )a 13<br />
)<br />
(1 − a 10 )K<br />
+ (1 − a 10)a 9<br />
P 0 e αut<br />
α u + a 17 + a } {{ 20}<br />
β u1 β u2<br />
(a 17 + a 20 )<br />
} {{ }<br />
(D 0 − β u1 − β u2 P 0<br />
)<br />
e −(a 17+a 20 )t<br />
where β u1 = 220.00119 and β u2 = 0.03054<br />
51
Then finding an lower bound,<br />
dD l<br />
dt + Dl (a 17 + a 20 ) = 0<br />
D l (t) = D 0 e −(a 17+a 20 )t<br />
lim<br />
t→+∞ Dl (t) = 0<br />
Thus,<br />
D 0 e −(a 17+a 20 )t ≤ D(t) ≤ β u1 + β u2 P 0 e αut +<br />
(D 0 − β u1 − β u2 P 0<br />
)<br />
e −(a 17+a 20 )t<br />
(27)<br />
These bound (27) show a maximum rate of decline driven by parameter 17 and<br />
20. As they were in Model A, the equation structure for Nitrate and Iron are very<br />
similar differing only by parameter and variables.<br />
[<br />
] [<br />
dN<br />
dt = a 17 D<br />
+ (a 21 − N)<br />
(a 7 12.0107)<br />
[<br />
P<br />
−<br />
(a 7 12.0107)<br />
≤a 22<br />
{ }} {<br />
]<br />
E T H2 O<br />
(a<br />
max<br />
− E T H2 O(t)<br />
22<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)<br />
]<br />
Using (26) and (27) we find an upper bound, the P term is dropped as it’s lower<br />
bound is zero . Thus,<br />
dN<br />
a 17<br />
(β<br />
dt ≤ (a u1 + β u2 P 0 e αut +<br />
21 − N)a 22 +<br />
) )<br />
(D 0 − β u1 − β u2 P 0 e −(a 17+a 20 )t<br />
(a 7 12.0107)<br />
52
Setting up to solve N u (t)<br />
dN u<br />
dt<br />
dN u<br />
dt<br />
a 17<br />
(β<br />
= (a 21 − N u u1 + β u2 P 0 e αut +<br />
)a 22 +<br />
(a 7 12.0107)<br />
a 17<br />
(β<br />
+ N u u1 + β u2 P 0 e αut +<br />
a 22 = a 21 a 22 +<br />
(a 7 12.0107)<br />
) )<br />
(D 0 − β u1 − β u2 P 0 e −(a 17+a 20 )t<br />
(D 0 − β u1 − β u2 P 0<br />
)<br />
e −(a 17+a 20 )t<br />
)<br />
Solving for N u (t) using Proposition 1,<br />
(<br />
N u (t) = a 21 +<br />
) )<br />
a 17<br />
(β u1 + β u2 P 0 e αut +<br />
(D 0 − β u1 − β u2 P 0 e −(a 17+a 20 )t<br />
)<br />
(a 22 a 7 12.0107)<br />
} {{ }<br />
+<br />
(N 0 − γ N<br />
)<br />
e −(a 22)t<br />
γ N<br />
where,<br />
lim γ N(t) = ∞<br />
t→+∞<br />
Let’s rewrite it as,<br />
N u (t) = γ N +<br />
lim N u (t) = ∞<br />
t→+∞<br />
(N 0 − γ N<br />
)<br />
e −(a 22)t<br />
For the lower bound, the D term is dropped as its lower bound goes to zero.<br />
dN<br />
dt ≤ (a 21 − N)a 22<br />
53
Thus, using Proposition 1 we get,<br />
dN l<br />
= (a 21 − N l )a 22<br />
dt<br />
dN u<br />
+ N l a 22 = a 21 a 22<br />
dt<br />
N l (t) = a 21 + N 0 e −a 22t<br />
lim<br />
t→+∞ N l (t) = a 21<br />
To summarize the bounds,<br />
a 21 + N 0 e −a 22<br />
≤ N(t) ≤ γ N +<br />
(N 0 − γ N<br />
)<br />
e −a 22t<br />
When t → ∞ we obtain,<br />
∴ a 21 = 31 ≤ N(t) ≤ ∞ (28)<br />
Nitrate then has constant lower bound, which implies that the concentration will<br />
never go below a 21 for this particular model structure. This make us re-iterate that<br />
these models are very sensitive to parameter selection process. As mentioned above<br />
Iron as the same equation structures, thus using the same analysis we found Iron as<br />
follows:<br />
[<br />
] [<br />
dF<br />
dt = Da 17<br />
+<br />
(a 8 12.0107)<br />
[<br />
P<br />
−<br />
(a 8 12.0107)<br />
]<br />
E T H2 O<br />
(a 23 − F )a<br />
max<br />
− E T H2 O(t)<br />
24<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)<br />
]<br />
Thus,<br />
54
(<br />
F u (t) = a 23 +<br />
) )<br />
a 17<br />
(β u1 + β u2 P 0 e αut +<br />
(D 0 − β u1 − β u2 P 0 e −(a 17+a 20 )t<br />
)<br />
(a 24 a 8 12.0107)<br />
} {{ }<br />
+<br />
(F 0 − γ F<br />
)<br />
e −(a 24)t<br />
γ F<br />
where,<br />
lim γ F (t) = ∞<br />
t→+∞<br />
Let’s rewrite it as,<br />
F u (t) = γ F +<br />
lim F u (t) = ∞<br />
t→+∞<br />
(F 0 − γ F<br />
)<br />
e −(a 24)t<br />
For the lower bound we get,<br />
F l (t) = a 23 + F 0 e −a 24t<br />
lim<br />
t→+∞ F l (t) = a 23<br />
To summarize we get,<br />
a 23 + F 0 e −a 24<br />
≤ F (t) ≤ γ F +<br />
(F 0 − γ F<br />
)<br />
e −a 24t<br />
which as t → ∞,<br />
55
∴ a 23 = 4.5.10 −4 ≤ F (t) ≤ ∞ (29)<br />
As was the case for Nitrate, Iron is bounded below by Parameter 23.<br />
This<br />
concludes the analysis of Model B.<br />
Models A and B are the two good fit models under a .5 reMSE which came<br />
up the most frequently throughout the 31 experiments.<br />
Our analysis has shown<br />
that the structure of the equations for phytoplankton and detritus produce similar<br />
dynamics and bounds for both models; on the other hand where iron and nitrate<br />
were bounded above with a parameter in Model A they were bounded below by<br />
a parameter value in Model B . Also, the structure and bounds for zooplankton<br />
had much more variations. For instance, the bounds for Model A implied that the<br />
zooplankton population will go to extinction whereas bounds for Model B indicated<br />
that the population has an upper bound at the carrying capacity K. This simple<br />
observation led me to look more into the zooplankton dynamic, to do so I chose to<br />
select the model with the lowest reMSE from experiment 6. In this experiment HIPM<br />
was provided observations for both phytoplankton and zooplankton dynamics. This<br />
was not a random choice since phytoplankton is the dynamic we are trying to model<br />
and zooplankton is the state variable demonstrating the most variability in structure<br />
and having potentially the most restrictive power out of all state variables, based on<br />
computational results. This model will be presented as Model C.<br />
56
4.5 Model C<br />
Table 7: This table summarize all the parameters that play a role in Model C<br />
Model C<br />
ID Name Value<br />
a 0 phyto.max growth 0.786225<br />
a 1 phyto.Ek max 1.69379<br />
a 2 arrigoetal1998 w photoinhibition coefficient 12.8247<br />
a 3 Nitrate monod lim coefficient 5.13429e-05<br />
a 4 Iron monod lim 0.0001252<br />
a 5 phyto.exude rate 0.0010004<br />
a 6 NO3.toCratio 6.68978<br />
a 7 Fe.toCratio 119659<br />
a 8 phyto.death rate 0.0273329<br />
a 9 environment.beta 0.993959<br />
a 10 zoo.death rate 0.00239853<br />
a 11 zoo.assim eff 0.393807<br />
a 12 zoo.attack 0.340717<br />
a 13 zoo grazing.ratio dependent 3 coefficient 1.86168<br />
a 14 detritus.remin rate 0.0374133<br />
a 15 zoo.respiration rate 0.0282409<br />
a 16 phyto.sinking rate 0.0143703<br />
a 17 detritus.sinking rate 0.0990947<br />
a 18 NO3.avg deep conc 31.2197<br />
a 19 NO3 linear temp control max mixing rate 0.753984<br />
a 20 Fe.avg deep conc 0.00045<br />
a 21 Fe linear temp control max mixing rate 0.0681006<br />
57
Model C<br />
Where,<br />
dP<br />
dt<br />
dZ<br />
dt<br />
dD<br />
dt<br />
dN<br />
dt<br />
dF<br />
dt<br />
=<br />
=<br />
=<br />
=<br />
=<br />
[ [ ]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) (1 − a 5 ) − a 8 − a 16<br />
]P (30)<br />
} {{ }<br />
(<br />
−<br />
(a 12 P 2 )<br />
Z 2 + a 12 a 13 P 2<br />
} {{ }<br />
Z Grazing Rate<br />
P Growth Rate<br />
)<br />
Z<br />
(<br />
(a 12 P 2 )<br />
)<br />
)<br />
a 11 Z −<br />
(a<br />
Z 2 + a 12 a 13 P<br />
} {{ 2<br />
10 Z + a 15 Z (31)<br />
}<br />
Z Grazing Rate<br />
(<br />
)<br />
(1 − a 9 )(a 8 P + a 10 Z 2 )<br />
(32)<br />
(<br />
+ (1 − a 9 )(1 − a 11 )<br />
[<br />
] [<br />
a 14 D<br />
+<br />
(a 6 ∗ 12.0107)<br />
(a 12 P 2 )<br />
Z 2 + a 12 a 13 P<br />
} {{ 2<br />
}<br />
Z Grazing Rate<br />
)<br />
Z − D(a 14 + a 17 )<br />
E T H2 O<br />
(a 18 − N) a<br />
max<br />
− E T H2 O(t)<br />
19<br />
E T H2 O max<br />
− E T H2 O<br />
} {{ min<br />
}<br />
N Mixing Rate<br />
[<br />
]<br />
P<br />
[<br />
]<br />
−<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)<br />
(a 6 12.0107) } {{ }<br />
P Growth Rate<br />
[<br />
] [<br />
]<br />
a 14 D<br />
+<br />
(a 7 12.0107)<br />
−<br />
[<br />
P<br />
(a 7 12.0107)<br />
E T H2 O<br />
(a 20 − F ) a<br />
max<br />
− E T H2 O(t)<br />
21<br />
E T H2 O max<br />
− E T H2 O<br />
} {{ min<br />
}<br />
F Mixing Rate<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)<br />
} {{ }<br />
P Growth Rate<br />
]<br />
]<br />
(33)<br />
(34)<br />
{<br />
M(t) = min<br />
F<br />
(F + a 4 ) , N<br />
(N + a 3 ) , (1 − e −E P UR (t)(1+a 2 e(E P UR (t)e1.089−2.12log 10 (a 1<br />
} ) ) )<br />
a 1 ))<br />
} {{ }<br />
Phytoplankton Growth Limitation<br />
58
The analysis of Model C is very similar to that of Model B, especially for Nitrate<br />
and Iron since their equations structures are identical expect for the parameters and<br />
M(t). In this case since we are using the approximation of the bound of M(t) to<br />
be between zero and one for all the models, the bounds for both Nitrate and Iron<br />
are going to be close in structure to that of model C. Following the procedure used<br />
in both previous models we start our analysis with the Zooplankton equation (31)<br />
which is the simplest of the five.<br />
dZ<br />
(<br />
dt =<br />
(a 12 P 2 )<br />
a 11<br />
Z 2 + a 12 a 13 P<br />
} {{ 2<br />
}<br />
Z Grazing Rate<br />
≤Z [ a 11<br />
a 13<br />
− a 15 − a 10 Z]<br />
} {{ }<br />
Logistic Equation<br />
)<br />
)<br />
Z −<br />
(a 10 Z + a 15 Z<br />
Setting, a 11<br />
a 13<br />
− a 15 − a 10 Z = 0 we can find the carrying capacity K.<br />
K =<br />
a 11<br />
a 13<br />
− a 15<br />
a 10<br />
= 76.41856944<br />
Thus, an upper bound for Z(t) will be,<br />
lim Z(t) ≤ K = 76.41856944<br />
t→+∞<br />
This is significant since the Zooplankton concentration will have a maximum of<br />
K. The lower bound of this entity will be zero, since we know from biology that<br />
Zooplankton concentration cannot be negative. Hence,<br />
∴ 0 ≤ Z(t) ≤ 76.41856944 (35)<br />
59
Next we turn our attention to P (t) for which we take M(t) to have the following<br />
bounds:<br />
0 ≤ M(t) ≤ 1, (36)<br />
and using the exogenous variables time-series we estimate:<br />
0.0138249 ≤<br />
[<br />
]<br />
(1 − E ice (t)) ∗ a 0 ∗ e (0.06933∗E T H 2 O(t))<br />
≤ 0.8491189 (37)<br />
Then,<br />
[<br />
dP [ ]<br />
dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)(1 − a 5 ) − a 8 − a 16<br />
]P<br />
(<br />
(a 12 P 2 )<br />
)<br />
−<br />
Z<br />
Z 2 + a 12 a 13 P 2 ]<br />
[(0.8491189)(1)(1 − a 5 ) − a 8 − a 16<br />
≤<br />
We may rewrite as follow,<br />
} {{ }<br />
α u<br />
P<br />
Using proposition 1 we get,<br />
dP<br />
≤ α<br />
dt u P where α u = 0.806566<br />
∴ 0 ≤ P (t) ≤ P 0 e αut (38)<br />
lim P (t) = ∞ (39)<br />
t→+∞<br />
60
That being said let’s continue the analysis of Model C with Detritus.<br />
dD<br />
dt = (<br />
+<br />
)<br />
(1 − a 9 )(a 8 P + a 10 Z 2 )<br />
(<br />
(1 − a 9 )(1 − a 11 )<br />
Using (35) and (39) we get,<br />
(a 12 P 2 )<br />
Z<br />
Z 2 + a 12 a 13 P<br />
} {{ 2<br />
}<br />
Z Grazing Rate<br />
)<br />
− D(a 14 + a 17 )<br />
dD<br />
(<br />
)<br />
dt ≤ (1 − a 9 )(a 8 P 0 e αut + a 10 K 2 ) +<br />
((1 − a 9 )(1 − a 11 ) K )<br />
− D(a 14 + a 17 )<br />
a 13<br />
Then solving for the upper bound,<br />
dD u<br />
dt<br />
+ D u (a 14 + a 17 ) =(1 − a 9 )a 8 P 0 e αut +<br />
(<br />
a 10 K + (1 − a )<br />
11)<br />
(1 − a 9 )K<br />
a 13<br />
Using Proposition 1,<br />
(<br />
a 10 K + (1−a 11)<br />
D u a 13<br />
)(1 − a 9 )K<br />
(t) =<br />
+ (1 − a 9)a 8<br />
P 0 e αut<br />
(a 14 + a 17 ) α<br />
} {{ } u + (a 14 + a 17 )<br />
} {{ }<br />
β u1 β u2<br />
)<br />
(D 0 − β u1 − β u2 P 0 e −(a 14+a 17 )t<br />
where β u1 = 1.721033 and β u2 = 1.7508e −04<br />
61
We can rewrite D u (t) as,<br />
D u (t) = β u1 + β u2 P 0 e αut +<br />
lim<br />
t→+∞ Du (t) = ∞<br />
(D 0 − β u1 − β u2 P 0<br />
)<br />
e −(a 14+a 17 )t<br />
Then finding an lower bound,<br />
dD l<br />
dt + Dl (a 14 + a 17 ) = 0<br />
D l (t) = D 0 e −(a 14+a 17 )t<br />
lim<br />
t→+∞ Dl (t) = 0<br />
Thus,<br />
D 0 e −(a 14+a 17 )t ≤ D(t) ≤ β u1 + β u2 P 0 e αut +<br />
(D 0 − β u1 − β u2 P 0<br />
)<br />
e −(a 14+a 17 )t<br />
(40)<br />
When t → ∞ we get the following,<br />
∴ 0 ≤ D(t) ≤ ∞ (41)<br />
The approach for Nitrate and Iron is identical to that of Model B so we get, for<br />
Nitrate<br />
a 18 + N 0 e −a 19<br />
≤ N(t) ≤ γ N +<br />
(N 0 − γ N<br />
)<br />
e −a 19t<br />
where,<br />
γ N =<br />
)<br />
( a 14 β u1 + β u2 P 0 e αut +<br />
(D 0 − β u1 − β u2 P 0<br />
a 18 +<br />
(a 19 a 6 12.0107)<br />
e −(a 14+a 17 )t<br />
)<br />
62
When t → ∞ we obtain,<br />
∴ a 18 = 31.2197 ≤ N(t) ≤ ∞ (42)<br />
And for iron we get,<br />
a 20 + F 0 e −a 21<br />
≤ F (t) ≤ γ F +<br />
(F 0 − γ F<br />
)<br />
e −a 21t<br />
where,<br />
γ F =<br />
)<br />
( a 17 β u1 + β u2 P 0 e αut +<br />
(D 0 − β u1 − β u2 P 0<br />
a 20 +<br />
(a 21 a 7 12.0107)<br />
e −(a 14+a 17 )t<br />
)<br />
which as t → ∞,<br />
∴ a 20 = 4.5.10−4 ≤ F (t) ≤ ∞ (43)<br />
The analysis of Model C yield some interesting results.<br />
Indeed, we obtained<br />
realistic bounds for all the states variables. The upper bounds for Phytoplankton,<br />
Nitrate and Iron are not informative as they go to infinity.That being said with<br />
only two set of time-series (Phytoplankton and Zooplankton) HIPM produced five<br />
good fit model under .5 reMSE, one of which I have analyzed and seemed to yield<br />
dynamics in accordance with what domain scientists would expect.<br />
4.6 Effects of increasing the number of constraints<br />
The computational results have established that increasing the number of constraints<br />
inputted into HIPM by adding additional entities’ time-series will in some cases<br />
reduce the number of good fit models selected by the software. A few cases were<br />
studied to look at the impact of increase in constraint on the models selected. During<br />
63
his research on the Ross Sea Phytoplankton dynamic, Borrett (unpublished data)<br />
worked with real-life measurements for Phytoplankton and Nitrate and inputted this<br />
data into HIPM, for that particular reason I decided to look at experiment number 8<br />
which assumed data for both Nitrate and Phytoplankton (cf. Table 4). The number<br />
of good fit models, under a .5 reMSE, produced by HIPM is 25. When observations<br />
for Zooplankton are added no models under the chosen reMSE cutoff are selected; on<br />
the other hand when Iron or Detritus are added a significant decrease in the number<br />
of good fit models can be observed. The addition of detritus (experiment 19) and<br />
iron (experiment 21) yielded respectively 13 and 15 good fit models. However, out<br />
of all the models selected in experiment 8 only 2 models were part of the set selected<br />
in experiment 19 (addition of detritus data) and 3 models from the set selected<br />
in experiment 21 (addition of iron data). A comparison of the structures of these<br />
models (the models can be found in appendix D and E) that they differ only, by<br />
the type of Zooplankton grazing process used, by the parameter values and, in some<br />
cases, by the Phytoplankton growth limitation (aka M(t)). Otherwise the structures<br />
of these models are comparable, implying perhaps that the grazing processes used<br />
in these models have similar effects on the ecosystem, which is plausible given the<br />
number of grazing processes present in the process library. It is in fact this high<br />
number of grazing options that makes for a very large structural search space.<br />
64
5 CONCLUSION<br />
The hierarchal inductive process-modeling framework was effective in its role to cover<br />
two very extensive search spaces in a short amount time and with the availability of<br />
the CIAO data I was able to investigate the usefulness of the software in our search<br />
for the best model representation.<br />
Some of the major observations that can be made throughout this paper are<br />
about the Zooplankton state variable.<br />
Not only did it tend to provide great restrictive<br />
power when its time series was inputted into HIPM as portrayed with its<br />
median activation value and Table 4, but most often some of the good fit models<br />
I selected only differed in the type of Zooplankton grazing process chosen. All<br />
this may suggest one of two things.<br />
Either, that indeed Zooplankton yields the<br />
most important discriminatory power out of all state variables and is the data that<br />
should be collected first and foremost for the Ross Sea ecosystem, or that the way<br />
the Zooplankton entity is defined in the HIPM framework is inadequate for this<br />
type of ecosystem which incidentally weeds out many of the model that are being<br />
searched through (i.e. all the zeros in Table 4 ). This makes for high variability in<br />
structure for the good fit models selected which in turn creates an array of dynamic<br />
some of which are at opposite end of the spectrum (i.e. zooplankton population<br />
going to extinction in some case or going to the carrying capacity in other) when<br />
HIPM is provided with time-series data that are not that of Zooplankton. The other<br />
explanation about the issues encountered with Zooplankton could be found not in<br />
the way Zooplankton is defined in the process library but rather in an assumption<br />
made within the biological knowledge encoded into HIPM. Indeed, Phaeocystis are<br />
assumed to be grazing resistant to zooplankton, meaning that it is more difficult for<br />
zooplankton to graze upon on Phaeocystis than it is on diatoms; the later not being<br />
included in the process-library as we will discuss later on. Hence, the inability for<br />
65
zooplankton to be properly fitted may lie in the way phytoplankton has been defined<br />
and not zooplankton.<br />
Even though phytoplankton did not seem to reduce the number of good fit model<br />
as effectively as zooplankton, it did have an unexpected dynamic for the good fit<br />
models studied. Indeed, in all models the phytoplankton state variable upper bound<br />
is infinity as t goes to infinity. As mentioned previously the Ross Sea is scene to one<br />
the largest phytoplankton blooms in the Southern Ocean. The population of phytoplankton<br />
in the Ross Sea is primarily made up of two species, Phaeocystis antartica<br />
and diatoms. Since Phaeocystis dominates the phytoplankton bloom in the region,<br />
it was the only species taken into consideration in this experiment which may have<br />
influenced the output of the system and the dynamics of the states variables and<br />
more specifically those of zooplankton. The addition of another species of phytoplankton<br />
in HIPM could change the outputs significantly. There is a need here to<br />
modify the way phytoplankton are defined in HIPM and incorporate the multiplicity<br />
of species of phytoplankton in the system. This can be ground for future research.<br />
Surprisingly, we found contrary evidence to the assumption that more information<br />
meant fewer models being selected. As explained in the computational results,<br />
there were instances where inputting two time series into HIPM selected fewer good<br />
fit models then inputting three time-series. Once again, as previously mentioned,<br />
the explanation for this observation is the way reMSE is defined within HIPM when<br />
dealing with multiple variables: the mean square errors of the state variable being<br />
fitted are averaged. For instance, if both phytoplankton and detritus have a mean<br />
error of 0.3 the reMSE would be 0.3 if we then add iron with a mean error of 0.5<br />
the overall reMSE is now 0.36 which is under the cutoff for good fit models. We<br />
can imagine cases for which models that were not in the good fit category with two<br />
time-series then become good fit models with the addition of another time-series.<br />
Hence, the set of good fit models selected with three time-series entries, will not<br />
66
e a subset of the set of good fit models selected with two of the three time-series<br />
previously mentioned. This puts an important emphasis on which measurements<br />
and observations added to the search. The decision of adding an extra time-series to<br />
the software should be highly influenced by which data has already been collected<br />
and used with HIPM. That being said this could be an area where HIPM could be<br />
enhanced and a direction for further research. Indeed, it would be interesting to look<br />
at the output of HIPM if the reMSE was calculated by taking the maximum square<br />
error of all the variables being fitted which would then make our assumption, that<br />
more time-series data equates fewer models, valid.<br />
The parameter selection process<br />
is a very important step of HIPM model selection; this statement was reinforced<br />
by the mathematical analysis that made evident the sensitivity of the system to parameter<br />
values. Differences in parameter values could mean the difference between a<br />
population going to extinction or not. In the case of nitrate and iron I observed that<br />
specific parameters acted as bounds for these variables, which is useful information.<br />
Indeed, when looking at a model through mathematical analysis one can determine<br />
if a parameter will have a significant effect on the overall dynamics of the system.<br />
Coupling this information with experts knowledge it would then be possible to redefine<br />
the ranges set within HIPM for the parameter selection, which would in turn<br />
refine the search process. Scientists could then run the software once again, take<br />
the result and see if parameters ranges could be further refined. This can almost be<br />
seen as a cycle, deriving information on the parameters from the analytical analysis<br />
which in turn help better the constraints given to HIPM, then repeating the process<br />
to see the improvement made to the type of model being selected. This in itself can<br />
be seen as a procedure that transcend just the phytoplankton dynamic in the Ross<br />
Sea ecosystem and can be generalized to other systems.<br />
Automated modeling is a successful method, the LAGRAMGE framework has<br />
been successfully applied in a real-world domain and selected models that performed<br />
67
eally well (Atanasova et al. 2007). While this framework can evaluate only able one<br />
state variable at the time, HIPM is capable of evaluating multiple variables simultaneously<br />
; however we are faced with an under-constrained optimization problem<br />
since we want to select models with data for only a couple of the variables. Using<br />
CIAO simulated data we were able to explore and investigate the response of HIPM<br />
for the phytoplankton dynamic in the Ross Sea. Even though we did not use real-life<br />
data the results generate conclusions. First, more data is not synonymous with fewer<br />
models being selected. This conclusion must be tested to see if it can be generalized<br />
to other ecosystems or if becomes obsolete with a refined and improved processlibrary.<br />
Secondly, the result that zooplankton contains more restrictive power than<br />
the other state variables was attained only through multiple experiments using a<br />
full data set. There is room for further work in the area of exploratory statistics<br />
with the median activation value in order to develop a formal procedure that would<br />
assist scientist in their decision making process for data collection.<br />
The number of processes taken into consideration for the Ross Sea ecosystem<br />
make for an extensive process-library, which creates models with very intricate and<br />
complex structures. A direction for improvement would be to look at some sort of<br />
measure of complexity for the models, somewhat motivated by the law of parsimony<br />
that states that the simplest explanation is often the best.This could be coupled<br />
with a measure of distance between models; this two concepts could potentially be<br />
of great value when comparing different models. However, at this point the more<br />
plausible and logical step for future research would be first to incorporate diatoms<br />
in the way in which phytoplankton is defined in the process library and second to<br />
switch from a mean square error to a maximum square error, which in my opinion<br />
would yield very different results. In the long run, this thesis could be the premise<br />
to a protocol towards decision making in the data collection process.<br />
68
REFERENCES<br />
[1] Arrigo, K. R., and C. R. McClain, “Spring phytoplankton production in the<br />
western Ross Sea”, Science, 266, 261263, 1994.<br />
[2] Arrigo, K. R., A. M. Weiss, and W. O. Smith Jr., “Physical forcing of phytoplankton<br />
dynamics in the southwestern Ross Sea”, J. Geophys. Res., 103,<br />
10071021, 1998.<br />
[3] Arrigo, K. R., G. R. DiTullio, R. B. Dunbar, M. P. Lizotte, D. H. Robinson,<br />
M. VanWoert, and D. L. Worthen, “Phytoplankton taxonomic variability and<br />
nutrient utilization and primary production in the Ross Sea”, J. Geophys. Res.,<br />
105, 8827 8846, 2000.<br />
[4] Arrigo, K. R., D. Worthen & D. Robinson, “A coupled ocean-ecosystem model<br />
of the Ross Sea: 2. Iron regulation of phytoplankton taxonomic variability and<br />
primary production”, Journal of Geophysical Research, VOL 108, NO. C7, 3231,<br />
2003.<br />
[5] Atanasova, N., L. Todorovski, S. Dzeroski &B. Kompare, “Application of automated<br />
model discovery from data and expert knowledge to a real-world domain:<br />
Lake Glums” Ecological Modelling, 212, 92-98, 2008.<br />
[6] Borrett, S. R., W. Bridewell, P. Langley & K. Arrigo, “A method for representing<br />
and developing process models” Ecological Complexity, 4, I-12, 2007.<br />
[7] Bridewell, W., P. Langley , S. Racunas, & Borrett, S. “Learning process models<br />
with missing data”. Proceedings of the Seventeenth European Conference on<br />
Machine Learning, 557–565. 2006.<br />
[8] Bridewell, W., P. Langley, L. Todorovski &S. Dzeroski, “Inductive Process Modeling”,<br />
Standford University, Standford, CA, 2007.<br />
69
[9] Dzeroski, S. and Todorovski, L. “Discovering dynamics: from inductive logic<br />
programming to machine discovery.” Journal of Intelligent Information Systems,<br />
4: 89-108. 1995.<br />
[10] Dzeroski, S., Todorovski, L. “Discovering dynamics. Proceedings of the Tenth<br />
International Conference on Machine learning”, Morgan Kaufmann, San Mateo,<br />
CA, pp. 97103. 1993.<br />
[11] Fayyad, U., Haussler, D., & Stolorz, P. KDD “for science data analysis: Issues<br />
and examples. Proceedings of the Second International Conference of Knowledge<br />
Discovery and Data Mining” (pp. 5056). Portland, OR: AAAI Press. 1996.<br />
[12] Feng, W., X. Lu & R. Donovan, “Population Dynamics in a Model Territory<br />
Acquisition” Discrete And Continuous Dynamical Systems,Added Volume, 156-<br />
165, 2001.<br />
[13] Langley, P., J. Shrager, N. Asgharbeygi & S. Bay, “Inducing Explanatory Process<br />
Models from Biological Time Series”,Standford University, Standford, CA.<br />
[14] Langley, P. Elements of machine learning. San Mateo, CA: Morgan Kaufmann.1995.<br />
[15] Langley, P., Shiran, O., Shrager, J., Todorovski, L., & Pohorille, A. “Constructing<br />
explanatory process models from biological data and knowledge.” AI<br />
in Medicine, 37, 191-201. 2006.<br />
[16] Ljung, L. “Modelling of industrial systems. Proceedings of Seventh International<br />
Symposium on Methodologies for Intelligent Systems” (pp. 338-349). Berlin:<br />
Springer. 1993.<br />
[17] Mitchell, T. M. Machine learning. New York, NY: McGraw Hill. 1997.<br />
70
[18] Oreskes, N., K. Shrader-Frechette & K. Belitz.“Verification, validation, and<br />
confirmation of numerical models in the earth sciences. Science, vol. 263, pp.<br />
641-646. [Reprinted in Transactions of the Computer Measurement Group, vol.<br />
84, pp. 85-92].1994.<br />
[19] Oreskes, N. “Why believe a computer Models, measures, and meaning in the<br />
natural world, in The Earth Around Us: Maintaining a Livable Planet, edited<br />
by Jill S. Schneiderman (San Francisco: W.H. Freeman and Co.), pp. 70-82.<br />
2000.<br />
[20] Tagliabue A. & K. R. Arrigo. “Anomalously Low Zooplankton Abundance in<br />
the Ross Sea: An Alternative Explanation, Limnology and Oceanography Vol.<br />
48, No. 2, pp. 686-699. 2003.<br />
[21] Todorovski, L., Dzeroski, S., Kompare, B. “Modelling and prediction of phytoplankton<br />
growth with equation discovery.” Ecological Modelling 113, 7181.<br />
1998.<br />
[22] Todorovski, L. “Using domain knowledge for automated modeling of dynamic<br />
systems with equation discovery”. Doctoral dissertation, Faculty of Computer<br />
and Information Science, University of Ljubljana. Ljubljana, Slovenia. 2003.<br />
[23] Todorovski, L., W. Bridewell, O. Shiran, & P. Langley. “Inducing hierarchical<br />
process models in dynamic domains”. Proceedings of the Twentieth National<br />
Conference on Artificial Intelligence, 892–897. 2005.<br />
71
APPENDIX<br />
A. Sample CIAO data - 1997<br />
JDAY TEMP DPML AI NITR PHOS SILC IRON_nm IRON_um PARL PHA PHA_c DIA DIA_c<br />
ZOO DET PURL TP<br />
229 -1.842 68.86 0.87 31 2.1 75.97 0.5052 0.0005052 3.204 0.02518 2.2662<br />
0.02518 1.7626 1.999 0.02289 1.678 4.0288<br />
230 -1.842 79 0.84 31 2.1 75.86 0.505 0.000505 7.401 0.02518 2.2662 0.02518<br />
1.7626 1.999 0.02289 3.891 4.0288<br />
231 -1.844 83.75 0.82 31 2.1 75.73 0.5054 0.0005054 5.875 0.02518 2.2662<br />
0.02518 1.7626 1.999 0.02289 3.166 4.0288<br />
232 -1.848 84.19 0.84 31 2.1 75.7 0.5062 0.0005062 8.494 0.02518 2.2662<br />
0.02518 1.7626 1.999 0.02289 4.425 4.0288<br />
233 -1.838 112.1 0.83 31 2.1 75.7 0.5069 0.0005069 10.76 0.02518 2.2662<br />
0.02518 1.7626 1.999 0.02289 5.595 4.0288<br />
234 -1.832 114.4 0.86 31 2.1 75.64 0.5073 0.0005073 16.48 0.02518 2.2662<br />
0.02518 1.7626 1.999 0.02289 8.723 4.0288<br />
235 -1.84 103 0.9 31 2.1 75.54 0.5083 0.0005083 19.51 0.02518 2.2662 0.02518<br />
1.7626 1.999 0.02289 10.33 4.0288<br />
236 -1.844 92.1 0.88 31 2.1 75.44 0.5087 0.0005087<br />
72
B. Full entity Specification File<br />
#!/usr/bin/python<br />
"""<br />
This is the revised file for entity specification<br />
Stuart Borrett<br />
April 26, 2007<br />
"""<br />
from ross_lib import *;<br />
# import library<br />
# observed primary producer<br />
p1 = entity_instance(pe, "phyto",<br />
{"conc": ("system", "PHA_c", (0,600)), # ugC/L<br />
"growth_rate": ("system", 0, (0,1)),<br />
"growth_lim": ("system", 1, (0,1))},<br />
{"max_growth":0.59,<br />
"exude_rate":0.19,<br />
"death_rate":0.025,<br />
"Ek_max":30,<br />
"biomin":0.025,<br />
"PhotoInhib":200}<br />
);<br />
# unobserved grazer with initial value from [0,1] default 0.1<br />
Z1 = entity_instance(ze, "zoo",<br />
{"conc": ("system", 0.1 , (0.10,510)),<br />
"growth_rate": ("system", 0.1, (0, 1))},<br />
{"assim_eff":0.75,<br />
73
"death_rate":0.02,<br />
"respiration_rate":0.019,<br />
"gmax":0.4,<br />
"gcap":200}<br />
);<br />
# observed nitrate<br />
no3 = entity_instance(no3, "NO3",<br />
{"conc": ("system", "NITR", (0,32)),<br />
"mixing_rate": ("system", 0, (0,1))}, None);<br />
# unobserved iron<br />
fe = entity_instance(fe, "Fe",<br />
{"conc": ("system",.00042920, (0,0.001)),<br />
"mixing_rate": ("system", 0, (0,1))}, None);<br />
# observed/exogenous ENVIRONMENT<br />
e1 = entity_instance(ee, "environment",<br />
{"PUR": ("exogenous", "PURL", None),<br />
"TH2O": ("exogenous", "TEMP", None),<br />
"ice":("exogenous", "AI", None) },<br />
{"beta":0.7}<br />
);<br />
# unobsevable detritus with initial value from [0,1] default 0.1<br />
D1 = entity_instance(de, "detritus",<br />
{"conc": ("system", 0.1, (0.001, 210))}, None);<br />
74
C. Full ross Sea generic model library<br />
#!/usr/bin/python<br />
"""<br />
This generic model library supports the construction<br />
of an ecosystem model of the Ross Sea.<br />
It is hierarchical in processes, but the entites are flat.<br />
This version is updated and corrected.<br />
It is designed for use with the sensitivity analysis experiments<br />
"""<br />
from library import *;<br />
from entities import *;<br />
from processes import *;<br />
lib = library("aquatic_ecosystem");<br />
# -----------------------------------------------------------------------<br />
# -----------------------------------------------------------------------<br />
# GENERIC ENTITIES<br />
# id, variables, constant parameters<br />
# -----------------------------------------------------------------------<br />
# --- PHYTOPLANKTON ---<br />
pe = lib.add_generic_entity("P",<br />
{"conc":"sum",<br />
"growth_rate":"prod",<br />
"growth_lim":"min"},<br />
{"max_growth": (0.4,0.8),<br />
"exude_rate": (0.001,0.2),<br />
75
"death_rate": (0.02,0.04),<br />
"Ek_max":(1,100),<br />
"sinking_rate":(0.0001,0.25),<br />
"biomin":(0.02,0.04),<br />
"PhotoInhib":(200,1500),<br />
}<br />
);<br />
# --- ZOOPLANKTON ---<br />
ze = lib.add_generic_entity("Z",<br />
{"conc": "sum",<br />
"grazing_rate": "prod"},<br />
{"assim_eff":(0.05,0.4),<br />
"death_rate": (0.001,0.3),<br />
"respiration_rate":(0.01,0.04),<br />
"sinking_rate":(0.001,0.25),<br />
"gmax":(0.3,0.5),<br />
"glim":(19,21),<br />
"gcap":(199,301)}<br />
);<br />
# --- NUTRIENTs ---<br />
# nitrate<br />
no3 = lib.add_generic_entity("Nitrate",<br />
{"conc":"sum",<br />
"mixing_rate":"sum"},<br />
{"toCratio": (6.6,6.7),<br />
"avg_deep_conc": (31,32)}<br />
);<br />
76
# iron<br />
fe = lib.add_generic_entity("Iron",<br />
{"conc":"sum",<br />
"mixing_rate":"sum"},<br />
{"toCratio": (3000,450000),<br />
"avg_deep_conc": (0.00035,0.00045)}<br />
);<br />
# --- DETRITUS ---<br />
de = lib.add_generic_entity("D",<br />
{"conc": "sum"},<br />
{"remin_rate": (0.03,0.04),<br />
"sinking_rate":(0.00001,0.1)}<br />
);<br />
# --- ENVIRONMENT ---<br />
ee = lib.add_generic_entity("E",<br />
{"TH2O":"sum",<br />
"PUR":"sum",<br />
"ice":"sum"},<br />
{"beta":(0.001,1),<br />
}<br />
);<br />
# -----------------------------------------------------------------------<br />
# -----------------------------------------------------------------------<br />
# GENERIC <strong>PROCESS</strong>ES:<br />
# id, type, entities related, list of subprocesses,<br />
# constant parameters, equations<br />
# -----------------------------------------------------------------------<br />
77
# --- GROWTH ---<br />
lib.add_generic_process(<br />
"growth", "",<br />
[("P",[pe],1,1), ("N",[no3,fe],1,100), ("D",[de],1,1), ("E",[ee],1,1)],<br />
[("limited_growth", ["P","N","E"], 0),<br />
("exudation",["P"],1),<br />
("nutrient_uptake",["P","N"],0)],<br />
{},<br />
{},<br />
{"P.conc": "P.growth_rate * P.conc"}<br />
);<br />
lib.add_generic_process(<br />
"exudation", "exudation",<br />
[("P",[pe],1,1)],<br />
[],<br />
{},<br />
{},<br />
{"P.conc": "-1 * P.exude_rate * P.growth_rate * P.conc"}<br />
);<br />
lib.add_generic_process(<br />
"nutrient_uptake", "nutrient_uptake",<br />
[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />
[],<br />
{},<br />
{},<br />
78
{"N.conc": "-1 * 1/( N.toCratio * 12.0107)<br />
* P.growth_rate * P.conc"}<br />
);<br />
lib.add_generic_process(<br />
"limited_growth", "limited_growth",<br />
[("P",[pe],1,1), ("N",[no3,fe],1,100), ("E",[ee],1,1)],<br />
[("light_lim", ["P","E"], 0), ("nutrient_lim",["P","N"], 0)],<br />
{},<br />
{"P.growth_rate": "(1-E.ice) * P.max_growth<br />
* exp(0.06933 * E.TH2O) * P.growth_lim"},<br />
{}<br />
);<br />
# ------ P.growth_lim --<br />
# there are multiple factors (and formulations of factors)<br />
# that might limit growth.<br />
# In this library nutrient and light limitations are combined<br />
# into P.growth_lim using a minimum function<br />
# so that only one operates at a time (i.e., they are substitutable).<br />
# The disadvantage<br />
# of this encoding is that it will not be possible to determine<br />
# which factor is operating at a given time. Temperature<br />
# is a multiplicative control factor encoded in the P.growth_rate<br />
# equation, and in the present library we do not consider<br />
# alternative temperature effect functions.<br />
# --light lim --<br />
lib.add_generic_process(<br />
79
"arrigoetal1998", "light_lim",<br />
[("P",[pe],1,1), ("E",[ee],1,1)],<br />
[],<br />
{"a":(5,15)},<br />
{"P.growth_lim": "(1 - exp(-E.PUR / (P.Ek_max / (1 + a<br />
* exp(E.PUR * exp(1.089 - 2.12 * log10(P.Ek_max)))))))"},<br />
{}<br />
);<br />
lib.add_generic_process(<br />
"arrigoetal1998_w_photoinhibition", "light_lim",<br />
[("P",[pe],1,1), ("E",[ee],1,1)],<br />
[],<br />
{"a":(5,15)},<br />
{"P.growth_lim": "(1 - exp(-E.PUR / (P.Ek_max / (1 + a<br />
* exp(E.PUR * exp(1.089 - 2.12 * log10(P.Ek_max)))))))<br />
* exp(-1 * E.PUR /P.PhotoInhib)"},<br />
{}<br />
);<br />
# -- nutrient lim --<br />
lib.add_generic_process(<br />
"monod_lim", "nutrient_lim",<br />
[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />
[],<br />
{"k":(0.000001,0.001)},<br />
{"P.growth_lim": "N.conc / (N.conc + k)"},<br />
{}<br />
);<br />
80
’’’<br />
lib.add_generic_process(<br />
"ratio_lim", "nutrient_lim",<br />
[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />
[],<br />
{"k":(0.000001,1)},<br />
{"P.growth_lim": "N.conc / (N.conc + k * P.conc)"},<br />
{}<br />
);<br />
’’’<br />
lib.add_generic_process(<br />
"monod_2nd", "nutrient_lim",<br />
[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />
[],<br />
{"k":(0.000001,0.001)},<br />
{"P.growth_lim": "pow(N.conc,2) / (pow(N.conc,2) + k)"},<br />
{}<br />
);<br />
lib.add_generic_process(<br />
"nut_lim_exp", "nutrient_lim",<br />
[("P",[pe],1,1), ("N",[no3,fe],1,1)],<br />
[],<br />
{"k":(0.000001,1)},<br />
{"P.growth_lim": "1-exp(-1* k * N.conc)"},<br />
{}<br />
);<br />
# --- DEATH ---<br />
81
lib.add_generic_process(<br />
"death_exp", "",<br />
[("S",[pe,ze],1,1), ("D",[de],1,1), ("E",[ee],1,1)],<br />
[],<br />
{},<br />
{},<br />
{"S.conc": "-1 * S.death_rate * S.conc",<br />
"D.conc": "(1-E.beta) * S.death_rate * S.conc"},<br />
);<br />
# --- REMINERALIZATION ---<br />
lib.add_generic_process(<br />
"remineralization", "",<br />
[("D",[de],1,1), ("N",[fe,no3],1,3)],<br />
[("nutrient_remineralization",["D","N"], 0)],<br />
{},<br />
{},<br />
{"D.conc": "-1 * D.remin_rate * D.conc"}<br />
);<br />
lib.add_generic_process(<br />
"nutrient_remineralization", "",<br />
[("D", [de], 1,1), ("N", [fe], 1, 1)],<br />
[],<br />
{},<br />
{},<br />
{ "N.conc": "1/(N.toCratio * 12.0107) * D.remin_rate * D.conc" }<br />
);<br />
# --- RESPIRATION ---<br />
82
lib.add_generic_process(<br />
"respiration", "",<br />
[("Z",[ze],1,1)],<br />
[],<br />
{},<br />
{},<br />
{"Z.conc":"-1 * Z.respiration_rate * Z.conc"}<br />
);<br />
# --- SINKING ---<br />
lib.add_generic_process(<br />
"sinking", "",<br />
[("V",[pe,ze,de],1,1)],<br />
[],<br />
{},<br />
{},<br />
{"V.conc": "-1 * V.sinking_rate * V.conc"}<br />
);<br />
# --- GRAZING ---<br />
lib.add_generic_process(<br />
"holling_type_1", "graze_rate",<br />
[("Z",[ze],1,1), ("P",[pe],1,1)],<br />
[],<br />
{},<br />
{"Z.grazing_rate": "Z.gmax * P.conc"},<br />
{}<br />
);<br />
83
lib.add_generic_process(<br />
"holling_type_2", "graze_rate",<br />
[("Z",[ze],1,1), ("P",[pe],1,1)],<br />
[],<br />
{},<br />
{"Z.grazing_rate": "max(0,Z.gmax * P.conc / (Z.gcap + P.conc))"},<br />
{}<br />
);<br />
lib.add_generic_process(<br />
"holling_type_2_mod", "graze_rate",<br />
[("Z",[ze],1,1), ("P",[pe],1,1)],<br />
[],<br />
{},<br />
{"Z.grazing_rate": "max(0,(Z.gmax * (P.conc - P.biomin - Z.glim)<br />
/ (Z.gcap + (P.conc - P.biomin - Z.glim))))"},<br />
{}<br />
);<br />
lib.add_generic_process(<br />
"ivlev", "graze_rate",<br />
[("Z", [ze],1,1), ("P", [pe],1,1)],<br />
[],<br />
{"delta":(0.01,0.5)},<br />
{"Z.grazing_rate": "max(0,Z.gmax * (1 - exp(-1 * delta * P.conc)))"<br />
},<br />
84
{}<br />
);<br />
lib.add_generic_process(<br />
"grazing", "grazing",<br />
[("Z",[ze],1,1), ("P",[pe],0,1), ("D",[de],0,1), ("E",[ee],0,1)],<br />
[("graze_rate", ["Z","P"], 0)],<br />
{},<br />
{},<br />
{"Z.conc": "Z.assim_eff * Z.grazing_rate * Z.conc",<br />
"P.conc": "-1 * Z.grazing_rate * Z.conc",<br />
"D.conc": "(1-E.beta) * (1-Z.assim_eff) * Z.grazing_rate * Z.conc"}<br />
);<br />
# --- Nutrient Mixing ------------------------------------------<br />
# this process represents an input of nutrients (nitrate)<br />
# due to mixing or upwelling.<br />
lib.add_generic_process(<br />
"nutrient_mixing", "",<br />
[("N",[no3,fe],1,1),("E",[ee],1,1)],<br />
[("mixing_rate", ["N","E"],0)],<br />
{},<br />
{},<br />
{"N.conc": "(N.avg_deep_conc - N.conc) * N.mixing_rate"}<br />
);<br />
lib.add_generic_process(<br />
"linear_temp_control", "mixing_rate",<br />
[("N",[no3,fe],1,1),("E",[ee],1,1)],<br />
85
[],<br />
{"max_mixing_rate":(0.000001,1)},<br />
{"N.mixing_rate": "max_mixing_rate<br />
*(datamax(E.TH2O)-E.TH2O)/(datamax(E.TH2O)-datamin(E.TH2O))"},<br />
{},<br />
);<br />
# --- ROOT ---<br />
lib.add_generic_process("root", "",<br />
[("Z",[ze],0,1), ("P",[pe],1,2),<br />
("N",[no3,fe],2,2), ("D",[de],1,1), ("E",[ee],1,1)],<br />
[("growth", ["P","N","D","E"], 0),<br />
("death_exp", ["P","D","E"],1),<br />
("death_exp", ["Z","D","E"],1),<br />
("grazing", ["Z","P","D","E"], 0),<br />
("remineralization", ["D","N"], 0),<br />
("respiration", ["Z"], 1),<br />
("sinking", ["P"],1),<br />
("sinking", ["D"],1),<br />
("nutrient_mixing", ["N","E"],1),<br />
],<br />
{}, {}, {}<br />
);<br />
86
D. Models selected in both experiment 8 and 19<br />
Model D<br />
[<br />
dP [ ]<br />
dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)(1 − a 6 ) − a 9 − a 17<br />
]P<br />
−<br />
(<br />
( a 13 P a 14<br />
) Z<br />
} {{ }<br />
Z Grazing Rate<br />
dZ<br />
(<br />
dt = a 12 ( a 13 P a 14<br />
)<br />
} {{ }<br />
Z Grazing Rate<br />
dD<br />
(<br />
dt = (1 − a 10 )(a 9 P + a 11 Z 2 )<br />
− D(a 15 + a 18 )<br />
)<br />
)<br />
)<br />
Z −<br />
(a 11 Z + a 16 Z<br />
)<br />
+<br />
[<br />
]<br />
dN<br />
dt = E T H2 O<br />
(a 19 − N)a<br />
max<br />
− E T H2 O(t)<br />
20<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 7 12.0107)<br />
(<br />
)<br />
(1 − a 10 )(1 − a 12 )( a 13 P a 14<br />
) Z<br />
} {{ }<br />
Z Grazing Rate<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
[<br />
]<br />
dF<br />
dt = E T H2 O<br />
(a 21 − F )a<br />
max<br />
− E T H2 O(t)<br />
22<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 8 12.0107)<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
]<br />
a 15 D<br />
(a 7 ∗ 12.0107)<br />
[<br />
]]<br />
a 15 D<br />
(a 7 12.0107)<br />
{<br />
F<br />
M(t) = min<br />
(F + a 5 ) , (1−e−a 4N ), (e − E P UR (t)<br />
a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />
a 1 ))<br />
}<br />
87
Model E<br />
[<br />
dP [ ]<br />
dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)(1 − a 5 ) − a 8 − a 18<br />
]P<br />
( {<br />
− max 0, a }<br />
12(P − a 15 − a 14 )<br />
)<br />
Z<br />
a 13 + P − a 15 − a<br />
} {{ 14<br />
}<br />
Z Grazing Rate<br />
dZ<br />
( {<br />
dt = a 11 max 0, a }<br />
12(P − a 15 − a 14 )<br />
)<br />
)<br />
Z −<br />
(a 10 Z + a 17 Z<br />
a 13 + P − a 15 − a<br />
} {{ 14<br />
}<br />
Z Grazing Rate<br />
dD<br />
(<br />
) (<br />
{<br />
dt = (1 − a 9 )(a 8 P + a 10 Z 2 ) + (1 − a 9 )(1 − a 11 ) max 0, a }<br />
12(P − a 15 − a 14 )<br />
a 13 + P − a 15 − a 14<br />
− D(a 16 + a 19 )<br />
[<br />
]<br />
dN<br />
dt = E T H2 O<br />
(a 20 − N)a<br />
max<br />
− E T H2 O(t)<br />
21<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 6 12.0107)<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
[<br />
]<br />
dF<br />
dt = E T H2 O<br />
(a 22 − F )a<br />
max<br />
− E T H2 O(t)<br />
23<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 7 12.0107)<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
} {{ }<br />
Z Grazing Rate<br />
]<br />
a 16 D<br />
(a 6 ∗ 12.0107)<br />
[<br />
]]<br />
a 16 D<br />
(a 7 12.0107)<br />
)<br />
Z<br />
{<br />
M(t) = min<br />
}<br />
a 1 ))<br />
F<br />
(F + a 4 ) , (1 − e−a 3N ), (1 − e −E P UR (t)(1+a 2 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />
88
E. Models selected in both experiment 8 and 21<br />
Model F<br />
[<br />
dP [ ]<br />
(<br />
dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)(1 − a 6 ) − a 9 − a 17<br />
]P −<br />
a 13 P<br />
1 + a 13 a 14 P<br />
} {{ }<br />
Z Grazing Rate<br />
)<br />
Z<br />
dZ<br />
(<br />
dt = a 12<br />
a 13 P<br />
1 + a 13 a 14 P<br />
} {{ }<br />
Z Grazing Rate<br />
)<br />
)<br />
Z −<br />
(a 11 + a 16 Z<br />
dD<br />
(<br />
) (<br />
dt = (1 − a 10 )(a 9 P + a 11 Z) + (1 − a 10 )(1 − a 12 )<br />
a 13 P<br />
1 + a 13 a 14 P<br />
} {{ }<br />
Z Grazing Rate<br />
)<br />
Z − D(a 15 + a 18 )<br />
[<br />
]<br />
dN<br />
dt = E T H2 O<br />
(a 19 − N)a<br />
max<br />
− E T H2 O(t)<br />
20<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 7 12.0107)<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
]<br />
a 15 D<br />
(a 7 ∗ 12.0107)<br />
[<br />
]<br />
dF<br />
dt = E T H2 O<br />
(a 21 − F )a<br />
max<br />
− E T H2 O(t)<br />
22<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 8 12.0107)<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
[<br />
]]<br />
a 15 D<br />
(a 7 12.0107)<br />
{<br />
F<br />
M(t) = min<br />
(F + a 5 ) , N<br />
(N + a 4 ) , E P UR (t)<br />
(e− a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />
a 1 ))<br />
}<br />
89
Model G<br />
[<br />
dP [ ]<br />
(<br />
dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)(1 − a 6 ) − a 9 − a 17<br />
]P −<br />
dZ<br />
(<br />
dt =<br />
a 13 P 2<br />
a 12<br />
1 + a 13 a 14 P<br />
} {{ 2<br />
}<br />
Z Grazing Rate<br />
dD<br />
dt = (<br />
(1 − a 10 )(a 9 P + a 11 Z)<br />
)<br />
)<br />
Z −<br />
(a 11 + a 16 Z<br />
)<br />
+<br />
(<br />
(1 − a 10 )(1 − a 12 )<br />
[<br />
]<br />
dN<br />
dt = E T H2 O<br />
(a 19 − N)a<br />
max<br />
− E T H2 O(t)<br />
20<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 7 12.0107)<br />
a 13 P 2<br />
1 + a 13 a 14 P 2<br />
} {{ }<br />
Z Grazing Rate<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
[<br />
]<br />
dF<br />
dt = E T H2 O<br />
(a 21 − F )a<br />
max<br />
− E T H2 O(t)<br />
22<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 8 12.0107)<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
a 13 P 2<br />
1 + a 13 a 14 P 2<br />
} {{ }<br />
Z Grazing Rate<br />
)<br />
Z − D(a 15 + a 18 )<br />
]<br />
a 15 D<br />
(a 7 ∗ 12.0107)<br />
[<br />
]]<br />
a 15 D<br />
(a 7 12.0107)<br />
)<br />
Z<br />
{<br />
F<br />
M(t) = min<br />
(F + a 5 ) , N<br />
(N + a 4 ) , E P UR (t)<br />
(e− a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />
a 1 ))<br />
}<br />
90
Model H<br />
[<br />
dP [ ]<br />
(<br />
dt = (1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t)(1 − a 6 ) − a 9 − a 17<br />
]P − a 13 (1 − e −a14P )<br />
} {{ }<br />
Z Grazing Rate<br />
dZ<br />
(<br />
dt = a 12 a 13 (1 − e −a14P )<br />
} {{ }<br />
Z Grazing Rate<br />
dD<br />
dt = (<br />
(1 − a 10 )(a 9 P + a 11 Z)<br />
)<br />
)<br />
Z −<br />
(a 11 + a 16 Z<br />
)<br />
+<br />
[<br />
]<br />
dN<br />
dt = E T H2 O<br />
(a 19 − N)a<br />
max<br />
− E T H2 O(t)<br />
20<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 7 12.0107)<br />
(<br />
)<br />
(1 − a 10 )(1 − a 12 ) a 13 (1 − e −a14P ) Z − D(a<br />
} {{ }<br />
15 + a 18 )<br />
Z Grazing Rate<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
[<br />
]<br />
dF<br />
dt = E T H2 O<br />
(a 21 − F )a<br />
max<br />
− E T H2 O(t)<br />
22<br />
E T H2 O max<br />
− E T H2 O min<br />
[<br />
−<br />
P<br />
(a 8 12.0107)<br />
[<br />
]<br />
(1 − E ice (t))a 0 e (0.06933∗E T H 2 O(t))<br />
M(t) +<br />
]<br />
a 15 D<br />
(a 7 ∗ 12.0107)<br />
[<br />
]]<br />
a 15 D<br />
(a 7 12.0107)<br />
)<br />
Z<br />
{<br />
F<br />
M(t) = min<br />
(F + a 5 ) , N<br />
(N + a 4 ) , E P UR (t)<br />
(e− a 2 )(1−e −E P UR (t)(1+a 3 e(E P UR (t)e1.089−2.12log 10 (a 1 ) ) )<br />
a 1 ))<br />
}<br />
91
BIOGRAPHICAL SKETCH<br />
I was born and raised in France and came to the United States in 2006 to further<br />
my education. I saw there an incredible opportunity not only to explore my father’s<br />
origins but also to set out on a journey that promised to be full of learning experiences.<br />
I used to be terrible in math. If you would have told me in High School that I<br />
would study math later on in life, I probably would have laughed. But sure enough<br />
I completed my Undergraduate Degree in Applied Mathematics at the University of<br />
North Carolina Wilmington in 2010. For the past year and a half I have conducted<br />
research under Dr. Borrett on Inductive Process Modeling. I am now looking at<br />
possibility of traveling and working for a non-profit Christian organization which<br />
work with orphanages around the world. I have a heart for service and helping<br />
others. I trust that God will use the skills that I have acquired during my Masters<br />
where he sees fit.<br />
92