Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
149<br />
screens to identify direct or indirect interactions based on phenotypes or gene<br />
expression patterns. It is thus important for FlyBase to recognize <strong>and</strong> support data<br />
representations <strong>and</strong> reports based on relationships among gene products in addition<br />
to those relationships based on chromosomal location. Some ways of addressing this<br />
need can be addressed now; others present substantial technical hurdles.<br />
The FlyBase architecture supports the curation of different versions of a gene<br />
product -RNAs or polypeptides or molecular complexes – as different data objects,<br />
so that annotations can be attached to the appropriate objects. This is an essential<br />
part of an organism-specific data model, since much of the regulation of cellular<br />
function boils down to gene products that can be toggled between alternative states<br />
based on allosteric interactions, subunit modifications, or differential subunit<br />
interactions.<br />
Describing the interactions <strong>and</strong> the pathways is an even larger <strong>and</strong> more difficult<br />
task. Much of the available physical interaction data involves in vitro assays, usually<br />
in heterologous systems. These data are often hints or suggestions of possible<br />
interactions rather than readily verifiable ones. Genetic interaction data have their<br />
own set of pitfalls. While the individual observations can be represented, our ability<br />
to compile them into computed pathways is impaired by the inherent limitations of<br />
the current data sets. Thus, we need to capture <strong>and</strong> represent data in a manner that<br />
reflects the current state of knowledge, but that will be of value once better st<strong>and</strong>ards<br />
<strong>and</strong> methods are available. This represents a considerable challenge at the strategic<br />
<strong>and</strong> computational levels.<br />
Another aspect of the problem are those of spatial pattern: descriptions of<br />
anatomical phenotypes <strong>and</strong> gene expression patterns. Were rigorous representations<br />
of spatial pattern possible, these could be used in combination with interaction data<br />
to distinguish among possible interactions. (For example, two proteins that are shown<br />
to physically interact but which are never expressed in the same tissues are unlikely<br />
to interact in a biologically meaningful way). FlyBase has developed an extensive<br />
ontology of anatomical parts, <strong>and</strong> using this vocabulary, phenotypes <strong>and</strong> expression<br />
patterns are captured. Either the authors or the curators, however, end up throwing<br />
away a great deal of data in turning two or three dimensional spatial information into<br />
text. Similarly, dependence on text terms to support user queries places inherent<br />
limitations on the depth of questions that can be answered. Ultimately, it will be<br />
important for tools to be developed that can effectively capture quantitative spatial<br />
information. Only in this way can these data can be directly queried without<br />
imposing a strong filter on the data set through its conversion into much coarser<br />
textual objects. This is obviously a major long term issue which is already receiving<br />
attention, <strong>and</strong> we can expect that it will continue to be an important area for<br />
computational research.