21.11.2014 Views

ayout 1 - EMBL Grenoble

ayout 1 - EMBL Grenoble

ayout 1 - EMBL Grenoble

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Structural and Computational Biology Unit<br />

Data integration and knowledge management<br />

Previous and current research<br />

Today it is widely recognised that a comprehensive integration of data can be one of the key factors<br />

to improve productivity and efficiency in the biological research process. Successful data integration<br />

helps researchers to discover relationships that enable them to make better and faster<br />

decisions, thus considerably saving time and money.<br />

Over the last 20 years, biological research has seen a very strong proliferation of data sources. Each<br />

research group and new experimental technique generates a source of valuable data. The creation,<br />

use, integration and warehousing of biological data is central to large-scale efforts in understanding<br />

biological systems. These tasks pose significant challenges from the standpoint of data storage,<br />

indexing, retrieval and system scalability<br />

over disparate types of data.<br />

Examples of the graphical features of Arena3D.<br />

Heterogeneous data types can be visualised in<br />

a 3D environment and a range of l<strong>ayout</strong> and<br />

cluster algorithms can be applied.<br />

easily mined, browsed and navigated. By providing access to all scientists<br />

in the organisation, it will foster collaborations between researchers in different<br />

cross-functional groups.<br />

The group is involved in the following areas:<br />

• Data schema design and technical implementation;<br />

• Metadata annotation with respect to experimental data;<br />

• Design and implementation of scientific data portals;<br />

• Providing access to, and developing further, data-mining tools<br />

(e.g. text-mining);<br />

• Visualisation environment for systems biology data.<br />

Reinhard<br />

Schneider<br />

PhD 199, University of<br />

Heidelberg.<br />

Postdoctoral research at<br />

<strong>EMBL</strong>.<br />

The current systems biology approaches<br />

are generating data sets with<br />

Co-founder and Chief<br />

Information Officer at LION<br />

rapidly growing complexity and dynamics.<br />

One major challenge is to<br />

bioscience AG.<br />

Chief Executive Officer at<br />

provide the mechanism for accessing LION bioscience Research<br />

the heterogeneous data and to detect<br />

the important information. We develop<br />

interactive visual data analysis<br />

Inc., Cambridge, MA.<br />

Team leader at <strong>EMBL</strong> since<br />

200.<br />

techniques using automatic data<br />

analysis pipelines. The combination of<br />

techniques allows us to analyse otherwise unmanageable amounts of complex data.<br />

The principal aim of the group is to capture and centralise the knowledge generated<br />

by the scientists in the several divisions, and to organise that knowledge such<br />

that it can be<br />

Future projects and goals<br />

Our goal is to develop a comprehensive knowledge platform for the life<br />

sciences. We will first focus on the biology-driven research areas, but will<br />

extend into chemistry-related fields, preliminary by collaborating with groups<br />

inside <strong>EMBL</strong>. Other research areas will include advanced data-mining and visualisation<br />

techniques.<br />

OnTheFly and Reflect server. Figure (A,B,C) shows an<br />

annotated table (A) of an PDF full text article, the generated<br />

popup window with information about the protein YGL227W<br />

(B), and an automatically generated protein-protein interaction<br />

network (C) of associated entities for the proteins shown in part<br />

(A). Part (D) shows the architecture and functionality.<br />

Selected references<br />

Pavlopoulos, G.A., O’Donoghue, S.I., Satagopam, V.P., Soldatos,<br />

T.G., Pafilis, E. & Schneider, R. (2008). Arena3D: visualization of<br />

biological networks in 3D. BMC Syst. Biol., 2, 10<br />

Erhardt, R.A., Schneider, R. & Blaschke, C. (2006). Status of textmining<br />

techniques applied to biomedical text. Drug Discov. Today,<br />

11, 315-325<br />

Kremer, A., Schneider, R. & Terstappen, G.C. (2005). A<br />

bioinformatics perspective on proteomics: data storage, analysis,<br />

and integration. Biosci. Rep., 25, 95-106<br />

Ofran, Y., Punta, M., Schneider, R. & Rost, B. (2005). Beyond<br />

annotation transfer by homology: novel protein-function prediction<br />

methods to assist drug discovery. Drug Discov. Today, 10, 175-<br />

182<br />

51

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!