NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

More documents

Recommendations

Info

Improving discovery in Life Sciences and Health Care with Semantic Web Technologies and Linked Data Helena F. Deus and Jonas S. Almeida Digital Enterprise Research Institute, National University of Ireland, Galway Email: helena.deus@deri.org Abstract One of the most enticing outcomes of biological exploration for the quantitative-minded researcher is to identify the “mathematics of biology” through the study of the patterns and interrelatedness of biological entities. To move forward in that direction, many pieces need to be set in place, from the availability of biological data in forms that can be computationally manipulated to the automated discovery of patterns that would derive from integration of data in many and diverse endpoints. A computational system where researchers could securely deposit any type of data and have it immediately analyzed, traversed, annotated and merged with data deposited elsewhere is a dream not yet achieved but one which could revolutionize scientific discovery and, ultimately, help cure disease. 1. Introduction “Science is organized knowledge”, said Immanuel Kant. Science in the Information Age has indeed been characterized by increasingly complex approaches to the organization and sharing of scientific knowledge to improve its discovery. A critical bottleneck in applying knowledge engineering to inform and improve biomedical discovery is the ability to align experimental data acquisition with knowledge representation models. Recent technological advances in genomics, proteomics and other ‘omics’ sciences, have flooded microbiology labs with unprecedented amounts of experimental data and were at the heart of a paradigm shift for researchers interested in its computational representation and analysis. 2. Knowledge Continuum for Life Sciences The deluge of data in biology marked the beginning of the “industrialization of data production in Life Sciences beyond a craft-based cottage industry” [1]. The successful exploit of this new wave of data acquisition technologies created the need for multiple, overlapping, sub-disciplines of biology to cope with the increasing complexity of biological results. However, this became an obstacle to discovery because the answers to biological problems often span multiple domain boundaries and rely on the existence of knowledge continuums. Starting to address these challenges requires agreement on optimal strategies for publishing experimental biomedical data in interoperable formats even before they are made available to the community through peer-reviewed publication. 129 3. Architecture The work presented here corresponds to the identification of the key requirements for supporting publication of experimental biomedical results in the Web of Data. With that purpose, the prototypical application S3DB (Simple Sloppy Semantic Database) was developed to fit the requirements of a Linked Data Knowledge Organization System (KOS) for the Life Sciences [3]. Reflecting our data-driven approach, the technological requirements and advancements in the S3DB prototype were iteratively devised through direct interaction and validation by biomedical domain experts, who interact with S3DB using graphical user interfaces (GUI). Application developers access and consume data in S3DB through a SPARQL application programming interface (API) whereas the interaction between multiple S3DB systems is ensured through a domain specific language, S3QL, which operates on S3DB’s indexing schema. 4. References [1] C. Goble and R. Stevens, “State of the nation in data integration for bioinformatics.,” Journal of biomedical informatics, vol. 41, Oct. 2008, pp. 687-93. [2] C. Brooksbank and J. Quackenbush, “Data standards: a call to action.,” Omics : a journal of integrative biology, vol. 10, Jan. 2006, pp. 94-9. [3] H.F. Deus, R. Stanislaus, D.F. Veiga, C. Behrens, I.I. Wistuba, J.D. Minna, H.R. Garner, S.G. Swisher, J.A. Roth, A.M. Correa, B. Broom, K. Coombes, A. Chang, L.H. Vogel, and J.S. Almeida, “A Semantic Web management model for integrative biomedical informatics.,” PloS one, vol. 3, Jan. 2008, p. e2946.
Why Reading Papers when EEYORE will do that for you!? Behrang QasemiZadeh, Paul Buitelaar Unit for Natural Language Processing, DERI, NUI Galway firstname.lastname@deri.org Abstract This abstract introduces EEYORE web service. EEYORE is designed to help researchers exploring their domain’s scientific publications by extracting key technical concepts and relations between them from an arbitrary set of input publications. The proposed system uses linguistic analysis and Machine learning techniques to perform its task. 1. Introduction There is a renewed interest in developing systems for exploring scientific publications. In this abstract we introduce EEYORE. EEYORE is based on generic linguistic analysis tools and machine learning techniques and extracts key technical terms and relation between them from an arbitrary input set of documents. In the proposed scenario, a researcher uploads a set of publications into the system and provides the system with a few examples of the topic of his/her own interest. EEYORE then analyzes the input set of publications and provides facts about technical terms similar to the ones provided by the user. Figure 1. An Overview of the proposed research: user provides the system with a set of publications in a domain of expertise in addition to a few examples of concepts that are interesting for him/her. The system then uses the provided examples for developing a set of machine learning models that can extract concepts and facts similar to the one proposed by the user. In this way, the user may explore publications in a scientific domain in a more effective way. 130 2. Methodology The proposed research is built upon two major technologies: generic human language technology [1] and machine learning techniques [2]. Detailed steps to fulfill the task are as follows: 1. Text Extraction and Segmentation: In this process step, scientific publications in digital format such as PDF are converted to linguistically well defined units. In other words, the input PDF files will be converted to sections, paragraphs, sentences and words. 2. Linguistic Analysis: The system then linguistically analyzes the input publications. The analysis includes part of speech tagging [3] i.e. the process of marking up the words in a text as corresponding to a particular part of speech, based on both its definition, as well as its context, and dependency parsing [4] i.e. a form of syntactic parsing and denotes grammatical relations between words in a sentence. 3. Machine Learning based Classifiers: The generated information in the previous steps of the analysis, in addition to the user provided examples/information then will be used for developing machine learning based classifiers namely a classifier for concept identification and a classifier for relation discovery. 4. Applying Classifiers to the provided set of publications: In this step, the system will employ the classifiers to extract new concepts and relations from the user provided publications. In addition, the system may use user feedback to refine its model and to adapt itself to user information needs. 3. References [1] Handbook of Natural Language Processing, Second Edition, Editor(s): Nitin Indurkhya; Fred J. Damerau, Goshen, Connecticut, USA, CRC Press, 2010. [2] Ian H. Witten and Eibe Frank. 2002. Data mining: practical machine learning tools and techniques with Java implementations. SIGMOD Rec. 31, 1 (March 2002), 76-77. DOI=10.1145/507338.507355 [3] Kristina Toutanova and Christopher D. Manning. 2000. Enriching the knowledge sources used in a maximum entropy part-ofspeech tagger. In In EMNLP/VLC 2000, pages 63–70. [4] Joakim Nivre. 2003. An efficient algorithm for projective dependency parsing. In Proceedings of the 8th International Workshop on Parsing Technologies (IWPT), pages 149–160
Page 1 and 2:
NUI Galway - UL Alliance First Annu
Page 4 and 5:
FULL TABLE OF CONTENTS 1 GAMES, VIS
Page 6 and 7:
4 MECHANICAL AND BIOMEDICAL ENGINEE
Page 8 and 9:
5.21 Detecting Topics and Events in
Page 10 and 11:
8.7 Modelling Extreme Flood Events
Page 12 and 13:
GAMES, VISUALISATION & EDUCATION 1.
Page 14 and 15:
Generation and Analysis of Graph St
Page 16 and 17:
Evolution and Analysis of Strategie
Page 18 and 19:
Abstract The delivery of multimedia
Page 20 and 21:
Applications of Reinforcement Learn
Page 22 and 23:
Assessing the effects of interactiv
Page 24 and 25:
Real-time depth map generation usin
Page 26 and 27:
An analysis of the capability of pr
Page 28 and 29:
Building Information Modelling duri
Page 30 and 31:
Dwelling Energy Measurement Procedu
Page 32 and 33:
Numerical Modelling of Tidal Turbin
Page 34 and 35:
Energy Storage using Microencapsula
Page 36 and 37:
Data Centre Energy Efficiency Mark
Page 38 and 39:
An embodied energy and carbon asses
Page 40 and 41:
SmartOp - Smart Buildings Operation
Page 42 and 43:
Ocean Wave Energy Exploitation in D
Page 44 and 45:
Future Smart Grid Synchronization C
Page 46 and 47:
Web-Based Building Energy Usage Vis
Page 48 and 49:
Image Recognition and Classificatio
Page 50 and 51:
Android Based Multi-Feature Elderly
Page 52 and 53:
Determining Subjects’ Activities
Page 54 and 55:
New Analysis Techniques for ICU Dat
Page 56 and 57:
National E-Prescribing Systems in I
Page 58 and 59:
Using Mashups to Satisfy Personalis
Page 60 and 61:
3D Computational Modeling of Blood
Page 62 and 63:
Experimental and Computational Inve
Page 64 and 65:
Experimental Analysis of the Therma
Page 66 and 67:
Simulating Actin Cytoskeleton Remod
Page 68 and 69:
Computational Analysis of Transcath
Page 70 and 71:
An In vitro Shear Stress System for
Page 72 and 73:
Development of a Micropipette Aspir
Page 74 and 75:
A Computational Test-Bed to Examine
Page 76 and 77:
Computational Modeling of Ceramic-b
Page 78 and 79:
Multi-Scale Computational Modelling
Page 80 and 81:
Development of a mixed-mode cohesiv
Page 82 and 83:
Active Computational Modelling of C
Page 84 and 85:
Modelling the Management of Medical
Page 86 and 87:
SOCIAL MEDIA, SEARCH & RECOMMENDATI
Page 88 and 89:
Improving Twitter Search by Removin
Page 90 and 91: Abstract The goal of this research
Page 92 and 93: Generalized Blockmodeling Samantha
Page 94 and 95: Life-Cycles and Mutual Effects of S
Page 96 and 97: dcat: Searching Public Sector Infor
Page 98 and 99: The Effect of User Features on Chur
Page 100 and 101: User Similarity and Interaction in
Page 102 and 103: Improving Categorisation in Social
Page 104 and 105: Natural Language Queries on Enterpr
Page 106 and 107: Studying Forum Dynamics from a User
Page 108 and 109: Provenance in the Web of Data: a bu
Page 110 and 111: Towards Social Descriptions of Serv
Page 112 and 113: ENVIRONMENTAL ENGINEERING 6.1 Asses
Page 114 and 115: Novel Agri-engineering solutions fo
Page 116 and 117: Evaluation of amendments to control
Page 118 and 119: Determination of optimal applicatio
Page 120 and 121: Treatment of Piggery Wastewaters us
Page 122 and 123: NEXT GENERATION INTERNET 7.1 Extens
Page 124 and 125: Enabling Federation of Government M
Page 126 and 127: Curated Entities for Enterprise Uma
Page 128 and 129: Mobile Web + Social Web + Semantic
Page 130 and 131: Engaging Citizens in the Policy-Mak
Page 132 and 133: Preference-based Discovery of Dynam
Page 134 and 135: RDF On the Go: An RDF Storage and Q
Page 136 and 137: Policy Modeling meets Linked Open D
Page 138 and 139: A Contextualized Perspective for Li
Page 142 and 143: The Semantic Public Service Portal
Page 144 and 145: Personalized Content Delivery on Mo
Page 146 and 147: A Framework to Describe Localisatio
Page 148 and 149: The influence of secondary settleme
Page 150 and 151: Analysis of Shear Transfer in Void-
Page 152 and 153: Cost-Effective Sustainable Construc
Page 154 and 155: Modelling Extreme Flood Events due
Page 156 and 157: Axial Load Capacity of a Driven Cas
Page 158 and 159: Chemical amendment of dairy cattle
Page 160 and 161: Seismic Design of Concentrically Br
Page 162 and 163: MODELLING, ALGORITHMS & CONTROL 9.1
Page 164 and 165: Eigen-based Approach for Leverage P
Page 166 and 167: Evolutionary Modelling of Industria
Page 168 and 169: Abstract: Graphical Semantic Wiki f
Page 170 and 171: Low Coverage Genome Assembly Using
Page 172 and 173: Evolving a Robust Open-Ended Langua
Page 174 and 175: Context Stamp - A Topic-based Conte
Page 176 and 177: DSP-Based Control of Multi-Rail DC-
Page 178 and 179: Topographical Cues - Controlling Ce
Page 180 and 181: Creep Relaxation and Crack Growth P
Page 182 and 183: Finite Element Modelling of Failure
Page 184 and 185: Influence of Fluorine and Nitrogen
Page 186 and 187: Phase Decompositions of Bioceramic
Page 188 and 189: High Resolution Microscopical Analy
Page 190 and 191:
An Experimental and Numerical Analy
Page 192 and 193:
Thermomechanical characterisation o
Page 194 and 195:
A multiaxial damage mechanics metho
Page 196:
The effect of citrate ester plastic
show all

NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?