NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

More documents

Recommendations

Info

Natural Language Queries on Enterprise Linked Dataspaces: A Vocabulary Independent Approach André Freitas, João Gabriel Oliveira, Edward Curry, Seán O’Riain Digital Enterprise Research Institute (DERI) National University of Ireland, <strong>Galway</strong> {firstname.lastname@deri.org} Abstract This work describes Treo, a natural language query mechanism for Linked Data which focuses on the provision of a precise and scalable semantic matching approach between natural language queries and distributed heterogeneous Linked Datasets. Treo's semantic matching approach combines three key elements: entity search, a Wikipedia-based semantic relatedness measure and spreading activation search. While entity search allows Treo to cope with queries over high volume and distributed data, the combination of entity search and spreading activation search using a Wikipedia-based semantic relatedness measure provides a flexible approach for handling the semantic match between natural language queries and Linked Data. 1. Introduction Linked Data brings the promise of incorporating a new dimension to the Web where the availability of Webscale data can determine a paradigmatic transformation of the Web and its applications. However, together with its opportunities, Linked Data brings inherent challenges in the way users and applications consume the available data. End-users consuming Linked Data on the Web or on corporate intranets should be able to query data spread over potentially a large number of heterogeneous, complex and distributed datasets. The freedom and universality provided by search engines in the Web of Documents were fundamental in the process of maximizing the value of the information available on the Web. Linked Data consumers, however, need previous understanding of the available vocabularies in order to execute expressive queries over Linked Datasets. This constraint strongly limits the visibility and value of Linked Data. Ideally a query mechanism for Linked Data should abstract users from the representation of data. This work focuses on the investigation of a query mechanism that could address this challenge providing a vocabulary independent natural language query approach for Linked Data. 2. Description of the Approach In order to address the problem, an approach based on the combination of entity search, a Wikipedia-based semantic relatedness measure and spreading activation is proposed. The combination of these three elements in a query mechanism for Linked Data is a new contribution in the space. The center of the approach 93 relies on the use of a Wikipedia-based semantic relatedness measure as a key element for matching query terms to vocabulary terms, addressing an existing gap in the literature. Wikipedia-based relatedness measures address limitations of existing works which are based on similarity measures/term expansion based on WordNet. The final query processing approach provides an opportunity to revisit cognitive inspired spreading activation models over semantic networks under contemporary lenses. The recent availability of Linked Data, large Web corpora, hardware resources and a better understanding of the principles behind information retrieval can provide the necessary resources to enable practical applications over cognitive inspired architectures. 3. Evaluation A prototype, Treo, was developed and evaluated in terms quality of results using the QALD Workshop DBpedia training query set [1] containing 50 natural language queries over DBPedia. Examples of queries present in the dataset include: “who was the wife of president Lincoln?” and “which capitals in Europe were host cities of the summer Olympic games?”. Mean reciprocal rank, precision and recall measures were collected. The proposed approach was able to answer 70% of the queries and the final values for the collected measures are mrr=0.492, precision=0.395 and recall=0.451. The relatedness measure was able to cope with nontaxonomic variations between query and vocabulary terms, showing high average discrimination in the node selection process (average difference between the relatedness value of answer nodes and the relatedness mean is 2.81 σ). The results for each query were analyzed and queries with errors were categorized into 5 different error classes. The removal of the queries with errors that are considered addressable in the short term (Partial Ordered Dependency Structure Error, Pivot Error, Relatedness Error) defines estimated projected measurements of precision=0.64, recall=0.75 and mrr=0.82. References 1st Workshop on Question Answering over Linked Data (QALD-1), ://www.sc.cit-ec.uni-bielefeld.de/qald-1, (2011).
A Citation-based approach to automatic topical indexing of scientific literature Arash Joorabchi and Abdulhussain E. Mahdi Department of Electronic and Computer Engineering, University of Limerick, Republic of Ireland {Arash.Joorabchi, Hussain.Mahdi}@ul.ie Abstract Topical indexing of documents with keyphrases is a common method used for revealing the subject of scientific and research documents to both human readers and information retrieval tools, such as search engines. However, scientific documents that are manually indexed with keyphrases are still in the minority. This work proposed a new unsupervised method for automatic keyphrase extraction from scientific documents which yields a performance on a par with human indexers. The method is based on identifying references cited in the document to be indexed and, using the keyphrases assigned to those references for generating a set of high-likelihood keyphrases for the document. We have evaluated the performance of the proposed method by using it to automatically index a third-party testset of research documents. Reported experimental results show that the performance of our method, measured in terms of consistency with human indexers, is competitive with that achieved by state-of-the-art supervised methods. The results of this work is published in [1]. 1. Citation-based keyphrase extraction A significant portion of electronic documents published on the Internet become part of a large chain of networks via some form of linkage that they have to other documents. In relation to scientific literature which is the subject of our work, the citation networks among scientific documents have been successfully used to improve the search and retrieval methods for scholarly publications, e.g., see [2]. These studies have successfully shown that citation networks among scientific documents can be utilized to improve the performance of three major information retrieval tasks; namely, clustering, classification, and full-text indexing. In our opinion, the results of these studies indirectly suggest that the content of cited documents could also potentially be used to improve the performance of keyphrase indexing of scientific documents. In this work, we have investigated this hypothesis as a new application of citation networks by developing a new Citation-based Keyphrase Extraction (CKE) method for scientific literature and evaluating its performance. The proposed method can be outlined in three main steps: 1. Reference extraction: this comprises the process of identifying and extracting reference strings in the bibliography section of a given document and parsing them into their logical components. 2. Data mining: this is a three-fold process. In the first stage, we query the Google Book Search (GBS) to 94 retrieve a list of publications which cite either the given document or one of its references. Then, in the second stage, we retrieve the metadata records of these citing publications from the GBS database. Among other metadata elements, these records contain a list of key terms extracted from the content of the citing publications. In the final stage of the data mining process, we extract these key terms along with their numerically represented degree of importance from the metadata records of the citing publications to be used as primary clues for keyphrase indexing of the given document. 3. Term weighting and selection: this process starts by searching the content of the given document for the set of key terms collected in the data mining process (step 2 above). Each matching term would be assigned a keyphraseness score which is the product function of seven statistical properties of the given term, namely: frequency among the extracted key terms, frequency inside the document, number of words, average degree of importance, first occurrence position inside the document, frequency inside reference strings, and length measured in terms of the number of characters. After computing the keyphraseness scores for all the candidate key terms, a simple selection algorithm is applied to index the document with a set of most probable keyphrases. 2. Experimental Results Our CKE algorithm clearly outperforms its unsupervised rival, Grineva et al. algorithm [3]. In comparison to its supervised rivals, the CKE algorithm significantly outperforms KEA [4] under all conditions. However, it yields a slightly lower averaged interconsistency score (≤1.1%) compared to Maui [5]. 3. References [1] Mahdi A. E. and Joorabchi A., A Citation-based approach to automatic topical indexing of scientific literature, Journal of Information Science 2010; 36, 6. [2] Aljaber B., et al., Document clustering of scientific texts using citation contexts, Information Retrieval 2009; 13, 2: 101-131. [3] Grineva M., et al, Extracting key terms from noisy and multi-theme documents. In: 18th international conference on World wide web; 2009; Madrid, Spain [4] Witten I. H., et al., KEA: practical automatic keyphrase extraction. In: fourth ACM conference on Digital libraries; 1999 [5] Medelyan O., Human-competitive automatic topic indexing (Ph.D Thesis, University of Waikato, 2009).
Page 1 and 2:
NUI Galway - UL Alliance First Annu
Page 4 and 5:
FULL TABLE OF CONTENTS 1 GAMES, VIS
Page 6 and 7:
4 MECHANICAL AND BIOMEDICAL ENGINEE
Page 8 and 9:
5.21 Detecting Topics and Events in
Page 10 and 11:
8.7 Modelling Extreme Flood Events
Page 12 and 13:
GAMES, VISUALISATION & EDUCATION 1.
Page 14 and 15:
Generation and Analysis of Graph St
Page 16 and 17:
Evolution and Analysis of Strategie
Page 18 and 19:
Abstract The delivery of multimedia
Page 20 and 21:
Applications of Reinforcement Learn
Page 22 and 23:
Assessing the effects of interactiv
Page 24 and 25:
Real-time depth map generation usin
Page 26 and 27:
An analysis of the capability of pr
Page 28 and 29:
Building Information Modelling duri
Page 30 and 31:
Dwelling Energy Measurement Procedu
Page 32 and 33:
Numerical Modelling of Tidal Turbin
Page 34 and 35:
Energy Storage using Microencapsula
Page 36 and 37:
Data Centre Energy Efficiency Mark
Page 38 and 39:
An embodied energy and carbon asses
Page 40 and 41:
SmartOp - Smart Buildings Operation
Page 42 and 43:
Ocean Wave Energy Exploitation in D
Page 44 and 45:
Future Smart Grid Synchronization C
Page 46 and 47:
Web-Based Building Energy Usage Vis
Page 48 and 49:
Image Recognition and Classificatio
Page 50 and 51:
Android Based Multi-Feature Elderly
Page 52 and 53:
Determining Subjects’ Activities
Page 54 and 55: New Analysis Techniques for ICU Dat
Page 56 and 57: National E-Prescribing Systems in I
Page 58 and 59: Using Mashups to Satisfy Personalis
Page 60 and 61: 3D Computational Modeling of Blood
Page 62 and 63: Experimental and Computational Inve
Page 64 and 65: Experimental Analysis of the Therma
Page 66 and 67: Simulating Actin Cytoskeleton Remod
Page 68 and 69: Computational Analysis of Transcath
Page 70 and 71: An In vitro Shear Stress System for
Page 72 and 73: Development of a Micropipette Aspir
Page 74 and 75: A Computational Test-Bed to Examine
Page 76 and 77: Computational Modeling of Ceramic-b
Page 78 and 79: Multi-Scale Computational Modelling
Page 80 and 81: Development of a mixed-mode cohesiv
Page 82 and 83: Active Computational Modelling of C
Page 84 and 85: Modelling the Management of Medical
Page 86 and 87: SOCIAL MEDIA, SEARCH & RECOMMENDATI
Page 88 and 89: Improving Twitter Search by Removin
Page 90 and 91: Abstract The goal of this research
Page 92 and 93: Generalized Blockmodeling Samantha
Page 94 and 95: Life-Cycles and Mutual Effects of S
Page 96 and 97: dcat: Searching Public Sector Infor
Page 98 and 99: The Effect of User Features on Chur
Page 100 and 101: User Similarity and Interaction in
Page 102 and 103: Improving Categorisation in Social
Page 106 and 107: Studying Forum Dynamics from a User
Page 108 and 109: Provenance in the Web of Data: a bu
Page 110 and 111: Towards Social Descriptions of Serv
Page 112 and 113: ENVIRONMENTAL ENGINEERING 6.1 Asses
Page 114 and 115: Novel Agri-engineering solutions fo
Page 116 and 117: Evaluation of amendments to control
Page 118 and 119: Determination of optimal applicatio
Page 120 and 121: Treatment of Piggery Wastewaters us
Page 122 and 123: NEXT GENERATION INTERNET 7.1 Extens
Page 124 and 125: Enabling Federation of Government M
Page 126 and 127: Curated Entities for Enterprise Uma
Page 128 and 129: Mobile Web + Social Web + Semantic
Page 130 and 131: Engaging Citizens in the Policy-Mak
Page 132 and 133: Preference-based Discovery of Dynam
Page 134 and 135: RDF On the Go: An RDF Storage and Q
Page 136 and 137: Policy Modeling meets Linked Open D
Page 138 and 139: A Contextualized Perspective for Li
Page 140 and 141: Improving discovery in Life Science
Page 142 and 143: The Semantic Public Service Portal
Page 144 and 145: Personalized Content Delivery on Mo
Page 146 and 147: A Framework to Describe Localisatio
Page 148 and 149: The influence of secondary settleme
Page 150 and 151: Analysis of Shear Transfer in Void-
Page 152 and 153: Cost-Effective Sustainable Construc
Page 154 and 155:
Modelling Extreme Flood Events due
Page 156 and 157:
Axial Load Capacity of a Driven Cas
Page 158 and 159:
Chemical amendment of dairy cattle
Page 160 and 161:
Seismic Design of Concentrically Br
Page 162 and 163:
MODELLING, ALGORITHMS & CONTROL 9.1
Page 164 and 165:
Eigen-based Approach for Leverage P
Page 166 and 167:
Evolutionary Modelling of Industria
Page 168 and 169:
Abstract: Graphical Semantic Wiki f
Page 170 and 171:
Low Coverage Genome Assembly Using
Page 172 and 173:
Evolving a Robust Open-Ended Langua
Page 174 and 175:
Context Stamp - A Topic-based Conte
Page 176 and 177:
DSP-Based Control of Multi-Rail DC-
Page 178 and 179:
Topographical Cues - Controlling Ce
Page 180 and 181:
Creep Relaxation and Crack Growth P
Page 182 and 183:
Finite Element Modelling of Failure
Page 184 and 185:
Influence of Fluorine and Nitrogen
Page 186 and 187:
Phase Decompositions of Bioceramic
Page 188 and 189:
High Resolution Microscopical Analy
Page 190 and 191:
An Experimental and Numerical Analy
Page 192 and 193:
Thermomechanical characterisation o
Page 194 and 195:
A multiaxial damage mechanics metho
Page 196:
The effect of citrate ester plastic
show all

NUI Galway – UL Alliance First Annual ENGINEERING AND - ARAN ...

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?