12.07.2015 Views

Corpus-Based Thesaurus Construction for Image Retrieval in ...

Corpus-Based Thesaurus Construction for Image Retrieval in ...

Corpus-Based Thesaurus Construction for Image Retrieval in ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

that can be used dur<strong>in</strong>g the <strong>in</strong>dexation phase and latterly <strong>in</strong> the query expansionphase. We have attempted to outl<strong>in</strong>e a corpus-based method <strong>for</strong> build<strong>in</strong>g a thesaurus.We have demonstrated, through the use of frequency metrices <strong>for</strong> s<strong>in</strong>gle tokens and<strong>for</strong> compound tokens, that a text corpus, randomly selected and systematicallyorganized, can perhaps be used to <strong>in</strong>itiate the development of such a thesaurus. Wehave developed a series of programs (<strong>in</strong> the JAVA programm<strong>in</strong>g language) tocompute frequency distribution of s<strong>in</strong>gle and compound tokens and to automatically<strong>in</strong>dex an image. The lexical-syntactic pattern analysis carried out has shown thatbroader and narrow relations can be explicitly extracted which makes it easier toconstruct a hierarchy and improve the query expansion <strong>in</strong> perhaps limit<strong>in</strong>g theexpansion to narrower terms. From our comparative analysis of progeny and mothercorpora it has been shown that <strong>in</strong>ter-variation of terms between sub-doma<strong>in</strong>s can beused to study differences <strong>in</strong> language usage and perhaps one method of ensur<strong>in</strong>gproper doma<strong>in</strong> coverage is <strong>for</strong> the construction of a doma<strong>in</strong>-specific thesaurus bycomb<strong>in</strong><strong>in</strong>g the sub-doma<strong>in</strong>-specific thesauri built from various progeny corpora.The current method used is based on certa<strong>in</strong> lexical syntactic patterns depictive ofhyponymic and meronymic relationships. This work can be cont<strong>in</strong>ued to discoversynonyms as well as other doma<strong>in</strong>-specific relationships such as attempt<strong>in</strong>g toidentify roles, attributes of entities, and events. In a multi-modal doma<strong>in</strong> it would be<strong>in</strong>terest<strong>in</strong>g to consider a visual representation as well as l<strong>in</strong>guistic labels to represent aconcept <strong>in</strong> the thesaurus, which could act as a l<strong>in</strong>k between the two. There has beenwork done on creat<strong>in</strong>g multimedia thesauri but it would be <strong>in</strong>terest<strong>in</strong>g to <strong>in</strong>vestigatehow images could be automatically analyzed to discover relationships between thembased on Picards work [6] and then the visual representation of these imagesautomatically l<strong>in</strong>ked to the concept with the correspond<strong>in</strong>g l<strong>in</strong>guistic label. This couldperhaps provide even more effective multi-model query expansion.References1. Veltkamp, R.C., Tanase M.: Content-<strong>Based</strong> <strong>Image</strong> <strong>Retrieval</strong> Systems: A Survey.(Technical Report UU-CS-2000-34). Institute of In<strong>for</strong>mation and Comput<strong>in</strong>g Sciences,Univ. of Utrecht, The Netherlands (2000)2. Marr, D.: Vision. W.H. Freeman, San Francisco (1982)3. Squire, McG.D., Muller, W., Muller, H., Pun, T.: Content-<strong>Based</strong> Query of <strong>Image</strong>databases: Inspirations from Text <strong>Retrieval</strong>. Pattern Recognition Letters 21. ElsevierScience B.V. (2000) 1193-11984. Eak<strong>in</strong>s, J.P., Graham, M.E.: Content-based <strong>Image</strong> <strong>Retrieval</strong>: A Report to the JISCTechnology Applications Programme. <strong>Image</strong> Data Research Institute Newcastle,Northumbria. (1999). (http://www.unn.ac.uk/iidr/report.html, visited 19/06/02)5. Ogle, V.E., Stonebraker, M.: Chabot: retrieval from a relational database of images. IEEEComputer Magaz<strong>in</strong>e, Vol. 28(9). IEEE (1995) 40-486. Picard, R.W.: Towards a Visual <strong>Thesaurus</strong>. In: Ian Ruthven (ed): Spr<strong>in</strong>ger VerlagWorkshops <strong>in</strong> Comput<strong>in</strong>g, MIRO 95, Glasgow, Scotland (1995)7. Srihari, R.K.: Use of Collateral Text <strong>in</strong> Understand<strong>in</strong>g Photos. Artificial IntelligenceReview, Special Issue on Integrat<strong>in</strong>g Language and Vision, Vol. 8 (1995) 409-4308. Paek, S., Sable C.L., Hatzivassiloglou, V., Jaimes, A., Schiffman, B.H., Chang, S.F.,McKeown, K.R.: Integration of Visual and Text-<strong>Based</strong> Approaches <strong>for</strong> the Content

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!