23.01.2014 Views

7 - Indira Gandhi Centre for Atomic Research

7 - Indira Gandhi Centre for Atomic Research

7 - Indira Gandhi Centre for Atomic Research

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

efficiently handle terabytes of data, database engines with fast I/O speeds and advanced<br />

query engines, that can access geographically distributed data are required.<br />

4. New Concepts and New Ideas<br />

The typical way to proceed with analyzing large datasets is to first generate some<br />

abstraction of the dataset in the <strong>for</strong>m of features or summarization. Scientists &<br />

Technocrats then use this in<strong>for</strong>mation to guide them to the regions of interest. In this<br />

exploration phase, they often wish to "drill down" in the dataset. This is done by<br />

specifying a subset of the dataset, either directly, or by the summarization features,<br />

repeatedly refining the focus. Once something interesting is found, there may be a long<br />

run of the computer <strong>for</strong> pattern-matching or other data-mining to find other places in the<br />

dataset which "look like that". Finally, the results of the search can be used to create new<br />

knowledge. Following diagram indicates the approach to new concept:<br />

Generate abstraction<br />

of Data<br />

Finding the region of<br />

Interest<br />

Data Mining<br />

Found Interesting<br />

things<br />

Look <strong>for</strong> more pattern<br />

matching<br />

Refine the focus<br />

New concept<br />

&<br />

New Ideas<br />

Fig – 2. Generation of new concepts & new Ideas<br />

5. Collection Based <strong>Research</strong><br />

Collection Based <strong>Research</strong> (based on Large Scientific and Technical Database) is a<br />

promising and emerging field wherein Scientists and Technologists from remote corners<br />

can participate in the research programme across the country / globe, making use of the<br />

available infrastructure and internet facilities to their full advantage. These remote<br />

scientists and technologists with a mixture of computing equipment on their desks, need<br />

catalogues and indexes of the data archive, the ability to select data objects, to define<br />

complex processing to be done, then choose how the results are to be returned to them. In<br />

addition, adequate authentication systems are needed to control access to the data across<br />

multiple security domains.<br />

148

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!