12.07.2015 Views

Module: Weka for Information Retrieval 1 Module Name 2 Scope 3 ...

Module: Weka for Information Retrieval 1 Module Name 2 Scope 3 ...

Module: Weka for Information Retrieval 1 Module Name 2 Scope 3 ...

SHOW MORE
SHOW LESS
  • No tags were found...

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

The Header of the ARFF file looks like the following:Figure 1: An example of ARFF file9.2 Introduction to <strong>Weka</strong> [7, 5]<strong>Weka</strong> was developed by the University of Waikato in New Zealand, and the namestands <strong>for</strong> Waikato Environment <strong>for</strong> Knowledge Analysis, with pronouncation ofrhyme of Mecca. <strong>Weka</strong> is written in Java and distributed under the terms of theGNU General Public License. It is plat<strong>for</strong>m independent and has been testedunder Linux, Windows, and Macintosh operating systems. It implement manydifferent learning algorithms along with methods <strong>for</strong> pre- and postprocessingand <strong>for</strong> evaluation the result of learning schemes on any given dataset. You canpreprocess a dataset, feed it into a learning scheme, and analyze the resultingclassifier and its per<strong>for</strong>mance.Be<strong>for</strong>e we start the tour of weka application, we first look into the file <strong>for</strong>matthat used by weka, which is called ARFF. ARFF files [6] have two distinctsections. The first section is the Header in<strong>for</strong>mation, which is followed theData in<strong>for</strong>mation. The Header of the ARFF file contains the name of therelation, a list of the attributes, and their types. An example Headers andthe Data of the ARFF file looks like in figure 1.<strong>Weka</strong> provides a graphical user interface with three main sub-interface.• Explorer gives access to all of its facilities using menu selection and <strong>for</strong>mfilling. For example, you can build a decision tree from an ARFF file; youcan compare different text classification model by analyzing the outputresult from the same file. Explorer allow you to finish all the task bymoving mouse to choose related options and adjust them quickly untilthey are applicable to your data.• Experimenter enables <strong>Weka</strong> users to compare a variety of learning techniquesautomatically with different parameter settings on a corpus of4


• C: Ran weka.classifiers.SMO on iris data, explain the results.• D: Make a table to record all the results, and describe your conclusion onthree classifier.11 Evaluation of Learning OutcomesIn the report of exercise, students need to show their understanding of basicconcept of the classification algorithms mentioned previously, and demonstratetheir ability to run weka with proper classifier and explain the meaning of resultsfrom weka.12 ResourcesPlease refer to Reference section at the end of this report.13 Glossaryi. Classification: Process to assign a given object into class(es).ii. Supervised Learning: A machine learning methods when learning froma training set of instances that have been properly labeled manually withthe correct classes.iii. Unsupervised Learning: A machine learning methods when groupingdata into classes based on some measure of inherent similarity.iv. Training Set: A set of data with label.v. Classifier: An implementation of a specific classification algorithms.vi. Rules: A group of features used <strong>for</strong> classification.vii. Vector Space Classification:Classification works on vector representeddataset.viii. Decision hyperplanes: Boundaries that used by vector space classificationmethods <strong>for</strong> a test document. classification14 ContributorsAuthors: Team 5 at CS 5604 In<strong>for</strong>mation Storage and <strong>Retrieval</strong>• Bhanu Peddi• Huijun Xiong• Noha ElsherbinyReviewer: Dr. Edward Fox6

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!