July 2006 Volume 9 Number 3 - CiteSeerX

July 2006 

Volume 9 Number 3

Educational Technology & Society 

An International Journal 

Aims and Scope 

Educational Technology & Society is a quarterly journal published in January, April, July and October. Educational Technology & 

Society seeks academic articles on the issues affecting the developers of educational systems and educators who implement and manage such 

systems. The articles should discuss the perspectives of both communities and their relation to each other: 

• Educators aim to use technology to enhance individual learning as well as to achieve widespread education and expect the technology to 

blend with their individual approach to instruction. However, most educators are not fully aware of the benefits that may be obtained by 

proactively harnessing the available technologies and how they might be able to influence further developments through systematic 

feedback and suggestions. 

• Educational system developers and artificial intelligence (AI) researchers are sometimes unaware of the needs and requirements of typical 

teachers, with a possible exception of those in the computer science domain. In transferring the notion of a 'user' from the humancomputer 

interaction studies and assigning it to the 'student', the educator's role as the 'implementer/ manager/ user' of the technology has 

been forgotten. 

The aim of the journal is to help them better understand each other's role in the overall process of education and how they may support 

each other. The articles should be original, unpublished, and not in consideration for publication elsewhere at the time of submission to 

Educational Technology & Society and three months thereafter. 

The scope of the journal is broad. Following list of topics is considered to be within the scope of the journal: 

Architectures for Educational Technology Systems, Computer-Mediated Communication, Cooperative/ Collaborative Learning and 

Environments, Cultural Issues in Educational System development, Didactic/ Pedagogical Issues and Teaching/Learning Strategies, Distance 

Education/Learning, Distance Learning Systems, Distributed Learning Environments, Educational Multimedia, Evaluation, Human- 

Computer Interface (HCI) Issues, Hypermedia Systems/ Applications, Intelligent Learning/ Tutoring Environments, Interactive Learning 

Environments, Learning by Doing, Methodologies for Development of Educational Technology Systems, Multimedia Systems/ Applications, 

Network-Based Learning Environments, Online Education, Simulations for Learning, Web Based Instruction/ Training 

Editors 

Kinshuk, Massey University, New Zealand; Demetrios G Sampson, University of Piraeus & ITI-CERTH, Greece; Ashok Patel, CAL 

Research & Software Engineering Centre, UK; Reinhard Oppermann, Fraunhofer Institut Angewandte Informationstechnik, Germany. 

Associate editors 

Alexandra I. Cristea, Technical University Eindhoven, The Netherlands; John Eklund, Access Australia Co-operative Multimedia 

Centre, Australia; Vladimir A Fomichov, K. E. Tsiolkovsky Russian State Tech Univ, Russia; Olga S Fomichova, Studio "Culture, 

Ecology, and Foreign Languages", Russia; Piet Kommers, University of Twente, The Netherlands; Chul-Hwan Lee, Inchon National 

University of Education, Korea; Brent Muirhead, University of Phoenix Online, USA; Erkki Sutinen, University of Joensuu, Finland; 

Vladimir Uskov, Bradley University, USA. 

Advisory board 

Ignacio Aedo, Universidad Carlos III de Madrid, Spain; Sherman Alpert, IBM T.J. Watson Research Center, USA; Alfred Bork, 

University of California, Irvine, USA; Rosa Maria Bottino, Consiglio Nazionale delle Ricerche, Italy; Mark Bullen, University of 

British Columbia, Canada; Tak-Wai Chan, National Central University, Taiwan; Nian-Shing Chen, National Sun Yat-sen University, 

Taiwan; Darina Dicheva, Winston-Salem State University, USA; Brian Garner, Deakin University, Australia; Roger Hartley, Leeds 

University, UK; Harald Haugen, Høgskolen Stord/Haugesund, Norway; J R Isaac, National Institute of Information Technology, 

India; Paul Kirschner, Open University of the Netherlands, The Netherlands; William Klemm, Texas A&M University, USA; Rob 

Koper, Open University of the Netherlands, The Netherlands; Ruddy Lelouche, Universite Laval, Canada; Rory McGreal, Athabasca 

University, Canada; David Merrill, Brigham Young University - Hawaii, USA; Marcelo Milrad, Växjö University, Sweden; Riichiro 

Mizoguchi, Osaka University, Japan; Hiroaki Ogata, Tokushima University, Japan; Toshio Okamoto, The University of Electro- 

Communications, Japan; Gilly Salmon, University of Leicester, United Kingdom; Timothy K. Shih, Tamkang University, Taiwan; 

Yoshiaki Shindo, Nippon Institute of Technology, Japan; Brian K. Smith, Pennsylvania State University, USA; J. Michael Spector, 

Florida State University, USA. 

Assistant Editors 

Sheng-Wen Hsieh, National Sun Yat-sen University, Taiwan; Taiyu Lin, Massey University, New Zealand; Kathleen Luchini, 

University of Michigan, USA; Dorota Mularczyk, Independent Researcher & Web Designer; Carmen Padrón Nápoles, Universidad 

Carlos III de Madrid, Spain; Ali Fawaz Shareef, Massey University, New Zealand; Jarkko Suhonen, University of Joensuu, Finland. 

Executive peer-reviewers 

http://www.ifets.info/ 

Subscription Prices and Ordering Information 

Institutions: NZ$ 120 (~ US$ 75) per year (four issues) including postage and handling. 

Individuals (no school or libraries): NZ$ 100 (~ US$ 50) per year (four issues) including postage and handling. 

Single issues (individuals only): NZ$ 35 (~ US$ 18) including postage and handling. 

Subscription orders should be sent to The International Forum of Educational Technology & Society (IFETS), c/o Prof. Kinshuk, 

Information Systems Department, Massey University, Private Bag 11-222, Palmerston North, New Zealand. Tel: +64 6 350 5799 ext 2090. 

Fax: +64 6 350 5725. E-mail: kinshuk@ieee.org. 

Advertisements 

Educational Technology & Society accepts advertisement of products and services of direct interest and usefulness to the readers of the 

journal, those involved in education and educational technology. Contact the editors at kinshuk@ieee.org. 

Abstracting and Indexing 

Educational Technology & Society is abstracted/indexed in Social Science Citation Index, Current Contents/Social & Behavioral Sciences, 

ISI Alerting Services, Social Scisearch, ACM Guide to Computing Literature, Australian DEST Register of Refereed Journals, Computing 

Reviews, DBLP, Educational Administration Abstracts, Educational Research Abstracts, Educational Technology Abstracts, Elsevier 

Bibliographic Databases, ERIC, Inspec, Technical Education & Training Abstracts, and VOCED. 

ISSN 1436-4522 1436-4522. (online) © International and 1176-3647 Forum (print). of Educational © International Technology Forum of & Educational Society (IFETS). Technology The authors & Society and (IFETS). the forum The jointly authors retain and the copyright forum jointly of the retain articles. the 

copyright Permission of the to make articles. digital Permission or hard copies to make of digital part or or all hard of this copies work of for part personal or all of or this classroom work for use personal is granted or without classroom fee use provided is granted that without copies are fee not provided made or that distributed copies 

are for not profit made or or commercial distributed advantage for profit and or commercial that copies advantage bear the full and citation that copies on the bear first the page. full Copyrights citation on the for components first page. Copyrights of this work for owned components by others of this than work IFETS owned must by be 

others honoured. than IFETS Abstracting must with be honoured. credit is permitted. Abstracting To with copy credit otherwise, is permitted. to republish, To copy to otherwise, post on servers, to republish, or to redistribute to post on to servers, lists, requires or to redistribute prior specific to lists, permission requires and/or prior a 

specific fee. Request permission permissions and/or a from fee. the Request editors permissions at kinshuk@massey.ac.nz. 

from the editors at kinshuk@ieee.org. 

i

Guidelines for authors 

Submissions are invited in the following categories: 

• Peer reviewed publications: a) Full length articles (4000 - 7000 words), b) Short articles, Critiques and Case studies (up to 3000 words) 

• Book reviews 

• Software reviews 

• Website reviews 

All peer review publications will be refereed in double-blind review process by at least two international reviewers with expertise in the 

relevant subject area. Book, Software and Website Reviews will not be reviewed, but the editors reserve the right to refuse or edit review. 

• Each peer review submission should have at least following items: title (up to 10 words), complete communication details of ALL 

authors , an informative abstract (75-200 words) presenting the main points of the paper and the author's conclusions, four - five 

descriptive keywords, main body of paper (in 10 point font), conclusion, references. 

• Submissions should be single spaced. 

• Footnotes and endnotes are not accepted, all such information should be included in main text. 

• The paragraphs should not be indented. There should be one line space between consecutive paragraphs. 

• There should be single space between full stop of previous sentence and first word of next sentence in a paragraph. 

• The keywords (just after the abstract) should be separated by comma, and each keyword phrase should have initial caps (for example, 

Internet based system, Distance learning). 

• Do not use 'underline' to highlight text. Use 'italic' instead. 

Headings 

Articles should be subdivided into unnumbered sections, using short, meaningful sub-headings. Please use only two level headings as far 

as possible. Use 'Heading 1' and 'Heading 2' styles of your word processor's template to indicate them. If that is not possible, use 12 point 

bold for first level headings and 10 point bold for second level heading. If you must use third level headings, use 10 point italic for this 

purpose. There should be one blank line after each heading and two blank lines before each heading (except when two headings are 

consecutive, there should be one blank like between them). 

Tables 

Tables should be included in the text at appropriate places and centered horizontally. Captions (maximum 6 to 8 words each) must be 

provided for every table (below the table) and must be referenced in the text. 

Figures 

Figures should be included in the text at appropriate places and centered horizontally. Captions (maximum 6 to 8 words each) must be 

provided for every figure (below the figure) and must be referenced in the text. The figures must NOT be larger than 500 pixels in width. 

Please also provide all figures separately (besides embedding them in the text). 

References 

• All references should be listed in alphabetical order at the end of the article under the heading 'References'. 

• All references must be cited in the article using "authors (year)" style e.g. Merrill & Twitchell (1994) or "(authors1, year1; authors2, 

year2)" style e.g. (Merrill, 1999; Kommers et al., 1997). 

• Do not use numbering style to cite the reference in the text e.g. "this was done in this way and was found successful [23]." 

• It is important to provide complete information in references. Please follow the patterns below: 

Journal article 

Laszlo, A. & Castro, K. (1995). Technology and values: Interactive learning environments for future generations. Educational Technology, 

35 (2), 7-13. 

Newspaper article 

Blunkett, D. (1998). Cash for Competence. Times Educational Supplement, July 24, 1998, 15. 

Or 

Clark, E. (1999). There'll never be enough bandwidth. Personal Computer World, July 26, 1999, retrieved July 7, 2004, from 

http://www.vnunet.co.uk/News/88174. 

Book (authored or edited) 

Brown, S. & McIntyre, D. (1993). Making sense of Teaching, Buckingham: Open University. 

Chapter in book/proceedings 

Malone, T. W. (1984). Toward a theory of intrinsically motivating instruction. In Walker, D. F. & Hess, R. D. (Eds.), Instructional 

software: principles and perspectives for design and use, California: Wadsworth Publishing Company, 68-95. 

Internet reference 

Fulton, J. C. (1996). Writing assignment as windows, not walls: enlivening unboundedness through boundaries, retrieved July 7, 2004, 

from http://leahi.kcc.hawaii.edu/org/tcc-conf96/fulton.html. 

Submission procedure 

Authors, submitting articles for a particular special issue, should send their submissions directly to the appropriate Guest Editor. Guest 

Editors will advise the authors regarding submission procedure for the final version. 

All submissions should be in electronic form. The editors will acknowledge the receipt of submission as soon as possible. 

The preferred formats for submission are Word document and RTF, but editors will try their best for other formats too. For figures, GIF 

and JPEG (JPG) are the preferred formats. Authors must supply separate figures in one of these formats besides embedding in text. 

Please provide following details with each submission: Author(s) full name(s) including title(s), Name of corresponding author, Job 

title(s), Organisation(s), Full contact details of ALL authors including email address, postal address, telephone and fax numbers. 

The submissions should be uploaded at http://www.ifets.info/ets_journal/upload.php. In case of difficulties, they can also be sent via 

email to (Subject: Submission for Educational Technology & Society journal): kinshuk@ieee.org. In the email, please state clearly that the 

manuscript is original material that has not been published, and is not being considered for publication elsewhere. 







ii

Journal of Educational Technology & Society 

Volume 9 Number 3 2006 

Table of contents 

Special issue articles 

Theme: Next Generation e-Learning Systems: Intelligent Applications and Smart Design 

Editorial: Next Generation e-Learning Systems: Intelligent Applications and Smart Design 

Demetrios G. Sampson and Peter Goodyear 

A Particle Swarm Optimization Approach to Composing Serial Test Sheets for Multiple Assessment 

Criteria 

Peng-Yeng Yin, Kuang-Cheng Chang, Gwo-Jen Hwang, Gwo-Haur Hwang and Ying Chan 

Modeling Peer Assessment as Agent Negotiation in a Computer Supported Collaborative Learning 

Environment 

K. Robert Lai and Chung Hsien Lan 

Ontology Mapping and Merging through OntoDNA for Learning Object Reusability 

Ching-Chieh Kiu and Chien-Sing Lee 

FODEM: developing digital learning environments in widely dispersed learning communities 

Jarkko Suhonen and Erkki Sutinen 

From Research Resources to Learning Objects: Process Model and Virtualization Experiences 

José Luis Sierra, Alfredo Fernández-Valmayor, Mercedes Guinea and Héctor Hernanz 

Mining Formative Evaluation Rules Using Web-based Learning Portfolios for Web-based Learning 

Systems 

Chih-Ming Chen, Chin-Ming Hong, Shyuan-Yi Chen and Chao-Yu Liu 

Formal Method of Description Supporting Portfolio Assessment 

Yasuhiko Morimoto, Maomi Ueno, Isao Kikukawa, Setsuo Yokoyama and Youzou Miyadera 

A Systemic Activity based Approach for Holistic Learning & Training Systems 

Hansjörg von Brevern and Kateryna Synytsya 

Full length articles 

Campus Laptops: What Logistical and Technological Factors are Perceived Critical? 

Robert Cutshall, Chuleeporn Changchit and Susan Elwood 

Students’ Preferences on Web-Based Instruction: linear or non-linear 

Nergiz Ercil Cagiltay, Soner Yildirim and Meral Aksu 

Student-Generated Visualization as a Study Strategy for Science Concept Learning 

Yi-Chuan Jane Hsieh and Lauren Cifuentes 

On Improving Spatial Ability Through Computer-Mediated Engineering Drawing Instruction 

Ahmad Rafi, Khairul Anuar Samsudin and Azniah Ismail 

Massively multiplayer online games (MMOs) in the new media classroom 

Aaron Delwiche 

A Web environment to encourage students to do exercises outside the classroom: A case study 

Laurence Capus, Frédéric Curvat, Olivier Leclair and Nicole Tourigny 

The effect of LEGO Training on Pupils’ School Performance in Mathematics, Problem Solving Ability 

and Attitude: Swedish Data 

Shakir Hussain, Jörgen Lindh and Ghazi Shukur 

The Effects of a Computer-Assisted Interview Tool on Data Quality 

Justus J. Randolph, Marjo Virnes, Ilkka Jormanainen and Pasi J. Eronen 

Data for School Improvement: Factors for designing effective information systems to support decisionmaking 

in schools 

Andreas Breiter and Daniel Light 

ISSN 1436-4522 1436-4522. (online) © International and 1176-3647 Forum (print). of Educational © International Technology Forum & of Society Educational (IFETS). Technology The authors & Society and the (IFETS). forum The jointly authors retain and the the copyright forum jointly of the retain articles. the 

Permission copyright of to the make articles. digital Permission or hard copies to make of part digital or all or of hard this copies work for of part personal or all or of classroom this work use for is personal granted or without classroom fee provided use is granted that copies without are fee not provided made or that distributed copies 

are for profit not made or commercial or distributed advantage for profit and or that commercial copies bear advantage the full and citation that copies on the bear first page. the full Copyrights citation on for the components first page. Copyrights of this work for owned components by others of than this work IFETS owned must by be 

others honoured. than Abstracting IFETS must with be honoured. credit is permitted. Abstracting To with copy credit otherwise, is permitted. to republish, To copy to post otherwise, on servers, to republish, or to redistribute to post on to lists, servers, requires or to prior redistribute specific to permission lists, requires and/or prior a 

specific fee. Request permission permissions and/or from a fee. the Request editors permissions at kinshuk@massey.ac.nz. 


1-2 

3-15 

16-26 

27-42 

43-55 

56-68 

69-87 

88-99 

100-111 

112-121 

122-136 

137-148 

149-159 

160-172 

173-181 

182-194 

195-205 

206-217 

iii

Component Exchange Community: A model of utilizing research components to foster international 

collaboration 

Yi-Chan Deng, Taiyu Lin, Kinshuk and Tak-Wai Chan 

A Visualization Tool for Managing and Studying Online Communications 

William J. Gibbs, Vladimir Olexa and Ronan S. Bernas 

An ICT-mediated Constructivist Approach for increasing academic support and teaching critical thinking 

skills 

Dick Ng’ambi and Kevin Johnston 







218-231 

232-243 

244-253 

iv

Sampson, D. G., & Goodyear, P. (2006). Next Generation e-Learning Systems: Intelligent Applications and Smart Design 

(Guest Editorial). Educational Technology & Society, 9 (3), 1-2. 

Next Generation e-Learning Systems: Intelligent Applications and Smart 

Design (Guest Editorial) 

Demetrios G. Sampson 

Department of Technology Education and Digital Systems, University of Piraeus 

and 

Cether for Research and Technology - Hellas, 150 Androutsou Street, Piraeus, 18235, Greece 

sampson@unipi.gr 

Peter Goodyear 

CoCo Research Centre, Education Building, A35, University of Sydney, NSW 2006, Australia 

p.goodyear@edfac.usyd.edu.au 

This special issue of Educational Technology & Society presents a selection of papers from the 5 th IEEE 

International Conference on Advanced Learning Technologies (ICALT2005) that was held in Kaohsiung, 

Taiwan in July 2005. There were 409 submissions to that conference. The acceptance rate for full papers was 

23% and for short papers 30%. There were 205 papers published in the proceedings, along with descriptions of 

55 posters and 32 workshop papers. We had the great honour of being co-chairs of the Programme Committee 

for that conference and it has been a difficult but rewarding task to decide from the great number of good papers 

presented at that conference. 

The conference theme represents a confluence of two combinations of technology and intelligence and their 

application to the design of (web-based) educational systems. On the one hand, we have the increasingly 

sophisticated embedding of intelligent technologies in educational computing applications. On the other, we see 

a need for more ‘savvy’ approaches to learning technology design: so that emerging technology can better serve 

the real needs of its users, rather than their “anticipated” needs. 

Advanced Learning Technologies (ALT) or Technology-enhanced Learning (TeL), at its best, is a field propelled 

by a creative tension – coupling an open-minded exploration of the educational affordances of each new 

technology with a rigorous demand for evidence to back up claims about potential benefit. Sometimes 

technological innovation seems to be in the driving seat, and some sceptics complain about “technological 

solutions in search of educational problems”. At other times, demands for evidence and the tight constraints of 

established evaluation methods can make the field appear as if it is moving nowhere. In the short term, this can 

be frustrating and partly demotivating. But one thing we have learned is that ALT progresses in the longer-term. 

While it may appear to meander back and forth, on a long view it generally seems to flow in the right direction. 

Thus, each of the papers selected here needs to be seen as contributing to this general flow. As in any healthy 

field, the papers differ in their central concerns and may even seem to be heading in different directions – but 

these are eddies making up the stream, rather than signs of a field in disarray. We present them to you as 

worthwhile contributions in their own right, but also for what they say about the general flow of ideas. 

The first paper in the collection, entitled “A Particle Swarm Optimization Approach to Composing Serial Test 

Sheets for Multiple Assessment Criteria” is by Yin, Hwang, Chang, Hwang and Chan. Their topic is central to 

education and forms a dominant theme within this special issue, being concerned with assessment of student 

learning. Assessment is a technically challenging area, hugely important for the student and potentially very time 

consuming for the teacher. Consequently there is a strong interest in automating or partially automating aspects 

of assessment work. Problems arise in computer-based test construction if very large item banks are in use and 

there are multiple constraints to be satisfied (such as length of test, restrictions on multiple use of items, gauging 

item difficulty to suit the level of the learner being assessed, etc.) This paper explores the use of particle swam 

optimisation approach in test sheet construction. The authors draw on recent modelling techniques from 

statistical biology, in particular, swarm modelling algorithms, and demonstrate aspects of the efficacy and the 

usability of a test-construction methodology that builds on such algorithms. 

Lai & Lan, in their paper entitled “Modelling Peer Assessment as Agent Negotiation in a Computer Supported 

Collaborative Learning Environment” are also interested in assessment and the use of technology to assist in the 

assessment process, but their context is radically different. They are concerned with facilitating peer assessment, 

rather than automating assessment per se. Peer assessment has some demonstrable value, especially as a way of 

getting students to engage more closely with ideas about how we make judgements about what is known and 

worth knowing. But it can also be fraught with difficulty. Lai & Lan describe a novel approach to helping 

ISSN 1436-4522 (online) and 1176-3647 (print). © International Forum of Educational Technology & Society (IFETS). The authors and the forum jointly retain the 

copyright of the articles. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies 

are not made or distributed for profit or commercial advantage and that copies bear the full citation on the first page. Copyrights for components of this work owned by 

others than IFETS must be honoured. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior 

specific permission and/or a fee. Request permissions from the editors at kinshuk@ieee.org. 

1

students negotiate in the peer-assessment process, through the use of mediating agents. Their paper is exemplary 

in combining technical innovation with demonstration of both learning benefits and user acceptance of the 

approach. 

The context for the work reported by Kiu and Lee in their paper entitled “Ontology Mapping and Merging 

through OntoDNA for Learning Object Reusability” is the rapid evolution of interest in repositories of reusable 

learning objects. This is an area of great technical, economic and pedagogical interest, though one would have to 

remark that much more attention is being paid to supply-side than to demand-side issues. That said, the potential 

of learning object repositories is such that serious progress is needed in methods for enhancing the 

interoperability of repositories. Kiu and Lee’s work on ontologies is an impressive case in point. They present a 

framework for automated ontology mapping (the OntoDNA framework) and demonstrate its significance for 

interoperability questions. 

Suhonen & Sutinen, in their paper entitled “FODEM: developing digital learning environments in sparse 

learning communities” shift our attention to distance learning provision and in particular are concerned with the 

needs of people in sparsely populated areas. Their focus is firmly on ‘smart design’ and they introduce and 

illustrate a methodology for designing digital learning environments that takes into account the needs of widely 

distributed learner groups. Their approach – the Formative Development Method, FODEM – focuses attention 

on needs analysis, implementation through rapid prototyping and formative evaluation; it is capable of providing 

timely and usable information about learner needs and about how well those needs are being understood and 

addressed. 

Sierra, Fernández-Valmayor, Guinea & Hernanz, in their paper entitled “From Research Resources to Learning 

Objects: Process Model and Virtualization Experiences” once more address the issue of reusable learning 

objects. In this case, their concern is with enabling wider educational access to materials that exist in museum 

collections. An important part of the problem is virtualization – in a sense, shifting the artefacts from the 

material to the digital world. The paper describes a process model for virtualization, building on the practical 

experience of the authors in working with domain experts in museums. An interesting aspect of this approach is 

that it is realistically conservative in its assumptions about how much effort, domain experts can contribute to 

such work. By this way, the authors attempt to avoid the fact that quite a lot of activity in the field of 

virtualization and/or re-purposing has made unrealistic assumptions about the skills and time available to domain 

experts – e.g. for producing metadata – and has consequently failed. 

Chen, Hong, Chen & Liu, in their paper entitled “Mining Formative Evaluation Rules Using Web-based 

Learning Portfolios for Web-based Learning Systems” bring us to assessment once more. This time the focus is 

on formative rather than summative assessment. They take a data mining approach to extracting evidence about 

learning from students’ online portfolios. Their method involves a combination of neuro-fuzzy network and Kmeans 

algorithm for logically determining membership functions, and a feature reduction scheme to discover a 

manageably small set of simplified fuzzy rules for evaluating learning performance based on the material 

gathered in student learning portfolios. Their goal is to allow teachers to redistribute their time, concentrating on 

tasks where they can uniquely add value to the educational process. 

Morimoto, Ueno, Kikukawa, Yokoyama and Miyadera, in their paper entitled “Formal Method of Description 

Supporting Portfolio Assessment” stay with the topic of portfolio assessment. This time, the point is to provide 

appropriate support to teachers and learners who can have problems understanding how best to engage with the 

portfolio assessment process. In particular, teachers need support in designing for portfolio assessment – e.g. 

determining the type of assessment portfolios that are needed. The contribution of this paper is to provide a way 

of formally mapping between lesson forms and portfolios. 

Finally, von Brevern & Synytsya, in their paper entitled “A Systemic Activity based Approach for holistic 

Learning & Training Systems” look at the connections between work activity and learning activity in corporate 

settings. Their contribution is rooted in a Systemic-Structural Theory of Activity, which supports a more holistic 

conceptualisation of learning and working, such that (among other things) technical systems can be designed to 

support these in an integrated rather than a fragmenting way. 

This Special Issue of Educational Technology & Society collected eight papers from the 5 th IEEE International 

Conference on Advanced Learning Technologies (ICALT2005) in a single volume. Technology-enhanced 

Assessment on one hand and Reusable Learning Resources from the other, have been two different areas of 

focus, following a current international trend towards deeper investigations in these topics. With our capacity, as 

Guest Editors of this volume, we hope that the readers of ET&S shall appreciate the contributions of this 

collection towards the research of the Next Generation e-Learning Systems. 

2

Yin, P.-Y., Chang, K.-C., Hwang, G.-J., Hwang, G.-H., & Chan, Y. (2006). A Particle Swarm Optimization Approach to 

Composing Serial Test Sheets for Multiple Assessment Criteria. Educational Technology & Society, 9 (3), 3-15. 

A Particle Swarm Optimization Approach to Composing Serial Test Sheets 

for Multiple Assessment Criteria 

Peng-Yeng Yin and Kuang-Cheng Chang 

Department of Information Management, National Chi Nan University, Pu-Li, Nan-Tou, Taiwan 545, R.O.C. 

Gwo-Jen Hwang 

Department of Information and Learning Technology, National University of Tainan, 33, Sec. 2 

Shulin St.,Tainan city 70005, Taiwan, R.O.C. 

gjhwang@mail.nutn.edu.tw 

Tel: 886-915396558 

Fax: 886-6-3017001 

Gwo-Haur Hwang 

Information Management Department, Ling Tung University, Taichung, Taiwan 40852, R.O.C. 

Ying Chan 

Graduate Institute of Educational Policy and Leadership, Tamkang University 

Tamsui, Taipei County, Taiwan 251, R.O.C. 

ABSTRACT 

To accurately analyze the problems of students in learning, the composed test sheets must meet multiple 

assessment criteria, such as the ratio of relevant concepts to be evaluated, the average discrimination 

degree, difficulty degree and estimated testing time. Furthermore, to precisely evaluate the improvement of 

student’s learning performance during a period of time, a series of relevant test sheets need to be composed. 

In this paper, a particle swarm optimization-based approach is proposed to improve the efficiency of 

composing near optimal serial test sheets from very large item banks to meet multiple assessment criteria. 

From the experimental results, we conclude that our novel approach is desirable in composing near optimal 

serial test sheets from large item banks and hence can support the need of evaluating student learning status. 

Keywords 

Computer-assisted testing, serial test-sheet composing, particle swarm optimization, computer-assisted 

assessment 

1. Introduction 

As the efficiency and efficacy of the deployment of computer-based tests have been confirmed by many early 

studies, many researchers in both technical and educational fields have engaged in the development of 

computerized testing systems (Fan et al., 1996; Olsen et al., 1986). Some researchers have even proposed 

computerized adaptive testing, which uses prediction methodologies to shorten the length of the test sheets 

without sacrificing their precision (Wainer, 1990). 

A well-scrutinized test is helpful for teachers wanting to verify whether students well digest relevant knowledge 

and skills and for recognition of students’ learning bottlenecks (Hwang et al., 2003a). In a computerized learning 

environment, which provides students with greater flexibility during the learning process, information 

concerning the student learning status is even more important (Hwang, 2003a). The key to a good test depends 

not only on the subjective appropriateness of test items, but also on the way the test sheet is constructed. To 

continuously evaluate the learning performance of a student, it is usually more desirable to compose a series of 

relevant test sheets to meet a predefined set of assessment criteria such that those test sheets in the same series 

will not contain identical test items (or contain only an acceptable percentage of overlapped test items). Because 

the number of test items in an item bank is usually large and the number of feasible combinations to form test 

sheets thus grows exponentially, an optimal test sheet takes enormous time to build up (Garey & Johnson, 1979). 

Previous investigation has even shown that a near-optimal solution is difficult to find when the number of 

candidate test items is larger than five thousand (Hwang et al., 2003b), not to mention the composition of a series 






3

of relevant test sheets from larger item banks for evaluating the improvement of student’s learning performance 

during a period of time. 

To cope with the problem in composing optimal serial test sheets from large item banks, a particle swarm 

optimization (PSO)-based algorithm (Kennedy and Eberhart, 1995) is proposed to optimize the selection of test 

items to compose serial test sheets. By employing this novel approach, the allocation of test items in each of the 

serial test sheets will meet the needs of multiple criteria, including the expected testing time, the degree of 

difficulty, the expected ratio of unit concepts, and the acceptable percentage of overlapped test items among test 

sheets to approximate the optimal allocation. Based on this approach, an Intelligent Tutoring, Testing and 

Diagnostic (ITED III) system has been developed. Experimental results indicated that the proposed approach is 

efficient and effective in generating near-optimal compositions of serial test sheets that satisfy the specified 

requirements. 

2. Background and Relevant Researches 

In recent years, researchers have developed various computer-assisted testing systems to more precisely evaluate 

student’s learning status. For example, Feldman and Jones (1997) attempted to perform semi-automatic testing 

of student software using Unix systems; Rasmussen, et al. (1997) proposed a system to evaluate student learning 

status on computer networks while taking Feldman and Jones’ progress into consideration. Additionally, Chou 

(2000) proposed the CATES system, which is an interactive testing system developed in a collective and 

collaborative project with theoretical and practical research on complex technology-dependent learning 

environments. Unfortunately, although many computer-assisted testing systems have been proposed, few of 

them have addressed the problem of finding a systematic approach for composing test sheets that satisfy multiple 

assessment requirements. Most of the existing systems construct a test sheet by manually or randomly selecting 

test items from their item banks. Such manual or random test item selection strategies are inefficient and are 

unable to meet multiple assessment requirements simultaneously. 

Some previous investigations showed that a well-constructed test sheet not only helps in the evaluation of 

student’s learning status, but also facilitates the diagnosis of the problems embedded in the learning process 

(Hwang, 2003a; Hwang et al. 2003a; Hwang 2005). Selecting proper test items is very critical to constitute a test 

sheet that meets multiple assessment criteria, including the expected time needed for answering the test sheet, the 

number of test items, the specified distribution of course concepts to be learned, and, most importantly, the 

maximization of the average degree of discrimination (Hwang et al, 2005). 

Since satisfying multiple requirements (or constraints) when selecting test items is difficult, most computerized 

testing systems generate test sheets in a random fashion. Hwang et al. (2003b) proposed a multiple-criteria where 

test sheet-composing problem is formulated as a dynamic programming model (Hillier and Lieberman, 2001) to 

minimize the distance between the parameters (e.g., discrimination, difficulty, etc.) of the generated test sheets 

and the objective values subject to the distribution of concept weights. A critical issue arising from the use of a 

dynamic programming approach is the exceedingly long execution time required for producing optimal 

solutions. As the time-complexity of the dynamic programming algorithm is exponential in terms of input data, 

the execution time will become unacceptably long if the number of candidate test items is large. Consequently, 

Hwang et al. (2005) attempted to solve the test sheet-composing problem by optimizing the discrimination 

degree of the generated test sheets with a specified range of assessment time and some other multiple constraints. 

Nevertheless, in developing an e-learning system, it is necessary to conduct a long-term assessment of each 

student; that is, only optimizing a test sheet is not enough for such long-term observation for the student. 

Therefore, a series of relevant test sheets with multiple assessment criteria need to be composed for such a 

continuously learning performance evaluation. As the problem is much more difficult than that of composing a 

single test sheet, a more efficient and effective approach is needed. In this paper, a particle swarm optimizationbased 

algorithm is proposed to find quality approximate solutions in an acceptable time. A series of experiments 

will be also presented to show the performances of the novel approach. 

3. Problem Description 

In this section, a mixed integer programming model (Linderoth and Savelsbergh, 1999) is presented to formulate 

the underlying problem. In order to conduct a long-term observation on the student’s learning status, a series of 

K relevant test sheets will be composed. The model aims at minimizing the differences between the average 

4

difficulty of each test sheet and the specified difficulty target, with a specified range of assessment time and 

some other multiple constraints. Assume K serial test sheets with a specific difficulty degree will be composed 

out of a test bank consisting of N items, Q1, Q2, …, QN. To compose test sheet k, 1 ≤ k ≤ K, a subset of nk 

candidate test items will be selected. Assume that in total M concepts will be involved in the K tests. With the 

specified course concepts to be learned, say Cj, 1 ≤ j ≤ M , each test item is relevant to one or more of them. 

For example, to test the multimedia knowledge of students, Cj might be “MPEG”, “Video Streaming” or “Videoon-Demand”. 

We shall call this problem the STSC (Serial Test Sheets Composition) problem. In the STSC 

problem, we need confine the similarities between each pair of tests. Such a constraint imposed upon each pair 

of tests k and l, 1 ≤ k, l ≤ K, is specified by parameter f, which indicates that any two tests can have at most f 

items in common. The variables used in this model are given as follows: 

Decision variables xik, 1 ≤ i ≤ N and 1 ≤ k ≤ K: xik is 1 if test item Qi is included in test sheet k; 0, 

otherwise. 

Coefficient di, 1 ≤ i ≤ N : degree of difficulty of Qi. 

Coefficient D, target difficulty level for each of the serial test sheets generated. 

Coefficient rij, 1 ≤ i ≤ N, 

1 ≤ j ≤ M : degree of association between Qi and concept Cj. 

Coefficient ti, 1 ≤ i ≤ N : expected time needed for answering Qi. 

Right hand side hj, 1 ≤ j ≤ M: lower bound on the expected relevance of Cj for each of the K test sheets. 

Right hand side l: lower bound on the expected time needed for answering each of the K test sheets. 

Right hand side u: upper bound on the expected time needed for answering each of the K test sheets. 

Right hand side f: the maximum number of identical test items between two composed test sheets; 

Formal definition of the STSC Model: 

Minimize Zk = 

subject to: 

⎛ 

⎜ 

⎜ 

⎜ 

⎜ 

⎝ 

N 

∑ rij 

xik 

i= 

1 

N 

∑ ti 

xik 

i= 

1 

N 

∑ ti 

xik 

i= 

1 

N 

∑ xijxik i= 

1 

∑ 

1 ≤ k ≤ K 

N 

∑ 

i = 1 

N 

∑ 

i = 1 

d 

i 

x 

x 

ik 

ik 

− 

D 

p 

⎞ 

⎟ 

⎟ 

⎟ 

⎟ 

⎠ 

1 

p 

≥ h , 1 ≤ j ≤ M , 1 ≤ k ≤ K; 

(1) 

j 

≥ l , 1 ≤ k ≤ K; 

(2) 

≤ u, 

1 ≤ k ≤ K; 

(3) 

≤ 

f 

, 1 

≤ j ≠ k ≤ K; 

In the above formula, constraint set (1) indicates the selected test items in each generated test sheet must have a 

total relevance no less than the expected relevance to each concept assumed to be covered. Constraint sets (2) 

and (3) indicate that total expected test time of each generated test sheet must be in its specified range. 

Constraint set (4) indicates that no pair of test sheets can contain more than f identical test items. 

1 

p p 

⎛ 

N 

⎞ 

⎜ 

d i x ⎟ 

ik 

⎜ ∑ 

i = 1 

⎟ 

In the objective function, Zk = ⎜ ∑ 

− D 

N 

⎟ is the p-norm between the average 

1 ≤ k ≤ K ⎜ 

x 

⎟ 

ik 

⎜ ∑ 

⎟ 

i = 1 

⎝ 

⎠ 

difficulty degree of each test sheet from the target difficulty degree specified by the teacher. In particular, the 

objective function indicates the absolute distance when p = 1, and it calculates the root squared distance when p 

= 2. Therefore, the objective of this model seeks to select a number of test items such that the average difficulty 

(4) 

5

of each generated test sheet is closest to the target difficulty value D. Without loss of generality, we let p = 2 for 

simulation. 

The computation complexity for obtaining the optimal solution to the STSC problem is analyzed as follows. The 

number of possible combinations of test items for composing a single test sheet is ⎛ N ⎞ where Ω is the range 

⎜ 

∑Ω 

for number of test items that could be answered within the specified time frame [l, u], while the parameters hj 

and f will affect the number of feasible solutions in those combinations. For composing K serial test sheets, it 

K 

requires a computation complexity of ⎛ ⎞ 

⎜⎛ 

⎛ N ⎞⎞ 

⎟ which is extremely high. Hence, seeking the optimal solution 

O ⎜ ⎜ ⎟⎟ 

⎜ ⎟ 

⎜⎜∑ 

⎜ ⎟⎟ 

i∈Ω ⎟ 

⎝⎝ 

⎝ i ⎠⎠ 

⎠ 

to the STSC problem is computationally prohibitive. 

4. PSO-based Algorithm for Serial Test Sheet Composition 

Linderoth and Savelsbergh (1999) conducted a comprehensive computational study manifesting that the mixed 

integer programming problems are NP-hard, which implies that composing optimal serial test sheets from a large 

item bank is computationally prohibitive. To cope with this difficulty, a particle swarm optimization (PSO)based 

algorithm is proposed to find quality approximate solutions with reasonable time. 

A. STSCPSO (Serial Test Sheets Composition with PSO) Algorithm 

The PSO algorithm was developed by Kennedy and Eberhart (1995). It is a biologically inspired algorithm 

which models the social dynamics of bird flocking and fish schooling. Ethologists find that a swarm of 

birds/fishes flock synchronously, change direction suddenly, scatter and regroup iteratively, and finally stop on a 

common target. The collective intelligence from each individual not only increases the success rate for food 

foraging but also expedites the process. The PSO algorithm facilitates simple rules simulating bird flocking and 

fish schooling and can serve as an optimizer for nonlinear functions. Kennedy and Eberhart (1997) further 

presented a discrete binary version of PSO for combinatorial optimization where the particles are represented by 

binary vectors of length d and the velocity represents the probability that a decision variable will take the value 

1. PSO has delivered many successful applications (Eberhart and Shi, 1998; Yoshida et al., 1999; Shigenori et 

al., 2003). The convergence and parameterization aspects of the PSO have also been discussed thoroughly (Clerc 

and Kennedy, 2002; Trelea, 2003). 

In the followings, a PSO-based algorithm, STSCPSO (Serial Test Sheets Composition with PSO approach), is 

proposed to find quality approximate solutions for the STSC problem. 

Input: N test items Q1, Q2, …, Qn, M concepts C1, C2, …, Cm, the target difficulty level D, and the number of 

required test sheets, K. 

Step 1. Generate initial swarm 

Since all decision variables of the STSC problem take binary values (either 0 or 1), a particle in the STSCPSO 

algorithm can be represented by x = [ x11 x21 

⋅⋅ ⋅ xN1x 

12x 

22 ⋅⋅ 

⋅ xN 

2 ⋅⋅ 

⋅ x1K 

x2K 

⋅⋅ 

⋅ xNK 

] , which is a vector of NK 

binary bits where xik is equivalent to 1 if test item Qi is included in test sheet k and 0 otherwise. Due to the 

constrains (2) and (3) with test time, the number of selected items in any test sheet is bounded in 

. Hence, we should enforce the integrity rule: 

[ l max{ 

t } , u min{ 

t } ] 

i= 

1~ 

N 

i 

i= 

1~ 

N 

N 

i 

l max{ 

ti} 

≤ ∑ x u ti 

k K 

i ik ≤ min{ 

} , ∀ = 1, 

2,..., 

, during every step of our algorithm. To generate the 

i= 

1~ 

N 

= 1 

i= 

1~ 

N 

initial swarm, we randomly determine the number of items for each test sheet according to the integrity rule. The 

selection probability of each item is based on the selection rule which gives higher selection probability to the 

items that have closer difficulty level to the target. In particular, the selection probability of item Qi is defined as 

S di 

D − − where S is a constant. As such the initial swarm contains solutions that have good objective 

( ) S 

⎜ 

⎝ 

i∈ i 

⎟ ⎟ 

⎠ 

6

values but may violate constraint sets. Then the particle swarm evolves to quality solutions that not only 

optimize the objective function but also meet all of the constraint sets. 

Step 2. Fitness evaluation of particles 

The original objective function of the STSC problem measures the quality of a candidate solution which meets all 

the constraints (1)-(4). However, the particles generated by the PSO-based algorithm may violate one or more of 

these constraints. To cope with this problem, the merit of a particle is evaluated by incorporating penalty terms 

into the objective function if any constraint is violated. The penalty terms corresponding to separate constraints 

are described as follows. 

α penalty for violating concept relevance bound constraint 

K M 

N ⎛ 

⎞ 

α = ⎜h 

− r x ⎟ . 

∑∑ j ∑ 

⎝ 

k= 

1 j= 

1 i= 

1 

ij 

ik 

⎠ 

This term sums up relevance deficit of selected test items to the specified relevance lower bound of 

each concept over all test sheets. 

β penalty for violating test time bound constraint 

K 

N 

N 

⎛ ⎛ ⎞ ⎛ 

⎞⎞ 

β = ∑ ⎜ 

⎜max⎜l 

−∑ 

t ⎟ + ⎜ ∑ − ⎟ ⎟ 

i xik 

, 0 max 0, 

ti 

xik 

u . 

k= 

1⎝⎝i= 1 ⎠ ⎝ i= 

1 ⎠⎠ 

This term penalizes the case where the expected test times are beyond the specified lower bound or 

upper bound. 

γ penalty for violating common item constraint 

N ⎛ 

⎞ 

γ ⎜ x x − f ⎟ . 

∑∑ 

= 

j≠ 

k i= 

1 

⎝ 

ij 

ik 

⎠ 

This term penalizes the case where the number of common items between two different tests exceeds 

the threshold f. 

Function J(⋅) for evaluating the fitness of a particle x 

Minimize J ( x) 

= Z k + w1α 

+ w2β 

+ w3γ 

. 

w1, w2, and w3 denote relative weights for the three penalty terms. As such the fitness of a particle x 

accounts for both of quality (objective value) and feasibility (penalty terms). The smaller the fitness 

value, the better the particle. 

Step 3. Determination of pbesti and gbest using the bounding criterion 

In the original PSO, the fitness evaluation of particles which is a necessity for determination of pbesti and gbest 

is the most time-consuming part. Here we propose a bounding criterion to speed up the process. We observe that 

the fitness value of a particle is only used for determination of pbesti and gbest, but not directly used for velocity 

update. Since Zk and J(⋅) are both monotonically increasing functions, we can use the fitness of the incumbent 

pbesti as a fitness bound and terminate the fitness evaluation of the ith particle when the intermediate fitness 

value has exceeded the bound. Also, only those pbesti that have been updated at the current iteration need to be 

compared with gbest for its possible updating. The use of bounding criterion can save the computational time 

significantly. 

Step 4. Update of velocities and particles 

The updating of velocities and particle positions follow the discrete version of PSO, i.e., the velocity is scaled 

into [0.0, 1.0] by a transformation function S(⋅) and is used as the probability with which the particle bit takes the 

vij 

value 1. In this paper, we adopt the linear interpolation function, S( 

v ) = + 0. 

5, 

to transform velocities 

ij 

2vmax 

into probabilities. 

7

C. An Illustrative Example 

Herein, an illustrative example for the STSCPSO algorithm is provided. Assume that two test sheets with target 

difficulty level D = 5 are required to be generated from 10 test items. The 10 test items are relevant to 3 

concepts, and the relevance association (rij) between each test item and each concept is shown in Table 1. The 

estimated answering time (ti) and difficulty degree (di) for the 10 test items are tabulated in Table 2. Let h1 = 2, 

h2 = 2, h3 = 1, l = 10, u = 16, f = 3, w1 = w2 = w3 = 0.01. The algorithm proceeds as follows. 

Initial swarm generation 

Table 1. Relevance association between each test item and each concept 

C1 C2 C3 

Q1 1 0 0 

Q2 0 1 0 

Q3 0 0 1 

Q4 0 1 0 

Q5 0 0 1 

Q6 0 1 0 

Q7 1 0 0 

Q8 1 1 1 

Q9 0 0 0 

Q10 0 1 0 

Table 2. Estimated answering time and difficulty degree for each test item 

ti di 

Q1 4 0.5 

Q2 3 0.9 

Q3 3 0.1 

Q4 5 0.7 

Q5 3 0.4 

Q6 2 0.5 

Q7 4 0.2 

Q8 3 0.6 

Q9 5 0.3 

Q10 4 0.5 

Let the algorithm proceed with a swarm of two particles. To generate the initial swarm, the range for the feasible 

number of selected items in a test sheet is first determined using the integrity rule: 

N 

l max{ 

ti} 

≤ ∑ x min{ 

} 

1~ 

i 1 ik ≤ u t . Hence, each test sheet can select 2 to 8 items from the test item bank. 

i 

i= 

N 

= 

i= 

1~ 

N 

According to our particle representation scheme, each particle is represented as a binary vector with 20 bits and 

is generated based on the selection rule ( S − di 

− D ) S which gives higher selection probability to the items 

that have closer difficulty level to the target D. It is observed from Table 2 that test items Q1, Q6, and Q10 will 

have the highest selection probability. With the integrity and selection rules, the initial swarm can be generated 

as shown in the first generation in Figure 1. Particle 1 selects items Q1, Q6, and Q8 for the first test sheet, and 

chooses Q1, Q4, Q5, Q8, and Q10 for the second test sheet. As for particle 2, the first test sheet consists of Q2, Q4, 

Q8, and Q10 and the second test sheet is composed of Q1, Q3, Q7, and Q9. 

test sheet 1 test sheet 2 Zk α β γ J 

Generation 1: particle 1 1000010100 1001100101 0.05 0 3 0 0.08 

particle 2 0101000101 1010001010 0.285 3 0 0 0.315 

Generation 2: particle 1 0100010100 1000100101 >0.08 - - - - 

particle 2 1000000100 1000101101 0.078 1 5 0 0.138 

Figure. 1. Swarm evolution and the corresponding fitness evaluation 

8

Particle fitness evaluation 

J x Z k w1α 

w2 

+ w3 

The particle fitness evaluation function ( ) = + + β γ consists of objective value and 

penalty terms which can be easily computed. In particular, particle 1 attains an objective value of 0.05 and incurs 

β penalty of 3 because the expected test time exceeds the upper limit, resulting a fitness value of 0.08. While 

particle 2 has an objective value of 0.285 and incurs α penalty of 3 due to the deficit of concept relevance. The 

fitness of particle 2 is thus 0.315. For the initial swarm, the personal best experience (pbesti) is the current 

particle itself. Since the fitness value of particle 1 is smaller, it is considered as gbest. The fitness values of pbesti 

and gbest will be used as bounds to expedite the process of next generation. 

Update of velocities and particles 

As for particle 1, the incumbent particle is equivalent to pbesti and gbest, resulting in the same vij values as 

previous ones and thus the same probabilities. Only a very small number of bits will be changed. Assume that 

particle 1 replaces Q1 by Q2 for the first test sheet and removes Q4 for the second test sheet (see generation 2 in 

Figure 1). For the case of particle 2, pbesti is the particle itself, but gbest is equivalent to particle 1, vij will be 

changed at the bits where pbesti and gbest are different. In essence, particle 2 will be dragged, to some extent, 

toward particle 1 in the discrete space. Assume that particle 2 becomes [10000001001000101101]. 

Use of bounding criterion for determining pbesti and gbest 

Now we proceed with the fitness evaluation for the two particles. For the case of particle 1, we find that the 

intermediate objective value during computation has already exceeded the current bound (0.08); hence, the 

computation is terminated according to the bounding criterion. There is also no need to derive penalty terms. As 

such the computational time is significantly reduced. As for particle 2, it attains an objective value of 0.078 and 

incurs α and β penalties of 1 and 5, resulting in a fitness of 0.138. Compared to incumbent pbest2 (fitness = 

0.315), the fitness is improved, so pbest2 is updated to current particle 2; while gbest is not changed (fitness = 

0.08). 

The STSCPSO algorithm iterates this process until a given maximum number of iterations has passed and gbest 

is considered as the best solution found by the algorithm. 

5. Experiment and Discussion 

The STSCPSO approach has been applied to the development of an Intelligent Tutoring, Evaluation and 

Diagnosis (ITED III) system, which contains a large item bank for many science courses. The interface of the 

developed system and the experiments for evaluating the performance of the novel approach are given in the 

following subsections. 

A. System Development 

The teacher interface of ITED III provides a step-by-step instruction to guide teachers in defining the goal and 

parameters of a test. In the first step, the teacher is asked to define the type and date/time of the test. The test 

type might be “certification” (for performing a formal test), personal practice (for performing a test based on the 

to-be-enhanced concepts of each student), or group practice (for performing a test to evaluate the computer 

knowledge of a group of students). Each student is asked to take the test, using an assigned computer in a 

monitored computer room, and is allowed to answer the test sheet only within the specified test date/time range. 

In the following steps, the teachers are asked to define the parameters for composing the test sheets, such as the 

lower bound and upper bound of the expected test times (i.e., l and u), and the lower bound on the expected 

relevance of each concept or skill to be evaluated (i.e., hj). Figure 2 demonstrates the third step for defining a test 

with lower bound and upper bound of the expected test times being 60 minutes and 80 minutes, respectively. 

Moreover, in this step, the teacher is asked to define the lower bounds on the expected relevance of the concepts 

for the test, which are all set to 0.9 in the given example. 

9

Figure. 2. Teacher interface for defining the goal and parameters of a test 

The entire test sheet is presented in one Web page with a scroll bar for moving the page up and down. After 

submitting the answers for the test items, each student will receive a scored test result and a personalized 

learning guide, which indicates the learning status of the student for each computer skill evaluated. Figure 3 is an 

illustrative example of a personalized learning guide for evaluating the skills of web-based programming. Such a 

learning guidance has been shown to be helpful to the students in improving their learning performance if 

remedial teaching or practice can be conducted accordingly (Hwang, 2003a; Hwang et al. 2003a; Hwang 2005). 

Figure. 3. Illustrative example of a personalized learning guide 

10

B. Experimental Design 

To evaluate the performance of the proposed STSCPSO algorithm, a series of experiments have been conducted 

to compare the execution times and the solution quality of three competing approaches: STSCPSO algorithm, 

Random Selection with Feasible Solution (RSFS), and exhaustive search. The RSFS program generates the test 

sheet by selecting test items randomly to meet all of the constraints, while the exhaustive search program 

examines every feasible combination of the test items to find the optimal solution. The platform of the 

experiments is a personal computer with a Pentium IV 1.6 GHz CPU, 1 GB RAM and 80G hard disk with 5400- 

RPM access speed. The programs were coded with C# Language. 

To analyze the comparative performances of the competing approaches, twelve item banks with number of 

candidate items ranging from 15 to 10,000 were constructed by randomly selecting test items from a computer 

skill certification test bank. Table 3 shows the features of each item bank. 

Table 3. Description of the experimental item banks 

Item Number of test Average Average expected answer time 

bank items difficulty of each test item (minutes) 

1 15 0.761525 3.00000 

2 20 0.765460 3.25000 

3 25 0.770409 3.20000 

4 30 0.758647 2.93333 

5 40 0.720506 2.95000 

6 250 0.741738 2.94800 

7 500 0.746789 2.97600 

8 1000 0.751302 2.97200 

9 2000 0.746708 3.00450 

10 4000 0.747959 3.00550 

11 5000 0.7473007 2.99260 

12 10000 0.7503020 3.00590 

The experiment is conducted by applying each approach twenty times on each item bank with the objective 

values (Zk) and the average execution time recorded. The lower bounds and upper bounds of testing times are 60 

and 120 minutes, respectively, and the maximal number of common test items between each pair of test sheets is 

5. To make the solutions hard to obtain, we set the target difficulty level D = 0.5 which sufficiently deviates 

from the average difficulty of item banks. The STSCPSO algorithm is executed with 10 particles for 100 

generations. The execution time of RSFS is set the same as that of the STSCPSO algorithm, while the maximal 

execution time of the exhaustive search is set to 7 days to obtain the optimal solution. 

STSCPSO 

Table 4 Experimental results 

RSFS Optimum Solution 

N Zk Average 

Time (sec) 

Zk Average 

Time (sec) 

Zk Average 

Time (day) 

15 0.44 50 0.62 50 0.42 2 days 

20 0.44 63 0.70 63 - > 7 days 

25 0.44 80 0.66 80 - - 

30 0.44 102 0.68 102 - - 

40 0.40 134 0.76 134 - - 

250 0.36 815 0.54 815 - - 

500 0.28 1805 0.68 1805 - - 

1000 0.22 3120 0.46 3120 - - 

2000 0.18 6403 0.60 6403 - - 

4000 0.16 12770 0.62 12770 - - 

5000 0.16 15330 0.54 15330 - - 

10000 0.14 21210 0.56 21210 - - 

11

Table 4 shows the experimental results of objective values (Zk) and execution times using the three methods. We 

observe that the exhaustive search method can only obtain the optimal solution for the smallest test item bank 

with N = 15 since the computation complexity of the STSC problem is extremely high as described in Section III. 

As for the STSCPSO algorithm and the RSFS, approximate solutions to all of the test item banks can be obtained 

with reasonable times ranging from 50 seconds to 5.89 hours, but the solution quality delivered by the STSCPSO 

algorithm is significantly better. In particular, for N = 15, the objective value obtained by the STSCPSO 

algorithm is 0.44 which is very close to the optimal value (0.42), while the objective value obtained by the RSFS 

is 0.62. For the other cases with larger item banks, the superiority of the STSCPSO algorithm over the RSFS 

becomes more prominent as the size of the item banks increases. Figure 4 shows the variations of the objective 

value obtained using the STSCPSO algorithm and the RSFS. The objective value derived by the RSFS fluctuates, 

while the objective value derived by the STSCPSO algorithm constantly decreases since more candidate test 

items can be selected to construct better solutions as the test bank is larger. 

Figure. 4. Variations of the objective value as the size of test item banks increases 

Figure 5 shows the fitness value obtained by gbest of the STSCPSO algorithm as the number of generations 

increases for the test item bank with N = 1000. We observe that the global swarm intelligence improves with a 

decreasing fitness value as the evolution proceeds. This validates the feasibility of the proposed particle 

representation and the fitness function fits the STSC problem scenario. 

Figure. 5. Variations of the objective value as the size of test item banks increases 

12

To analyze the convergence behavior of the particles, we testify whether the swarm evolves to the same 

optimization goal. We propose the information entropy for measuring the similarity convergence among the 

particles as follows. Let pij be the binary value of the jth bit for the ith particle, i = 1, 2, …, R, and j = 1, 2, …, 

NK, where R is the swarm size. We can calculate probj as the conditional probability that value one happens at 

the jth bit given the total number of bits that take value one in the entire swarm as follows. 

prob 

R 

∑i= 

∑ ∑ 

j = R 

p 

1 ij . 

NK 

i= 

1 h= 

1 

The particle entropy can be then defined as 

Entropy = − 

NK 

prob log prob . 

∑ j= 

1 

p 

ih 

j 

2 

( ) 

j 

The particle entropy is smaller if the probability distributions are denser. As such, the variations of particle 

entropy during the swarm evolution measure the convergence about the similarity among all particles. If the 

particles are highly similar to one another, the values of the non-zero probj would be high, resulting in denser 

probability distributions and less entropy value. This also means the swarm particles reach the consensus about 

which test items should be selected for composing the test sheets. 

Figure 6 shows the variations of particle entropy as the number of generations increases. It is observed that the 

entropy value drops drastically during the first 18 generations since the particles exchange information by 

referring to the swarm’s best solution. After this period, the entropy value is relatively fixed due to the good 

quality solutions found and the high similarity among the particles, meaning the particles are resorting to the 

same high quality solution as the swarm converges. 

6. Conclusions and Future work 

Figure 6. The particle entropy as the number of generations increases 

In this paper, a particle swarm optimization-based approach is proposed to cope with the serial test sheet 

composition problems. The algorithm has been embedded in an intelligent tutoring, evaluation and diagnosis 

system with large-scale test banks that are accessible to students and instructors through the World-Wide Web. 

To evaluate the performance of the proposed algorithm, a series of experiments have been conducted to compare 

the execution time and the solution quality of three solution-seeking strategies on twelve item banks. 

Experimental results show that serial test sheets with near-optimal average difficulty to a specified target value 

can be obtained with reasonable time by employing the novel approach. 

For further application, collaborative plans with some local e-learning companies are proceeding, in which the 

present approach is used in the testing and assessment of students in elementary school and junior high schools. 

13

Acknowledgement 

This study is supported in part by the National Science Council of the Republic of China under contract numbers 

NSC-94-2524-S-024-001 and NSC 94-2524-S-024 -003. 

References 

Chou. C. (2000). Constructing a computer-assisted testing and evaluation system on the World Wide Web-the 

CATES experience. IEEE Transactions on Education, 43 (3), 266-272. 

Clerc, M., & Kennedy, J. (2002). The particle swarm explosion, stability, and convergence in a multidimensional 

complex space. IEEE Transaction on Evolutionary Computation, 6, 58-73. 

Eberhart, R. C., & Shi, Y. (1998). Evolving artificial neural networks. In Proceedings of the International 

Conference on Neural Networks and Brain, PL5-PL13. 

Fan, J. P., Tina, K. M., & Shue, L. Y. (1996). Development of knowledge-based computer-assisted instruction 

system. The 1996 International Conference Software Engineering: Education and Practice, Dunedin, New 

Zealand. 

Feldman, J. M., & Jones, J. Jr. (1997). Semiautomatic testing of student software under Unix(R). IEEE 

Transactions on Education, 40 (2), 158-161. 

Garey, M. R., & Johnson, D. S. (1979). Computers and Intractability: A Guide to the Theory of NP- 

Completeness, San Francisco, CA: Freedman. 

Hillier, F. S., & Lieberman, G. J. (2001). Introduction to Operations Research, 7 th Ed., New York: McGraw- 

Hill. 

Hwang, G. J. (2003a). A concept map model for developing intelligent tutoring systems. Computers & 

Education, 40 (3), 217-235. 

Hwang, G. J. (2003b). A Test Sheet Generating Algorithm for Multiple assessment criteria. IEEE Transactions 

on Education, 46 (3), 329-337. 

Hwang, G. J., Hsiao, J. L., & Tseng, J. C. R. (2003a). A Computer-Assisted Approach for Diagnosing Student 

Learning Problems in Science Courses. Journal of Information Science and Engineering, 19 (2), 229-248. 

Hwang, G. J., Lin, B. M. T., & Lin, T. L. (2003b). An effective approach to the composition of test sheets from 

large item banks. 5th International Congress on Industrial and Applied Mathematics, Sydney, Australia, 7-11 

July, 2003. 

Hwang, G. J. (2005). A Data Mining Algorithm for Diagnosing Student Learning Problems in Science Courses. 

International Journal of Distance Education Technologies, 3 (4), 35-50. 

Hwang, G. J., Yin, P. Y., Hwang, G. H., & Chan, Y. (2005). A Novel Approach for Composing Test Sheets from 

Large Item Banks to Meet Multiple Assessment Criteria. The 5th IEEE International Conference on Advanced 

Learning Technologies, Kaohsiung, Taiwan, July 5-8, 2005. 

Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization. In Proceedings of the IEEE International 

Conference on Neural Networks, IV, 1942-1948. 

Kennedy, J., & Eberhart, R. C. (1997). A discrete binary version of the particle swarm algorithm. In Proceedings 

of the IEEE International Conference on Systems, Man, and Cybernetics, 4104-4108. 

Linderothm J. T., & Savelsbergh, M. W. P. (1999). A computational study of search strategies for mixed integer 

programming. INFORMS Journal on Computing, 11 (2), 173-187. 

14

Olsen, J. B., Maynes, D. D., Slawson, D., & Ho, K. (1986). Comparison and equating of paper-administered, 

computer-administered and computerized adaptive tests of achievement. The Annual Meeting of American 

Educational Research Association, California, April 16-20, 1986. 

Rasmussen, K., Northrup, P., & Lee, R. (1997). Implementing Web-based instruction. In Web-Based Instruction, 

Khan, B. H. (Ed.), Englewood Cliffs, NJ: Educational Technology, 341–346 

Shigenori, N., Takamu, G., Toshiku, Y., & Yoshikazu, F. (2003). A hybrid particle swarm optimization for 

distribution state estimation. IEEE Transaction on Power Systems, 18, 60-68. 

Trelea, I. C. (2003). The particle swarm optimization algorithm: convergence analysis and parameter selection. 

Information Processing Letters, 85, 317-325. 

Wainer, H. (1990). Computerized Adaptive Testing: A Primer, Lawrence Erlbaum Associates, Hillsdale, NJ. 

Yoshida, H., Kawata, K., Fukuyama, Y., & Nakanishi, Y. (1999). A particle swarm optimization for reactive 

power and voltage control considering voltage stability. In Proceedings of the International Conference on 

Intelligent System Application to Power Systems, 117-121. 

15

Lai, K. R., & Lan, C. H. (2006). Modeling Peer Assessment as Agent Negotiation in a Computer Supported Collaborative 

Learning Environment. Educational Technology & Society, 9 (3), 16-26. 

Modeling Peer Assessment as Agent Negotiation in a Computer 

Supported Collaborative Learning Environment 

K. Robert Lai 

Department of Computer Science and Engineering, Yuan Ze University, Taiwan 

krlai@cs.yzu.edu.tw 

Chung Hsien Lan 

Department of Computer Science and Engineering, Yuan Ze University, Taiwan 

s929408@mail.yzu.edu.tw 

ABSTRACT 

This work presents a novel method for modeling collaborative learning as multi-issue agent negotiation 

using fuzzy constraints. Agent negotiation is an iterative process, through which, the proposed method 

aggregates student marks to reduce personal bias. In the framework, students define individual fuzzy 

membership functions based on their evaluation concepts and agents facilitate student-student negotiations 

during the assessment process. By applying the proposed method, agents can achieve mutually acceptable 

agreements that avoid the subjective judgments and unfair assessments. Thus, the negotiated agreement 

provides students with superior assessments, thereby enhancing learning effectiveness. To demonstrate the 

usefulness of the proposed framework, a web-based assessment agent was implemented and used by 49 

information management students who submitted assignments for peer review. Experimental results 

suggested that students using the system had significantly improved learning performance over three rounds 

of peer assessment. Questionnaire results indicated that students believed that the assessment agent 

provided increased flexibility and equity during the peer assessment process. 

Keywords 

Peer assessment, collaborative learning, agent negotiation, assessment agent, fuzzy constraint 

1. Introduction 

Peer assessment is a widely adopted technique that can be applied to improve learning processes (Arnold et al., 

1981; Falchikov, 1995; McDowell and Mowl, 1996; Freeman, 1995; Strachan and Wilcox, 1996). Computerbased 

environments typically enable students to develop their individual learning portfolios and conveniently 

assess those of their peers. Numerous researchers have investigated the effectiveness of computer-based peer 

assessment systems in various learning scenarios (Lin, 2001; Davies and Berrow, 1998). Davies developed a 

computer program to adjust the scores awarded to the coursework of others and encourage students to diligently 

and fairly review the coursework of their fellow students (Davies, 2002). Kwok and Ma generated a Group 

Support Systems (GSS) for collaborative and peer assessment (Kwok and Ma, 1999). Rada applied a Many 

Using and Creating Hypermedia system (MUCH) to solve exercise problems and submit solutions for peer 

review (Rada, 1998). 

Rapid development of Internet technologies has spawned extensive use of online web learning in education. 

Some researchers have explored the feasibility of Internet supported peer assessment (Davis and Berrow, 1998; 

Zhao, 1998; Liu et al., 1999; Lin et al., 2001). For example, Sitthiworachart designed a web-based peer 

assessment system that successfully assists students in developing their understanding of computer programming 

(Sitthiworachart and Joy, 2003). Liu et al. developed a Networked Knowledge Management and Evaluation 

System (NetKMES) for an instructional method that emphasizes expressing ideas via oral presentations and 

writing, accumulated wisdoms via discussion, critical thinking via peer assessment, and knowledge construction 

via project work. Student’s achievement increased significantly as a result of the peer assessment process and the 

number of students willing to take part in learning activities also significantly increased (Liu, 2003). Knowledge 

acquisition software was utilized for peer assessment systems to discover the conceptual framework and 

evaluation schemes for students during peer assessment. Instructors and students go online to exchange their 

understanding of criteria and portfolios, allowing them to reflect and thereby enhance the learning process (Ford 

et al., 1993; Liu et al., 2002). 

Computer-based peer assessment systems enhance the effectiveness of student-student and student-instructor 

communication and assist students in thinking reflectively. In the peer assessment process, each student assumes 

the role of a researcher who submits assignments as well as a reviewer who comments on their peers’ 

assignments (Rogoff, 1991). Therefore, students are involved in the learning and assessment processes. 

16 





specific permission and/or a fee. Request permissions from the editors at kinshuk@ieee.org.

Numerous studies have demonstrated that students benefit significantly from peer assessment; however, some 

students have revealed that peers may lack the ability to evaluate each others’ work or not take their role 

seriously, allowing friendships, entertainment value, etc., to influence the marks they give to peer coursework. 

Furthermore, students often lack experience in peer assessment and encounter difficulties in interpreting 

assessment criteria. These obstacles often result in subjective and unfair assessments and, thus, students’ 

reflection and learning effectiveness are not enhanced. 

This work presents a novel approach that models collaborative learning as multi-issue agent negotiation using 

fuzzy constraints. Agent negotiation (Pruitt, 1981) is an iterative process through which the proposed 

methodology aggregates students’ marks to reduce personal bias. In this framework, students define individual 

fuzzy membership functions based on their evaluation concepts and agents facilitate student-student negotiations 

during the assessment process. By applying this method, agents can reach mutually acceptable agreements that 

overcome the unfair assessment as a result of students’ various degree of understanding the assessment criteria. 

Thus, the negotiated agreement improves the assessment process and learning effectiveness. 

The remainder of this paper is organized as follows. Section 2 introduces the concept of peer assessment. Section 

3 describes the overall framework of the assessment agent and its computational model. Then, a walk-through 

example is utilized to illustrate the application of the proposed methodology. Section 4 presents the experimental 

design, analytical results, and evaluation of the questionnaire results. Finally, Section 5 draws the conclusion. 

2. Peer Assessment 

Assessment is a learning tool. However, most traditional assessment methods often encourage “surface 

learning,” which is characterized by information memorization and comprehension (Sitthiworachart and Joy, 

2003). Falchikov defined peer assessment as “the process whereby groups rate their peers” (Falchikov, 2001). It 

can include student involvement not only in the final judgement made of student work, but also in the prior 

setting of criteria and the selection of evidence of achievement (Biggs, 1999). 

Peer assessment is an interactive assessment method that enhances student interpretation and reflection, enabling 

instructors to improve their understanding of student performance. This method moves coursework assessment 

from instructor-centered to student-centered. Based on previous research (Ford et al., 1993; Liu et al., 2002; 

Sitthiworachart and Joy, 2003), students are capable of learning how to criticize the work of their peers and 

accept peer criticism, thereby developing their critical thinking skills and self-reinforcement through peer 

assessment. 

Peer assessment requires cognitive activities such as reviewing, summarizing, clarifying, providing feedback, 

diagnosing errors and identifying missing knowledge or deviations (Van Lehn et al., 1995). However, based on 

the framework described by Sitthiworachart (Sitthiworachart and Joy, 2003) and Liu (Liu et al., 1999), the peer 

assessment process can be divided into four separate stages (Fig. 1). During stage 1, students complete their 

assignments on their own time and then submit assignments. During stage 2, peer reviewers assess peer 

assignments and then discuss their marks with the other students in their group who marked the same 

assignments. After the reviewers generate preliminary scores and feedback (stage 2), each student in the stage 3 

evaluates the quality of marks given by the other markers. Finally, during stage 4, the results, which are 

aggregates of all marks, are sent to the original author who then revises the original assignment based on peer 

feedback. By this process, students typically develop a serious attitude toward their coursework. 

Stage 1 

Do assignment 

Stage 2 

Do peer assessment 

exercise 

Submit the assignment Mark & Feedback 

Stage 3 

Mark quality of 

marking 

Figure 1. Peer assessment process 

Scores & Feedback 

Stage 4 

Show results 

Thus, peer assessment entails formative work reviewing to provide feedback and summative grading. Moreover, 

peer assessment also integrates a feedback mechanism into the peer assessment process. Some studies indicate 

that receiving feedback is correlated with effective learning (Van Lehn et al., 1995; Bangert-Drowns et al., 

1991). 

17

Web-based peer assessment has recently gained popularity. Liu, who implemented a networked peer assessment 

system (NetPeas) to support peer assessment (Liu et al., 1999), indicated that web-based peer assessment 

positively affected learning effectiveness (Yu et al., 2004). A network peer assessment model allows students to 

learn or construct knowledge by submitting assignments and receiving comments from peers to improve their 

work. During the assessment process, students evaluate peer assignments via the Web, which ensures anonymity 

and positively affects students’ willingness to critique their peers’ work. Furthermore, web-based peer 

assessment allows instructors to monitor student progress at any point during the assessment process. That is, 

instructors can determine how well an assessor or assessee performs and constantly monitor the assessment 

process whereas this is nearly impossible during ordinary peer assessment when several rounds are involved 

(Chiu et al., 1998). 

Topping et al. described the potential advantages of peer assessment, including the development of the skills of 

evaluating, justifying, and using discipline knowledge (Topping et al., 2000). They also suggested that peer 

assessment can increase post hoc reflection, improves student ability to generalize knowledge to new situations, 

and promotes self-assessment and self-awareness. In addition to the potential advantages of peer assessment, 

some students feel negatively about peer assessment as markers are also competitors (Lin et al., 2001). During 

the peer assessment process, students may utilize different assessment criteria that produce different assessment 

results. Additionally, students often lack the ability to evaluate each other’s work or may apply judgments that 

are excessively subjective. Zhao also pointed out that students commonly believe that only instructors have the 

ability and knowledge required to effectively evaluate work and provide critical feedback (Zhao, 1998). This 

study employed a novel negotiation mechanism to balance individual assessment standard and utilized web 

technologies to implement this system to enhance student anonymity, reduce transmission costs and increase 

student interaction. Thus, the computational methodology employed combines fuzzy constraints and agent 

negotiation to overcome such difficulties. An assessment agent is deployed to support web-based peer 

assessment. 

3. Assessment Agent 

3.1 Methodology 

A novel methodology that provides computational support to elicit the assessment criteria and results is 

proposed. This methodology relies on fuzzy membership functions and agent negotiation to construct an 

assessment agent. Negotiation is a process of cooperative and competitive decision making between two or more 

entities. Agent negotiation is an iterative process through which a joint decision is made by two or more agents 

in order to reach a mutually acceptable agreement (Pruitt, 1981). Thus, the student, whose coursework is marked, 

can negotiate with markers using the assessment agent to reach a final assessment. Figure 2 presents the 

framework of peer assessment. 

Student A 

Student A’s 

Portfolio 

Assessment criteria 


Student B 

Student D 

Student A’s 

Portfolio 

Self-assessment criteria 

Assessment 


Student A’s 

Portfolio 

Final assessment results 

Agent 

Student C 

Figure 2. The framework of peer assessment 

Student A’s 

Portfolio 

In this framework, students are free to construct personal assessment criteria when assessing portfolios. 

Assessment criteria, which consist of several evaluation concepts, are used to measure the qualities of students’ 

portfolios. Figure 2 shows a group of four students, A, B, C and D. Student A’s portfolio is assessed by students 

18

B, C and D, who rate the portfolio by defining fuzzy constraints after identifying their own evaluation concepts. 

The role of the assessment agent is to negotiate the assessment of students A, B, C and D and to achieve an 

agreement. Based on this process, the assessment agent is considered a distributed fuzzy constraint network (Lai 

and Lin, 2004). Assessment criteria are regarded as negotiation issues or constrained objects and student 

assessments are fuzzy constraints. Figure 3 presents the workflow of the assessment agent. 

Define Fuzzy 

constraints 

In the first step of the assessment process, students evaluate the portfolios submitted by the other students and 

use their own fuzzy membership functions to mark. These fuzzy membership functions for assessment criteria 

are regarded as fuzzy constraints. After defining fuzzy constraints, the assessment agent applies concession 

and/or trade-off strategies to negotiate. Trough offers generation and evaluation, if an agreement cannot be 

reached, the fuzzy constraints or negotiation strategies must be adjusted. Conversely, if an agreement is reached, 

the interests of all students are considered to produce the final results. Then, the student submitting portfolio to 

be assessed by other students receives final scores, understands the peer assessments, and can reflect upon the 

assessment and revise the portfolio. 

During the negotiation process, each agent begins negotiating by proposing an ideal offer. However, when an 

offer is unacceptable to the other agents, these agents make concessions using a concession strategy or derive 

new alternatives using a trade-off strategy to move toward an agreement. Adopted from the framework in (Lai 

and Lin, 2004), the fuzzy constraint-based negotiation context is formalized as follows. 

ℜ is a set of agents involved in the negotiation, ℜ p ∈ℜ 

is the one of members inℜwhere 1 ≤ p ≤ m and 

m is the number of ℜ . 

p 

C is a distributed fuzzy constraint network that represents an agent ℜ . p 

p 

Π is the intent of a distributed fuzzy constraint network p 

C and represents the set of all potential 

C 

agreements for agent ℜ . p 

μ (u ) is the overall degree of satisfaction reached with a solution u. 

Π 

p 

C 

Apply negotiation 

strategy 

Negotiation 

(Offer generation) 

(Offer evaluation) 

Self-reflection and 

improvement 

Figure 3. The workflow of the assessment agent 

p 

wq 

μΠ 

= min (( μ p ( u)) 

) 

(1) 

C 

p q= 

1,.., 

n Cq 

p 

where n is the number of negotiation issues, w is the weight of issue q in agent q 

ℜ , and p μ p (.) is the 

Cq 

degree of satisfaction for agent ℜ and issue q. 

p 

The process of negotiation is a series of determining how agents evaluate and generate alternatives 

from a possible designated space. ( , 

p ) 

i C Π ℜ ℑ denotes to find a final agreement for all agents in ℜ 

α 

from p 

i C Π . If α ( , 

p ) 

i C Π ℜ ℑ holds, the negotiation is complete and terminates; otherwise, threshold 

α 

α will move to next lower threshold i 

α and repeatedly applies i+ 

1 

( , 

p ) 

i 1 C Π ℜ ℑ to achieve an agreement. 

α + 

As the next lower threshold q 

q 

α over issue q is smaller than the minimal satisfaction degree δ for 

i+ 1 

issue q, the set of potential agreements over issue q would be j Π and that of other issues is k k 

δ C j 

i C j 

Π . 

α +1 

Then, ( , 

p ) 

i C Π ℜ ℑ will be false and the negotiation terminates until the next lower threshold α α is i+ 

1 

lower than the overall minimal satisfaction degree, that is, 

q 

αi 

+ 1 arg minδ 

q= 

1.. 

n 

< . 

No 

Reach an 

agreement 

Yes 

Produce the final 

results 

19

Ψ : Π p → [ 0, 

1] 

is an evaluation function that represents the aggregate satisfaction value of 

p 

C α i C 

agentℜ for the potential agreement in p 

p 

i C Π . The aggregate satisfaction value is the measure of human 

α 

preference. Given an offer (or counteroffer) u, the aggregate satisfaction value of u for agent p ℜ can be 

defined as 

1 n 

p 

wq 

Ψ p ( u) 

= ∑ ( μ p ( u)) 

(2) 

C 

q= 

1 Cq 

n 

In a concession strategy, agents generate new proposals to achieve a mutual satisfactory outcome by 

reducing their demands. p 

p 

p℘ 

is a set of feasible concession proposals at the threshold α for agent p 

α u 

q 

q 

and it is defined as 

p 

p p p 

p℘u 

= { v| 

( μ p ( v) 

≥αq 

) ∧( 

Ψ ( v) 

= Ψ ( u) 

−r)} 

(3) 

αq 

C 

where r is the concession value. 

In a trade-off strategy, agents can explore options for achieving a mutual satisfactory outcome by 

reconciling their interests. Agents can propose alternatives from a certain solution space and the 

degrees of satisfaction for constraints associated with the alternative are greater than or equal to an 

acceptable threshold. p 

p 

p Φ is a set of feasible trade-off proposals at threshold α for the alternatives 

α u 

q 

q 

of agent p and is defined as 

p 

p p p 

p Φu = { v| 

( μ p ( v) 

≥αq 

) ∧( 

Ψ ( v) 

= Ψ ( u))} 

(4) 

αq 

C 

where u is the latest offer. 

Agents maximize their individual payoffs and maximize the outcomes for all agents. Thus, a 

normalized Euclidean distance can be applied in measuring the similarity between alternatives to 

generate a best offer. The similarity function is defined as 

n 

∑ = 

2 

( μ p( 

v) 

−μ 

p( 

u') 

+ p p ( u')) 

p 

q 1 Cq 

Cq 

Cq 

Θ ( v, 

U′ 

) = 1− 

(5) 

n 

where n is the number of fuzzy constraints for agent p on issues, v is a feasible trade-off proposal of 

agent p, U′ is the set of counteroffers made by other agents, 

u' = arg v ' max v ' U ' ( μ p ( v) 

− μ p ( v' 

)) , 

∈ 

μ (v) 

and p μ (u') 

denote the satisfaction degree of qth fuzzy 

p 

C q 

C q 

Cq 

Cq 

constraint associated with v and u′ for agent p, and p pq (u') 

is the penalty from the qth dissatisfied fuzzy 

C 

constraint associated with offer u′ made by agent p. 

u* is the expected trade-off proposal for the next offer by agent p and is defined as 

where p 

q 

u 

* 

p 

= arg v (max Θ ( v, 

U ')) 

v Φ 

α is the highest possible threshold such that Φ ≠ {} 

p 

p p 

and Θ ( v, 

U ′ ) > Θ ( u, 

U ′ ) . 

By applying a similarity function, the best proposal that can be accepted by all participators will be found. The 

negotiation integrates the interests of all agents. If a final solution that satisfies all participators exists during a 

negotiation, a negotiation process will succeed. Otherwise, negotiation terminates and all agents adjust their 

interests prior to a new negotiation process. 

Multi-issue agent negotiation is supported in the assessment process. In particular, the core methodology of the 

assessment agent focuses on using fuzzy constraints to represent personal interests and applies negotiation 

strategies in making concessions or trade-offs between different possible values for negotiation issues. By 

applying constraints to express negotiation proposals, the model can execute the negotiation with increased 

efficiency and determine final results for overall assessments. 

3.2 Illustrative Example 

To demonstrate the application of this approach, the following scenario in peer assessment is considered. The 

peer assessment system is designed to assist information management undergraduates to learn database design 

methodologies during Database System course. Following the instruction in a conventional classroom, the 

instructor assigned a database design project as homework. Students were required to submit portfolios that 

analyze the assigned project and the design methodology used to solve the database development problem. 

p 

α q 

u 

∈ p 

α q 

p u 

(6) 

20

Solving this problem involved the following database development concepts: defining, constructing, 

manipulating, protecting and sharing databases. In this example, three students (students I, J and K) were in a 

group. Student K submitted his portfolio and students I and J assessed independently student K’s portfolio. 

Students I, J and K were allowed to construct their own fuzzy membership functions for the same evaluation 

concepts such as Completeness, Security and Flexibility to assess the portfolio. Following the workflow 

presented in Fig. 3, the assessment process was divided into the following steps. 

Defining the Fuzzy Constraints 

The instructor first explained each evaluation concept to the students. Students I, J and K defined membership 

functions for each evaluation concept. Each membership function was considered as a fuzzy constraint. After 

assessing student K’s portfolio, students I, J and K illustrated their own fuzzy constraints (Figures 4(a)(b)(c)). 

(a) Student I (b) Student J (c) Student K 

Figure 4. Fuzzy constrains 

Student I specified the assessments as Fuzzy constraints, namely Low Completeness, Median Security and 

Median Flexibility. Student J defined Median Completeness, Low Security and Median Flexibility as fuzzy 

constraints and student K defined High Completeness, High Security, and High Flexibility as fuzzy constraints. 

Suppose students I, J and K adopted concession and trade-off strategies. 

Using Fuzzy Constraints to negotiation 

Agents I, J and K represent students I, J and K respectively. Negotiation in this example is a multi-issue 

negotiation among agents I, J and K. Agreement is achieved when all participants agree. Agents I, J and K took 

I 

turns attempting to reach an agreement. Agent I proposed its assessments u 1 = ( 60, 

70, 

70) 

related to 

Completeness, Security and Flexibility at threshold 1 1 = 

I 

I 

α (Fig. 4(a)). However, according to (1), ( 1 ) = 0 

C u μ J 

I and ( 1 ) = 0 u 

I 

μ , agents J and K did not accept u1 as an agreement. Subsequently, agent J proposed its offer 

K 

C 

1 ( 70, 

60, 

70) 

= 

J 

J 

J 

u at threshold α 1 = 1. 

However, agents I and K also did not accept u1 as an agreement. Agent K 

then proposed its offer 1 ( 90, 

90, 

85) 

= 

K 

u at threshold 1 1 = 

K 

α . However, agents I and J still did not accept K 

u as an 

1 

agreement. 

Furthermore, assuming agent K adopted the fixed concession strategy and had no expected proposal at threshold 

K 

K 

α 1 = 1, 

agent K lowered its threshold to next threshold α 1 = 0. 

9 and created a new set of feasible proposals as 

K 

K 

K 

v = 90, 

90, 

83), 

v = ( 90, 

88, 

85), 

v = ( 88, 

90, 

85) 

2 a 

( 2b 

2c 

According to (5), the similarity among these feasible proposals was computed by agent K as 

K 

Θ 

K 

K K 

v , u' 

) = 0. 

361, 

Θ ( v , u' 

) = 0. 

366, 

Θ v , u' 

) = 0. 

374 

K K 

( 2 a 

2b 

( 2 c 

21

K 

Thus, agent K selected the most likely acceptable solution v 2 c = ( 88, 

90, 

85) 

as the offer for agents I and J. This 

procedure of offer evaluation and generation for agents I, J and K continued until an agreement was reached or 

no additional solutions were proposed. Through several rounds, negotiation finally reached an agreement over 

(completeness, security, flexibility) at (76, 76, 75), of which the satisfaction degree for agent I is (0.2, 0.4, 0.5); 

that of the agent J is (0.4, 0.2, 0.5), and that of the agent K is (0.3, 0.3, 0.5). The agreement reached 

K I J u=(76,76,75), of which Ψ ( u) 

= 0. 

367 , Ψ ( u) 

= 0. 

367, 

Ψ ( u) 

= 0. 

367 , ( u) 

= 0. 

3 , ( u) 

= 0. 

2 and 

u J 

C 

( u) 

= 

0. 

3 

shows that the proposed approach, involving fuzzy constraint relaxation and similarity, helped the 

assessment agent in arranging assessment criteria to meet each agent’s needs, and assisted agents in reaching an 

agreement that maximizes overall degree of satisfaction for assessments in the multi-issue negotiation. 

4. Experiment and Results 

This experiment examined the usability and effectiveness of the assessment agent, a computational model that 

emphasizes critical thinking via peer assessment. Ninety-two university students in their junior year majoring in 

information management and enrolling in a Database System course were selected as study subjects. These 

students, who attended two sessions, were assigned to 26 teams, and each team was assigned to implement a 

database design project and undergo two examinations. However, the teams in each session were randomly 

divided into two groups. One group was the experimental group that participated in peer assessment using the 

web-based assessment agent, whereas the other group did not take part in any peer assessment activities. 

Therefore, 13 teams in the experimental group were assigned to group A and the remaining teams were in group 

B. 

The experimental process consisted of the following three steps. 

Analysis of Groups’ Difference 

The instructor gave all students an examination that tested knowledge of course content to determine whether the 

learning status of the two groups differed. After students completed the examination, the tests were marked by an 

instructor and analyzed using t-test analysis (Table 1). 

Table 1. The analysis of the test score difference between the two groups 

Session Group N Teams Mean t-value p-value 

Session 1 Group A 27 7 71.11 -0.005 0.995 

Group B 22 7 71.14 

Session 2 Group A 22 6 58.05 -0.172 0.864 

Group B 21 6 59.10 

Level of significanceα =0.05 

No significant difference in learning status was noted between the two groups (Table 1) as both p-values exceed 

the level of significanceα . Therefore, the 13 selected teams were assigned to use the web-based assessment 

agent for peer assessment. 

Analysis of Teams’ Performance and Correlation 

During the peer assessment process, the instructor asked each team to design a project utilizing database theory 

that comprised preliminary plan, requirement analysis, conceptual database design, logistical database design, 

physical database design and implementation. All teams were instructed to do the following rounds. During round 1, 

they learned about database design theory and practiced on a case that needed to be assessed prior to attempting 

their projects. When all teams finished round 1, the 13 experimental teams submitted their projects and moved 

on to peer and self assessment using the evaluation concepts provided by the instructor via the assessment agent. 

Figure 5 displays the student interface of the assessment agent in defining fuzzy constraints. The instructor 

marked the submitted projects. After each team submitted fuzzy constraints for each project and obtained a 

negotiated agreement, these teams were allowed to revise their own projects and proceed to round 2 (preliminary 

plan and requirement analysis). Similarly, each team and the instructor completed the work. The overall 

u K 

C 

u I 

C 

22

assessment proceeded until round 3 (the conceptual, logistical, physical database design and the implementation) 

was completed. 

Figure 5. The student interface in defining fuzzy constraints 

Pearson’s correlation analysis is adopted to compare the correlations between peer and instructor marks. Student 

assessments are significantly and positively correlated with the instructors’ assessments for each evaluation 

concept utilized in the 3 rounds of assessment (Table 2). 

Table 2. Correlation analysis between instructor and student marks 

Round 1 Round 2 Round 3 

Evaluation Concepts Correlation Correlation Correlation 

Completeness 0.457* 0.619* 0.752** 

Correctness 0.534* 0.745** 0.623* 

Originality 0.413* 0.712** 0.676* 

*: p < 0.05, **: p < 0.01 

Paired t-test analysis for performance during the three rounds indicates that the improvement in learning for the 

13 teams was significant (Table 3). Students improved their performance from round 1 to round 2, and especially 

from round 1 to round 3. 

Table 3. Performance analysis over the 3 rounds 

Round 1 and 2 Round 2 and 3 Round 1 and 3 

Evaluation Concepts t-value p-value t-value p-value t-value p-value 

Completeness 5.84 2.89E-05 0.60 0.278 5.04 1.13E-04 

Correctness 5.66 3.88E-05 1.25 0.116 5.25 7.82E-05 

Originality 4.32 4.18E-04 0.39 0.348 4.66 2.22E-04 


Furthermore, as the difference cannot be perceived in round 1, the instructor’s assessment is utilized in round 2 

and round 3 to evaluate the performance of the two groups. By t-test analysis, the difference between the two 

groups was significant (Table 4). 

Table 4. Performance analysis for the two groups 

Round 2 Round 3 

Evaluation Concepts t-value p-value t-value p-value 

Completeness 2.02 0.029 2.24 0.019 

Correctness 2.47 0.012 2.38 0.014 

Originality 1.46 0.082 3.09 0.003 


23

Analysis of Individual Performance 

The instructor then gave all students a post-assessment examination related to course content and the project to 

determine whether each student’s performance in the 13 experimental teams differed from that of students in the 

other teams. Table 5 presents the differences of individual performance via the paired t-test and t-test. Analytical 

results indicate that students who participated in peer assessment had acquired more knowledge than students 

who did not. Furthermore, quantitative scores also indicate that students in the experimental group improved 

their performance via peer assessment. 

Table 5. Performance analysis for each student 

Session Group Examination N Mean t-value p-value 

Session 1 Group A First 27 71.11 -4.43 0.000 

Group A Second 27 78.22 

Group B First 22 71.14 -0.32 0.375 

Group B Second 22 71.36 

Group A Second 27 78.22 1.79 0.039 


Session 2 Group A First 22 58.05 -6.52 0.000 

Group A Second 22 71.77 

Group B First 21 59.10 -1.59 0.063 


Group A Second 22 71.77 1.90 0.003 



Evaluation of Questionnaire Results 

Following the experiment, students provided feedback via a questionnaire containing twelve questions. A 5point 

Likert scale was employed to grade responses. Questionnaire results indicate that students regarded the 

assessment agent as a satisfactory approach for flexibly assessing peer assignments, receiving fair feedback, and 

improving their performance. The questionnaire results are listed as follows: (A high mean indicates that 

students agree with the statement.) 

The assessment model is easily used. (Mean=4.02) 

The assessment model is flexible. (Mean=4.00) 

The assessment model is fair and, therefore, personal bias can be reduced. (mean=3.66) 

The assessment model is complex. (Mean=2.95) 

Students are provided with additional opportunities to reflect and improve their assignments via peer 

assessments. (Mean=4.39) 

The assessment standard can be constructed while reviewing peers’ assignments. (Mean=4.14) 

Peer assessment assists students in learning more than does instructor assessment. (Mean=4.05) 

Peer assessment is time and effort consuming. (Mean=3.22) 

Peer feedback encourages students to reflect self-assessment results. (Mean=4.05) 

Peer feedback enables students to improve their own assignments. (Mean=4.00) 

Peer assessment enhances the degree of seriousness students ascribe to their work. (Mean=3.82) 

Overall, students agreed that the assessment agent benefits learning evaluation. (Mean=3.93) Although some 

students considered it time and effort consuming, most students believed that the system helped them to reflect 

on and improve their learning activities. Additionally, students felt that by relying on fuzzy membership 

functions and agent negotiation, the assessment model was flexible, fair and easy to use. 

5. Conclusion 

This study has presented a computational model for peer assessment learning. By using this model, students are 

able to reach an agreement for peer assessment via agent negotiation. Experimental results indicate that this 

model significantly improved student performance. Students also agreed that the assessment agent is flexible and 

benefits the learning process. Based on these results, the assessment agent, which relies on fuzzy constraints and 

agent negotiation to facilitate peer assessment, has the following important merits. 

24

First, the assessment agent can produce an objective assessment by considering assessments of all markers and 

the self-marker. Second, the level of flexibility increases by using fuzzy constraints. Furthermore, agent 

negotiation augments interaction among students. Third, the student assessed is provided with individual scores, 

rather than overall results based on assessment concepts. Thus, students can reflect on the representation over 

each assessment concept and thereby think more deeply to improve the quality of the portfolio. Fourth, final 

scores integrate all assessments and, thus, the assessment method overcomes the assessment bias. The scores 

achieved through this process of peer assessment are more reliable than scores assigned by an individual. In 

addition to eliminating individual bias, student learning effectiveness is enhanced through interaction with 

students. 

The assessment agent benefits student learning in many ways. The web-based assessment agent provides a 

higher degree of anonymity and lower transmission costs than traditional peer assessment processes. 

Furthermore, it eliminates time and location constraints and effectively offers interaction and feedback. 

Finally, although the proposed methodology has yielded promising results in promoting learning effectiveness, 

considerable work remains to be done, including further development of the methodology to allow students to 

select their own evaluation concepts, comparing assessment agents against other peer assessment methodologies, 

and experimental application of the methodology to other domains. 

Acknowledgements 

This research is supported in part by the National Science Council of R.O.C. through Grant Number NSC 94- 

2745-E-155-007-URD and by the Center for Telecommunication Research of YZU. The authors also thank the 

anonymous reviewers for their helpful comments on this paper. 

References 

Arnold, L., Willoughby, L., Calkins, V., Gammon, L., & Eberhart, G. (1981). Use of peer evaluation in the 

assessment of medical students. Journal of Medical Education, 56 (1), 35-42. 

Bangert-Drowns, R. L., Kulick, C. L. C., Kulick, J. A., & Morgan, M. T. (1991). The instructional effect of 

feedback in test-like events. Review of Educational Research, 61, 213-238. 

Biggs, J. (1999). Teaching for Quality Learning at University, Buckingham: SRHE and Open University Press. 

Chiu, C. H., Wang, W. R., & Yuan, S. M. (1998). Web-based collaborative homework review system. 

Proceedings of ICCE’98, 474–477. 

Davies, P. (2002). Using agent reflective self-assessment for awarding degree classifications. Innovations in 

Education and Teaching International, 39 (4), 307-319. 

Davis, R., & Berrow, T. (1998). An evaluation of the use of computer supported peer review for developing 

higher-level skills. Computers and Education, 30, 111-115. 

Falchikov, N. (2001). Learning Together: Peer tutoring in higher education, London: Routledge Falmer. 

Falchikov, N. (1995). Peer feedback marking: developing peer assessment. Innovations in Education and 

Training International, 32, 175-186. 

Ford, K., Bradshaw, J., Adams-Webber, J., & Agnew, N. (1993). Knowledge acquisition as a constructive 

modeling activity. International Journal of Intelligent Systems, 8 (1), 9-32. 

Freeman, N. (1995). Peer assessment by groups of group work. Assessment and Evaluation in Higher Education, 

20, 289-300. 

Kwok, R. C. W., & Ma, J. (1999). Use of a group support system for collaborative assessment. Computers and 

Education, 32 (2), 109-125. 

25

Lai, K. R., & Lin, M. W. (2004). Modeling agent negotiation via Fuzzy constraints in E-Business. 

Computational Intelligence, 20 (4), 624-642. 

Lin, S. S. J., Liu, E. Z. F., & Yuan, S. M. (2001). Web-based peer Assessment feedback for students with various 

thinking styles. Journal of Computer-Assisted Learning, 17 (4), 420-432. 

Lin, S. S. J., Liu, E. Z. F., Chiu, C. H., & Yuan, S. M. (2001). Web-based peer assessment: attitude and 

achievement. IEEE Transaction on Education, 4 (2), 211. 

Liu, C. C., Liu, B. J., Hui, T. A., & Horng, J. T. (2002). Web based peer assessment using knowledge acquisition 

techniques: tools for supporting contexture awareness. International Conference on Computers in Education, 

Auckland, New Zealand, 4-6 December, 2002. 

Liu, E. Z. F. (2003). Networked knowledge management and evaluation system for CSCL: A case study. The 

Hawaii International Conference on Business, Hawaii, USA. 

Liu, E. Z. F., Lin, S. S. J., Chiu, C. H., & Yuan, S. M. (1999). Student participation in computer sciences courses 

via the Network Peer Assessment System. Advanced Research in Computers and Communications in Education, 

2, 744-747. 

McDowell, L., & Mowl, G. (1996). Innovative assessment. In Gibbs, G. (Ed.), Improving student learning 

through assessment and evaluation, Oxford: The Oxford Centre for Staff Development, 131-147. 

Pruitt, D. G. (1981). Negotiation behavior, New York: Academic Press. 

Rada, T. (1998). Efficiency and effectiveness in computer-supported peer-peer learning. Computers and 

Education, 30 (3), 134-146. 

Rogoff, B. (1991). Social interaction as apprenticeship in thinking: guidance and participation in spatial 

planning. In L. B. Resnick, J. M. Levine, & S. Teasley (Eds.), Perspectives on socially shared cognition, 

Washington: APA Press, 349-383. 

Sitthiworachart, J., & Joy, M. (2003). Web-based peer assessment in learning computer programming. The 3rd 

IEEE International Conference on Advanced Learning Technologies, 180-184. 

Strachan, I. B., & Wilcox, S. (1996). Peer and self assessment of group work: developing an effective response 

to increased enrollment in a third year course in microclimatology. Journal of Geography in Higher Education, 

20 (3), 343-353. 

Topping, K. J., Smith, E. F., Swanson, I., & Elliot, A. (2000). Formative peer assessment of academic writing 

between postgraduate agents. Assessment & Evaluation in Higher Education, 25 (2), 149-166. 

Van Lehn, K. A., Chi, M. T. H., Baggett, W., & Murray, R. C. (1995). Progress report: Towards a theory of 

learning during tutoring, Pittsburgh: Learning Research and Development Center, University of Pittsburgh. 

Yu, F. Y., Liu, Y. H., & Chan, T. W. (2004). A networked question-posing and peer assessment learning system: 

a cognitive enhancing tool. International Journal of Educational Technology Systems, 32, 211-226. 

Zhao, Y. (1998). The effects of anonymity on computer-mediated peer review. International Journal of 

Educational Telecommunications, 4 (4), 311-345. 

26

Kiu, C.-C., & Lee, C.-S. (2006). Ontology Mapping and Merging through OntoDNA for Learning Object Reusability. 

Educational Technology & Society, 9 (3), 27-42. 

Ontology Mapping and Merging through OntoDNA for Learning Object 

Reusability 

Ching-Chieh Kiu 

Faculty of Information Technology, Multimedia University, Jalan Multimedia, 63100 Cyberjaya, Malaysia 

cckiu@mmu.edu.my 

Chien-Sing Lee 

Faculty of Information Technology, Multimedia University, Jalan Multimedia, 63100 Cyberjaya, Malaysia 

cslee@mmu.edu.my 

ABSTRACT 

The issue of structural and semantic interoperability among learning objects and other resources on the 

Internet is increasingly pointing towards Semantic Web technologies in general and ontology in particular 

as a solution provider. Ontology defines an explicit formal specification of domains to learning objects. 

However, the effectiveness to interoperate learning objects among various learning object repositories are 

often reduced due to the use of different ontological schemes to annotate learning objects in each learning 

object repository. Hence, structural differences and semantic heterogeneity between ontologies need to be 

resolved in order to generate shared ontology to facilitate learning object reusability. This paper presents 

OntoDNA, an automated ontology mapping and merging tool. Significance of the study lies in an 

algorithmic framework for mapping the attributes of concepts/learning objects and merging these 

concepts/learning objects from different ontologies based on the mapped attributes; identification of a 

suitable threshold value for mapping and merging; an easily scalable unsupervised data mining algorithm 

for modeling existing concepts and predicting the cluster to which a new concept/learning object should 

belong, easy indexing, retrieval and visualization of concepts and learning objects based on the merged 

ontology. 

Keywords 

Learning object, Semantic interoperability, Ontology, Ontology mapping and merging, Semantic Web 

Introduction 

The borderless classroom revolves not only around the breaking down of geographical and physical borders but 

also, of cultures and knowledge. Realizing this vision however, requires orienting next-generation e-learning 

systems to address three challenges: first, refocusing development of e-learning systems on pedagogical 

foundations; second personalizing human-computer interaction in a seamless manner amidst a seemingly 

faceless technological world and third increasing interoperability among learning objects and among 

technological devices (Sampson, Karagiannidis & Kinshuk, 2002). 

A corresponding ontological solution, which addresses all these three issues, is Devedzic’s (2003) GET-BITS 

framework for constructing intelligent Web-based systems. Ontology is applied at three levels. The first is to 

provide a means to specify and represent educational content (domain knowledge as well as pedagogical 

strategies). The second application of ontology is to formulate a common vocabulary for human-machine 

interaction, human-human interaction and software-to-software agent interaction in order to enable 

personalization in presentation of learning materials, assessment and references adapted to different student 

models. The third application lies in systematization of functions (Mizoguchi & Bordeau, 2000) to enable 

intelligent help in learning systems, especially in collaborative systems. 

The concerns mentioned above point towards Semantic Web technologies as solutions. Layers in Semantic Web 

technologies are as shown in Figure 1 (Arroyo, Ding, Lara, Stollberg & Fensel, 2004). These layers appear to 

provide the solution to the third aspect mentioned by Sampson et al and Devedzic, which is also the focus in this 

paper. At the top most layer is the Web of trust, which relies on the proof and logic layers. The proof layer 

verifies the degree of trust assigned to a particular resource through security features such as digital signatures. 

The logic layer on the other hand, is formalized by knowledge representation data, which provides ontological 

support to the formation of knowledge representation rules. These rules enable inferencing and derivation of new 

knowledge. Basic description of data (metadata) is defined in the Dublin core such as author, year, location and 

ISBN. RDF (Resource Description Framework) schema and XML (eXtensible Markup Language) schema both 

provide means for standardizing metadata descriptions. RDF’s object-properties-values (OAV) building block 

creates a semantic net of concepts whereas XML describes grammars to represent document structure either 

through data type definitions (DTD) or XML schemas. Finally, all resources are tagged with uniform resource 






27

identifiers to enable efficient retrieval. Unicode, a 16-bit character scheme, creates machine-readable codes for 

all languages. 

Problem Definition and Background 

Background to the Problem 

The Semantic Web layers discussed above can be applied to the retrieval and reuse of learning objects as shown 

in Figure 2 (Qin & Finneran, 2002). According to Mohan and Greer (2003), “learning objects (LO) are digital 

learning resources which meet a single learning objective that may be reused in a different context”. Basic 

metadata to describe a learning object are defined in the Dublin core followed by contextual information that 

allows reuse of a learning object in different contexts. However, metadata expressed in XML addresses only 

lexical issues. XML does not allow interpretation of the data in the document. XML allows only lexical 

interoperability. 

Figure 1. The Semantic Web stack 

Figure 2. Representation granularity 

Furthermore, metadata schemas are user-defined. Due to differences in metadata specifications, metadata 

standardization initiatives such as ADL’s Sharable Content Object Resource Model (SCORM) and IEEE’s 

Learning Object Model (LOM) are introduced. 

However, LO reusability is complicated due to differences in LO metadata standards. For example, LO metadata 

standards such as IEEE LOM and Dublin Core are RDF binding whereas SCORM is XML binding. Hence, there 

are admirable efforts to map different learning object standards (Verbert, Gasevic, Jovanovic & Duval, 2005) in 

order to increase interoperability between learning object repositories. Significance of this work is the retrieval 

and assembly of parts of learning objects to create new learning objects suitable for different learning objectives 

and contexts. 

28

On a higher layer of abstraction, additional constraints to schemas are introduced through W3C’s Web Ontology 

Language (OWL). OWL extends RDF’s OAV schemas using DAML (DARPA Agent Markup Language) and 

OIL (Ontology Inference Layer). OWL creates a common vocabulary by adding constraints to class instances, 

property value, domain and range of an attribute (property) by providing interoperability functions such as 

sameClassAs and differentFrom (Mizoguchi, 2000). However, there are also differences among ontologies. 

This paper deals with semantic and structural heterogeneity. 

Ontological semantic heterogeneity arises from two scenarios. In the first scenario, ontological concepts for a 

domain are described with different terminologies (synonymy). For instance, in Figure 3, the terms booking and 

reservation; client and customer are synonymous but termed differently. In the second scenario, different 

meanings are assigned to the same word in different contexts (polysemy). On the other hand, different 

taxonomies cause structural heterogeneity among ontologies (Euzenat et al., 2004; Noy & Musen, 2000; de- 

Diego, 2001; Ramos, 2001; Stumme & Maedche, 2001). Instances of structural differences in Figure 3 are 

between the concepts airline and duration. 

Figure 3. Semantic differences and structural differences between ontologies 

Hence, there is a need for ontology mapping and merging. Ontology mapping involves mapping the structure 

and semantics describing learning objects in different repositories whereas ontology merging integrates the 

initial taxonomies into a common schematic taxonomy. [As such, in this paper the term merged ontology is used 

interchangeably with the term shared ontology]. 

Problem Statement 

Four major issues constrain efforts toward ontology mapping and merging tasks: 

1. Semantic differences: Some ontological concepts describe similar domain with different terminology 

(synonymy and polysemy). This results in overlapping domains. Thus, there is a need for mapping tools to 

interpret metadata descriptions from a lexical and semantic perspective to resolve the problems of synonymy 

and polysemy. 

2. Structural differences: Structure in ontology mapping and merging refers to the structural taxonomy 

associating concepts (objects). Different creators annotate LOs with different ontological concepts. This 

creates syntactical differences, necessitating merging tools, which are able to capture the different 

taxonomies and merge the taxonomies into a common taxonomy. 

3. Scalability in ontology mapping and merging: This is true especially for large ontological repositories where 

the growth of LO repositories over the network can be explosive. 

4. Lack of prior knowledge: Prior knowledge is needed for ontology mapping and merging using supervised 

data mining methods. However, such knowledge is not always available. Thus, unsupervised methods are 

needed as an alternative in the absence of prior knowledge. 

29

Research Questions 

We aim to design an automated and dynamic ontology mapping and merging framework and algorithm to enable 

syntactical and semantic interoperability among ontologies. Our research questions are: 

1. Semantic interoperability: 

2. How can we capture attributes describing concepts (objects) in order to map the attributes of concepts from 

different ontologies? 

3. What is the best threshold value for automated mapping and merging between two ontological concepts 

based on experiments on 4 different ontologies? 

4. Structural interoperability: How can we use these captured attributes to create a shared (merged) taxonomy 

from different ontologies? 

5. Scalability: Which unsupervised data mining technique is sufficiently easy to scale? 

6. Lack of prior knowledge for knowledge management: Which unsupervised data mining technique is easy to 

use for modeling existing concepts in the database and for predicting the cluster to which a new concept 

should be categorized into? 

Significance of the Study 

The contributions of this paper are: 

1. An algorithmic framework for mapping attributes from concepts in different ontologies and for merging 

concepts from different ontologies based on the mapped attributes. 

2. Determination of a suitable threshold value for automated mapping and merging 

3. An easily scalable unsupervised data mining technique that enables modeling of existing concepts in the 

database and for predicting the cluster to which a new concept should be categorized into in order to 

enhance the management of knowledge. 

4. Easy indexing and retrieval of LOs based on the merged ontology. 

5. Easy visualization of the concept space based on formal context. 

The outline for this paper is as follows: First, we will present related work in ontology mapping and merging. 

This is followed by a discussion on the OntoDNA framework; the OntoDNA algorithm, simulation results on 4 

different ontologies; and comparison with Ehrig and Sure’s (2004) integrated approach to ontology mapping in 

terms of precision, recall and f-measure. Next, we present how OntoDNA is applied in the CogMoLab integrated 

learning environment, in particular, in the OntoVis authoring tool. Finally, we conclude with future directions. 

Related Work on Ontology Mapping and Merging 

Ontology mapping is a precursor to ontology merging. As mentioned earlier, ontology mapping is the process of 

finding the closest semantic and intrinsic relationship between two or more existing ontologies of corresponding 

ontological entities such as concept, attribute, relation, instance and etc. Ontology merging however, is the 

process of creating new ontologies from two or more existing ontologies with overlapping or similar parts 

(Klein, 2001). 

Given ontology A (preferred ontology) and ontology B as illustrated in Figure 4, ontology mapping is used to 

discover intrinsic semantics of ontological concepts between ontology A and ontology B. Concepts between 

ontologies may be related or unrelated. The ontology merging process looks at the semantics between ontologies 

and restructures the concept-attribute relations between and among concepts in order to merge the different 

ontologies. 

Figure 4. Ontology Mapping and Merging Process 

30

Ontology mapping and merging methods mainly can be categorized into two approaches, the concept-based 

approach and the instance-based approach. Concept-based approaches are top-down approaches, which consider 

concept information such as name, taxonomies, constraints and relations and properties of concept elements for 

ontology merging. On the other hand, instance-based approaches are bottom-up approaches, which build up the 

structural hierarchy based on instances of concepts and instances of relations (Gomez-Perez, Angele, Fernandez- 

Lopez, Christophides, Stutt, & Sure, 2002). Some ontology mapping and merging systems, namely, Chimaera, 

PROMPT, FCA-Merge and ODEMerge are described in this section. 

Chimaera is a merging and diagnostic ontological environment for light-weight ontologies. The ontologies are 

automatically merged if the linguistic match is found. Otherwise, name resolution lists are generated, suggestion 

the terms from different ontologies to guide users in the merging process. The name resolution lists consists the 

candidate to be merged and taxonomic relationships that are yet to be merged into the existing ontology 

(McGuinness, Fikes, Rice, & Wilder, 2000). 

Similarly, PROMPT provides semi-automatic guided ontology merging. PROMPT is plugged into Protégé 2000. 

PROMPT’s ontology merging process is interactive. It guides the user through the ontology merging process. 

PROMPT identifies matching class names and iteratively performs automatic updates. PROMPT also identifies 

conflicts and makes suggestions on means to remove these conflicts to the user (Noy & Musen, 2000). 

A fully automated merging tool, ODEMerge is integrated with WebODE. ODEMerge performs supervised 

merging of concepts, attributes and relationships from two different ontologies using synonym and hypernym 

tables to generate the merged ontology. It merges ontologies with the help of corresponding information from the 

user. The user can modify results derived from the ODEMerge process (de-Diego, 2001; Ramos, 2001; Gomez et 

al., 2002). 

In contrast to Chimaera, PROMPT and ODEMerge, which are top-down approaches, FCA-Merge is a bottom-up 

ontology merging approach using formal concept analysis and natural language processing techniques. Given 

source ontologies, FCA-Merge extracts instances from a given set of domain-specific text documents by 

applying natural language processing techniques. The concept lattice, a structural result of FCA-Merge, is 

derived from the extracted instances using formal concept analysis. The result is analyzed and merged with the 

existing ontology by the ontology engineer (Stumme & Maedche, 2001). 

Differences between OntoDNA and Related Work 

OntoDNA utilizes Formal Concept Analysis (FCA) to capture attributes and the inherent structural relationships 

among concepts (objects) in ontologies. FCA functions as a preprocessor to the ontological mapping and 

merging process. Semantic problems such as synonymy and polysemy are resolved as FCA captures the 

structure (taxonomy) of ontological concepts as background knowledge to resolve semantic interpretations in 

different contexts. 

Level of 

mapping 

Table 1. Comparison between OntoDNA and four ontology mapping and merging systems 

Chimarea PROMPT ODEMerge FCA-Merge OntoDNA 

Problem addressed Merging Mapping and Merging Merging Mapping and 

Merging 

Merging 

Approach Top-down Top-down Top-down Bottom-up Top-down 

Is the tool integrated in Yes Yes. Prompt Suite Yes. WebODE No. It is a method No. 

other ontology tool? 

Which? 

Concept 

definitions and 

slot values 

 

Taxonomies 

Instances of 

concepts 

 

Knowledge- 

No. No. No. Lexicons Lexicons – string 

representation supported 

matching 

Suggestion provided by Name resolution A list of suggested - - No. 

the method 

lists are generated concepts to be 

to guide users in 

the merging 

process 

merged 

Methodology or - - - Natural language Conceptual 

techniques supported 

processing and clustering (FCA) 

31

conceptual 

clustering (FCA) 

Unsupervised data 

mining (SOM and 

k-means) 

Merged ontology 

in concept lattice 

- 

Type of output Merged ontology Merged ontology Merged ontology Merged ontology 

in concept lattice 

Level of user interaction - Adjusting the Supply synonym The produced 

system suggestion and polysemy files merged ontology 

for merging is fine-tuning by 

the user 

Level of automaticity Semi-automated Semi-automated Fully Automated Fully Automated Fully Automated 

A hybrid unsupervised clustering technique, Self-Organizing Map (SOM) – k-means (both explained in Vesanto 

& Alhoniemi, 2000; Kiu & Lee, 2005) is employed by OntoDNA to organize data and reduce problem size prior 

to string matching in order to address semantic heterogeneity in different contexts more efficiently. A priori 

knowledge is not required in unsupervised techniques as the unsupervised clustering results are derived from the 

natural characteristics of the data set. SOM organizes data, clustering more similar objects together, while kmeans 

is used to reduce the problem size of the SOM map for efficient semantic heterogeneous discovery. 

Most of the mentioned tools are based on syntactic and semantic matching heuristics. User interaction on the 

ontology merging process is required to generate the merged ontology. However, OntoDNA provides automated 

ontology mapping and merging using unsupervised clustering techniques and string matching metric to generate 

the merged ontology. Similar to the above tools, OntoDNA allows the user to modify system-generated choices 

if he or she wants to. However, if the option is not selected, then the output from the system is automatically 

updated. 

Table 1 summarizes these comparisons in terms of the problem addressed, the type of approach, the level of 

mapping, the level of automation, the type of output and the presence or absence of knowledge-representation 

supported in ontology mapping and merging systems. 

Figure 5. OntoDNA framework for ontology mapping and merging 

32

OntoDNA Framework and Algorithm 

OntoDNA Framework 

The OntoDNA automated ontology mapping and merging framework is depicted in Figure 5. The OntoDNA 

framework enables learning object reuse through Formal Concept Analysis (FCA), Self-Organizing Map (SOM) 

and K-means incorporated with string matching based on Levenshtein edit distance. 

Ontological Concept 

Ontology is formalized as a tuple O: = (C, SC, P, SP, A, I), where C is concepts of ontology and SC corresponds to 

the hierarchy of concepts. The relationship between the concepts is defined by properties of ontology, P whereas 

SP corresponds to the hierarchy of properties. A is axioms used to infer knowledge from existing knowledge and I 

is instances of concept (Ehrig & Sure, 2004). 

Clustering Techniques 

Formal Concept Analysis (FCA) is a conceptual clustering tool used for discovering conceptual structures of 

data (Ganter & Wille, 1997; Lee & Kiu, 2005). To allow significant data analysis, a formal context is first 

defined in FCA. Consequently, the concept lattice is depicted according to the context to represent the 

conceptual hierarchy of the data. As shown in Table 2, the formal context for learning objects is contextualized. 

The concepts of the ontology are filled in the matrix rows and the corresponding attributes and the matrix 

columns represent relations of the concepts. A ‘1’ indicates the binary relation that the concept g has the attribute 

m. The source ontology and target ontology is first formalized using Formal Concept Analysis, followed by 

semantic discovery through string matching using Levenshtein edit distance. 

version 

description 

title 

publishedOn 

Table 2. Example of a formal context for an ontology 

abstract 

softCopyURI 

softCopyFormat 

softCopySize 

institution 

volume 

organization 

school 

chapter 

publisher 

journal 

counter 

type 

note 

keyword 

PhdThesis 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Misc 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

SoftCopy 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 

TechReport 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

MastersThesis 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

InBook 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

InProceedings 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

InCollection 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

To accelerate the ontology mapping and merging process, the source ontology is fed to the self-organizing map 

(SOM). The SOM is an unsupervised neural network used to cluster the data set according to similarity. SOM 

compresses complex and high-dimensional data to lower-dimensional data, usually to a two-dimensional grid. 

The most similar data are grouped together in the same cluster. The SOM clustering algorithm provides effective 

modeling, prediction and scalability to cluster data. 

An unsupervised clustering technique, k-means is applied on the learnt SOM to reduce the problem size of the 

SOM cluster. K-means iteratively divides a data set to a number of clusters and minimizes the error function. To 

compute the optimal number of clusters k for the data set, the Davies-Bouldin validity index is used (Vesanto & 

Alhoniemi, 2000; Kiu & Lee, 2004). The determination of the best number of k-clusters to be used based on the 

Davies-Bouldin index provides scalability to the modeling of ontological concepts. Figure 6 illustrates the 

trained SOM, k-means clustering of the trained SOM based on k = 3, and the clustering of new ontological 

concepts to the SOM cluster which best matches it. 

In the event of new concepts, they will be clustered into the same cluster as their best matching units. 

Subsequently, the new concepts are merged based on semantic similarity defined by Levenshtein edit distance 

and the older version of the source ontology will be dynamically updated with the new concept. 

pages 

number 

booktitle 

series 

address 

edition 

author 

firstAuthor 

editor 

relatedProject 

softCopy 

33

(a) 

(b) 

(c) 

Figure 6. (a) The trained SOM for the ontology, (b) K-Means clustering of trained SOM and (c) New 

Ontological Concepts to SOM’s BMU (boxed) 

String Matching Metric 

String matching based on Levenshtein edit distance is found to provide the best semantic similarity metric as 

demonstrated in the empirical experiment section. As such, we have applied it in the OntoDNA to measure 

similarity between two strings. Given two ontological elements, ontological element A, OE1 and ontological 

concept B, OE2, the edit distance calculated between OE1 and OE2, is based on simple editing operations such as 

delete, insert and replace. The result returned by string matching is in the interval range [0, 1], where 1 indicates 

similar match and 0 dissimilar match. 

OntoDNA Algorithm 

The terms used in the OntoDNA framework are first explained: 

Source ontology OS: Source ontology is the local data repository ontology 

Target ontology OT: Target ontology refers to non-local data repository ontology 

Formal context KS and KT: Formal context KS is the formal context representation of the conceptual 

relationship of the source ontology OS, meanwhile formal context KT is the formal context representation of 

the conceptual relationship of the target ontology OT. 

Reconciled formal context RKS and RKT: Reconciled formal context RKS and RKT are formal context with 

normalized intents of source and target ontological concepts’ properties. 

The prototypical implementation of the automated mapping and merging process illustrated in Figure 5 above is 

explicated below: 

Input : Two ontologies that are to be merged, OS (source ontology) and OT (target ontology) 

Step 1 : Ontological contextualization 

The conceptual pattern of OS and OT is discovered using FCA. Given an ontology O: = (C, SC, P, SP, 

A), OS and OT are contextualized using FCA with respect to the formal context, KS and KT. The 

ontological concepts C are denoted as G (objects) and the rest of the ontology elements, SC, P, SP and 

A are denoted as M (attributes). The binary relation I ⊆ G x M of the formal context denotes the 

ontology elements, SC, P, SP and A corresponding to the ontological concepts C. 

Step 2 : Pre-linguistic processing 

A similarity calculation, Levenshtein edit distance is applied to discover correlations between KS and 

KT attributes (ontological properties). The computed similarity value of ontological properties at or 

above threshold value is persisted in the context, else it appends into the context to reconcile the 

formal context, KS and KT. RKS and RKT are used as input for the next step. 

34

Step 3 : Contextual clustering 

SOM and k-means are applied to discover semantics of ontological concepts based on the conceptual 

pattern discovered in the formal context, KS and KT. This process consists of two phases: 

(a) Modeling and training 

Firstly, SOM is used to model the formal context RKS to discover the intrinsic relationship 

between ontological concepts of the source ontology OS. Subsequently, k-means clustering is 

applied on the learnt SOM to reduce the problem size of the SOM cluster to the most optimal 

number of k clusters based on the Davies-Bouldin validity index. 

(b) Testing and prediction 

In this phase, new concepts from the target ontology OT are discovered by SOM's best-matching 

unit (BMU). SOM's BMU clusters the formal concepts RKT into its appropriate cluster without 

need for prior knowledge of internal ontological concepts. 

Step 4 : Post-linguistic processing 

The clusters, which contain the target ontological concepts, are evaluated by Levenshtein edit 

distance to discover the semantic similarity between ontological concepts in the clusters. If the 

similarity value between the ontological concepts are at or above threshold value, the target 

ontological concepts are dropped from the context (since they are similar to the source ontological 

concepts) and the binary relations I ⊆ G x M are automatically updated in the formal context. Else, 

the target ontological concept is merged with the source ontology. Finally a compounded formal 

context is generated. 

Output : Merged ontology in a concept lattice is formed. 

Mapping and Merging of Ontology OT to Ontology OS 

Mapping between source ontology OS and target ontology OT is needed prior to merging ontology OT with 

ontology OS. Mapping between ontological elements (ontological concepts or ontological properties) is required 

to resolve semantic overlaps between OS and OT. To perform ontology mapping between OS and OT, each 

ontological element in source ontology OS is examined against each ontological element in the target ontology 

OT. Hence, the mapping algorithm of OntoDNA runs in O(nm) time where n and m are the length of the source 

and target ontological elements. The structure and naming conventions of source ontology OS are preserved in 

the mapping and merging process. 

OntoDNA addresses two types of mapping: matched case and unmatched case (or non-exact-match). A matched 

case occurs in one-to-one mapping, where the source ontological element correlates with a target ontological 

element. Meanwhile, the unmatched case (or non-exact-match) exists when: 

1. there is no semantic overlap; where a source ontological element has no correlation with any target 

ontological element (no mapping) 

2. there are semantic overlaps between many elements, where a source ontological element has correlation 

with more than one target ontological element (one-to-many mapping) 

In OntoDNA, simple mapping and complex mapping algorithms are adopted to map the target ontology OT to 

the source ontology OS. The simple mapping algorithm is implemented to handle one-to-one mapping and also in 

cases where there are no semantic overlaps. The simple mapping algorithm uses lexical similarity to perform 

mapping and merging between ontologies. The simple mapping algorithm is outlined as follows: 

Given source ontological element OelementSi and target ontological element OelementTj, apply lexical 

similarity measure (LSM) to map the target ontology OT to the source ontology OS at threshold value, t, 

where elements i and j = 1, 2, 3, …, n. 

a) map (OelementTj OelementSi), if LSM(OelementSi, OelementTj) ≥ t; 

b) the target ontological element, OelementTj is mapped to (integrated with) the source ontological 

element, OelementSi and the naming convention and structure of the source ontological element, 

OelementSi are preserved. 

c) merge (OelementTj OS), if LSM(OelementSi, OelementTj) < t; 

d) the target ontological element, OelementTj is merged (appended) to the source ontology and the 

naming convention and structure of the target ontological element, OelementTj are preserved. 

A complex mapping algorithm is used to handle one-to-many mapping, where multiple matches between the 

target ontological elements to a source ontological element are resolved based on relative frequency of instances 

of the ontological elements. The complex mapping algorithm is outlined as follows: 

35

Given source ontological element OelementSi and its instances IelementSi and target ontological element 

OelementTj and its instances IelementTi, lexical similarity measure (LSM) and relative frequency of instances 

(fI) are applied to map the target ontology OT to the source ontology OS at threshold value, t, where 

ontological elements i and j = 1, 2, 3, …, n. 

a) map (OelementTj OelementSi), if LSM(OelementSi, OelementTj) ≥ t AND LSM(OelementSi, OelementTj+1) ≥ t, 

where LSM(OelementSi, OelementTj) = LSM(OelementSi, OelementTj+1), if fI(IelementSi, IelementTj) > fI(IelementSi, 

IelementTj+1); 

the target ontological element, OelementTj is mapped to the source ontological element, OelementSi and 

the naming convention and structure of the source ontological element, OelementSi are preserved. 

For example, given threshold value, t = 0.8, target ontological elements, B, C, D, source ontological 

element, X, map (X B), map (X C) and map (X D). If LSM = 0.8678 (above threshold value) 

and relative frequency, fI of ontological properties to each mapping is 7, 8 and 5 respectively, map C 

(highest frequency match) to the source ontological element, X. 

Empirical Experiment 

The objective of this experiment is 

a) to investigate which lexical measure, i.e. string matching, Wordnet or a combination of string 

matching and Wordnet will provide more accurate semantic similarity results. The Levenshtein edit 

distance measure is used for string matching (Cohen, Ravikumar & Fienberg, 2003) whereas the 

Leacock-Chodorow for Wordnet linguistic matching (Pedersen, Patwardhan & Michelizzi, 2004). 

b) to identify the best threshold value for semantic similarity discovery to automate the ontology 

mapping and merging process. 

The mapping results generated by OntoDNA are compared against human mapping to evaluate the precision of 

the system. For this experiment, we used a threshold value from 0.6 to 1.0 for the evaluation. Other comparative 

experimental results highlighting OntoDNA’s degree of accuracy are found in Kiu & Lee (in press). 

Data Sets 

Four pairs of ontologies were used for evaluation. These ontologies and the human mapping results can be 

obtained from http://www.aifb.uni-karlsruhe.de/WBS/meh/mapping/. The details of the paired ontologies are 

summarized in Table 3. 

Pair 1, Pair 2 and Pair 3 are SWRC (Semantic Web Research Community) ontologies, which describe the 

domain of universities and research. SWRC1a contains about 300 entities including concepts, properties and 

instances. Meanwhile the three ontologies, SWRC1b, SWRC1c and SWRC1d are small ontologies, each of 

them containing about 20 entities. 

Pair 4 ontologies describe Russia. Each ontology contains about 300 entities. The ontologies are created by 

students to represent the content of two independent travel websites about Russia. 

Table 3. Ontologies used in the experiment 

Experiment Ontology 1 Ontology 2 # Concepts # Properties # Total Manual 

Mapping 

Pair 1 SWRC1a SWRC1b 62 143 205 9 

Pair 2 SWRC1a SWRC1c 59 144 203 6 

Pair 3 SWRC1a SWRC1d 60 143 203 3 

Pair 4 Russian2a Russian2b 225 122 347 215 

Evaluation Metrics 

Information retrieval metrics such as precision, recall and f-measure are used to evaluate the mapping result from 

OntoDNA against human mapping (Do, Melnik, & Rahm, 2002). The objective and formula for each of these 

metrics are indicated below: 

36

Precision: measures the number of correct mapping found against the total number of retrieved mappings. 

number of correct found mapping 

precision = Eq. 1 

number of retrieved mappings 

Recall: measures the number of correct mapping found comparable to the total number of existing mappings. 

number of correct found mapping 

recall = Eq. 2 

number of existing mappings 

F-measure: combines measure of precision and recall as single efficiency measure. 

2 x precision x recall 

f - measure = Eq. 3 

precision + recall 

Results and Discussion 

The result of each semantic similarity approach is presented in Table 4. In terms of recall and precision, 

Levenshtein edit distance measure yields better results against the Leacock-Chodorow similarity measure and the 

combined WordNet-string matching approach in discovering correlation between two candidate mappings. 

String matching using Levenshtein edit distance generates an accuracy rate of 93.33% comparable to the human 

experts’ assessment as shown in Figure 7. 

Table 4. Comparison of lexical measures for 4 paired ontologies based on thresholds 0.6 to 0.1 

String Matching Measure (SM) WordNet Matching Measure (WN) Combined Measure (SM + WN) 

Pair 1 Pair 1 Pair 1 

Threshold t=0.6 t=0.7 t=0.8 t=0.9 t=1.0 t=0.6 t=0.7 t=0.8 t=0.9 t=1.0 t=0.6 t=0.7 t=0.8 t=0.9 t=1.0 

Precision 0.6667 0.8000 1.0000 1.0000 1.0000 0.3333 0.5714 0.6667 1.0000 1.0000 0.5000 0.5714 0.5714 1.0000 1.0000 

Recall 0.4444 0.4444 0.2222 0.5556 0.2222 0.2222 0.4444 0.4444 0.2222 0.4444 0.4444 0.4444 0.4444 0.2222 0.2222 

F-Measure 0.5333 0.5714 0.3636 0.7143 0.3636 0.2667 0.5000 0.5333 0.3636 0.6154 0.4706 0.5000 0.5000 0.3636 0.3636 



Precision 0.7500 0.6667 0.7500 0.7500 0.7500 0.7500 0.8000 0.7500 0.6667 0.8000 0.8000 0.8000 0.6000 0.7500 0.6667 

Recall 0.5000 0.3333 0.5000 0.5000 0.5000 0.5000 0.6667 0.5000 0.3333 0.6667 0.6667 0.6667 0.5000 0.5000 0.3333 

F-Measure 0.6000 0.4444 0.6000 0.6000 0.6000 0.6000 0.7273 0.6000 0.4444 0.7273 0.7273 0.7273 0.5455 0.6000 0.4444 



Precision 0.5000 1.0000 1.0000 1.0000 1.0000 0.2857 0.3333 0.2857 1.0000 1.0000 0.2857 0.1667 0.2857 1.0000 1.0000 

Recall 0.6667 0.6667 0.3333 0.3333 0.3333 0.6667 0.6667 0.6667 0.3333 0.3333 0.6667 0.3333 0.6667 0.3333 0.6667 

F-Measure 0.5714 0.8000 0.5000 0.5000 0.5000 0.4000 0.4444 0.4000 0.5000 0.5000 0.4000 0.2222 0.4000 0.5000 0.8000 



Precision 1.0000 1.0000 0.9831 0.9752 0.9762 0.7778 0.7500 0.6667 0.9091 1.0000 1.0000 0.9231 1.0000 0.9524 0.9643 

Recall 0.1767 0.1814 0.5395 0.5488 0.5721 0.0326 0.0279 0.0186 0.0465 0.0465 0.0465 0.0558 0.0372 0.1860 0.2512 

F-Measure 0.3004 0.3071 0.6967 0.7024 0.7214 0.0625 0.0538 0.0362 0.0885 0.0889 0.0889 0.1053 0.0717 0.3113 0.3985 

Average Average Average 


Precision 0.7292 0.8667 0.9333 0.9313 0.9315 0.5367 0.6137 0.5923 0.8939 0.9500 0.6464 0.6153 0.6143 0.9256 0.9077 

Recall 0.4470 0.4065 0.3988 0.4844 0.4069 0.3554 0.4514 0.4074 0.2339 0.3727 0.4561 0.3751 0.4121 0.3104 0.3683 

F-Measure 0.5013 0.5307 0.5401 0.6292 0.5463 0.3323 0.4314 0.3924 0.3491 0.4829 0.4217 0.3887 0.3793 0.4437 0.5017 

1.0000 

0.8000 

0.6000 

0.4000 

0.2000 

0.0000 

t=0.6 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

Precision Recall F-Measure 

t=0.6 

t=0.7 

t=0.8 

String Matching Measure WordNet Matching Measure Combined Measure 

t=0.9 

t=1.0 

t=0.6 

Figure 7. Average values of lexical measures 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

37

Generally, a threshold value of 0.8 or above improves OntoDNA’s precision and recall. Based on the individual 

ontology mapping performance, threshold value 0.8 provides the best measurement for ontology Pairs 1, 2 and 3 

in terms of precision. However the recall values for the mapping are affirmative at the threshold value 0.7 as 

evidenced by ontology Pairs 1 and 3. The graphical representation of the experimental results for Levenshtein 

edit distance measure is shown in Figure 8, Figure 9 and Figure 10. 

1.20 

1.00 

0.80 

0.60 

0.40 

0.20 

0.00 

1.20 

1.00 

0.80 

0.60 

0.40 

0.20 

0.00 

1.20 

1.00 

0.80 

0.60 

0.40 

0.20 

0.00 

t=0.6 

Pair 1 Pair 2 Pair 3 Pair 4 

Threshold = 0.6 

(a) 



(c) 

1.20 

1.00 

0.80 

0.60 

0.40 

0.20 

0.00 

1.20 

1.00 

0.80 

0.60 

0.40 

0.20 

0.00 

1.20 

1.00 

0.80 

0.60 

0.40 

0.20 

0.00 





(b) 



(d) 

Precision 

Recall 

F-Measure 

(e) 

Figure 8. Precision, recall and f-measure of the paired ontologies at threshold value 0.6 to 1.0 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

t=0.6 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

Precision Recall F-Measure 

t=0.6 

t=0.7 

t=0.8 

Pair 1 Pair 2 Pair 3 Pair 4 Average 

Figure 9. Average mapping results based on the threshold value 

t=0.9 

t=1.0 

t=0.6 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

t=0.6 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

38

Matching Accuracy 

70% 

60% 

50% 

40% 

30% 

20% 

10% 

0% 

t=0.6 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

t=0.6 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

t=0.6 

t=0.7 

t=0.8 

Pair 1 Pair 2 Pair 3 Pair 4 Average 

Figure 10. Improvement in matching accuracy of mapped ontologies 

The average mapping result for all four pairs (Figure 9) shows that the threshold value 0.8 generates the best 

precision and recall. It contributes towards improvement in the f-measure and matching accuracy in mapping 

(Figure 10). Therefore, threshold value 0.8 will be adopted to automate the ontology mapping and merging 

process. However, we desire to perform more experiments to justify the current threshold value for ontological 

domains other than that of the academic domain. 

OntoDNA and Ontology Mapping (An Integrated Approach) Comparison Evaluation Result 

The statistics for Ehrig and Sure’s (2004) ontology mapping on measures at cut-off using a neural net similarity 

strategy is extracted and compared with that of OntoDNA in terms of precision, recall and f-measure as shown in 

Table 5 below. The statistics indicate the best results obtained in terms of precision among metric measures and 

similarity strategies. 

Table 5. Summarized of precision, recall and f-measure 

t=0.9 

t=1.0 

OntoDNA Ontology Mapping (Ehrig & Sure, 2004) 

Precision Recall F-Measure Precision Recall F-Measure 

SWRC 0.9167 0.4630 0.6048 0.7500 0.6667 0.7059 

Russia2 0.9752 0.5488 0.7024 0.7763 0.2822 0.4140 

OntoDNA provides better precision compared to Ehrig and Sure’s ontology mapping tool (Figure 11). OntoDNA 

shows significant improvement in terms of recall and f-measure for the Russia2 ontology. This is evidence that 

OntoDNA can effectively address structural complexities and different ontological semantics. It is also noted 

that precision and recall relationships are inverse; increase in precision tends to result in decrease in recall, and 

vice-verse (Soboroff, 2005). There is tradeoff between precision and recall. Hence, it is up to the designer to 

decide on a suitable level of tradeoff. 

precision, recall & f-measure 

1.2000 

1.0000 

0.8000 

0.6000 

0.4000 

0.2000 

0.0000 

OntoDNA vs. Ontology Mapping (Ehrig & Sure, 2004) 

t=0.6 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

t=0.6 

OntoDNA Ontology Mapping (Ehrig & Sure, 2004) 

SWRC Russia2 SWRC Russia 2 SWRC Russia2 

precision recall f-measure 

Figure 11. Comparison result between OntoDNA with Ehrig and Sure’s ontology mapping 

t=0.7 

t=0.8 

t=0.9 

t=1.0 

39

Learning Object Interoperability in CogMoLab with OntoDNA 

CogMoLab is an integrated learning environment comprising of the OntoID authoring tool (Lee & Chong, 

2005), OntoVis authoring/visualization tool (Lee & Lim, 2005), Merlin agent-assisted collaborative concept map 

(Lee & Kuan, 2005) and OntoDNA (formerly named as OntoShare) ontology mapping and merging tool (Lee & 

Kiu, 2005). In CogMoLab, the student model (Lee, in press) forms the base component to provide intelligent 

adaptation with the applications to offer interactive environments for supporting student-learning tasks and 

instructors’ designing tasks. Currently, we plan to use OntoDNA to enable retrieval of resources from external 

repositories to enrich resources in OntoID, Merlin and OntoVis (Figure 12). 

Figure 12. General view of learning objects interoperability with OntoDNA 

Figure 13. Visualization of concepts through the OntoVis 

Concepts and instances in the OntoVis (Lim & Lee, 2005) are designed based on Formal Concept Analysis’ 

formal context. Currently, concepts and attributes are keyed in manually by the instructor into OntoVis’ formal 

context and visualized as shown in Figure 13. Hence, we also aim to enable visualization of the merged ontology 

through the OntoVis. 

40

Conclusion 

This paper has described the OntoDNA framework for automated ontology mapping and merging and for 

dynamic update of the new ontological concepts in the existing knowledge base or database. The utilization of 

OntoDNA to interoperate ontologies for learning object retrieval and reuse from local and external learning 

object repository in CogMoLab has been explained. The merged ontology can be visualized through the OntoVis 

authoring/visualization tool in the form of a composited concept lattice. 

For future work, we will experiment with the discovery of instances from the ontological clusters through SOMk-means 

to enable more efficient query. We also desire to evaluate the performance of the OntoDNA on other 

ontological domains in terms of matching accuracy and threshold value. 

In the ASEAN seminar on e-learning, participating representatives have agreed to share knowledge and expertise 

with regards to human resource development through Information and Communications Technologies (ICT). 

One of the means for sharing knowledge is through the establishment of an ASEAN repository of learning 

objects. We hope that the OntoDNA will be able to contribute towards interoperability among standards and 

schemas and enrich the teaching and learning process, especially with the sharing of various cultures, not only in 

ASEAN but also with communities of practice in other parts of the world. 

References 

Arroyo, S., Ding, Y., Lara, R., Stollberg, M., & Fensel, D. (2004). Semantic Web Languages - Strengths and 

Weakness. International Conference in Applied computing (IADIS04), Lisbon, Portugal. 

Cohen, W., Ravikumar, P., & Fienberg, S. (2003). A Comparison of String Distance Metrics for Name-matching 

tasks. In IIWeb Workshop held in conjunction with IJCAI, retrieved June 20, 2006 from 

http://www.cs.cmu.edu/~wcohen/postscript/ijcai-ws-2003.pdf. 

de-Diego, R. (2001). Método de mezcla de catálogos electrónicos, Final Year Project, Facultad de Informática 

de la Universidad Politécnica de Madrid. Spain. 

Devedzic, V. (2003). Key issues in next-generation Web-based education. IEEE Transactions on Systems - Man, 

and Cybernetics, Part C, 33 (3), 339-349. 

Do, H., Melnik, S., & Rahm, E. (2002). Comparison of schema matching evaluations. In Proceedings of the 2 nd 

International.Workshop on Web Databases (German Informatics Society). 

Ehrig, M., & Sure, Y. (2004). Ontology Mapping - An Integrated Approach. Lecture Notes in Computer Science, 

3053, 76-91. 

Euzenat, J., Bach, T. L., Barrasa, J., Bouquet, P., Maynard, D., Stamou, G., Stuckenschmidt, H., Zaihrayeu, H., 

Hauswirth, M., Ehrig, M., Jarrar, M., Shvaiko, P., Dieng-Kuntz, R., Hernández, R. L., Tessaris, S., & Acker, S. 

V. (2004). D2.2.3: State of the art on current alignment techniques, retrieved April 5, 2006 from 

http://knowledgeweb.semanticweb.org/. 

Ganter, B., & Wille, R. (1997). Applied Lattice Theory: Formal Concept Analysis, retrieved June 10, 2006 from 

http://citeseer.ist.psu.edu/rd/0%2C292404%2C1%2C0.25%2CDownload/http://citeseer.ist.psu.edu/cache/papers/ 

cs/14648/http:zSzzSzwww.math.tu-dresden.dezSz%7EganterzSzconcept.pdf/ganter97applied.pdf. 

Gomez-Perez, A., Angele, J., Fernandez-Lopez, M., Christophides, V., Stutt, A., & Sure, Y. (2002). A survey on 

ontology tools, OntoWeb deliverable 1.3. Universidad Politecnia de Madrid. 

Kiu, C. C., & Lee, C. S. (2004). Discovering Ontological Semantics using FCA and SOM, In Proceedings of 

M2USIC2004, Cyberjaya, Malaysia, 7-8 October 2004. 

Kiu, C. C., & Lee, C. S. (2005). Discovering Ontological Semantics for Reuse and Sharing of Learning Objects 

in a Contextual Learning Environment. The 5 th International Conference on Advanced Learning Technology 

(ICALT2005), July, 8-9 2005, Kaohsiung, Taiwan. 

Kiu, C. C., & Lee, C. S. (2006). A Data Mining Approach for Managing Shared Ontological Knowledge. The 6 th 

International Conference on Advanced Learning Technology, July 5-7, 2006, Kerkrade, The Netherlands. 

41

Klein, M. (2001). Combining and relating ontologies: an analysis of problems and solutions. In Proceedings of 

the 17 th International Joint Conference on Artificial Intelligence (IJCAI-01), Workshop: Ontologies and 

Information Sharing, USA. 

Lee, C. S. (in press). Diagnostic, predicting and compositional modeling with data mining in integrated learning 

environments. Computers & Education, Elsevier. 

Lee, C. S., & Chong, H. R. (2005). Synergistic design considerations for continuing education: Refocusing on 

instructional design. WSEAS Transactions on Advances in Engineering Education, 2 (4), 294-304. 

Lee, C. S., & Kiu, C. C. (2005). A concept-based graphical –neural approach to ontological interoperability, 

WSEAS Transactions on Information Systems and Applications, 2 (6), 761-770. 

Lee, C. S., & Kuan, C. L. (2005). Intelligent scaffolds for collaborative concept mapping, WSEAS Transactions 

on Information Systems and Applications, 2 (8), 1157-1166. 

Lee, C. S., & Lim, W. C. (2005). Visualization for course modeling, navigation and retrieval: Towards 

contextual learning and cognitive affordance. WSEAS Transactions on Advances in Engineering Education, 2 

(4), 347-355. 

Lim, W. C., & Lee, C. S. (2005). Knowledge discovery through composited visualization, navigation and 

retrieval. Lecture Notes in Artificial Intelligence, 3735, 376-378. 

McGuinness, D. L., Fikes, R., Rice, J., & Wilder, S. (2000). An environment for merging and testing large 

Ontologies. Proceeding 7 th International Conference on Principles of Knowledge Representation and Reasoning 

(KR’2000), Breckenridge, Colorado, USA, 483-493. 

Mizoguchi, R., & Bordeau, J. (2000). Using ontological engineering to overcome AIED problems. Journal of 

Artificial Intelligence in Education, 11, 107-121. 

Mohan, P., & Greer. J. (2003). Reusable Learning Objects: Current Status and Future Directions. In D. Lassner 

& C. McNaught (Eds.) Proceedings of ED-MEDIA 2003 World Conference on Educational Multimedia, 

Hypermedia and Telecommunication. Honolulu, Hawaii, USA. 

Noy, N. F., & Musen, M. (2000). PROMPT: Algorithm and Tool for Automated Ontology Merging and 

Alignment. In Proceedings of the 17 th National Conference on Artificial Intelligence (AAAI’00), Austin, USA. 

Qin, J., & Finneran, C. (2002). Ontological representation for learning objects. In Proceedings of the Workshop 

on Documents Search Interface Design and Intelligent Access in Large-Scale Collections, JCDL'02, Portland, 

OR, July 18, 2002. 

Pedersen, T., Patwardhan, S., & Michelizzi J. (2004). WordNet: Similarity - Measuring the Relatedness of 

Concepts, In the Proceedings of the 19 th National Conference on Artificial Intelligence (AAAI-04), San Jose, CA. 

Ramos, J. A. (2001). Mezcla automática de ontologías y catálogos electrónicos. Final Year Project. Facultad de 

Informática de la Universidad Politécnica de Madrid. Spain. 

Soboroff, I. (2005). IR Evaluation, retrieved December 2, 2005 from 

http://www.csee.umbc.edu/~ian/irF02/lectures/09Evaluation.pdf. 

Sampson, D., Karagiannidis C., & Kinshuk (2002). Personalized Learning: Educational, Technology and 

Standardization Perpective. Interactive Educational Multimedia, 4, 24-39. 

Stumme, G., & Maedche, A. (2001). FCA–Merge: A Bottom–Up Approach for Merging Ontologies. In IJCAI 

’01 - Proceedings of the 17 th International Joint Conference on Artificial Intelligence, Morgan Kaufmann. USA. 

Verbert, K., Gasevic, D., Jovanovic, J., & Duval, E. (2005). Ontology-based Learning Content Repurposing. 

WWW2005, Chiba, Japan, May 10-14, 2005. 

Vesanto, J., & Alhoniemi. E. (2000). Clustering of the Self-Organizing Map. IEEE Transactions on Neural 

Networks, 11 (3), 586-600. 

42

Suhonen, J., & Sutinen, E. (2006). FODEM: developing digital learning environments in widely dispersed learning 

communities. Educational Technology & Society, 9 (3), 43-55. 

FODEM: developing digital learning environments in widely dispersed 

learning communities 

Jarkko Suhonen and Erkki Sutinen 

Department of Computer Science, University of Joensuu, Finland, P.O.BOX 111, FI-80101 Joensuu, Finland 

jarkko.suhonen@cs.joensuu.fi 

erkki.sutinen@cs.joensuu.fi 

ABSTRACT 

FODEM (FOrmative DEvelopment Method) is a design method for developing digital learning 

environments for widely dispersed learning communities. These are communities in which the geographical 

distribution and density of learners is low when compared to the kind of learning communities in which 

there is a high distribution and density of learners (such as those that exist in urban areas where courses are 

developed and taken by hundreds or thousands of learners who are simultaneously present in the area). 

Since only limited resources can be allocated for the design of a digital learning environment for widely 

dispersed learning communities, it is necessary to use what limited funds are available to obtain valid 

feedback from stakeholders and to utilize such feedback in an optimal way. In terms of the FODEM model, 

the design process consists of asynchronous development threads with three interrelated components: (1) 

needs analysis, (2) implementation, and (3) formative evaluation. In needs analysis, both theory and 

practice are used to define specifications for the environment. In implementation, fast prototyping in 

authentic learning settings is emphasized. Finally, formative evaluation is used to evaluate the use of the 

environment within the thread. FODEM has been applied to develop ViSCoS (Virtual Studies of Computer 

Science) online studies and LEAP (LEArning Process companion) digital learning too in the rural regions 

of Finland where the population is geographically widely dispersedly. 

Keywords 

Design Methods, Digital Learning Environments, Online Learning, Formative Evaluation 

Introduction 

Digital learning environments (DLEs) are technical solutions for supporting learning, teaching and studying 

activities (Suhonen, 2005). A digital learning environment can be educational software, a digital learning tool, an 

online study program or a learning resource. A digital learning environment may thus consist of a combination 

of different technical solutions. A DLE may thus be used as the basis for an e-learning program (Anohina, 2005). 

The development of effective DLEs is not an easy task. The challenge when developing DLEs for us to use 

technology ingeniously and creatively to solve problems and meet the needs that arise in various technological, 

educational and cultural contexts (Kähkönen et al., 2003). The best design methods are those that help designers 

to develop innovative and effective solutions by clearly depicting the most important procedures and aspects of 

the development process (Design-Based Research Collective, 2003). 

A widely dispersed learning community refers to a student population that is thinly distributed over a relatively 

large geographical region or throughout a long period of time. Since widely dispersed learning communities like 

this are restricted by cultural, geographical or temporal factors, they tend to be relatively small in number. A 

widely dispersed learning community might, for instance, consist of 30 students who live in an area with a radius 

of 200 kilometers and who are learning Java programming. Such characteristics have two consequences: firstly, 

a thinly distributed community needs outstandingly accessible DLEs because the students live too far away from 

one another to offer or receive assistance, and, secondly, not even the sum of fees collected from such a small 

number of students can finance excellent DLEs of the kind we are contemplating. Such difficulties have given 

rise to in poorly designed ad hoc DLEs. Table 1 presents typical differences between widely dispersed and dense 

learning communities. The United Kingdom Open University (UKOU) is an example of a dense learning 

community where the student numbers are reasonably high. For instance, the UKOU offers a full range of 

degrees and it has over 200,000 students. According to Castro et al. (2001) and Bork and Gunnarsdottir (2001), 

the UKOU has a full-scale preparation system for the courses. Several years and millions of euros are invested to 

make high quality learning products to be used over several years. The evaluations and course improvements are 

often conducted to the final version of a DLE. 

This paper explains how we applied a FODEM (FOrmative DEvelopment Method) to create effective DLEs for 

the Finnish context which is well known for its widely dispersed learning communities. Apart from meeting the 

needs of a widely dispersed learning community, FODEM has to prove itself as a practicable DLE design 

method. Whatever method is used, it has to be responsive to the diversity of learners’ needs and situations, to 






43

whatever technologies are available, and to the cultural idiosyncrasies of the learning context (Soloway et al., 

1996). The lack of unified theories for functional specifications of DLEs can make the design situation 

unpredictable (Moonen, 2002). Such uncertainty gives, for instance, rise to the need for a formative design 

process. This is especially applicable when the whole design situation is new, and the requirements and 

expectations of the developed environment can easily change during development (Fallman, 2003). 

Understanding the core problems and needs of learners is a crucial aspect of the development process. 

Table 1. Features of widely dispersed and dense learning communities 

Sparse Dense 

Design costs per 

course 

Not more than US$10,000 100,000US$ or more 

Number of students Fewer than 100 From several hundred up to 

thousands 

Design team A few people with multi-disciplinary skills and 

several responsibilities 

10-20 specialists 

Development time 2-4 person months 1-2 years 

Primary use of 

technology 

Information processing Information delivery 

Production Tailor-made Mass production 

Feasible DLEs Multipurpose digital learning tools Re-usable learning objects 

In this paper we describe how we applied FODEM to two DLE development scenarios – ViSCoS and LEAP. 

ViSCoS is an online study program for first-year university-level courses in computer science. The ViSCoS 

curriculum consists of three main areas: the preliminaries of ICT, the basics of programming with Java, and an 

introduction to computer science (Sutinen & Torvinen, 2001). ViSCoS first began to be developed in 2000. 

Between 2000 and 2005, a total of 109 high school students completed the program. LEAP is a digital learning 

tool that was developed in response to challenges that arose during the ViSCoS Programming Project course 

(Suhonen & Sutinen, 2004). The main functions of LEAP are a digital learning portfolio and creative problem 

solving support (Suhonen, 2005). 

Formative development method – FODEM 

Thread as a core structure 

The most basic concept in FODEM is that a thread represents an identified developmental theme within a DLE 

development process. Such a theme represents a particularly important area or need that has to be addressed if a 

learning process and its outcomes are to be improved by a DLE. An example of a thread is the design process of 

a crucial component in a system of DLEs – such as a visualization tool in a programming course. We use the 

term thread rather than stage or phase because FODEM development process consists of several parallel and 

simultaneous asynchronous threads. The thread structure differentiates the FODEM method from, say, many 

general software design methods which proceeds linearly, stage by stage (Whitten et al., 2004). 

Threads in a development process can be asynchronous or they can progress independently of one another. 

Thread structures are used because they represent the most important aspects of the development process. In 

spite of this, threads do not necessarily reflect all aspects of a development process. Because resources for design 

and implementation of a widely dispersed learning community context are limited, it is better to focus on the 

most important features (which are threads) and undertake quality work with them rather than create a flat 

learning environment which in which one cannot penetrate too deeply into any given area because it would be at 

the cost of other areas. The developers of FODEM were inspired in this regard by jagged study zones which are 

a cross between open learning environments and intelligent tutoring systems (Gerdt et al., 2002). 

A thread has three interdependent, dialectic components: needs analysis (NA), implementation (I) and formative 

evaluation (FE), and it can include several instances of such components. Each thread has a certain development 

theme which aligns the goals of a thread and relates the components to one another. The line (left thread in 

Figure 1) shows an indefinite interaction among components. One of the main principles of FODEM is to 

gradually develop the DLE on the basis of the feedback received from the stakeholders (students, teachers, 

administrative staff and designers alike). It is important to note that more threads can be added into the process 

on the basis of feedback received. In the beginning, there might only be one thread and a need to implement the 

44

first version of a digital learning environment as quickly as possible. As a formative design method FODEM 

emphasizes research-orientation throughout the entire lifecycle of a DLE. Various thread types can also be 

identified. In a cycle of development thread type, the components within a thread have an iterative dependency. 

With a cycle of development thread, one can, for instance, represent how a certain development theme 

progresses clearly in linear or spiral form. This means that typically a cycle of development thread includes 

several instances of components. Figure 1 illustrates how two threads might work together in a development 

process. The thread on the left illustrates a cycle of a development thread. 

FODEM components 

First designs of 

courses 

NA 

I FE 

Focus on a 

special case 

Single course 

NA 

I FE 

Figure 1. Two threads in a development process 

In a needs analysis, designers identify the needs and pedagogical objectives of the DLE and the consequent 

requirements of a design context. A needs analysis includes the definition of the main concepts, roles and desired 

goals of the environment in terms of the thread’s theme. In the early stages of development, the most urgent 

tasks are given the highest priority. The definition of pedagogical objectives can be grounded on both theory and 

practice. Learning theories, for example, can be a useful resource for suggesting various approaches and ideas to 

the designer. But design solutions may also be the product of human innovation and ingenuity, particularly in 

circumstances where no relevant theories exist. The requirements of the design context may also be constructed 

from extrapolating from experiences of successful practical developments in similar situations in the past. Needs 

analysis may also include an analysis of formative evaluations in other threads. 

Tasks Identify the design solutions and 

main concepts (Kerne, 2002). 

Methods Analysis of contextual factors, 

learning theories, and practical 

experiences. Literature reviews. 

Evaluation of experiences from 

other threads. 

Outcomes Pedagogical and technical design 

principles and solutions 

Challenges To incorporate and combine the 

design ideas from different origins 

in an effective and meaningful way 

(Abdelraheem, 2003) 

Table 2. Summary of the FODEM components 

NA I FE 

Implement design solutions as 

quickly as possible so that 

experiments with learners can 

take place as soon as possible 

(Soloway et al., 1996). 

Fast prototyping, sketching, 

experimenting (Bahn & 

Nauman, 1997). 

An environment that is usable 

in authentic learning settings 

To expose the environment to 

users at the right moment. 

Identify viable features 

by evaluating. 

Use of environmental 

and experiential analysis, 

content analysis. 

Information about the 

functions of the 

environments in future 

development 

To go beyond the 

structure of the design 

(Kommers, 2004) 

The implementation component is used to implement the design solutions identified in needs analysis. One of the 

applied implementation methods is fast prototyping. This is used to test design ideas at an early stage for 

information about which design solutions are viable and which are unsuccessful. This can be achieved through 

45

constructive feedback elicited from stakeholders. The developed environment must be developed to a level of 

maturity where its core functionalities are operating satisfactorily. While usability need not be fully refined, the 

attention of learners may be constantly distracted to secondary issues if there are too many usability problems. In 

the worst case scenario, learners might not even realize the main functions of the environment. If learners are 

exposed too late to the development process, major changes to the core functionality of the environment will be 

costly and complex to implement. Furthermore, learner participation in the implementation process (through 

making suggestions and proposing developmental ideas) also benefits the learners themselves. 

The third component is formative evaluation by means of which users’ experiences of the developed 

environment are evaluated. In FODEM, multiple data sources and pluralistic research methods ensure the 

production of a comprehensive, rich and layered overall picture. Activity logs, databases, and browsing histories 

can be used to evaluate learners’ usage of the environment. As part of FE, developers can also interpret users’ 

personal opinions, experiences and perceptions of the environment by using (most typically) interviews and 

questionnaires for to elicit data. Designers need insight, sensitivity and creative empathy to identify those often 

implicit or hidden viewpoints, attitudes and emotions that users have but sometimes cannot or will not reveal. An 

evaluation can also focus on obtaining practical design ideas from users themselves. Research results can 

indicate new problems or design ideas for future development. Table 2 summarizes the tasks, methods, outcomes 

and challenges of FODEM components. 

Dependencies: interdependent structure of threads 

Dependencies in FODEM are used to represent the interdependent structure of threads, the interaction among 

components and the critical relationships among components in different threads. A dependency is represented 

pictorially by an arrow showing the direction of the interaction. A dependency between threads and single 

components can be one-directional or bi-directional (a bi-directional dependency represents a mutual 

dependence between two components or threads). The name of a dependency shows the nature and meaning of 

the interaction. A thread dependency shows that a whole thread is initiated by another thread. Figure 2 illustrates 

a possible FODEM scenario. The dependency between Threads 1 and 2 illustrates how Thread 1 created a need 

for the re-design of the environment’s technical architecture (a thread dependency). The arrow from Thread 3 to 

Thread 2 illustrates that the implementation component in Thread 2 is dependent on the needs analysis 

component in Thread 3. 

Figure 2. A possible FODEM scenario 

In principle, many layers of threads can exist. When a development process is extensive, a thread may be divided 

into a set of sub-threads. Should a sub-thread become too large, it may be reconstituted into a separate 

development process. The dependencies in multi-layered threads exist among thread layers and sister subthreads. 

Figure 3 visualizes an example of a multi-layered thread structure. 

Methods that are similar to FODEM 

FODEM has been inspired by general software development and educational technology design models. The 

System Development Life Cycle (SDLC) and Context Design (CD) originate from the software development 

tradition while Instructional Design (ID) and Design Research (DR) are used specifically in educational 

technology. Common to all of these approaches are specification (S), implementation (I) and evaluation (E) 

46

phases. The specification phase is used to determine the needs, requirements and expectations for the solutions. 

In the implementation phases, the solution is implemented. Finally, the evaluation phase is used to analyze 

implementation. Table 3 shows how the different ideas (marked in bold) from similar methods relate to FODEM. 

Figure 3. Multi-layered thread structure 

Table 3. Characteristics of FODEM’s sister methods in the specification (S), implementation (I), and evaluation 

(E) phases 

SDLC CD ID DR 

S De-contextualized User needs Learner-centered Practical knowledge 

requirements 

I Iterative through 

prototyping, sequential 

versions, structured 

approaches 

E Summative, phased 

development 

Sketching for 

presenting the design 

ideas to users 

Stories of experience, 

actual observations 

design 

SDLC methods Embedded in use 

Experimenting, 

reflection, formative 

evaluation 

Based on theoretical 

models of human behavior 

SDLC models are often based on structured approaches (Whitten et al., 2004). The development of any software 

is thought to take place in a series of phases (Löwgren, 1995). In traditional methods, evaluation is often 

conducted on the final product. In newer SDLC models, the development process follows an iterative, spiral 

model in which design ideas are tested in the early stages of development. In CD models, the emphasis is on 

problems that relate to particular user concerns such as, for example, how to recognize the needs of users 

(Löwgren, 1995). The focus in ID models falls more on the pedagogical side of the development process 

(Moallem, 2001). Contemporary ID models, for example, stress learner-centered approaches, reflection and 

formative development (McKenna & Laycock, 2004; Moonen, 2002). DR includes a series of approaches that 

issue in technological designs and practices for learning and teaching (Barab & Squire, 2004). The emphasis 

there is on testing systems in authentic settings. The development of an environment is blended with a strong 

theoretical emphasis on human behavior. If one compares it with similar methods, the uniqueness of FODEM 

consists of the way in which it models the parallel, but interdependent and asynchronous units of the DLE 

development – units that we call threads. 

FODEM in action 

FODEM in ViSCoS development 

ViSCoS (Virtual Studies of Computer Science) is an online study program offered by the Department of 

Computer Science, University of Joensuu (ViSCoS, 2005). ViSCoS students study first-year university-level 

47

Computer Science courses by means of the Web. The curriculum consists of three main parts: the preliminaries 

of Information and Communication Technology (ICT), basics of programming with Java, and an introduction to 

Computer Science (Torvinen, 2004). The development of ViSCoS began in 2000, and the first students 

commenced studies that same year. One hundred and nine (109) students had completed the program by the end 

of 2004. While ViSCoS studies were initially offered only to high school students between the ages of 16 and 18 

who lived in the rural areas of Eastern Finland, subsequent collaboration between the Department of Computer 

Science and the Continuing Education Centre at the University of Joensuu made it possible for this course to be 

taken by anyone in Finland. Table 4 summarizes the content of ViSCoS courses. 

Table 4. ViSCoS courses 

Course Content 

Introduction to Information and Computer hardware components, computer programs and operating systems, data 

Communication Technology communications and local area networks, network services and digital 

(ICT) and Computing 

communication, controlling the risks of information technology. Practical skills 

needed for using word processing and spreadsheet applications, basics of Unix, 

introduction to HTML. 

Programming I Algorithmic thinking. Basic structures of programming with Java. 

Programming II Introduction to object-oriented programming (objects, classes, inheritance) 

Hardware, Computer 

An overview of architecture of computers, parsers, system software, and 

Architecture and Operating 

Systems 

databases 

Programming Project Software design, implementation, testing and documenting 

Introduction to the Ethics of 

Computing 

General principles of the ethics of computing. 

Design of Algorithms Introduction to basic issues in computer science; algorithms, computation, and 

data structures. 

Research Fields in Computer 

Science 

Introduction to a selection of research fields in Computer Science 

There are five threads in ViSCoS development: first designs, loop of improvements, new technical solutions to 

support learners, ViSCoS Mobile, and English version. We implemented and ran the first versions of the courses 

in the first designs thread in 2000 and 2001. The design of ViSCoS is based on the CANDLE scheme (Torvinen, 

2004), the primary goal of which is to get courses up and running as quickly as possible. We went out of our way 

to rework these initial courses so that they reflected the needs and interests of the kind of young high school 

students who would take the course. Once they had been modified and adjusted, the course examples and 

assignments were more topical and reflective of the everyday concerns and interests of this target group. While 

these first courses were running, we collected feedback from all the stakeholders. Our main research efforts were 

dedicated to identifying and understanding the reasons why students encountered difficulties in the ViSCoS 

program (Meisalo et al., 2003). In pursuit of our aim of achieving a clear understanding of how students 

experienced the courses, we made use of questionnaires, interviews, log files, analyses of examination and 

submitted exercises and assignments, and analyses of student and tutor feedback to elicit the required 

information. First evaluations indicated that it was the programming courses that caused students the greatest 

difficulties since the highest drop-out rates had occurred there. The Programming Project course in particular 

was evidently a difficult one for students. 

In the second thread – loop of improvements – we addressed these problems by improving the programming 

courses in whatever was we could (Torvinen, 2004). We made changes, for example, on the basis of feedback 

from Thread 1 to the course structures and learning materials. One such change required us to merge two courses 

into one. After careful consideration of the student feedback we received, we then also modified the curriculum 

so that students could give more of their attention to the more demanding topics and so that they would have 

more time for Programming I (we increased the duration of the course from 10 to 12 weeks). We also created 

optional exercises, new examples and assignments, as well as interactive animations, to illuminate what we by 

then knew were the most difficult student topics (arrays, loops, and methods). Several studies within the loop of 

improvements also investigated the reasons why students were electing to drop out of the ViSCoS program. 

(This loop of improvements thread has been active since 2001, and research efforts to investigate the drop-out 

phenomenon are still continuing.) 

Our aim in the third thread, new technical solutions to support learners, was to develop and introduce new 

technical solutions that would enhance ViSCoS studies. We introduced the first of these new solutions in 2002. 

48

Four sub-threads can be identified: the LEAP digital learning tool, the ethical argumentation tool called Ethicsar, 

the Jeliot program visualization tool, and data mining techniques. LEAP is a tool for helping students to manage 

their programming projects in the Programming Project course. Details of LEAP development are presented in 

the next section. Ethicsar is a web-based tool for argumentation and evaluation of ethical issues and questions 

(Jetsu et al., 2004). Jeliot permits novice programmers to visualize their own Java code (Moreno et al., 2004). 

The development of data mining techniques focused specifically on processing data related to the course 

assignments. Our ultimate aim here was to create intelligent support for learners by means of instructional 

intervention (Hämäläinen et al., 2006). Dependency between ViSCoS and the technological solutions is bidirectional 

(Figure 4). The introduction of these improved technologies has meant that students now receive 

better and more comprehensive support. The two most recent threads are ViSCoS Mobile and the English 

version. ViSCoS Mobile was created so that students would be able to use a mobile device to study their ViSCoS 

courses (Laine et al., 2005). Because this is a relatively new concept, only the first procedures in the needs 

analysis component are currently being implemented. An English version thread has also been created so that the 

ViSCoS courses can be offered to English-speaking students. The first English-medium learning materials were 

implemented in the courses Introduction to ICT and Computing, Programming I, Introduction to Ethics of 

Computing and Research Fields of Computer Science in September and December of 2005. Table 5 summarizes 

how the ViSCoS program was developed, while Figure 4 depicts the process graphically. 

Table 5. ViSCoS development (Th = Thread) 

Th Theme NA I FE 

1 First designs Prioritization of the tasks, Running the first Feedback obtained from 

identifying the contextual versions of the courses all stakeholders through 

factors of the high school 

direct questions, 

students 

interviews, content 

analysis. 

2 Loop of Identification of the most Improvement of the Drop-out phenomenon, 

improvements difficult aspects of the courses program on the basis of questions, interviews, 

feedback received from 

learners 

analysis of exercises 

3 New technical Needs identified while running Development of The evaluation of the 

solutions the first versions of the course. Ethicsar, Jeliot, LEAP, implemented solutions 

created to Practical experiences. and data mining from many different 

support 

learners 

points of view 

4 ViSCoS Content adaptation of course – – 

Mobile materials. Programming with a 

mobile device. Student 

support activities in a mobile 

device (eg. Mobile Blog) 

5 English 

version 

FODEM in the development of LEAP 

Analysis of the current Finnish 

versions 

Implementation of the 

English version. 

Improvements included 

in the Finnish version. 

Figure 4 shows ViSCoS development at a higher level of abstraction. Details may be examined by zooming into 

a particular thread. The LEAP digital learning tool development, for instance, can be divided into a set of subthreads. 

LEAP application has two main functions. These are digital learning portfolio and creative problemsolving 

support (Suhonen & Sutinen, 2004). Three sub-threads can be identified: the first prototype, re-design of 

the technical implementation, and a mobile adaptation extension. 

We based the first prototype in Thread 1 on the needs that we had identified in the ViSCoS Programming Project 

course. Some kind of tutorial mechanism was needed to help students to manage their projects in a creative way. 

This tool was also inspired by digital learning portfolio, creative problem-solving support and provocative agent 

concepts (Suhonen, 2005). But we did not restrict the implementation of LEAP to the Programming Project 

course. Our purpose was rather to implement a generic gadget that could be used in several different settings. 

The formative evaluation component in the first thread included two studies on use of the tool in authentic 

learning settings. In the first study, we used the tool in the Problem Solving contact teaching course at the 

– 

49

University of Joensuu (we used the digital learning portfolio functionality in this study). In the second study, we 

used the tool in the ViSCoS Programming Project course. This study allowed us to test the LEAP tool in its 

original design context (we used both the digital learning portfolio and creative problem-solving support 

functionalities of the tool). In both studies we used a dual evaluation scheme. These two parts dealt with the 

analysis of how students were using the tool (use analysis) and the analysis of students’ opinions about the tool 

(experience analysis). We investigated usage by analyzing the content of students’ contributions with LEAP, and 

elicited learners’ reflective impressions and opinions from interviews. 

The first designs 

Etchics of Computing Course 

NA 

I FE 

NA 

I FE 

Ethicsar 

NA 

NA 

I FE 

Data mining 

I FE 

English version 

Focus on programming 

Compare research results, new 

ideas, data exchange 

Integration, improvements 

NA 

I FE 

Jeliot 

Figure 4. Development of ViSCoS 

Improvement, data exchange 

NA 

I FE 

ViSCoS Mobile 

Loop of improvements 

NA 

I FE 

NA 

Programming Project Course 

I FE 

Content adaptation 

In the second thread, re-design of the technical architecture, we modified LEAP in terms of the findings of the 

two studies within the first thread. In that thread we identified problems by means of the usability of the tool. 

Our main finding was that LEAP was complicated to use and that it required simplification. We added no new 

features within the second thread, but merely refined existing features. We also re-implemented the technical 

architecture of the tool because of requirements posed by the third thread. We then conducted a third study 

within the second thread in the ViSCoS Programming Project course, and evaluated it with an evaluation scheme 

that was similar to the approach that we had used in our earlier studies. Our main finding was that LEAP should 

be customized for the context of the course since students were of the opinion that the tool was not relevant 

(Suhonen, 2005). 

Thread 3, a mobile adaptation extension, is still in the concept design phase (Kinshuk et al., 2004). Although we 

had decided that we should begin to implement the mobile adaptation after the core functions of LEAP had 

become sufficiently mature, the needs analysis for the technical implementation of the mobile adaptation 

extension affected the implementation in the second thread. The mobile adaptation extension will certainly 

influence the ViSCoS Mobile thread at the higher level. 

Figure 5 visualizes the development of LEAP. It shows the interaction among the three threads. Table 6 shows 

the details of the LEAP development process. 

LEAP 

50

Table 6. Summary of the LEAP development 

Th Theme NA I FE 

1 First prototype Pedagogical and technical design; First designs Two studies; 

digital learning portfolio and 

Programming 

creative problem-solving support 

Project and 

Problem Solving 

courses 

2 Re-design of the Ideas and knowledge from the XML-based A study in the 

technical architecture first thread. Requirements from implementation ViSCoS 

the mobile adaptation extension 

Programming 

thread. 

Project course. 

Comparison with 

previous studies. 

3 Mobile adaptation Concept design of mobile – – 

extension 

adaptation extension 

The first 

prototype 

NA 

I FE1 

FE2 

Experiences 

Re-design of the technical 

architecture 

NA 

I FE 

Comparison, changes to 

the evaluation scheme 

Figure 5. Visualization of the LEAP development 

Woven Stories-based environment to support FODEM 

Mobile adaptation 

extension 

Requirements 

and needs NA 

I FE 

The multithreaded structure of FODEM allows us to specify a corresponding visual environment to support the 

design process. The first sketch for a FODEM design environment prototype has been implemented. The 

prototype is built on a client-server web-based application called Woven Stories (WS) (Gerdt et al., 2001; 

Nuutinen et al., 2004). The prototype can be used to manage the threads, components and dependencies in a 

development process. A designer can add, remove and edit threads to represent the main aspects of the 

development process. Information can be added to each component in a thread. Although the current version of 

the prototype allows the inclusion of web links to the components, users cannot add documents to the 

environment. Dependencies can be created between threads and components. The client-server implementation 

enables a number of users to work with the same design process from remote locations. The prototype also 

includes a chat facility which allows designers to interact while they work with the environment. Figure 6 shows 

the FODEM design environment prototype. 

The next challenge in the design environment development was to improve the preliminary prototype that would 

enable more meaningful support for the designers. The first new feature would have to be the addition of 

arbitrary documents to components in threads. Designers could, for example, add requirements documents, 

articles and review results to a needs analysis component. The environment would also need to include basic 

functionalities for document management. The design environment would also need a zooming functionality that 

would help designers to grasp the development process at various levels. The environment should also 

automatically construct different visualizations of the development process by, for example, presenting the 

process in a time-based view, or by separating the most important threads (as defined by the designers) from 

other threads. 

A comprehensive design environment would also include built-in procedures that would facilitate working with 

components. A formative evaluation component, for instance, could include a list of possible evaluation 

methods, and the designers could get some information about their applicability in different situations. This kind 

51

of service would help designers to decide which evaluation methods would be most suitable for a given 

development phase (Reeves & Hedberg, 2003). Tacit knowledge about different aspects of the development 

process could also be stored in the environment. Finally, such a web-based environment would help designers to 

work collaboratively with one another, to exchange ideas and information about problems in different threads, 

and ultimately to undertake collaborative management of the development process. 

Figure 6. FODEM design environment 

The possibilities that FODEM opens up for a novel DLE design tool (such as the one described above) is yet 

another indication of the effectiveness and feasibility of the FODEM approach. The inherently parallel structure 

of FODEM allows the designers to manage a complex but need-based process for the development of effective 

DLEs. 

Conclusions 

In this paper we introduced the FODEM method for developing digital learning environments in the context of 

widely dispersed learning communities. We also used ViSCoS and LEAP development cases in a Finnish 

educational context to show how the method works. The ViSCoS program has now been running for five years 

and has been proven to be sustainable. ViSCoS provides a flexible solution for studying first-year computer 

science courses over the net. But the development of LEAP is still in its early stages. Three prototyping 

experiences have in the interim revealed both some positive and negative features in LEAP. 

The two cases have shown how FODEM can be used to develop different types of digital learning environments. 

FODEM provides tools to conceptualize the most important aspects of a development process with thread, 

component and dependency structures (Suhonen, 2005). FODEM includes tools that can capture the dynamics of 

a development process and that can be used to model various representations of the FODEM development 

process from different perspectives. FODEM also supports the integration of different development processes: a 

thread within a development process can be a part of another development process. An important aspect of 

FODEM is its emphasis on the utilization of all feedback from the users of the developed digital learning 

environment. Case studies, interviews, user observations and contextual analysis methods are among the 

appropriate evaluation techniques that developers utilize gradually to improve the environment. 

52

The unique features of FODEM are evident from three features of the development of DLEs for widely dispersed 

learning communities. Firstly, development costs should be low. Both ViSCoS and LEAP have been developed 

with reasonably low investments (approximately US$10,000 per course). Secondly, needs analysis and research 

orientation in FODEM can be used to create contextualized and appropriate DLEs. When we developed ViSCoS, 

for example, we used a number of methods to fit the course arrangements and digital learning materials to the 

students’ needs. Finally, the development process should follow a simple, yet structured, design method. 

FODEM models the development process with parallel, interrelated threads. The formative evaluation 

component ensures that the development process is both rigorous and focused. 

One of FODEM’s best -known characteristics is how well it adapts to the context of everyday school life and 

youth culture. Far too frequently, development methods (and hence the DLEs) were far too heavily loaded or 

rich in technical features. This might account for the fact that ICT has in the past been widely underutilized in 

several educational contexts. FODEM – in contrast to the proliferating approach of comprehensive design 

methodologies – emphasizes simplicity. At the same time, FODEM’s inherently parallel approach which 

represents and reflects the dynamic nature of most educational contexts, gives due weight to the complexity of 

learning and teaching needs. Unlike rapid prototyping, FODEM does not reduce the simultaneous needs that are 

identified in real school contexts into a sequential design process. FODEM therefore copes better with the 

uncertainty that is characteristic of real life, a common expectation of ICT in other areas of society as well. 

Because FODEM treats the individual needs of heterogeneous student communities with appropriate 

seriousness, it is an affordable alternative to design information-delivery-oriented DLEs for dense communities 

which take into account individual needs through adaptive technologies or re-usable learning objects. 


We are grateful to Teemu Laine for implementing the FODEM design environment prototype on the basis of the 

Woven Stories application. We also thank Roger Loveday for his work on revising the text. 

References 

Abdelraheem, A. Y. (2003). Computerized learning environments: Problems, design challenges and future 

promises. The Journal of Interactive Online Learning, 2 (2), 1-9. 

Anohina, A. (2005). Analysis of the terminology used in the field of virtual learning. Journal of Educational 

Technology & Society, 8 (3), 91-102. 

Bahn, D. L., & Naumann, J. D. (1997). Evolution versus construction: Distinguishing user review of software 

prototypes from conventional review of software specifications. In Proceedings of the ACM Conference on 

Computer Personnel Research (SIGCPR’97), New York: ACM Press, 240-245. 

Barab, S., & Squire, K. (2004). Design-based research: Putting a stake in the ground. The Journal of the 

Learning Sciences, 13 (1), 1-14. 

Bork, A., & Gunnarsdottir, S. (2001). Tutorial Distance Learning: Rebuilding Our Educational System, New 

York: Kluwer Academic/Plenum Publishers. 

Castro, M., López-Rey, A., Pérez-Molina, C.M., Colmenar, A., de Mora, C., Yeves, J., Carpio, J. Peire, J., 

Daniel, J.S. (2001). Examples of distance learning project in the European Community. IEEE Transactions on 

Education, 44 (4), 406-410. 

Design-Based Research Collective. (2003). Design-based research: An emerging paradigm for educational 

inquiry. Educational Researcher, 32 (1), 5-8. 

Fallman, D. (2003). Design-oriented human-computer interaction. In Proceedings of the Conference on Human 

Factors in Computing Systems (CHI 2003), New York: ACM Press, 225-232. 

Gerdt, P., Kommers, P., Looi, C.-K., & Sutinen, E. (2001). Woven Stories as a cognitive tool. Lecture Notes in 

Computer Science, 2117, 233-247. 

53

Gerdt, P., Kurhila, J., Meisalo, V., Suhonen, J., & Sutinen, E. (2002). Rambling through the wilds: A concept of 

Jagged Study Zones. In Proceedings of the World Conference on Educational Multimedia, Hypermedia & 

Telecommunications (ED-MEDIA2002), Norfolk: AACE, 596-601. 

Hämäläinen, W., Laine, T. H., & Sutinen, E. (2006, forthcoming). Data mining in personalizing distance 

education courses. In Romero, C. & Ventura, S. (Ed.), Data Mining in E-learning. WitPress. 

Jetsu, I., Meisalo, V., Myller, N., & Sutinen, E. (2004). Ethical argumentation with Ethicsar. In Proceedings of 

the IADIS International Conference on Web Based Communities (WBC2004), Lisbon, Portugal: IADIS, 255-261. 

Kerne, A. (2002). Concept-context-design: A creative model for the development of interactivity. In Proceedings 

of the 4 th Conference on Creativity & Cognition (C&C’02), New York: ACM Press, 192-199. 

Kinshuk, Suhonen, J., & Sutinen, E. (2004). Mobile adaptation extension of a digital learning tool. In 

Proceedings of the International Conference on Computers in Education (ICCE2004), Altona & Melbourne, 

Australia: Common Ground Publishing and RMIT University, 1581-1590. 

Kommers, P. (2004). Concepts in the world of the mind. In Kommers, P. A. M. (Ed.), Conceptual Support 

Systems for Learning - Imagining the Unknown, Amsterdam, The Netherlands: IOS Press, 3-30. 

Kähkönen, E., Bonnano, P., Duveskog, M., Faggiano, E., Kampf, C., Lahtinen, S.-P., Nuutinen, M., Polsani, P., 

Rossano, V., & Tedre, M. (2003). Approaches and methodologies for a culturally contextualized educational 

technology. In Kähkönen, E. (Ed.), Proceedings of the 2 nd International Conference on Educational Technology 

in Cultural Context, Joensuu, Finland: University of Joensuu, 5-10. 

Laine, T., Myller, N., & Suhonen, J. (2005). ViSCoS Mobile: Learning Computer Science on the Road. 

Accepted for publication. In Proceedings of the 5th Annual Baltic Sea Conference on Computer Science 

Education, November 17-20, Koli, Finland. 

Löwgren, J. (1995). Applying design methodology to software development. In Proceedings of the Conference 

on Designing Interactive Systems: Processes, Practices, Methods & Techniques, New York: ACM Press, 87-94. 

McKenna, P., & Laycock, B. (2004). Constructivist or instructivist: Pedagogical concepts practically applied to a 

computer learning environment. In Proceedings of the 9 th Annual Conference on Innovation and Technology in 

Computer Science Education (ITiCSE2004), New York: ACM Press, 166-170. 

Meisalo, V., Sutinen, E., & Torvinen, S. (2003). How to improve a virtual programming course. In Proceedings 

of the 32 nd ASEE/IEEE Frontiers in Education Conference (FIE2002), November 6-9, Boston, MA. USA. 

Moallem, M. (2001). Applying constructivist and objectivist learning theories in the design of a web-based 

course: Implications for practice. Educational Technology & Society, 4 (3), 113-125. 

Moonen, J. (2002). Design methodology. In Adelsberger, A. A., Collins, B. & Pawlowski, V. (Eds.), Handbook 

on Information Technologies for Education and Training, Springer-Verlag, 153-180. 

Moreno, A., Myller, N., Sutinen, E., & Ben-Ari, M. (2004). Visualizing programs with Jeliot 3. In Proceedings 

of the International Working Conference on Advanced Visual Interfaces, New York: ACM Press, 373-376. 

Nuutinen, J., Sutinen, E., Liinamaa, K., & Vanharanta, H. (2004). Woven Stories as a tool for corporate strategy 

planning. In Proceedings of the IADIS International Conference on Web Based Communities (WBC2004), 

Lisbon, Portugal: IADIS, 438-441. 

Reeves, T. C., & Hedberg, J. G. (2003). Interactive Learning Systems Evaluation, Engelwood Cliffs, New 

Jersey: Educational Technology Publications. 

Soloway, E., Jackson S. L., Klein, J., Quintana, C., Reed, J., Spitulnik, J., Stratford, S. J., Struder, S., Eng, J., & 

Scala, N. (1996). Learning theory in practice: Case studies of learner-centered design. In Proceedings of the 

Computer Human-Interaction Conference (CHI'96), New York: ACM Press, 189-196. 

54

Suhonen, J. (2005). A Formative Development Method for Digital Learning Environments in Sparse Learning 

Communities, Doctoral Dissertation, Department of Computer Science, University of Joensuu, Finland, 

Retrieved April 6, 2006 from http://joypub.joensuu.fi/publications/dissertations/suhonen_learning/suhonen.pdf. 

Suhonen, J., & Sutinen, E. (2004). Formative development model in action - case of a digital learning tool. In 

Proceedings of the IEEE International Conference on Advanced Learning Technologies (ICALT 2004), Los 

Alamitos, CA: IEEE Computer Society Press, 720-722. 

Sutinen, E., & Torvinen, S. (2001). The Candle scheme for creating an on-line computer science program – 

experiences and vision. Informatics in Education, 2, 93-102. 

Torvinen, S. (2004). Aspects of the Evaluation and Improvement Process in an Online Programming Course – 

Case: the ViSCoS Program, Licentiate Thesis. Department of Computer Science, University of Joensuu, Finland, 

Retrieved April 6, 2006 from ftp://cs.joensuu.fi/pub/PhLic/2004_PhLic_Torvinen_Sirpa.pdf. 

ViSCoS (2005). Virtual Studies of Computer Science, Retrieved April 6, 2006, from 

http://cs.joensuu.fi/viscos/?lang=eng. 

Whitten, J. L., Bentley, L. D., & Dittman, K. C. (2004). Systems Analysis and Design Methods (6 th Ed.), 

McGraw-Hill. 

55

Sierra, J. L., Fernández-Valmayor, A., Guinea, M., & Hernanz, H. (2006). From Research Resources to Learning Objects: 

Process Model and Virtualization Experiences. Educational Technology & Society, 9 (3), 56-68. 

From Research Resources to Learning Objects: Process Model and 

Virtualization Experiences 

José Luis Sierra 

Dpto. Sistemas Informáticos y Programación. Fac. Informática. Universidad Complutense, Madrid, Spain 

jlsierra@sip.ucm.es 

Alfredo Fernández-Valmayor 

Dpto. Sistemas Informáticos y Programación. Fac. Informática. Universidad Complutense, Madrid, Spain 

alfredo@sip.ucm.es 

Mercedes Guinea 

Dpto. Historia de América II. Universidad Complutense de Madrid. 28040, Madrid, Spain 

guinea@ghis.ucm.es 

Héctor Hernanz 

Telefónica I+D S.A. C/ Emilio Vargas 6. 28043, Madrid, Spain 

hhb@tid.es 

ABSTRACT 

Typically, most research and academic institutions own and archive a great amount of objects and research 

related resources that have been produced, used and maintained over long periods of time by different types 

of domain experts (e.g. lecturers and researchers). Although the potential educational value of these 

resources is very high, this potential may largely be underused due to severe accessibility and 

manipulability constraints. The virtualization of these resources, i.e. their representation as reusable digital 

learning objects that can be integrated in an e-learning environment, would allow the full exploitation of all 

their educational potential. In this paper we describe the process model that we have followed during the 

virtualization of the objects and research resources owned by two academic museums at the Complutense 

University of Madrid (Spain). In the context of this model we also summarize the main aspects of these 

experiences in virtualization. 

Keywords 

Repositories of learning objects, Authoring of domain–specific learning objects, Virtual museums, Virtual 

campus 

Introduction 

A research center usually owns and archives a great amount of objects and research related resources whose 

pedagogical value is unquestionable. Unfortunately, in many cases the scarcity and the value of this material also 

hinder its use for educational purposes. Two paradigmatic examples are the museums and the archives owned 

and maintained by many academic and research institutions. The transformation of all these materials into 

reusable digital learning objects (LOs) (Koper, 2003; Polsani, 2003) that can be integrated and used in an 

e-learning environment is, in our opinion, a key step to attaining the full exploitation of their educational value. 

We have confirmed this fact during our experiences with the virtualization of two academic museums at the 

Complutense University of Madrid (Spain): the Antonio Ballesteros museum, an academic museum of 

archaeology and ethnology maintained by the Department of American History II, and the José García 

Santesmases Computing museum, an academic museum maintained at the School of Computer Science. 

In this paper we present and illustrate the process model used in the two aforementioned virtualization 

experiences. Our virtualization process model establishes a set of guidelines for the construction of repositories 

of digital LOs from pre-existing research resources in specialized domains like the two mentioned above. This 

model, which is based on our previous experiences with a document-oriented approach to the development of 

content-intensive (e.g. educational, hypermedia and knowledge-based) applications, makes it easy for the 

virtualization of these resources to be carried out by the same experts that use, and in many cases have produced, 

them. In addition, this virtualization task should suppose a minimum overload in the habitual work of these 

experts. For this reason, the approach involves a community of developers supporting experts in the task of 

virtualization. Experts and developers collaborate in the definition of an adequate LO model specific to the 

domain. This model lets developers build a domain-specific application for the authoring and deployment of 






56

these LOs. Since this application is especially adapted to the bodies of expertise and to the skills of the experts, 

the task of virtualization can be dramatically facilitated. 

The rest of this paper is organized as follows. We begin by describing the virtualization process model: we 

present the different products and activities involved in the process, we show the sequencing of these activities, 

and we also outline the main participants and their responsibilities. Next we describe our virtualization 

experiences in the domain of the two academic museums aforementioned: we illustrate how the different 

activities described in the process model take form in each scenario, and we concentrate on the results of the 

virtualization. We finish the paper with some conclusions and lines of future work. A previous version of the 

work described in this paper can be found in (Sierra et al. 2005b). 

The Virtualization Process Model 

The collaboration between specialists in different knowledge areas, in order to be effective, must be adequately 

ruled. In our opinion, the rules used must emerge from the working experience in each specific domain instead 

of being adapted from aprioristic universal principles; consequently they must be refined and developed to 

accommodate the ongoing needs of experts and developers in each domain. The production and maintenance of 

reusable learning material from pre-existing research resources in specialized domains (e.g. the academic 

museums) is not an exception. The virtualization process model presented in this section is in accordance with 

these pragmatic considerations, since it has emerged from our practical experiences at the Complutense 

University. During these experiences we have recognized the typical scenarios contemplated by our previously 

mentioned document-oriented approach to the development of content-intensive applications (Sierra et al. 2004; 

Sierra et al. 2005a), and thereby we have adopted a strategy in structuring our virtualization process model 

similar to the strategy promoted in the document-oriented approach. Hence we introduce three different views, as 

established in Figure 1: 

In the products and activities view we include the activities in the approach along with the products 

produced and consumed by these activities. 

In the sequencing view we show how the activities considered are sequenced. Note that in this context the 

term sequencing does not mean the sequencing of the learning activities as it is usually understood in the 

e-learning domain, but the sequencing of the activities followed by the experts and the computer science 

technicians when they collaborate in the production and maintenance of the learning materials. Therefore, 

this view has nothing to do with any e-learning specification. It only reflects a pre-established characteristic 

of a process model. 

In the participants and responsibilities view we outline the participants in the activities along with their 

responsibilities in these activities. 

In this section we examine the process model from these three different perspectives. 

Products and activities 

The products and activities contemplated in the virtualization process model are displayed in Figure 1a. As 

shown in this Figure, the model introduces three different activities (Domain Analysis, Operationalization and 

Virtualization) and it produces three different kinds of products (a domain-specific LO model, a domain-specific 

authoring and deployment tool and the LO repository itself). Also notice how the model supposes the existence 

of a huge body of pre-existing research resources. Next we describe these aspects. 

The goal of the Domain Analysis activity is the formulation of an LO model that makes the educational features 

of the research resources created and manipulated by the experts explicit. This model must be able to integrate 

all the specific features needed by the experts in a knowledge area to describe and to manipulate the objects and 

the research goals in that area. Therefore the model for a domain must closely mirror the nature of the actual 

resources in the domain (e.g. the LO model in the domain of archeology can include resources and features 

different from the LO model used in natural history, because experts in each domain are interested in different 

characteristics of the object and have different research objectives). The capability of the model to include all the 

domain-specific characteristics of the object is critical in order to increase its acceptance and usability by the 

domain experts. In addition, the model should be conceptually independent of existing LO technologies. While 

these technologies are very valuable from a developer’s point of view, they must not condition domain experts 

unnecessarily during the characterization process of the LOs in their domains of expertise. On the contrary, this 

57

activity could be better based on techniques used in software engineering (Arango, 1989) and knowledge 

engineering (Studer et al., 1999) for the construction of domain models. 

LO Model 

(a) Products and activities 

Domain 

Analysis 

LO Repository 

Operationalization 

Describe 

the domain 

Formulate 

the LO 

model 

Domain 

Analysis 

Research 

resources 

Virtualization 

LO Authoring 

and Deployment 

Tool 

(c) Participants and responsibilities 

Assess 


Build the 

tool 

Evolutionary 

iteration 

[Changes or 

extensions in 

the LO model 

required] 

Domain 

experts 

(b) Sequencing 

Virtualize 

Detect needs for 

changes/extensions 

Developers 

[Virtualization 

completed] 


Domain 

Analysis 



Figure 1. The three views of the virtualization process model 

Corrective 

iteration 

[Changes or 

extensions in 

tool required] 

During the Operationalization activity a suitable authoring and deployment tool is constructed. This activity is 

driven by the LO model formulated during the Domain Analysis activity. Therefore, the resulting tool will also 

be domain-specific. This activity can take advantage of existing e-learning technologies. Hence, 

recommendations and standards like those proposed by IMS Content Packaging (IMS CP, 2004), IMS Learning 

Design (IMS LD, 2003; Koper & Tatersall, 2005) and ADL Shareable Content Object Reference Model 

(SCORM) (ADL, 2003) can be adopted in order to promote the interoperability of the resulting tool and the 

shareability and reusability of the LOs produced. These technologies must be considered implementation 

mechanisms providing additional functionalities to the tool and as such their potential complexity must be 

hidden from the domain experts. Thus while IMS CP can be easily adopted to add import and export facilities for 

LOs to the tool under development, IMS LD and SCORM are notably more complex in nature. Therefore the 

authoring/deployment tool, being domain-specific (i.e. being oriented to let experts author and deploy LOs in a 

specific and well-established domain), will only provide navigation facilities through the contents of the learning 

material organized accordingly to these general-purpose information models. In addition, it can provide an 

application programming interface to connect with standard editors and players for these recommendations. For 

this purpose, suitable mappings between the generic and the domain-specific model supported by the domainspecific 

tool must be defined. These mappings can be actually incorporated to the tool using pluggable facilities 

of the application programming interface. When mapping a general representation of a LO to a domain-specific 

one, some features can be loosen, although the information items that enables these features can be preserved in 

the domain-specific representation as hidden resources in order to make the revival of the original material 

possible when exported. On the other hand, domain-specific LOs can be represented in general-purpose formats 

in order to allow its use in external general-purpose authoring and playing environments - e.g. visualization 

systems for SCORM 2004 as those described in (Roberts et al. 2004). 

58

Finally, during the Virtualization activity, the repository of LOs is populated with the virtualizations of the 

research objects and resources. This virtualization is carried out using the tool produced during the 

Operationalization activity. Nevertheless if import and export standard facilities for LOs are added to the tool, 

LOs can be exported and edited with external tools and after re-imported to the repository. 

Sequencing of the activities 

The diagram in Figure 1b shows the sequencing of the activities in the virtualization process model. Instead of 

proceeding sequentially, performing an exhaustive domain analysis, followed by exhaustive operationalization 

and virtualization, the three activities are interleaved in time. According to this iterative - incremental 

conception, the LO model and the associated LO authoring and deployment tool are refined whenever new 

domain knowledge is acquired during virtualization. 

Our process model introduces two types of iterations in the construction of the repositories, which are 

highlighted in Figure 1b. On one hand there are corrective iterations, which are related to the process of updating 

and fine-tuning the LO authoring and deployment tool to accommodate it to the needs of the experts (e.g. by 

introducing an enhanced interaction style in its user interface). On the other hand, the model also contemplates 

evolutionary iterations, which are related to the evolution of the LO model to capture new research or 

educational features of the virtualized resources (e.g. by considering new kinds of attributes for the LOs). Both 

types of iterations can be started during the Virtualization activity in response to the specific needs manifested by 

the domain experts. 

During our experiences with the approach we have realized that continuous maintenance and evolution of the LO 

model and their associated tools are mandatory to better accommodate them to the desires and changing 

expressive needs of the experts. This obligation supposes a heavy interaction between the experts and the 

developers of the applications, which can decrease overall productivity. To manage this interaction we are 

studying the application of the specific techniques proposed by our document-oriented approach for dealing with 

the intrinsic evolutionary nature of content-intensive applications (Sierra et al. 2005c). 

Participants and their responsibilities in the activities 

Domain experts and developers are the two main participants involved on the construction of LO repositories, as 

mentioned before. The different responsibilities that they have in the model’s three activities are depicted in 

Figure 1c. Next we detail these responsibilities. 

During the Domain Analysis activity, the main role of developers is to formulate the LO model for the objects 

and resources managed by the domain experts. In turn, domain experts must describe these resources to the 

developers, how they are used and how they are interrelated, letting developers perform an adequate 

conceptualization. The acceptability and usability of the resulting LO model will strongly depend on the 

participation of domain experts, because they know the actual resources, they can describe these resources to the 

developers, and they can help them in the elicitation of the possible educational uses of this material. 

During the Operationalization activity, the main responsibility is for the developers. They must construct the LO 

authoring and deployment tool. During its construction, they are driven by the LO model and they can also be 

assessed by the domain experts regarding different aspects not contemplated in the model (e.g. presentation and 

edition styles). 

Finally, during the virtualization activity, domain experts use the LO authoring and deployment tool to populate 

the LO repository. In this activity developers can react to the needs manifested by the experts and can start new 

corrective and/or evolutionary iterations when required. 

Experiences in the Domain of the Academic Museums 

We have successfully applied the principles of virtualization explained in the previous section during the 

virtualization of two different museums at Complutense University of Madrid, as mentioned in the introduction: 

the museum of archaeological and ethnographical material maintained at the Department of American History II 

(the Antonio Ballesteros museum) and the museum of computing at the School of Computer Science (the José 

59

García Santesmases Computing museum). Using these experiences we illustrate the analysis performed in these 

domains and we introduce the LO model formulated. Next we briefly outline the operationalization and the 

architecture of the authoring and deployment tools produced. Finally we detail our working experiences during 

virtualization. 

Domain analysis: virtual objects 

Academic museums contain collections of real objects that can be directly chosen as the most suitable 

candidates for conversion into LOs. In these scenarios it is natural to distinguish between such real objects and 

their virtual representations. These virtual representations will be called virtual objects (VOs) because they come 

from the virtualization of real objects initially with educational purposes (Figure 2a). The VO model was 

formerly proposed in (Fernández-Valmayor et al. 2003) in relation with the archaeology and ethnology museum 

but it has also been used in the computing museum, although with different specific features (Navarro et al. 

2005). 

(a) 

Real object 


Virtual object 

Metadata 

……. 

Data 

Resources 

… 

VO 

resource 

… 

Foreign 

resource 

Figure 2. (a) Real and virtual objects; (b) Virtual objects can be related using foreign and reference resources. 

In Figure 2a the structure of a VO is outlined. That way a VO is characterized by a set of data, a set of metadata, 

and a set of resources: 

The data in a VO represent all the features of the real object that are considered useful for its scientific 

study. Examples of this kind of data are the dimensions and the result of the chemical analysis of a piece of 

pottery of the archaeology museum, or the model and the components of a computer of the computing 

museum. 

The metadata are the information used to describe and classify the VO from a pedagogical point of view. 

Examples of metadata are the name of the VO’s author, its version number or its classification. The 

different features covered by metadata are chosen from existing exhaustive metadata schemas like the IEEE 

LTSC Learning Object Metadata (LOM) (LOM, 2002). 

The resources are all the other informational items associated with the VO. These are further classified in 

own, foreign and VO resources. The own resources of a VO are the multimedia archives resulting from the 

virtualization of the real objects (e.g. a set of photographs of a pottery vessel or a video showing the 

operation of a computer). The foreign resources are references to resources belonging to another VO but 

related to the first one (e.g. documents describing different aspects of the culture that manufactured the 

pottery vessel, or those related to the research and design process of a computer model). Finally, VO 

resources are references to other VOs in the repository that are related in some way to the first one (e.g. a 

VO for another piece discovered at the same excavation as the vessel, or a set of VOs for the different 

components of the computer). 

Foreign and VO resources allow the establishment of basic relationships between different VOs (Figure 2b). 

Indeed this mechanism can be used to build new VOs based on existing ones. These resulting composite VOs 

may, or may not, correspond to real objects in the museum. In this latter case they usually represent new 

constructed knowledge and/or new educational facilitators that arise during the virtualization process (e.g. 

thematic guided tours based on the objects owned by the museum about a cultural or technical subject). 

Although the real objects of both museums are very different in nature, the VO model has proved to be flexible 

enough to deal with the domain-specific data in each knowledge area. Indeed: 

(b) 

… 

60

Although the set of data needed to describe a computer prototype is very different from the set of data 

needed to describe a cultural artifact, in both cases researchers can choose the set of attribute-value pairs 

needed to describe the set of features in which they are interested. 

In the same way, resources associated with the VO can gather all the digital archives that result from the 

virtualization process of a computer prototype or from the virtualization process of an archaeological object. 

Moreover, through foreign and VO resources, the relation of a computer prototype with its components and 

with the research work, documents and tests that preceded its construction can be expressed in a similar 

manner to the relation between a pottery vessel and the other artifacts that were dug at the same 

archaeological site. In addition, when dealing with cultural artifacts, foreign resources will mainly be 

research documents describing different aspects of the culture that produced them or, alternatively, the VO 

representing all the field work involved in a specific archaeological project. Similar relationships also exist 

between computers and digital devices and their manufacturing processes. 

It is important to point out that although VOs and their associated resources can resemble SCOs and assets in 

SCORM still there are important differences between them. In SCORM, SCOs cannot refer to assets in other 

SCOs neither other SCOs themselves, while the possibility of these relations is a main feature for VOs. 

Conceptually, a VO gathers all the information relevant to a physical or conceptual object. Its associated 

resources can be as simple as SCOs assets, but also can be complex structures describing a learning process and 

the workflow of the learning activities in which this VO can be implicated. 

Operationalization: web-based authoring and deployment tools for virtual objects 

The VO model has led us to develop simple web-based authoring and deployment tools for the two 

aforementioned museums (Figure 3). Basically these tools enable authors to create VOs, upload their resources 

and to establish their relations with other VOs and/or their resources. Users can navigate the repository of VOs 

and their associated resources, thus the complexity of the navigation depends of the navigational structure of the 

resource. The tool for the museum of archaeology (Figure 3a) is called Chasqui. The word chasqui means 

messenger in Quechua, the language spoken in the Inca Empire. Chasqui has gone through a strong evolution, as 

described in (Navarro et al. 2005). In the initial version of Chasqui (Chasqui Web Site, 2005), VOs were directly 

mapped onto its database representations. While the resulting application enabled users to author learning objects 

using domain-specific authoring tools, we found serious difficulties regarding portability and interoperability 

with other systems and authoring tools. For this purpose we have adopted a more sophisticated architecture, 

based on well-known interoperability standards. The resulting version can be visited in its testing site (Chasqui2 

Web Site, 2005), and it is scheduled to be in production in the first quarter of 2006. The architecture of this new 

version of Chasqui has also been reused in the development of the tool for the computing museum (Figure 3b). 

This tool is called MIGS (MIGS Web Site, 2005), the acronym for the name of the museum in Spanish. 

(a) 

Figure 3. Snapshots for (a) Chasqui; (b) MIGS 

The final architecture proposed during Operationalization is depicted in Figure 4. According to this architecture, 

the repository of VOs continues to be supported by a relational database. The tools include pre-established web 

interfaces tailored to the museums being virtualized. In addition, these tools also include programmatic 

interfaces accessible via web services (Cerami, 2002). Web services facilitate the interoperation with other 

(b) 

61

epositories, enable different accessing mechanisms (e.g. mobile devices) and permit the use of external tools 

with alternative and more powerful interaction styles (e.g. IMS LD and SCORM editors and players). As 

suggested in Figure 4, this architecture is entirely implemented using open source technologies. 

Web Interface (PHP) Web Service 

(AXIS) 

Web Client 

Repository (MySQL) 

Other repositories 

Mobile 

Devices 

Alternative 

authoring / 

deployment tools 

Figure 4. Architecture of the Web Based Authoring and Deployment Tools 

To enable interoperability, the tools incorporate import and export facilities of VOs in accordance with the IMS 

CP specification. This way, VOs can be packaged according to this specification and can be exported to other 

IMS-aware Learning Management Systems able to conduct a more elaborated learning process. Direct 

importation is also possible between repositories associated with museums sharing a common VO model. More 

complex importation processes can also be automatically achieved by connecting the appropriate adapters to the 

web service interface. Indeed, this interface is the point where mappings between general-purpose representation 

formats (e.g. SCORM or IMS LD compliant learning materials) and the VO model can be readily incorporated. 

By incorporating to it the appropriate importation and exportation mappings, many current and future learning 

application profiles could be readily supported, maybe with slight evolutions of the current VO model. 


The tools described in the previous section are being used in the virtualization of the two aforementioned 

museums. While the virtualization of the computing museum is in its initial stages (115 VOs included on 

December 2005), the virtualization of the archaeology museum is in an advanced state. Indeed, Chasqui has been 

in production since the year 2003 and has therefore had enough time to prove the usefulness of the iterative – 

incremental conception of the aforementioned virtualization process. Currently, the resulting virtual museum 

contains more than 1500 VOs and the virtualization process continues to be active. Beyond its capabilities for 

creating simple collections of VOs, the Chasqui tool, being in a more advanced virtualization state than MIGS, 

has proven very valuable in several research and pedagogically related activities oriented to students with very 

different backgrounds. In particular, as it will be detailed in the next sections, Chasqui has evolved from its 

original purpose to be a very valuable tool for supporting the active learning of the research process by junior 

students and of its refinement by senior and PhD. students. In addition, Chasqui has also proven very valuable as 

a basic tool for the just-in-time publishing of research resources. Next we summarize some of these experiences. 

Virtualization and dissemination of pre-existing materials 

The primary use of Chasqui and MIGS was to let students and the general public learn about the objects found or 

collected during an archaeological or ethnological field season, or to learn about the computing devices built at 

the university by pioneering professors. For this purpose, in the initial virtualization iterations the repositories of 

both museums were populated with such elementary VOs. 

The publishing of this kind of stored and archived resources makes tools like Chasqui and MIGS very valuable 

artefacts for disseminating pioneering work and the patrimony of the academic museums among the general 

public. This way, by exhibiting many pieces of hardware, like those shown in Figure 5a (MIGS VO number 2), 

62

as VOs, the computing museum has gained considerable popularity among students. The museum itself is 

discovered in Internet, and instructors and students can use MIGS for documentation prior to visiting the real 

museum at the Computer Science School. As another recent example, 23 pieces in the archives of the museum of 

archaeology were located using Chasqui and selected for the exposition Pueblos Amazónicos: Un Viaje a otras 

Estéticas y Cosmovisiones (Amazon Cultures: A Journey to other Aesthetics and Cosmovisions) organized for 

the general public by the Museum of Science of Castilla-La Mancha (Cuenca, Spain, 1-30 Junio 2005). 

(a) (b) 

Figure 5. (a) a VO in MIGS; (b) Unpublished pottery vessel neck from Proyecto Esmeraldas that was rerecovered 

during the virtualization process 

Another interesting experience, this time related to Chasqui, has been its use for reviving unpublished 

archaeological material for research and pedagogical purposes (Figure 5b). This is indeed the case with several 

Archaeological and Ethnological projects already finished at the Department of American History II: Chinchero, 

Incapirca and Esmeraldas. 

Work Assignments for Undergraduate Students 

In Chasqui the research materials collected at different archaeological sites have been used to design work 

assignments for undergraduate students. The flexibility of the VO model lets teachers conceive these 

assignments as a new composite VOs. 

In Figure 6a we show some snapshots corresponding to one of these assignments for an undergraduate 

introductory course on the Andean archaeology area (Chasqui VO number 1183). As a deployment tool, 

Chasqui can be combined with typical Learning Management Systems (LMS) as has been the case with these 

work assignments that have been made accessible to students using the virtual campus of the Complutense 

University, currently supported by the WebCT platform (Rehberg et al. 2004). This lets us take advantage of the 

communication and learning management features of a typical LMS (e.g. discussion forums and management 

student facilities) in conjunction with the other features of Chasqui. 

Involving Advanced Undergraduate and Graduate Students in the Virtualization Process 

We have also used Chasqui to involve advanced undergraduate and graduate students in the construction of new 

composite VOs, therefore promoting active learning among these students. Again the flexibility of the VO model 

enables students to construct new VOs by referencing pre-existing foreign resources and also pre-existing VOs. 

In addition, students can also collaborate in the elaboration of new original resources. 

The snapshots shown in Figure 6b correspond to a VO produced by a group of graduate students enrolled in a 

course about Aztec Culture (Chasqui VO number 1683). In addition to the pedagogical benefits of their active 

involvement, the work performed by the students is made available to other users and it can be used as support 

material in future editions of the course. As with the undergraduate-level experiences, we have realized the 

advantages of integrating the domain-specific virtualization activity with the generic facilities provided by a 

63

general-purpose LMS, like WebCT. In this case the groups of students can share a server file to interchange the 

digital materials needed for the virtualization, as well as a private newsgroup and a personal e-mail for 

communication purposes. 

(a) (b) 

Figure 6. Snapshots of (a) a work assignment for undergraduate students, (b) a VO produced by graduate 

students 

(a) (b) 

 

Sonidos de América: un recorrido 

por los instrumentos musicales de la 

Colecciones de Arqueología y Etnografía 

 

 

 

Partiendo de América Central 

 

 

Bajando por Sudamérica 

 

... 

 

 

 

 

 

En la Colecciones de Arqueología y 

Etnografía de América del Departamento de 

Historia de América II, Universidad 

Complutense de Madrid, se puede consultar 

una limitada pero interesante cantidad de 

instrumentos musicales procedentes de 

distintas culturas ... 

 

... 

(c) 

Figure 7. (a) An archeological marked document; (b) presentation generated from (a); (c) VO resource with the 

presentation in (b). 

64

Involving PhD. Students in the Virtualization Process 

We have used Chasqui to implement a new pedagogical strategy in a PhD. course on New Information 

Technologies in Andean Archaeology. Students enrolled in this course are integrated as active members in our 

research group. This is an interdisciplinary group formed by both archaeologists and computer scientists. One of 

the main research interests of the Computer Science branch of the group is the formulation of domain-specific 

descriptive markup languages (Goldfarb, 1981; Coombs et al. 1987) to structure documents of different 

knowledge areas for multiple purposes. Our interest in the use of makup languages is directly related with the 

aforementioned document-oriented approach for the development of content-intensive applications. Therefore, 

we propose that PhD. students define their own markup languages to structure the documents that they produce 

with different purposes: research, dissemination or education. 

The languages created by the students (with the help of their teacher) are conceived as descriptive markup 

languages defined using XML (XML, 2004). A typical workflow of the process is as follows: 

The teacher assigns research papers about different subjects to each student. 

Then the students must synthesise the main ideas in the papers they have studied and, in a following session, 

discuss their syntheses with the rest of the group. 

When the students have enough knowledge about the subject, they must create composite VOs, gathering 

and structuring as much information as possible regarding this subject. Among these resources, they must 

include documents describing the results of their research. These documents, which are the most common in 

the domain, can be classified by: (i) their final purpose (site reports, essay, review, research or dissemination 

papers), (ii) their primary sources (artefacts, monuments, site plans or collections), or (iii) by the targeted 

audience (public or academic). 

In the next stage, each of the documents elaborated is analyzed to identify its structural elements, its 

hierarchical organization and the labels and the attributes to be used to make all the information that they 

convey explicit. 

The documents are marked up and a first attempt to abstract the type of these documents as XML document 

grammars is carried out with the help of Chasqui’s developer community. Note that these document 

grammars define new domain-specific descriptive markup languages used for authoring purposes. Therefore 

they do not should be confused with extensions to the XML bindings for the information models proposed 

by the different e-learning specifications. These document grammars can be also used and refined by 

subsequent groups of students. Therefore, these document grammars are subjected to a continuous 

evolution. In order to manage this evolution, developers can use suitable XML schema technologies (Lee & 

Chu, 2000) as well as the techniques for the incremental definition of domain-specific descriptive markup 

languages described in (Sierra et al. 2005c). 

Finally, the documental resources associated with the composite VOs already created are automatically 

generated from the marked documents without need, for the students, of further manual processing. For this 

purpose, they also get support from the developer community. This community produces suitable processors 

for the domain-specific descriptive markup languages defined by the students. For this purpose they can use 

standard XML processing technologies (Birbeck et al. 2001) or the more advanced solutions oriented to 

facilitate the incremental construction of such processors described in (Sierra et al. 2005c). 

In Figure 7a we show part of a document marked with a domain-specific markup language developed by a group 

of PhD. students during the course’s edition of 2004-2005. Note again that this document is used only during 

authoring. Indeed, the resource finally integrated in the VO is the result of transforming it to HTML - in this 

particular case an XSLT style sheet (XSLT, 2004) was used to carry out the transformation. Therefore, the 

markup used here is domain-specific and has nothing to do with the tags used in the XML bindings for the 

different information models proposed by the e-learning community (e.g. markup for representing IMS CP 

manifests). In Figure 7b we show the resource (an HTML page) generated from this document. In Figure 7c we 

show some snapshots of the Chasqui VO where the resource is finally integrated (Chasqui VO 1430). 

Just-In-Time Diffusion of the Research Results 

Domain-specific authoring tools like Chasqui allow, for researchers, the continuous and just-in-time update of 

the VOs. The only requirement is an internet connection, since the usability of the tool makes the presence of 

developers unnecessary. That way, research results and field work reports can be published as they are obtained, 

letting the interested researchers and students access these results without waiting for their diffusion over more 

conventional publishing channels (e.g. specialized conferences and journals). 

65

In Figure 8 we show some snapshots of a VO (Chasqui VO number 1743) related to the field work and the 

preliminary reports for the Project Manabí Central, which is being carried out by an international research 

consortium at Chirije and San Jacinto de Japoto, on the Ecuadorian Coast (Bouchard, 2004). Another related 

use of Chasqui is the organization as VOs of the content and development of other research activities, like 

symposiums (see Chasqui VO number 1431 in the Chasqui web site). 

Conclusions and Future Work 

Figure 8. VO with the preliminary reports of the Project Manabí Central 

The virtualization process model described in this paper lets domain experts create repositories of LOs in a very 

dynamic way. For this purpose domain-specific LO models are formulated and supporting authoring and 

deployment tools based on these models are developed. These tools are used by the experts to produce the 

repositories. We have realized that the approach is very valuable for exploiting the educational potential of 

otherwise underused and/or access-limited research materials. We have also realized that the approach allows the 

creation of new knowledge as reusable composite LOs for many different pedagogical and research purposes. 

But perhaps the most relevant and in some way unexpected result of our work, from an e-learning perspective, 

has been how teachers and students are using the system. This way, they are finding that Chasqui is a very useful 

tool for supporting active and collaborative learning, involving both student-student and teacher-student relations 

as has been shown in the virtualization experiences described in this paper. 

Currently we are finishing the first stage in the virtualization of the computing museum, and we are also starting 

several virtualization experiences regarding the virtual campus of the Complutense University of Madrid. We are 

also working on another evolution of the VO concept, by extending VOs with scripts documenting the 

sequencing of their resources. These scripts will be implemented by using the IMS Learning Design 

Specification (IMS LD, 2003; Koper & Tatersall, 2005). As future work we are planning to undertake the 

virtualization of other museums at this university in order to refine the concept of VO. We are also planning to 

further use our document-oriented approach (Sierra et al. 2004; Sierra et al. 2005a) to improve the maintenance 

of LO domain-specific authoring tools by enabling their full documental description. 


This work is supported by the Spanish Committees of Science and Technology (TIC2002-04067-C03-02, 

TIN2004-08367-C02-02 and TIN2005-08788-C04-01) and of Industry, Tourism and Commerce (PROFIT, 

FIT-350100-2004-217). We also want thank Antonio Navarro for his participation in early versions of this work. 

66

References 

ADL (2003). Shareable Content Object Reference Model SCORM, Retrieved April 7, 2006, from, 

http://www.adlnet.org/scorm. 

Arango, G. (1989). Domain-Analysis: From Art Form to Engineering Discipline. ACM SIGSOFT Notes, 14 (3), 

152-159. 

Birbeck, M., Kay, M., Livingstone, S., Mohr, S. F., Pinnock, J., Loesgen, B., Livingston, S., Martin, D., Ozu, N., 

Seabourne, M., & Baliles, D. (2001). Professional XML, (2 nd Ed.). Birminghan,UK: WROX Press. 

Bouchard, J. F. (2004). Project Manabí Central. Rapport d'Activité Année 2004, Paris, France: Division des 

Sciences Sociales de l'Archéologie del Ministére des Affaires Etrangéres. 

Cerami, E. (2002). Web Services Essentials, Sebastopol, CA, USA: O’Reilly. 

Chasqui Web Site (2005). Chasqui website, Retrieved April 7, 2006, from, 

http://macgalatea.sip.ucm.es/chasqui.html. 

Chasqui2 Web Site (2005). Chasqui website, Retrieved, April 7, 2006, from, 

http://macgalatea.sip.ucm.es/web3/principal/principal.html. 

Coombs, J., Renear, A., & DeRose S. (1987). Markup Systems and the Future of Scholarly Text Processing. 

Communications of the ACM, 30 (11), 933-947. 

Fernández-Valmayor, A., Guinea, M., Jiménez, M., Navarro, A., & Sarasa, A. (2003). Virtual Objects: An 

Approach to Building Learning Objects in Archeology. In Llamas-Nistal, M., Fernández-Iglesias, J., & Anido- 

Rifon, L. E (Eds.), Computers and Education: Towards a Lifelong Society, Dordrecht, The Netherlands: Kluwer 

Academic Publishers, 191-202. 

Goldfarb, C. F. (1981). A Generalized Approach to Document Markup. ACM SIGPLAN Notices, 16 (6), 68-73. 

LOM, (2002). IEEE Standard for Learning Object Metadata, IEEE Standard 1484.12.1-2002. 

IMS LD (2003). IMS Learning Design 1.0, Retrieved April 7, 2006, from, 

http://www.imsglobal.org/learningdesign. 

IMS CP (2004). IMS Content Packaging Information Model Version 1.1.4 Final Specification, Retrieved April 

7, 2006, from, http://www.imsglobal.org/content/packaging. 

Koper, R. (2003). Combining re-usable learning resources and services to pedagogical purposeful units of 

learning. In Littlejohn, A. (Ed.), Reusing Online Resources: A Sustainable Approach to eLearning, London, UK: 

Kogan Page, 46-59. 

Koper, R., & Tatersall, C. (2005). Learning Design, a handbook on modelling and delivering networked 

education and training, Heidelberg: Springer-Verlag. 

Lee, D., & Chu, W. W. (2000) Comparative Analysis of Six XML Schema Languages. ACM SIGMOD Record, 

29 (3), 76-87. 

MIGS (2005). MIGS website, Retrieved April 7, 2006, from, 

http://macgalatea.sip.ucm.es/migs/publico/control.php. 

Navarro, A., Sierra, J. L., Fernández-Valmayor, A., & Hernanz, H. (2005). From Chasqui to Chasqui 2: An 

Evolution in the Conceptualization of Virtual Objects. Journal of Universal Computer Science, 11 (9), 1518- 

1529. 

Polsani, P. R. (2003). Use and Abuse of Reusable Learning Objects. Journal of Digital Information, 3 (4), 

retrieved June 30, 2006, from, http://jodi.tamu.edu/Articles/v03/i04/Polsani/. 

67

Rehberg, S., Ferguson, D., McQuillan, J., Eneman, S., & Stanton, L. (2004). The Ultimate WebCT Handbook, a 

Practical and Pedagogical Guide to WebCT 4.x. Clarkston, GA, USA: Ultimate Handbooks. 

Roberts, E., Freeman, M. W., Wiley, D., & Sampson, D (Eds.). (2005). Special Issue on SCORM 2004 

Sequencing & Navigation. Learning Technology Newletter, 7 (1), retrieved June 30, 2006, from, 

http://lttf.ieee.org/learn_tech/issues/january2005/index.html. 

Sierra, J. L., Fernández-Valmayor, A., Fernández-Manjón, B. & Navarro, A. (2004). ADDS: A Document- 

Oriented Approach for Application Development. Journal of Universal Computer Science, 10 (9), 1302-1324. 

Sierra, J. L., Fernández-Manjón, B., Fernández-Valmayor, A. & Navarro, A. (2005a). Document-Oriented 

Construction of Content-Intensive Applications. International Journal of Software Engineering and Knowledge 

Engineering, 15 (6), 975-993. 

Sierra, J. L., Fernández-Valmayor, A., Guinea, M., Hernanz, H. & Navarro, A. (2005b). Building Repositories of 

Learning Objects in Specialized Domains: The Chasqui Approach. In Goodyear, P., Sampson, D. G., Yang, 

D.J.T., Kinshuk, Okamoto, T., Hartley, R., & Chen, N.-S. (Eds.), In Proceedings of the 5th IEEE International 

Conference on Advanced Learning Technologies ICALT’05, Los Alamitos, CA, USA: IEEE Society Press, 225- 

229. 

Sierra, J. L., Navarro, A., Fernández-Manjón, B., & Fernández-Valmayor, A. (2005c). Incremental Definition 

and Operationalization of Domain-Specific Markup Languages in ADDS. ACM SIGPLAN Notices, 40 (12), 28- 

37. 

Studer, R., Fensel, D., Decker, S., & Benjamins, V. R. (1999). Knowledge Engineering: Survey and Future 

Directions. Lecture Notes in Artificial Intelligence, 1570, 1-23. 

XSLT (2004). XSL Transformations (XSLT) Version 1.0, Retrieved April 7, 2006, from, 

http://www.w3.org/TR/xslt. 

XML, (2004). Extensible Markup Language (XML) (3 rd Ed.), Retrieved April 7, 2006, from, 

http://www.w3.org/TR/REC-xml/. 

68

Chen, C.-M., Hong, C.-M., Chen, S.-Y. & Liu, C.-Y. (2006). Mining Formative Evaluation Rules Using Web-based Learning 

Portfolios for Web-based Learning Systems. Educational Technology & Society, 9 (3), 69-87. 

Mining Formative Evaluation Rules Using Web-based Learning Portfolios 

for Web-based Learning Systems 

Chih-Ming Chen 

Institute of Learning Tech, National Hualien Univ of Education, Hualien 970, Taiwan, R.O.C. 

cmchen@mail.nhlue.edu.tw 

Chin-Ming Hong 

Institute of Applied Electronics Technology, National Taiwan Normal University, Taipei 106, Taiwan, R.O.C. 

t07026@ntnu.edu.tw 

Shyuan-Yi Chen 

Institute of Industrial Education, National Taiwan Normal University, Taipei 106, Taiwan, R.O.C. 

69370034@ntnu.edu.tw 

Chao-Yu Liu 

Institute of Learning Technology, National Hualien University of Education, Hualien 970, Taiwan, R.O.C. 

rank@enjust.com 

ABSTRACT 

Learning performance assessment aims to evaluate what knowledge learners have acquired from teaching 

activities. Objective technical measures of learning performance are difficult to develop, but are extremely 

important for both teachers and learners. Learning performance assessment using learning portfolios or 

web server log data is becoming an essential research issue in web-based learning, owing to the rapid 

growth of e-learning systems and real application in teaching scenes. The traditional summative evaluation 

by performing examinations or feedback forms is usually employed to evaluate the learning performance 

for both the traditional classroom learning and the web-based learning. However, summative evaluation 

only considers final learning outcomes without considering learning processes of learners. This study 

presents a learning performance assessment scheme by combining four computational intelligence theories, 

i.e., the proposed refined K-means algorithm, the neuro-fuzzy classifier, the proposed feature reduction 

scheme, and fuzzy inference, to identify the learning performance assessment rules using the web-based 

learning portfolios of an individual learner. Experimental results indicate that the evaluation results of the 

proposed scheme are very close to those of summative assessment results of grade levels. In other words, 

this scheme can help teachers to assess individual learners precisely utilizing only the learning portfolios in 

a web-based learning environment. Additionally, teachers can devote themselves to teaching and designing 

courseware since they save a lot of time in evaluating learning. This idea can be beneficially applied to 

immediately examine the learning progress of learners, and to perform interactively control learning for elearning 

systems. More significantly, teachers could understand the factors influencing learning 

performance in a web-based learning environment according to the obtained interpretable learning 

performance assessment rules. 

Keywords: 

Learning Performance Assessment, Web-based Learning, Web-based Learning Portfolio, Data Mining 

Introduction 

Gagnés’ research on the internal process of learning has indicated that the complete learning process should 

assess learning performance (Gagnés, 1997). Learning performance evaluation instruments can generally be 

classified as summative assessment and formative assessment (Torrance & Pryor, 1998; Campbell et al, 2000). 

While summative evaluation is generally performed after finishing an instruction unit or class, formative 

assessment emphasizes the learning process. Teachers can assess the overall learning performance using 

summative assessments. Conversely, the use of formative assessments helps teachers obtain feedback about how 

well students are learning and particular difficulties they might be having. The standard summative assessment 

scheme at course level is to perform either an examination or an assessment form. However, these methods 

cannot collect all learning process information. Since the formative assessment strongly emphasizes integrating 

“what people learned” with “how people learned”, the formative assessment occurs when teachers feed 

information back to students in ways that enable the student to learn better, or when students can engage in a 

self-reflective process. Therefore, while assessing learning performance using the portfolio has become 

increasingly popular, technology is facilitating its evolution and management (Torrance & Pryor, 1998; 






69

Campbell et al, 2000; Lankes, 1995). Moreover, learning performance assessment based on the learning 

portfolio is a very useful concept for some fields of learning assessment, such as psychological testing, knowing 

how subjects take the test can be more important than their answers. 

Traditional portfolio assessment relies on man-made data collection and a writing-centered learning process. 

Difficulties in data storage, search and management after long-term implementation have become problematic in 

developing and implementing portfolio assessment (Chang, 2002). In contrast, a web-based learning portfolio 

can be collected, stored and managed automatically by computers when learners interact with an e-learning 

platform. Therefore, learning performance assessment using a web-based learning portfolio has received 

significant attention recently (Lankes, 1995; Rahkila & Karjalainen, 1999; Wu & Leung, 2002). Rasmussen et 

al. (Rasmussen et al. 1997) suggested that Internet-based instruction could also allow student progress to be 

evaluated through participation in group discussions and portfolio development. Lankes (Lankes, 1995) stated 

that implementing computer-based student assessment portfolios is an innovative educational innovation owing 

to its ability not only to offer an authentic demonstration of accomplishments, but also enable students to take 

responsibility for their completed tasks. Rahkila (Rahkila & Karjalainen, 1999) also proposed using the user 

profiles with log data for the learning performance assessment, finding that it can be applied to interactively 

control learning effectively. Rahkila emphasized that the learning portfolio assessment is supported by the 

cognitive–constructive theory of learning (Rahkila & Karjalainen, 1999; Bruner, 1996). Zaiane (Zaiane & Luo, 

2001) proposed applying web access logs and advanced data mining techniques to extract useful patterns that 

can help educators evaluate on-line course activities. Wang et al. (Wang et al. 2003) found that learning 

behavior information, commonly referred to as a learning portfolio, can help teachers understand why a learner 

obtained a high or low grade. Carrieira (Carrira & Crato, 2004) proposed monitoring users’ reading behaviors to 

infer their interest in particular articles. 

However, developing a precise learning performance assessment scheme using web-based learning portfolio is a 

challenging task for web-based learning systems. Data mining, or knowledge discovery, attempts to obtain 

valuable knowledge from data stored in large repositories. Data mining has been considered as an appropriate 

method of knowledge discovery to excavate the implicit information (Dunham, 2002). This study presents a 

data mining approach that integrates four computational intelligence schemes, i.e., the refined K-means 

clustering algorithm, the neuro-fuzzy classifier (Nauck & Kruse, 1999), the proposed feature reduction scheme, 

and the fuzzy inference (Lin & Lee, 1996) , to evaluate on-line learning behavior and learning performance. The 

four computational intelligence schemes are employed to logically determine fuzzy membership functions for 

the neuro-fuzzy classifier, discover the fuzzy rules relating to the learning performance assessment, reduce the 

feature dimension of the discovered fuzzy rules, and infer the learning performance using the discovered fuzzy 

rules based on the learning portfolio of an individual learner, respectively. 

The proposed learning performance assessment scheme was implemented on the personalized e-learning system, 

and its effectiveness was demonstrated in a real-world teaching scenario. Experimental results show that the 

proposed learning assessment scheme can correctly measure learners’ learning performance according to their 

learning portfolios as well as help teachers save a lot of time for the learning evaluation during the learning 

process. Significantly, the learning evaluation results were applied to help teachers examine the learning 

progress of learners and interactively control learning for the personalized e-learning system. The remainder of 

this study is organized as follow. Section 2 describes the problem description based on the gathered learning 

portfolios for the proposed learning performance assessment scheme. Section 3 explains the proposed learning 

performance assessment scheme. Section 4 presents the detailed experiments. Conclusions are drawn in Section 

5. 

Problem Description 

Personalized E-learning System (PELS) with Learning Performance Assessment Mechanism Based on 

Learning Portfolios 

The personalized e-learning system (PELS) based on the Item Response Theory, which includes an off-line 

courseware modeling process, four intelligent agents and four databases, is presented in our previous study for 

adaptive courseware recommendation (Chen,, Lee & Chen, 2005), (Chen, Liu & Chang, 2006). However, the 

PELS mainly focuses on performing adaptive learning based on the difficulty parameters of courseware and 

learner ability of individual learner, the learning performance assessment is lacked feature in this system. In this 

paper, the functionalities of the PELS system are extended to include the learning performance assessment agent 

70

in order to perform the evaluation of learning performance using the gathered learning portfolios of individual 

learners. The extended system architecture is shown as Figure 1. 

Learner 

User Account 

Database 

Learning 

Interface 

Agent 

User Profile 

Database 

Learning 

Performance 

Assessment Agent 

XML-based 

Courseware 

database 

Test Agent 

Courseware 

Recommendation 

Agent 

Personalized Learning 

Module 

Figure 1. The system architecture 

Courseware 

Modeling Process 

Courseware 

Management Teacher Account 

Agent 

Database 

Courseware Construction 

Module 

Testing Items 

Database 

Figure 2. The learning interface for learner 

71

Figure 2 shows the entire layout of the learning interface on PELS. In the left frame, system shows the course 

categories, course units and the list of all courseware in the courseware database using a hierarchical tree 

topology structure. While a learner clicks a courseware for learning, the content of selected courseware will be 

exhibited in the upper-right window. Besides, the feedback interface is arranged in the bottom-right window. 

The proposed system can get learner’s feedback response from the interface of feedback agent through learner 

replies one randomly testing question related to the conveyed learning content. The answer of testing question 

helps system to get the learner’s comprehension percentage for the recommended courseware. System passes 

the feedback response to the courseware recommendation agent to infer the learner’s ability using the item 

response theory (Baker & Frank, 1992) detailed in our previous study (Chen,, Lee & Chen, 2005; Chen, Liu & 

Chang, 2006). After a learner presses the button of “submit”, this system will reveal a list of the recommended 

courseware based on his current ability. After the learner selects the next courseware according to the 

suggestion of the courseware recommendation agent for further learning, the learner can continue to learn the 

selected courseware. The PELS will continue to run the learning cycle until the evaluated learner ability satisfies 

the stop criterion. Next, the learning procedure will enter the final testing stage to perform a summative 

assessment through replying 25 randomly selected testing questions. The results of final testing will be used to 

verify the evaluation quality of the proposed learning performance assessment approach in the analysis of 

experimental results. Besides, the learner can also log in the discussion board on PELS to publish the learning 

questions of the individual self or contribute helpful ideas for the learned course unit. The PELS will 

automatically gather the useful learning portfolios of individual learners for the learning performance assessment 

during learning processes. 

Considered Learning Portfolio in the User Profile Database 

The IMS Global Learning Consortium defined that ePortfolios are collections of personal information about a 

learner that represent accomplishments, goals, experiences, and other personalized records that a learner can 

present to schools, employers, or other entities (http://support.imsglobal.org/ePortfolio/). The ePortfolio 

specification developed by IMS Global Learning Consortium enables the use of the Internet for personalized, 

competency-based learning and assessment, enhanced career planning, and improved employment opportunities. 

In the meanwhile, the proposed ePortfolio specification supports many kinds of portfolios, including assessment 

ePortfolios, presentation ePortfolios, learning ePortfolios, personal development ePortfolios, and working 

ePortfolios. The IMS Global Learning Consortium also summarized that ePortfolio can contain personal 

information about the owners, achievements, interests and values, test and examination results, and so on. Next, 

the learning portfolio information collected by PELS for the proposed learning performance assessment 

approach is presented based on the IMS ePortfolio specification. This learning portfolio information was saved 

into the user profile database according to the learner’s interaction with PELS. Hence, the learning performance 

was graded according to their PELS learning portfolios. The eight gathered learning factors are described in 

detail as follows: 

1. Reading rate of course materials (RR) 

After a learner has logged onto the PELS system, the PELS automatically counts and accumulates the 

reading amount of the learned course materials for each learner. The learning parameter is progressively 

increased for each section learning process. Thus, the reading rate of course materials is defined as the rate 

of studying course materials in a course unit, and the notation RR is employed to represent the learning 

factor. Table 1 presents an example of calculating the reading rate for both the learners A and B in the 

learning course unit. 

Table 1. A calculating example for the learning factor of reading rate 

Learner Learning path of course materials 

The 

number of 

studying 

course 

materials 

The total number 

of course materials 

in the learned 

course unit 

A 1581011121314151617 11 11 1 

B 154678910(9)(10)11 9 11 0.82 

( ) indicates the repeated course material during the learning process 

Reading 

rate (RR) 

72

2. Total accumulated reading time of all learned course materials (RT) 

The total accumulated reading time of each learner is calculated by summing up the reading time of all 

learned course materials on the PELS system, and the notation RT is used to represent the learning factor. 

Table 2 presents an example of calculating the total accumulated reading time for both the learners A and B 

in the learning course unit. 

Table 2. A calculating example for the learning factor of total accumulated reading time 

Learner Learning path of course materials 

Total accumulated 

reading time(RT) 

1(25)5(32)8(33)10(30)11(36)12(43)13(47)14(43) 

A 

15(45)16(40)17(46) 

1(42)5(56)4(50)6(62)7(67)8(72)9(79)10(93) 

B 

9(70)10(81)11(118) 

( ) indicates the reading time for the learned course material 

3. Learner ability evaluated by PELS (LA) 

420(secs) 

790(secs) 

After studying the recommended courseware, the PELS (Chen, Lee & Chen, 2005; Chen, Liu & Chang, 

2006) can dynamically estimate a learner’s ability according to the item response theory (Baker & Frank, 

1992) by collecting the replied responses of the learner to the randomly selected testing questions in the 

learned course unit. The value denotes the learner’s ability in the learned course unit measured by the PELS 

system during the learning process, and the range of learner’s ability are limited from -3 (i.e. lowest ability) 

to +3 (i.e. highest ability). As a learner logs in this system, if the user account database does not have any 

history records in the selected course unit for this learner, then his initial ability will be regarded as 0. That 

is, the system assumes beginner’s ability is moderate level. As a learner clicked the recommended 

courseware for learning, his/her ability in this course unit will be re-evaluated according to his/her feedback 

responses and the corresponding difficulty parameter of the learned courseware. The notation LA is used to 

represent the learning factor of learner ability in this paper. 

In the PELS, the quadrature form proposed by Bock and Mislevy (Baker & Frank, 1992) was employed to 

approximately estimate learner’s ability as follows: 

q 

∑ 

ˆθ 

= 

θk 

L( 

u1, 

u2 

,..., un 

| θ k 

k 

q 

) A( 

θ 

L( 

u , u ,..., u | θ ) A( 

θ 

∑ 

k 

1 

where θˆ denotes the learner’s ability of estimation, L( u1, 

u2, 

L , un 

| θk 

) is the value of likelihood function 

th 

at a level below their ability level θ k and learner’s responses are u 1 , u2 

,..., un 

, θ k is the k split value of 

ability in the standard normal distribution, and A( θ k ) represents the quadrature weight at a level below 

their ability level θ k . 

In Eq. (1), the likelihood function L u , u , L , u | θ ) can be further described as follows: 

D( 

θ −b 

) 

( 1 2 n k 

2 

j 

L( 

u , u , , u | θ ) = P ( θ ) Q ( θ 

1 

n 

k 

k 

k 

) 

) 

(1) 

n 

u 1−u 

j 

2 L n k ∏ j k j k ) 

(2) 

j= 

1 

where 

k j e 

Pj 

( θ k ) = 

D( 

θk 

−b 

j ) 

1+ 

e , 

Q j ( θ k ) = 1− 

Pj 

( θ k ) Pj 

(θ ) 

, denotes the probability that learners can 

th 

understand the j courseware at a level below their ability level θ k , Q j ( θ k ) represents the probability 

73

that learners cannot understand the th 

j courseware at a level below their ability level θ k , b j is the 

th 

difficulty parameter of the j courseware, and D is a constant 1.702, and U j is the understanding or not 

understanding answer obtained from learner feedback to the th 

j courseware, i.e. if the answer is 

understanding then U = 1 ; otherwise, U = 0 . 

j 

4. Correct response rate of randomly selecting testing questions (CR) 

j 

After a learner has studied the recommended course material, the PELS tests the learner on his 

understanding by randomly selecting a relevant question from the testing item database. The rate of correct 

responses to test questions helps determine the learner’s degree of understanding for all learned courseware, 

and the notation CR is used to represent the learning factor in this paper. For example, if a learner gave 

seven correct responses and three incorrect responses for ten randomly selecting testing questions from the 

testing item database, then the correct response rate is 0.7. 

5. Posted amount of articles on the forum board (PN) 

The PELS system can automatically count the total number of posted articles and answered questions on the 

forum board for each learner during the learning process. The value usually shows a learner level of 

participation in group discussion, and the notation PN is used to represent the learning factor in this paper. 

For example, if a learner posted three articles and answered five questions on the forum board, then the 

posted amount of articles on the forum board is eight pieces. 

6. Accumulated score on the forum board (AS) 

The PELS system gives various scores for the various interactive behaviors among learners on the forum 

board. In this study, the PELS will automatically increment a learner’s score of one point when he/she logs 

onto the forum board, and accumulate a learner’s score of two points if he/she posts an article on the forum 

board. The accumulated score on the forum board represents the level of useful feedback given by a learner, 

and the notation AS is used to represent the learning factor in this paper. For example, if a learner logged 

onto the forum board two times and posted three articles, then the accumulated score on the forum board 

will be eight points. 

7. Effort level of studying course Materials (EL) 

Since each course material in the PELS system is assigned a required minimum reading time by course 

experts based on the courseware content, the effort level is defined as the actual reading time compared with 

the required minimum reading time for the learned courseware, and the notation EL is used to represent the 

learning factor in this paper. Suppose the actual reading time of learners A and B in the learned course unit 

is 420 seconds and 790 seconds, respectively, and listed as Table 2. Table 3 presents an example of 

calculating the effort level for the learners A and B in the learned course unit. In this study, the range of the 

effort level is limited from 0 to 1. Restated, the value of the effort level will be assigned as 1 if the 

calculated value is over 1. 

Table 3. A calculating example for the learning factor of effort level 

Learner Learning path of course materials Effort level (EL) 

A 

B 

1(30)5(60)8(75)10(60)11(75)12(75)13(60)14(60)15(75) 

16(75)17(90) 

1(30)5(60)4(45)6(60)7(60)8(75)9(60)10(60)9(60) 

10(60)11(75) 

420 / 735 ≈ 0. 

57 

790/585 → 1 

( ) indicates the required minimum reading time determined by course experts for the learned course material 

74

8. Final test grade (G) 

This study measures the final test grade through the summative assessment scheme of fixed-length testing 

examination after the entire learning process is completed, and the notation G is used to represent the 

learning factor in this paper. 

Mining Learning Performance Assessment Rules Based on Learning Portfolios 

Flowchart of Mining Learning Performance Assessment Rules 

Figure 3 shows the entire flowchart of the proposed learning performance assessment scheme. The proposed 

refined K-means algorithm is initially employed to logically determine the fuzzy membership functions of each 

learning factor based on real data distribution of learning portfolios for mining learning performance assessment 

rules by the neuro-fuzzy classifier (Nauck & Kruse, 1999). After the fuzzy rules for learning performance 

assessment are identified by the neuro-fuzzy classifier, the feature reduction scheme is employed to simplify the 

linguistic terms of the discovered learning performance assessment rules under keeping a satisfied accuracy rate 

of learning performance assessment. This procedure aims to obtain simplified learning performance assessment 

rules for promoting the inferring efficiency in web-based learning systems as well as discover the main learning 

factors that affect the learning outcomes. Finally, the fuzzy inference is employed to grade the learning 

performance by the simplified fuzzy rules for individual learners. The following sections give details for the 

proposed learning performance assessment scheme for individual learners. 

START 

Teacher 

Learner Learning behavior 

Learning 

Performance 

Fuzzy 

Inferring 

Refined fuzzy 

rules 

The gathered learning 

portfolios 

Feature Reduction 

Algorithm 

Primal fuzzy 

rules 

Refined K-means 

Algorithm 

Neuro-Fuzzy 

Classifier 

Figure 3. The flowchart of learning performance assessment 

Mining Learning Performance Rules Using Web-based Learning Portfolios 

1 

0.5 

Clustering 

C 1 C 2 C 3 

Membership functions 

determined 

Determining Fuzzy Membership Functions by the Refined K-means Algorithm for the Neuro-Fuzzy Classifier 

This study employed the neuro-fuzzy classifier to discover the fuzzy knowledge rules from the learning 

portfolios for learning performance assessment. To identify the fuzzy knowledge rules by the neuron-fuzzy 

classifier, the membership functions used in the neuro-fuzzy classifier must be logically determined according to 

the data distribution of learning portfolios in advance. In this work, the proposed refined K-means clustering 

algorithm improved from the original K-means clustering algorithm (Xu & Wunsch, 2005) was employed to 

determine the centers of the triangle fuzzy membership functions automatically according to the data distribution 

75

of each learning factor in learning portfolios for the neuro-fuzzy classifier herein. To obtain appropriate and 

interpretable learning performance assessment rules by the neuro-fuzzy classifier, this study sets the number of 

clusters in the refined K-means clustering algorithm as three based on the requirements of learning performance 

assessment in real teaching scenario. In other words, each considered learning factor contains three linguistic 

terms, i.e. low, moderate, and high, to describe a fuzzy rule. After that, the membership functions of the triangle 

fuzzy sets were automatically determined according to three cluster centers of each learning factor. Suppose the 

centers of three linguistic terms determined by the refined K-means clustering algorithm are respectively c 1 , 

c 2 , and c 3 , the membership functions for the linguistic terms “low”, “moderate”, and “high” can be formulated 

as follows: 

i. The membership function for the linguistic term “low” 

⎧ 1 

⎪ c2 

− x 

μ Low ( x) 

= ⎨ 

⎪c 

2 − c1 

⎪⎩ 

0 

if 

if 

if 

x ≤ c1 

c1 

< x < c2 

(3) 

x ≥ c2 

ii. The membership function for the linguistic term “moderate” 

⎧ 0 

⎪ x − c1 

⎪ 

⎪c 

2 − c1 

μ Moderate ( x) 

= ⎨ 1 

⎪ c3 

− x 

⎪ 

⎪ 

c3 

− c2 

⎪⎩ 

0 

iii. The membership function for the linguistic term “high” 

if 

if 

if 

if 

if 

x ≤ c1 

c1 

< x < c2 

x = c2 

(4) 

c2 

< x < c3 

x ≥ c3 

⎧ 0 if x ≤ c2 

⎪ x − c2 

μ High ( x) 

= ⎨ if c2 

< x < c3 

(5) 

⎪c3 

− c2 

⎪⎩ 

1 if x ≥ c3 

Next, the difference of the proposed refined K-means clustering algorithm with the original K-means clustering 

algorithm is compared and explained herein. Suppose there are s patterns in the gathered learning portfolios, 

the original K-means clustering algorithm is employed to find the k centers for each learning factor, where 

k ≤ s , and the set of the k cluster centers are represented as follows: 

C s ∈ { c1,..., 

cl 

,..., cr 

,..., ck 

} 

(6) 

where C s is the set of the k cluster centers, c l is the cluster center with the nearest Euclidean distance to the 

left boundary, and c r is the cluster center with the nearest Euclidean distance to the right boundary. 

In the proposed refined K-means clustering algorithm, the cluster centers c l and c r determined by the original 

K-means clustering algorithm are tuned by the refined value δ ; moreover, the other cluster centers are 

unchangeable. The refined value is computed as follows: 

max( S) 

+ min( S) 

δ = (7) 

2 

where max ( S) 

and min ( S) 

represent the maximum and minimum feature values of patterns in each learning 

factor, respectively. 

Therefore, the new cluster centers determined by the refined K-means clustering algorithm can be represented as 

follows: 

( new) 

C s ∈ { c1,..., 

cl 

− δ ,..., cr 

+ δ,..., 

ck 

} 

(8) 

(new) 

where C is the new set of the k cluster centers. 

s 

76

Figure 4 gives an example to explain how to tune the membership functions by the refined value δ defined in 

the proposed refined K-means clustering algorithm. In Fig. 4, the circle and square notations with black and 

grey colors represent respectively the cluster centers determined by the K-means and refined K-means clustering 

algorithms. Moreover, the dotted and solid lines respectively represent the membership functions determined by 

the K-means and refined K-means clustering algorithms. Compared with the original K-means clustering 

algorithm, we find that the refined K-means clustering algorithm has benefits in terms of promoting 

classification ability for the boundary patterns between different classes and reducing the number of unknown 

patterns due to expanding the boundary range while employing the neuro-fuzzy classifier to discover the learning 

performance assessment rules based on learning portfolios. The later experimental results will confirm these 

benefits. 

x2 

μ′ 22 

μ22 

μ21 

μ′ 21 

δ 

δ 

δ 

δ 

μ′ 11 μ11 μ′ 12 μ12 

Figure 4. Tuning membership functions by the refined value δ , where the circle and square notations with 

black and grey colors respectively represent the cluster centers determined by the K-means clustering algorithm 

and refined K-means clustering algorithm, and the dotted and solid lines respectively represent the membership 

functions determined by the K-means and refined K-means clustering algorithms 

Neuro-Fuzzy Classifier for Mining Learning Performance Assessment Rules 

After the triangle fuzzy membership functions were determined by the refined K-means clustering algorithm, the 

neuro-fuzzy classifier proposed by Nauck (Nauck & Kruse, 1999) was employed to infer the fuzzy rules for the 

learning performance assessment based on learning portfolios. Figure 5 shows the learning architecture of the 

used neuro-fuzzy classifier which can be represented as four-layer feed-forward networks. The first layer is the 

input layer which consists of input neurons and each neuron corresponds to an input feature. The second layer is 

the fuzzification layer which consists of fuzzy set neurons and each neuron corresponds to a linguistic term, such 

as low, moderate, and high, and so on. The third layer is the rule layer which consists of rule neurons and each 

neuron corresponds to a fuzzy rule. The fourth layer is the class layer which consists of class neurons and each 

neuron corresponds to a class label. To explain the detailed operation procedures of the neuro-fuzzy classifier, 

th 

the notations used in Fig. 5 are first explained as follows: x k is the k input feature, μ kl denotes the th 

l 

membership function of the 

th 

k input feature, Rule j is the 

x1 

th 

j fuzzy rule, and Classi is the 

th 

i output class. 

In addition, ar j denotes the activation strength of the th 

j neuron in the rule layer, and ac i denotes the 

activation strength of the th 

i neuron in the class layer. Moreover, the input layer only receives input features 

from the gathered learning portfolios and directly passes the input features to the fuzzification layer for 

computing the fired membership degrees. Each fuzzy set neuron in the fuzzification layer could connect to 

several rule neurons and each rule neuron only connects to one class neuron, but each class neuron could be 

simultaneously connected by different rule neurons since different antecedent parts of fuzzy rules could lead to 

the same consequent part. 

77

ac1 aci acc 

Class1 

Classi 

Classc 

ar1 ar2 arj arr 

Rule 1 Rule 2 Rule j 

Ruler 

μ11 μ12 μ1m μkl μn1 μn2 μnm 

Input 1 

Input k 

Inputn 

x1 k x xn 

Figure. 5 Four-layer learning architecture of the employed neuro-fuzzy classifier 

In this study, the employed neuro-fuzzy classifier follows six steps to generate fuzzy rules for learning 

performance assessment, and summarized as follows: 

Step 1. Computing the activation strength of each fuzzy neuron in the rule layer by the minimum operator, and 

formulated as follows: 

) ( j 

ar = min u 

(9) 

j 

k = 1, 

2,..., 

n 

{ } 

where ar j is the activation strength of the antecedent part of the th 

j rule neuron, 

membership degree of the 

{ u , u u } 

u ,..., 

th 

j rule fired by the 

k 

u represents the 

( j ) 

k 

th 

k input feature, which can be computed as 

( j) 

k = max k1 

k 2 km , and m is the number of the defined linguistic terms of each input feature. 

Step 2. Computing the activation strength of each class neuron in the class layer by the maximum operator, and 

formulated as follows: 

) ( i 

= max 

(10) 

ac 

i 

j = 1, 

2,..., 

z 

{ ar } 

where ac i is the activation strength of the consequent part of the th 

i class neuron, 

i 

j 

( i ) 

j 

ar represents the 

activation strength of the th 

i class neuron fired by the th 

j rule which is the set member of the activation 

( i) 

strengths of the rule neurons represented as ar j ∈ { ar1 

, ar2 

,..., arj 

,..., arr 

} , and z i is the number of the 

th 

fired fuzzy rules of the i class neuron, and zi ≤ r . 

Step 3. Computing the corresponding class register values by respectively calculating the summation of the 

activation strengths fired by all training patterns for each fuzzy rule, and formulated as follows: 

78

t ( 

i 

j ) 

y 

= ∑ = 

p 1 

where ti ( j) 

is the class register value of the 

th 

of the i class neuron fired by the 

patterns. 

ac 

( j ) 

i 

( X 

p 

th 

i class for the 

th 

p training pattern for the 

) , where i = 1, 

2,..., 

c, 

j = 1, 

2,..., 

r 

th 

( j ) 

j rule, ac ( X p ) 

(11) 

i is the activation strength 

th 

j rule, and y is the total number of training 

Step 4.Finding the class label with the largest class register value for each fuzzy rule, and formulated as follows: 

τ j ) = arg max{ 

t ( j )} 

(12) 

( i 

i= 1, 

2,..., 

c 

where τ ( j ) represents the class label with largest class register value for the 

Step 5. Evaluating the performance for each fuzzy rule, and formulated as follows: 

where p ( j ) represents the performance of the 

th 

register value for the j fuzzy rule. 

p( 

j ) 

= t 

τ( 

j ) 

( 

j ) 

− 

i= 

1, 

2 

th 

j fuzzy rule, ( j ) 

∑ti( ,..., c, 

i≠ 

τ 

th 

j rule. 

j ) 

(13) 

τ is the class label with the largest class 

Step 6. Collecting the fuzzy rules with the best performance in each class as the learning performance 

assessment rules. 

Feature Reduction Scheme for Simplifying the Learning Performance Assessment Rules 

To generate greater efficient and even accurate fuzzy rules, the original fuzzy rules discovered by the neurofuzzy 

classifier can be refined by the proposed feature reduction scheme. In the feature reduction process, it is 

assumed that there are sufficient relevant features in the original feature set to discriminate clearly between 

categories, and that some irrelevant features can be eliminated to improve efficiency and even accuracy. 

Generally, elimination of redundancy will improve efficiency without losing accuracy, and elimination of noise 

will improve both efficiency and accuracy (Chakraborty & Pal, 2004). In particular, the antecedent parts of 

fuzzy rules discovered by the neuro-fuzzy classifier always contain all features so that the main relevant features 

related to the classification accuracy cannot be clearly revealed as well as the inference efficiency is descended. 

More significantly, the antecedent parts of the fuzzy rules discovered by the neuro-fuzzy classifier containing all 

features will lead to over strict fired conditions so that many patterns cannot be inferred, and called as unknown 

patterns herein. The number of unknown patterns will obviously reduce the classification accuracy rate. 

Moreover, the simplified fuzzy rules for learning performance assessment are more easily interpreted by 

teachers, thus helping teachers to tune their teaching strategies. Based on these reasons, this study proposes a 

feature reduction scheme to simplify the discovered fuzzy rules. To obtain the simplified fuzzy rules, the 

proposed feature reduction scheme only performs the feature reduction scheme to the fuzzy rules with the best 

performance in each class. After that, the feature reduction is performed based on the proposed performance 

evaluation method for all feature combinations of the antecedent parts containing in each discovered fuzzy rule. 

Figure 6 shows that the linguistic terms used in a fuzzy rule usually have overlapped intervals between two 

neighborhood linguistic terms, such as the intervals I 1 , I 2 , I 3, 

and I 4 . In these overlapped intervals, a training 

pattern with feature value mapped in the corresponding overlapped interval will simultaneously fire the 

antecedent parts of two linguistic terms defined in a feature dimension and the fired strengths of two linguistic 

terms are different. For example, the membership degree of the linguistic term “Small” fired by a training 

pattern with feature value mapped in the interval I 1 is larger than the linguistic term “Moderate”. In contrast, 

the membership degree of the linguistic term “Small” fired by a training pattern with feature value mapped in the 

interval I 2 is smaller than the linguistic term “Moderate”. Based on the observation, an excellent feature 

combination in a discovered fuzzy rule should trigger the membership degree on the fired linguistic term as large 

as possible, but trigger the membership degree on the other non-fired linguistic terms as small as possible when 

the discovered fuzzy rule is fired by a training pattern. Herein, the fired linguistic term indicates the linguistic 

term with the largest membership degree fired by a training pattern, and the other linguistic terms are viewed as 

79

the non-fired linguistic terms. This study names the difference between the membership degree of the fired 

linguistic term with the summation of the membership degrees of the non-fired linguistic terms as the match 

degree. To develop the proposed feature reduction scheme, the match degree is first defined as follows: 

u 

l 

j 

l 

l 

() i = u () i − u () i 

jλ 

k 

n 

∑ 

= 1, 

k≠λ 

where u () i 

l 

j represents the match degree of the 

pattern, u ( i ) 

l 

th 

jk is the membership degree of the k linguistic term of the 

fired by the th 

i training pattern, n represents the number of the defined linguistic terms in each fuzzy rule, λ 

th 

indicates the linguistic term with the largest membership degree fired by the i training pattern. 

jk 

(14) 

th 

j feature of the th 

l fuzzy rule fired by the th 

i training 

th 

j feature of the th 

l fuzzy rule 

After the match degree is obtained, the performance of the discovered fuzzy rule with the considered feature 

combination can be measured as follows: 

ϕ 

∑∑ () 

= 

= = 1 1 j 

t 

l 

u j i 

l 

v q 

i 

(15) 

l 

where v q represents the performance of the th 

l fuzzy rule under the 

of training patterns, ϕ is the number of features, u ( i ) 

l 

j is the membership degree of the 

th 

fuzzy rule fired by the i training pattern. 

ϕ 

th 

q feature combination, t is the number 

th 

j feature of the 

Moreover, the total number of feature combinations for a discovered fuzzy rule can be computed as follows: 

g = 

p 

j! 

p! 

(16) 

p − j ! 

∑ 

j = 

1 

( ) 

where g is the total number of feature combinations for a discovered fuzzy rule, p is the total number of 

feature dimensions, j is the number of the considered features. 

Figure 6. The overlapped regions between various linguistic terms, where jk ( i ) is the membership degree of 

th 

the k linguistic term of the 

th 

th 

th 

j feature of the l fuzzy rule fired by the i training pattern, and x ( i ) 

l 

j is 

th 

the feature value of the i training pattern corresponding to the 

th 

th 

j feature of the l fuzzy rule 

Fuzzy Inference by the Discovered Fuzzy Rules for the Learning Performance Assessment 

This section explains how to infer the learning performance according to the discovered fuzzy rules by fuzzy 

inference. The discovered fuzzy production rules are formed by IF-THEN rules represented as follows: 

IF X A and X = A THEN Y = B 

1 

= 1 

2 2 

u l 

th 

l 

80

where X i and Y denote linguistic variables, and A i and B represent linguistic terms. 

A defuzzification strategy aims to convert the outcome of fuzzy inference into a crisp class label. In this study, 

the maximum operator (Lin & Lee, 1996) is employed as the defuzzification scheme, to infer the crisp class label 

from the respective fired degrees of membership functions. Restated, the result of the learning performance 

assessment is assigned as the class label with the largest membership degree fired by a training pattern among 

the discovered fuzzy rules. 

Experiments 

The personalized e-learning system (PELS) was published on the web site http://192.192.6.86/irt4 to provide 

personalized e-learning services and assess the learning performance of individual learners by web-based 

learning portfolios. To verify the evaluation quality of the discovered fuzzy rules for the learning performance 

assessment, some third-year students of Jee-May Elementary School (http://www.jmes.tpc.edu.tw/), who had 

majored in the “Fractions” course unit in elementary school mathematics, were invited to test this system. The 

experimental results are described as follows. 

The Format of Learning Portfolio Gathered by PELS 

The “Fractions” unit currently includes a total of 17 course materials with various difficulty levels, each 

conveying similar concepts. The system was used by 583 learners from 18 different classes of Taipei County 

Jee-May Elementary School (http://www.jmes.tpc.edu.tw/). Among the eight gathered learning factors, the final 

testing score G was obtained through the summative assessment scheme of fixed-length testing examination after 

the entire learning process is completed. Table 4 shows the format of the partial learning portfolios in the user 

profile database gathered by PELS. 

Learning 

Factor 

Learning 

Record 

T1 Table 4. The format of the learning portfolio in the user profile database 

G 

(0~100) 

RR 

(0~1) 

RT 

(sec) 

LA 

(-3~+3) 

CR 

(0~1) 

PN 

(piece) 

AS 

(point) 

EL 

(0~1) 

83.24 0.630 2212 1.68 0.815 0 41 1.00 

T2 74.17 0.600 1413 0.26 0.533 0 42 1.00 

_ _ _ _ _ _ _ _ _ 

T583 84.81 0.436 3747 1.36 0.846 1 27 0.95 

Evaluating Accuracy Rate of Learning Performance Assessment 

Evaluation Method 

To measure the accuracy rate of learning performance assessment for the proposed method, the score level 

method was used to evaluate the accuracy rate of the predicted learning performance. In this evaluation method, 

each learner is assessed according to one of three score levels based on the mapping membership degrees of 

learning factor G. For example, the score level of a learner with a final test score of 86.16 was set to moderate 

level if the linguistic term of moderate level has the largest mapping membership degree among all linguistic 

terms of the learning factor G. 

Experimental Results 

In our experiments, half the learning portfolios among 583 learners selected randomly were used as training data 

and the remaining learning portfolios were used as testing data. Table 5 illustrates the number of learning 

records in each grade level. To obtain simple and interpretable fuzzy rules for learning performance assessment, 

the number of cluster centers in the K-means algorithm is set to 3. That is, each learning factor contains three 

81

various linguistic terms to describe a fuzzy rule, and named as “low”, “moderate”, and “high” herein. Table 6 

shows the determined centers of the linguistic terms for eight learning factors by the K-means algorithm. To 

obtain fuzzy rules for learning performance assessment more accurately, the proposed refined K-means 

algorithm was employed to tune the centers of the linguistic terms listed in Table 6. Tables 7 and 8 display the 

revised amount of cluster centers for each learning factor and the revised centers of the linguistic terms of eight 

learning factors by the refined K-means algorithm, respectively. Figure 7 reveals the membership functions of 

triangle fuzzy sets for eight learning factors determined by the revised centers of the linguistic terms in the 

refined K-means algorithm. Basically, the left and right widths of triangle membership functions of the 

linguistic terms “moderate” for each learning factor are determined by the differences between the self-center 

with the left and right neighborhood centers, respectively. Moreover, the left and right widths of triangle 

membership functions of the linguistic terms “low” and “high” for each learning factor are determined by the 

difference between the self-center with the left or right neighborhood center as well as the boundary of each 

learning factor. 

Table 5. The number of learning records in each grade level 

Data set Training data set Testing data set Whole data set 

Item Low Moderate High Low Moderate High Low Moderate High 

Number of 

51 194 47 65 188 38 116 382 85 

patterns 

Total number of 

patterns 

292 291 583 

Table 6. The determined centers of the linguistic terms for eight learning factors by the K-means algorithm 

Linguistic term 

Learning factor 

The centers of the linguistic term 

“low” 


“moderate” 


“high” 

Revised amount 

G RR RT LA CR PN AS EL 

81.95 0.25 739.24 1.25 0.77 1.52 28.38 0.74 

87.25 0.44 1585.02 1.58 0.87 2.61 53.88 1.05 

90.44 0.63 3284.04 1.62 0.91 5.08 90.12 1.08 

Table 7. The revised amount of cluster centers for each learning factor 



The revised amount δ 12.5 0.5 3154 1.15 0.5 13 175 1.87 

Table 8. The revised centers of the linguistic terms of eight learning factors by the refined K-means algorithm 

Linguistic term 


The revised center of the linguistic 

term “low” 


term “moderate” 


term “high” 


69.45 -0.26 -2414.76 0.10 0.27 -11.48 -146.62 -1.13 

87.25 0.44 1585.02 1.58 0.87 2.61 53.88 1.05 

102.94 1.13 6438.04 2.77 1.41 18.80 256.12 2.95 

Next, the neuro-fuzzy neural network with the refined K-means algorithm for logically determining membership 

functions was employed to discover the formative evaluation rules based on the learning portfolios of the 

training data set. There are totally 24 fuzzy rules discovered by the employed neuro-fuzzy neural network in our 

82

experiment. Among all discovered fuzzy rules, there are 10, 9, and 5 rules discovered for assessing the low, 

moderate, and high grade levels, respectively. Table 9 shows three fuzzy rules with the best performance for 

assessing three different grade levels based on gathered learning portfolios. In our experiment, ten independent 

runs were performed to yield an average performance for the training and testing data sets. Table 10 displays the 

averaged accuracy rate of learning performance assessment evaluated by the discovered rules listed in Table 9. 

The averaged accuracy rates of learning performance assessment for the training and testing data sets are 74.3% 

and 71.1%, respectively. 

1 

0.5 

G 

0 

69.45 87.25 

LA 

102.94 

1 

0.5 

0 

1 

0.5 

0.1 1.58 

AS 

2.77 

0 

-146.62 53.88 256.12 

1 

0.5 

0 

1 

0.5 

0 

1 

0.5 

0 

RR 

-0.26 0.44 

CR 

1.13 

0.27 0.87 

EL 

1.41 

-1.13 1.05 2.95 

1 

0.5 

RT 

0 

-2414.76 1585.02 

PN 

6438.04 

1 

0.5 

0 

-11.48 2.61 18.8 

Figure 7. The membership functions of learning factors determined by the refined K-means algorithm 

Table 9. The learning performance assessment fuzzy rules discovered by the neuro-fuzzy network with the 

refined K-means algorithm for determining fuzzy membership functions 


Consequent 

Antecedent part 

part 

The discovered rule 

RR RT LA CR PN AS EL G 

The rule 1 M M L L M M M Low 

The rule 2 M M M M M M M Moderate 

The rule 3 H M M M M M M High 

Item 

Table 10. The accuracy rate of learning performance assessment evaluated by the discovered rules listed in 

Table 9 

Data set 

Training data set Testing data set 

Number of patterns 292 291 

Number of the discovered rules 3 3 

Number of the used fuzzy sets in the discovered rules 10 10 

Number of patterns with correct learning performance 

217 207 

assessment 

Number of patterns with incorrect learning performance 

assessment 

66 73 

83

Number of patterns with unknown learning performance 

9 

assessment 

11 

Averaged accuracy rate of learning performance assessment 0.743 0.711 

Table 11. All combinations of seven learning factors in the antecedent parts of the discovered fuzzy rules 

evaluated in various training cycles 

The combined learning 

Training cycle 

factors 

1~7 

Performance 

90 

80 

70 

60 

50 

40 

30 

20 

10 

0 

8 

9 

10 

x 1 ~ x 7 

x 1 , 2 x 

x 1 , 3 x 

x 1 , 4 x 

M M 

125 

126 

127 

x 1 , 3 x , 4 x , 5 x , 6 x , 7 x 

x , 6 x , 7 x 

x 2 , 3 x , 4 x , 5 

x 1 , 2 x , 3 x , 4 x , 5 x , 6 x , 7 x 

-10 

0 20 40 60 80 100 120 140 

Training Procedure of Rule 3 

Figure 8. The performance evaluation plot of the discovered rule 1 for assessing the low score level 

Although the used neuro-fuzzy classifier with the refined K-means algorithm can discover as few fuzzy rules as 

possible to assess the learning performance of an individual learner accurately, the obtained fuzzy rules always 

contain all learning factors so that the main learning factors that affect the learning performance cannot be 

revealed clearly. To simplify the obtained fuzzy rules, the feature reduction scheme was employed to further 

reduce the fuzzy rules of learning performance assessment. In the proposed feature reduction scheme, the 

performance index defined in Eq. (17) for all combinations of learning factors of the antecedent parts of 

discovered rules will be used to find the main learning factors that affect the learning performance. Table 11 

lists all combinations of seven learning factors contained in the antecedent parts of the discovered fuzzy rules, 

where x i represents the th 

i considered learning factor, and their performances were respectively evaluated in 

various training cycles. Figures 8, 9 and 10 show the performance evaluation plots of the discovered rules 1, 2 

and 3 for assessing the low, moderate and high score levels, respectively. Table 12 shows the learning factor 

combinations with top six excellent performances for three discovered fuzzy rules. In particular, the learning 

factor combination with best performance index among three discovered fuzzy rules is 3 x , 2 x and x 1 . The 

result indicates that the third learning factor of the discovered fuzzy rule 1, i.e. learner ability (LA), is the main 

learning factor for assessing the learners with low score level. Similarly, the results also indicate that the second 

and first learning factors are the main learning factors for the discovered fuzzy rules 2 and 3, respectively. Table 

13 shows the simplified learning performance assessment rules after performing the proposed feature reduction 

scheme. Table 14 displays the accuracy rate of learning performance assessment by these simplified fuzzy rules. 

84

Compared with the results of learning performance assessment listed in Table 10, we find that the accuracy rate 

of the learning performance assessment using the simplified fuzzy rules is only slightly poor than the discovered 

fuzzy rules without performing feature reduction scheme. However, the simplified fuzzy rules provide benefits 

to teachers in terms of easily understanding main learning factors that affect learning performance for learners 

with various score levels (Paiva & Dourado, 2004; Mikut, Jakel & Groll, 2005). Moreover, this idea can be 

beneficially applied to immediately examine the learning progress of learners, and to perform interactively 

control learning for web-based learning systems. In the meanwhile, the simplified fuzzy rules will obviously 

promote the inference efficiency of learning performance assessment for web-based learning systems (Paiva & 

Dourado, 2004; Mikut, Jakel & Groll, 2005). 

Performance 

20 

15 

10 

5 

0 

-5 

-10 

0 20 40 60 80 100 120 140 


Figure 9. The performance evaluation plot of the discovered rule 2 for assessing the moderate score level 

Performance 

30 

25 

20 

15 

10 

5 

0 

-5 

-10 

0 20 40 60 80 100 120 140 


Figure 10. The performance evaluation plot of the discovered rule 3 for assessing the high score level 

Table 12. The top six combined learning factors with excellent performance of learning performance assessment 

for three discovered rules 

Training 

cycle 

The discovered rule 1 The discovered rule 2 The discovered rule 3 

The 

combined 

learning 

factor 

Performance 

index 

Training 

cycle 

The 

combined 

learning 

factor 

Performance 

index 

Training 

cycle 

The 

combined 

learning 

factor 

3 3 x 82.71 2 2 x 16.78 1 1 

19 3 x , 4 x 62.98 15 2 x , 4 x 16.29 8 1 

14 2 x , 3 x 49.75 4 4 x 15.79 10 1 

44 2 x , 3 x , 4 x 47.58 30 1 x , 2 x , 4 x 15.00 30 1 x , 2 

9 1 x , 3 x 47.57 8 1 x , 2 x 14.61 12 1 

21 3 x , 6 x 46.59 49 2 x , 4 x , 6 x 14.34 32 1 x , 2 

Performance 

index 

x 27.50 

x , 2 

x , 4 

x 22.14 

x 21.65 

x , 4 

x , 6 

x 20.02 

x 18.98 

x , 7 

x 18.25 

85

Table 13. The discovered learning performance assessment rules after performing feature reduction 

Learning factor Antecedent part 

Consequent 

part 

The discovered rule 

RR RT LA G 

The rule 1 L Low 

The rule 2 M Moderate 

The rule 3 H High 

Table 14. The accuracy rate of learning performance assessment by the simplified fuzzy rules listed in Table 13 

Data set 

Item Training data set Testing data set 

Conclusion 

Number of patterns 292 291 

Number of the discovered rules 3 3 

Number of the used fuzzy sets in the 

discovered rules 

3 3 

Number of patterns with correct learning 

performance assessment 

212 204 

Number of patterns with incorrect learning 


80 87 

Number of patterns with unknown learning 

0 0 


Accuracy rate of learning performance 

assessment 

0.726 0.701 

This study presents an effective learning performance assessment approach which contains the neuro-fuzzy 

network with the refined K-means algorithm for logically determining membership functions and the proposed 

feature reduction scheme to discover simplified and small amount fuzzy rules for evaluating learning 

performance of individual learners based on the gathered learning portfolios. The proposed method can help 

teachers to perform precise formative assessments according to the web-based learning portfolios of individual 

learners in a web-based learning environment. The inferred learning performance can be applied as a reference 

guide for teachers and as learning feedback for learners. Such a feedback mechanism enables learners to 

understand their current learning status and make suitable learning adjustments. Additionally, teachers can 

determine the main factors affecting learning performance in a web-based learning environment according to the 

interpretable learning performance assessment rules obtained. These factors can be used by teachers to tune their 

teaching strategies. Meanwhile, teachers can devote themselves to teaching job, since they save significant time 

in performing learning evaluation. 

Acknowledgment 

The authors would like to thank the National Science Council of the Republic of China, Taiwan for financially 

supporting this research under Contract No. NSC94-2520-S-026-002. 

References 

Baker, B. B., Baker, F. B., & Frank B. B. (1996), Item Response Theory: Parameter Estimation Techniques, 

New York: Marcel Dekker. 

Bruner, J. S. (1996). Toward a Theory of Instruction, Cambridge, MA, USA: Harvard University Press. 

86

Campbell, D., Nettles. D., & Melenyzer, B. (2000). Portfolio and Performance Assessment in Teacher 

Education, Boston: Allyn and Bacon. 

Carrira, R., & Crato, J. M. (2004). Evaluating Adaptive User Profiles for News Classification. International 

Conference on Intelligent User Interfaces, January 13-16, 2004, 206-212. 

Chakraborty, D., & Pal, N. R. (2004). A Neuro-Fuzzy Scheme for Simultaneous Feature Selection and Fuzzy 

Rule-Based Classification. IEEE Transactions on Neural Networks, 15, 110–123. 

Chang, C.-C. (2002). Building A Web-Based Learning Portfolio for Authentic Assessment. Proceedings of the 

International Conference on Computers in Education (Vol. 1), 129-133. 

Chen, C.-M., Lee, H.-M., & Chen, Y.-H. (2005). Personalized E-Learning System Using Item Response Theory. 

Computers & Education, 44 (3), 237-255. 

Chen, C.-M., Liu, C.-Y., & Chang, M.-H. (2006). Personalized Curriculum Sequencing Using Modified Item 

Response Theory for Web-based Instruction. Expert Systems with Applications, 30 (2), 378-396. 

Dunham, M. H. (2002). Data Mining: Introductory and Advanced Topics, Prentice Hall. 

Gagnés, R, (1997). The Conditions of Learning and Theory of Instruction, New York: Holt, Reinehart & 

Winston. 

Lankes, A. M. D. (1995). Electronic Portfolios: A New Idea in Assessment, (ERIC Digest EDO-IR-95-9), 

Available April 11, 2006 at http://searcheric.org/digests/ed390377.html. 

Lin, C.-T., & Lee, C. S. G. (1996). Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent Systems, 

Prentice-Hall. 

Nauck, D., & Kruse, R. (1999). Obtaining Interpretable Fuzzy Classification Rules from Medical Data. Artificial 

Intelligence in Medicine, 16, 149-169. 

Mikut, R., Jakel, J., & Groll, L. (2005). Interpretability Issues in Data-Based Learning of Fuzzy Systems. Fuzzy 

Sets and Systems, 150 (2), 179–197. 

Paiva, R. P., & Dourado, A. (2004). Interpretability and Learning in Neuro-Fuzzy Systems. Fuzzy Sets and 

Systems, 147 (1), 17-38. 

Rahkila, M., & Karjalainen, M. (1999). Evaluation of Learning in Computer based Education Using Log 

Systems. 29 th ASEE/IEEE Frontiers in Education Conference, 1, 12A3/16-12A3/21. 

Rasmussen, K., Northrup, P., & Lee, R. (1997). Implementing Web-based Instruction. In Khan, B. H. (Ed.), 

Web-Based Instruction, Englewood Cliffs, NJ: Educational Technology, 341-346. 

Torrance, H., & Pryor, J. (1998). Investigating Formative Assessment: Teaching, Learning and Assessment in 

the Classroom, Buckingham: Open University Press. 

Wang, W., Weng, J.-F., Su, J.-M., & Tseng, S.-S. (2003). Learning Portfolio Analysis and Mining in SCORM 

Compliant Environment. The 34th ASEE/IEEE Frontiers in Education Conference, TC2-17-TC2-34. 

Wu, A. K. W., & Leung, C. H. (2002). Evaluating Learning Behavior of Web-Based Training using Web Log. In 

Proceedings of the International Conference on Computers in Education, 736-737. 

Xu, R., & Wunsch, D. II (2005). Survey of Clustering Algorithms. IEEE Transactions on Neural Networks, 16 

(3), 645-678. 

Zaiane, O. R., & Luo, J. (2001). Towards Evaluating Learners' Behavior in a Web-based Distance Learning 

Environment. IEEE International Conference on Advanced Learning Technologies, 357-360. 

87

Morimoto, Y., Ueno, M., Kikukawa, I., Yokoyama, S. & Miyadera, Y. (2006). Formal Method of Description Supporting 

Portfolio Assessment. Educational Technology & Society, 9 (3), 88-99. 

Formal Method of Description Supporting Portfolio Assessment 

Yasuhiko Morimoto 

Fuji Tokoha University 325 Obuchi, Fuji, Shizuoka 417-0801 Japan 

morimoto@mis.nagaokaut.ac.jp 

Maomi Ueno 

Nagaoka University of Technology 1603-1 Kamitomioka, Nagaoka, Niigata 940-2188 Japan 

ueno@kjs.nagaokaut.ac.jp 

Isao Kikukawa 

Tokyo Polytechnic University 1583 Iiyama, Atsugi, Kanagawa 243-0297 Japan 

kikukawa@cc.t-kougei.ac.jp, 

Setsuo Yokoyama and Youzou Miyadera 

Tokyo Gakugei University 4-1-1 Nukuikita, Koganei, Tokyo 184-8501 Japan 

yokoyama@u-gakugei.ac.jp 

miyadera@u-gakugei.ac.jp 

ABSTRACT 

Teachers need to assess learner portfolios in the field of education. However, they need support in the 

process of designing and practicing what kind of portfolios are to be assessed. To solve the problem, a 

formal method of describing the relations between the lesson forms and portfolios that need to be collected 

and the relations between practices and these collected portfolios was developed. These relations are 

indispensable in portfolio assessment. A support system for these based on the formal method was also 

developed. As the formal method of description can precisely and consistently describe these relations, the 

system makes it possible to support the assessment of portfolios in the design and practice phases. 

Keywords 

Portfolio assessment, Formal description, Support system, Assessment tool, Educational evaluation 

Introduction 

Although traditional methods of evaluation based on tests have improved in the field of education, portfolio 

assessment has attracted attention as a more authentic means of evaluating learning directly. The use of 

Electronic Portfolios (Digital Portfolios), in which the portfolio is saved electronically, is spreading with 

computerized learning systems (Mochizuki et al., 2003). 

However, portfolio assessment has not been carried out adequately in the educational field because of two 

problems, and teachers and learners both need support (Morimoto et al., 2005). 

Problem 1: When teachers decide to use portfolios for lessons that they intend to assess, they do not know 

what kinds of portfolios need to be collected. In other words, they need support in the design phase of 

portfolio assessment. 

Problemt 2: Learners do not know which collected portfolios should be used for practices. For example, 

when a learner carries out his or her self-assessment, he or she does not know that he or she should carry out 

the self-assessment while checking which collected portfolios should be used. In other words, learners need 

support on which portfolios they need to assess in the practice phase. 

Support for portfolio assessment is carried out with tools that treat electronic portfolios (Chang, 2001; Chen et 

al., 2000; Fukunaga et al., 2001). However, these tools cannot support teachers in selecting which portfolios to 

assess. Moreover, teachers have to match the type of collected portfolio and the method of practice to the tools 

being used because these tools individually determine the types of collected portfolios and the practice method. 

Therefore, support of portfolio assessment in the lesson that the teacher intends to assess cannot be expected. 

The relation between practices and collected portfolios is not clear, and practices are not supported. These tools 

cannot solve Problems (1) or (2). 

The purpose of this study is to achieve support for portfolio assessment at the design and practice phases, and 

solve these problems at the same time. We therefore developed a formal method of describing the relations 






88

among elements that are indispensable in assessing portfolios and a support system for these based on this 

method. As the formal method of description can precisely and consistently describe the relations between the 

lesson forms and portfolios that need to be collected and the relations between practices and collected portfolios, 

the system makes it possible to support the assessment of portfolios in the design and practice phases. 

First, we outline the support of portfolio assessment that this study is aimed at. Second, we discuss the relations 

between elements indispensable to the support of portfolio assessment. These relations become the framework 

for the formal description method that we developed. Third, we explain the formal description method. Finally, 

we discuss the portfolio assessment support system (PASS), which is based on this method. 

Support of portfolio assessment in this study 

Requirements for solving problems 

To solve Problem (1) in the design phase, we have to extract and clarify the relations between lesson forms and 

portfolios that need to be collected, and support portfolio-assessment design based on these relations. Moreover, 

to solve Problem (2) in the practice phase, we have to extract and clarify the relations between practices and 

collected portfolios, and support user portfolio-assessment practices based on these relations. Here, a person who 

carries out practices is called a “user”. Namely, a user means a learner, a teacher, or others. 

To achieve these two phases of support, we need to faithfully express the extracted relations. If a formal method 

that precisely and consistently describes the extracted relations is developed, a system is possible that 

systematically supports portfolio assessment based on the formal method. Therefore, the requirements for 

solving Problems (1) and (2) are as follows. 

Requirement(a): Extract and clarify the relations between the lesson forms and portfolios that need to be 

collected, and develop a formal method of describing these relations (Corresponds to Problem (1)). 

Requirement(b): Extract and clarify the relations between a user’s practices and collected portfolios that a user 

needs to carry out practices, and develop a formal method of describing these relations (Corresponds 

to Problem (2)). 

Requirement(c): Develop a system based on the formal method of description developed in (a) and (b) 

(Corresponds to Problems (1) and (2)). 

Blueprints to support portfolio assessment 

It is possible to support teachers’ designs for assessing portfolios and users’ practices by formally describing the 

relations between the lesson forms and portfolios that need to be collected (PDS: Portfolio assessment Design 

Semantics), which satisfy Requirement (a), and the relations between practices and collected portfolios (PPS: 

Portfolio assessment Practice Semantics), which satisfy Requirement (b), and develop a system based on these 

relations (see Figures 1 and 2). 

Figure 1 outlines a model illustrating how a teacher uses lesson forms to check PDS to select portfolios and how 

he or she uses portfolios to check PDS to select the lesson forms in the design phase. When he or she specifies 

the lesson form that he or she intends to teach, the system checks PDS and presents the portfolios that need to be 

collected dynamically according to this form. Moreover, when a teacher specifies portfolios that users need to 

collect first, the system presents lesson forms conforming to portfolios according to PDS. Thus, both lesson 

forms and portfolios are supported using PDS, and teachers can design portfolio assessment more easily. Both 

Requirements (a) and (c) are satisfied, and a solution to Problem (1) is expected. 

Figure 1. Model supporting teacher’s portfolio-assessment design 

89

Figure 2 outlines a model illustrating how a user uses practices to check PPS to select portfolios and how a user 

uses portfolios to check PPS to select practice in the practice phase. When a user specifies the practice that the 

user wants to carry out, the system checks PPS and presents collected portfolios that the user needs for the 

practice. The user can therefore effectively practice by checking necessary collected portfolios. Moreover, when 

the user specifies a collected portfolio, the system dynamically supports practice that adjusts the portfolio 

according to PPS. Thus, both practice and portfolios are supported using PPS. Here, both Requirements (b) and 

(c) are satisfied, and a solution to Problem (2) is expected. 

Advantages of formal description 

Figure 2. Model supporting user’s portfolio-assessment practices 

We want to describe the PDS and PPS by using a formal method. The advantages of formally describing PDS 

and PPS are as follows. 

The relations between elements indispensable to the support of portfolio assessment can precisely be 

described by removing contradictions and vagueness, 

It is possible for the system developer to develop a support system that works systematically according to 

semantics formally described by constructing the system, which can interpret the framework for the formal 

description method and work based on it, 

As a result, a portfolio assessment support system that provides adaptive support according to the lesson 

form or a user’s practices can be achieved, and 

Changing the semantics of the framework with formal description rules can easily change the support 

methods of the developed system. 

Extracting relations to support portfolio assessment 

Outline 

In this section, we discuss our extraction of the relations between elements indispensable for supporting portfolio 

assessment, which became the framework for formal description that was developed in this study. PDS and PPS 

were extracted. 

It is necessary to extensively investigate actual portfolios to assess them. However, the ideas and methods for 

assessing portfolios are inconsistent and varied (Bruke, 1992; Hart, 1993; Barton & Collins, 1996; Puckett & 

Black, 1999; Shores & Grace, 1998). We therefore focused on assessing indispensable portfolios that teachers 

actually use. We analyzed 298 practical records such as lesson plans that had actually been used. We extracted 

and analyzed methods and ideas applied to actual practical portfolio-assessment records, which teachers had 

obtained by trial and error. This paper discusses support for portfolio assessment based on these extracted 

results. However, this might be insufficient for extraction. This is because there may be a method of assessing 

portfolios that has been derived from analytical targets. We used these extraction results to devise a formal 

method of description. Therefore, although the relations we extracted form the framework for supporting 

portfolio assessment, which were our study targets, the formal method of description that we developed and 

discuss in the next section needs to be changed based on the framework. 

Extracting PDS 

We clarified the relations between lesson forms and portfolios that needed to be collected (Table 1). These 

relations will be represented by “Portfolio assessment Design Semantics (PDS)”. The rows in Table 1 are for 

90

lesson forms, and the columns are for collected portfolios. Here, “X” means that the portfolio corresponds to the 

lesson form that needs to be collected. For example, if a teacher wants to carry out portfolio-assessment design, 

he or she decides the lesson form and finds the “X” in the corresponding row. Portfolios with “X” need to be 

collected. Teachers can also identify corresponding lesson forms by specifying portfolios first. Therefore, 

portfolios that need to be collected according to lesson forms can be adequately identified by PDS. 

We found from the rows that lesson forms conforming to the assessment of portfolios changed under three 

conditions, i.e., the lesson style, the person who set tasks for the lesson, and the person who created rubrics. In 

extracting lesson styles, we paid attention to learners’ behaviors in a class, and classified lesson forms into 

eleven lesson styles. These were “Lecture”, where the lesson is given in the form of a lecture, “Question and 

Answer”, where learners answer a teacher’s questions, “Exercise”, where learners repeat exercises, “Skill 

acquisition”, where skills are acquired, “Expression”, where the body is used to express ideas, “Creation”, where 

something is created, “Discussion”, where there is a discussion, “Seminar”, where people in a group learn from 

one another, “Experiment or Observation”, where phenomena are verified through experiments or observations, 

“Experience”, where social and technical experience is used, and “Problem Solving”, where learners attempt 

problem solving by themselves. In extracting the “person who assigns tasks (Setting Tasks)”, we found that there 

were two kinds of cases where the teacher set them or the learner set them. In extracting “person who creates 

rubrics (Rubric Creation)”, we found there were three kinds of cases where the teacher created them, the learner 

created them, or the teacher and the learner collaboratively created them. 

Collected portfolios in the columns were classified separately in terms of portfolios concerning records of 

assessment (Assessment Portfolios), portfolios concerning learners’ work (Work Portfolios), portfolios 

concerning records of learning processes (Process Portfolios), and other portfolios. We classified them according 

to their original role in learning without getting caught up with the common name for the portfolio. Work 

Portfolios were classified into ten kinds. We classified Production into two types. The first was “Sample Based”, 

which was based on a sample, the second was “Original Based” which was the learner’s original. We also 

classified Reports and Notes into two types. We called one’s own assessments in the Assessment Portfolios with 

rubrics “Self-assessment”, and called looking back on one’s activities without rubrics “Reflection”. We also 

called mutual assessments with rubrics “Other-assessment”, and called advising one another freely without 

rubrics “Advice”. We classified Process Portfolios into three types. The first was “Learning-record” where the 

learner objectively describes his or her ongoing learning in a log, the second was “Anecdotal-record” where the 

teacher describes a learner’s spontaneous events and behaviors, and the third was “Conversation-record” which 

describes conversations. We also classified the Learning-record into two types. The first was “Premeditatedrecord”, 

which describes premeditation according to lesson plans, and the second was “Situation-record”, which 

describes situations. Other portfolios consisted of records of portfolio conferences (“Portfolio Conference”), 

learning plans (“Learning Plan”), and learning materials (“Learning Material”). 

Extracting PPS 

We extracted the relations between practices and collected portfolios (Table 2). The relations will be represented 

by “Portfolio assessment Practice Semantics (PPS)”. Here, “X” means that the portfolio corresponds to the 

practices that are necessary to carry it out. For example, when a learner carries out practices, he or she finds an 

“X” in the corresponding row. Portfolios with “X” are collected portfolios that are needed for carrying out 

practices. The learner can also identify corresponding practices by specifying portfolios first. Therefore, 

portfolios that are needed for practices can be adequately identified by PPS. 

We found ten kinds of practices that could be carried out during the process of assessing portfolios. These were 

“Browsing”, where collected portfolios were perused, “Self-assessment”, where one assesses one’s self, “Otherassessment”, 

where others are assessed, “Reflection”, where one reflects on his or her learning, “Advice”, where 

advice is given to someone, “Registration of Work Portfolios”, where work portfolios are registered, “Selection 

of Collected Portfolios”, where one’s collected portfolios are carefully selected, “Process Recording”, where 

process portfolios are recorded, “Rubric Creation”, where rubrics are created, and “Portfolio Conference”, where 

the state of the portfolio conference is recorded. We also found teachers, learners, and others (e.g., parents and 

specialists) were people who carried out practices. 

91

Lesson form 

Assessment Portfolios Work Portfolios Process Portfolios Other Portfolios 

Asssessment with Assessment 

Production Report Note 

Learning-record 

Setting 

rubric 

without rubric 

Practical 

Conversation Anecdotal Portfolio Learning Learning 

Rubric Creation Rubric 

Collection Test Presentation 

Tasks 

SelfOther- Sample Original 

Lesson Other 

ability Situation Premeditated -record -record Conference Plan Material 

Reflection Advice Conclusion Impression 

assessmentassessment Based Based Note Note 

-record -record 

Teacher X X X X X X X X X X 

Teacher X Collaboration 

Learner 

Learner 

Lesson 

style 

Lecture 

Teacher X X X X X X X X X X X X X 

Collaboration 

Learner 

X 

Teacher 

Question 

and Answer 

Learner 


Collaboration X X X X X X X X X X 

Learner 

X 

Teacher 

Exercise 

Learner 

Teacher X X X X X X X X X X X 

Collaboration X X X X X X X X X X X X X X 

Learner 

X 

Teacher 

Skill 

acquisition 

Learner 


Collaboration X X X X X X X X X X X X X X X 

Learner 

Teacher 


Learner X X X X X X X X X X X X X X 


Collaboration X X X X X X X X X X X X X X X X 

Learner 

Teacher 


Learner X X X X X X X X X X X X X 

Teacher X X X X X X X X X X X X X X X X 

Collaboration X X X X X X X X X X X X X X X X X 

Learner 

Teacher 


Learner X X X X X X X X X X X X X X X X 


Collaboration X X X X X X X X X X X X X X X X X 

Learner 

Teacher 




Collaboration 

Learner 

Teacher 

Collaboration X X X X X X X X X X X X X X X 

Learner 

Teacher X X X X X X X X X X X X X X 


Learner 

Teacher 



Teacher X X X X X X X X X X X X X X X X X X X X X 

Collaboration X X X X X X X X X X X X X X X X X X X X X 

Learner 

Teacher 

Collaboration X X X X X X X X X X X X X X X X X X X X X 

Learner X X X X X X X X X X X X X X X X X X X X X 

X 

Teacher 

Expression 

X 

Learner 

X 

Teacher 

Creation 

Learner X 

X 

Teacher 

Discussion 

X 

Learner 

X 

Teacher 

Seminar 

X 

Learner 

X 

Teacher 

Experiment 

or 

observation 

X 

Learner 

X 

Teacher 

Experience 

X 

Learner 

X 

Teacher 

Problem 

solving 

X 

Learner 

92

Assessment Portfolios Work Portfolios 

Process Portfolios Other Portfolios 

User 

Production Report Note 

Learning Learning 

Collection Test Presentation 

SelfOther- Sample Original 

Lesson Other 

Situation Premeditated 

Plan Material 

Reflection Advice Conclusion Impression 

assessmentassessment Based Based Note Note 

-record -record 

Learner X X X X X X X X X X X X X X X X X X X X X X 

Teacher X X X X X X X X X X X X X X X X X X X X X X X 

Others X X X X X X X X X X X X X X X X X X X X X X 

Learner X X X X X X X X X X X X X X X X X X X X 

Teacher 

Others 

Learner 

Teacher X X X X X X X X X X X X X X X X X X X X X 

Others X X X X X X X X X X X X X X X X 

Learner X X X X X X X X X X X X X X X 

Teacher 

Others 

Learner 


Others X X X X X X X X X X X X X X X 

Learner X 

Teacher 

Others 

Learner X X X X X X X X X X X X X X 

Teacher 

Others 

Learner X X X X 

Teacher X X X X X 

Others 

Learner X X 

Teacher X X 

Others 

Learner X X X X X X X X X X X X X X X X X X X X X X 

Teacher X X X X X X X X X X X X X X X X X X X X X X 

Others X X X X X X X X X X X X X X X X X X X X X X 

Practical 

Asssessment with Assessment 

Learning-record 

rubric without rubric 

Conversation Anecdotal Portfolio 

Rubric 

ability 

-record -record Conference 

Practices 

Browsing 

Self-assessment 

Other-assessment 

Reflection 

Advice 

Registration of work 

portfolios 

Sellection of collected 

portfolios 

Process Recording 

Rubric Creation 

Portfolio Conference 

93

Formal description method 

Development policy 

This section describes the development of the method for formally describing PDS and PPS. Formal methods of 

description are currently being developed and used within the syntax definitions of programming languages, the 

context of software engineering, and other applications. These are necessary and indispensable for achieving 

systematic processing by computers, and specialists can effectively obtain precise and consistent specifications 

for system models. 

The formal method of description we have aimed at in this study should be able to precisely and consistently 

describe relations between elements indispensable for supporting portfolio assessment. In other words, the 

method of formal description can describe the structure and content of the model supporting the design of 

teachers’ portfolio-assessment designs (Figure 1) and the model supporting users’ portfolio-assessment practices 

(Figure 2). Moreover, it is necessary to design description rules to change dynamically to provide further 

portfolio assessment support in the future. However, a formal method specialized for describing the relations 

between the elements has not been found. We have therefore aimed at developing an original method for 

describing the relations between the elements. The framework to describe PDS and PPS precisely and 

consistently can be acquired by further developing our formal method of description. Therefore, a solution to 

Requirement (c) can be expected. 

Definitions 

Here, we provide definitions for PDS and PPS as follows. 

Definition 1 

PDS is the relation between the set of lesson forms (LF) and the set of collected portfolios (CP). Therefore: 

LF CP. 

Here, LF is: 

LF = LT × PT × PR. 

And, CP is the subset of all portfolios: 

CP = WP ∪ PP ∪ AP ∪ OP. 

LT is the set of lesson styles, PT is the set of persons who assign tasks, PR is the set of persons who create 

rubrics, WP is the set of work portfolios, PP is the set of process portfolios, AP is the set of assessment 

portfolios, and OP is the set of other portfolios. 

The relations between details of lesson forms and the kinds of collected portfolios are: 

[ lt , pt, 

pr] 

[ wp, 

pp, 

ap, 

op] 

lt ∈ LT pt ∈ PT, 

pr ∈ PR, 

wp ⊆ WP, 

pp ⊆ PP, 

ap ⊆ AP, 

op ⊆ 

Here, , OP // 

Definition 2 

PPS is the relation between the set of practices (UP) and the set of collected portfolios (CP). Therefore: 

UP CP. 

Here, UP is: 

UP = PN × UU. 

And, CP is the subset of all portfolios: 

CP = WP ∪ PP ∪ AP ∪ OP. 

PN is the set of practice names, UU is the set of users, WP is the set of work portfolios, PP is the set of 

process portfolios, AP is the set of assessment portfolios, and OP is the set of other portfolios. 

The relations between details of practices and the kinds of collected portfolios can be written as: 

Notation 

[ pn , uu] 

[ wp, 

pp, 

ap, 

op] 

pn ∈ PN uu ∈UU, 

wp ⊆ WP, 

pp ⊆ PP, 

ap ⊆ AP, 

op ⊆ 

Here, , OP // 

This section proposes a notation based on the definitions in the previous section. Tables 1 and 2 list the PDS and 

PPS we extracted. Therefore, the developed notation needs to satisfy the following two Requirements. 

94

i. To be able to freely add and delete items that comprise the table (i.e., lesson form, collected portfolios, 

and others), and rewrite the relations. 

ii. To be able to precisely and clearly describe which computer systems read the text written by the 

notation and work systematically while checking the correspondence to the description in the text. 

The notation consists of three parts: declaring symbols (Symbol),showing the structure of relations (Structure), 

and describing relations (Relations). Figure 3 shows a concrete example, which describes part of the PDS in 

Table 1 with the notation. 

#Symbol: 

LT=”Lesson type”; PT=”Person who sets tasks”; PR=”Person who creates rubrics”; 

AP=”Assessment portfolio”; WP=”Work portfolio”; PP= ”Process portfolio”; OP=”Other portfolio”; 

lec=”Lecture”; q&a=”Question and answer”; exe=”Exercise”; skl=”Skill acquisition”; cre=”Creation”; 

dis=”Discussion”; sem=”Seminar”; eob=”Experiment or observation”; exp=”Experience”; plb=”Problem solving”; 

tcr=”Teacher”; lnr=”Learner”; clb=”Collaboration by teacher and learner”; 

rbc=”Rubric”; sfa=”Self-assessment”; ota=”Other-assessment”; ref=”Reflection”; adv=”Advice”; 

sam=”Sample based”; org=”Original based”; con=”Conclusion”; imp=”Impression”; lnt=”Lesson note”; 

ont=”Other Note”; col=”Collection”; tst=”Test”; prt=”Presentation”; pra=”Practical ability”; 

sit=”Situation-record”; prd=”Premeditated-record”; cnv=”Conversation-record”; anc =”Anecdotal-record”; 

pcf=”Portfolio conference”; lpl=”Learning plan”; lmt=”Learning material”; 

#Sturcture: 

[LT,PT,PR] [AP,WP,PP,OP] 

LT={lec,q&a,exe,skl,cre,dis,sem,eob,exp,plb}; 

PT={tcr,lnr}; 

PR={tcr,lnr,clb}; 

AP={rbc,sfa,ota,ref,adv}; 

WP={sam,org,con,imp,lnt,ont,col,tst,prt,pra}; 

PP={sit,prd,cnv,anc}; 

OP={pcf,lpl,lmt}; 

#Relations: 

[lec,tcr,tcr] [{rbc,sfa,ref}, {lnt,tst}, {sit,anc}, {pcf,lmt}]; 

[q&a,tcr,tcr] [{rbc,sfa,ref}, {lnt,ont,col,tst}, {sit,cnv,anc}, {pcf,lmt}]; 

[exe,tcr,tcr] [{rbc,sfa,ref}, {ont,tst}, {sit,anc}, {pcf,lmt}]; 

[exe,tcr,clb] [{rbc,sfa,ref}, {ont,tst}, {sit,anc}, {pcf,lmt}]; 

[skl,tcr,tcr] [{rbc,sfa,ref}, {sam,lnt,col}, {sit,anc}, {pcf,lmt}]; 

[skl,tcr,clb] [{rbc,sfa,ota,ref,adv}, {sam,org,lnt,col}, {sit,anc}, {pcf,lmt}]; 

[exp,tcr,tcr] [{rbc,sfa,ref}, {prt,pra}, {sit,anc}, {pcf,lmt}]; 

[exp,tcr,clb] [{rbc,sfa,ota,ref,adv}, {col,prt,pra}, {sit,prd,anc}, {pcf,lpl,lmt}]; 

[exp,lnr,clb] [{rbc,sfa,ota,ref,adv}, {col,prt,pra}, {prd,anc}, {pcf,lpl,lmt}]; 

[exp,lnr,lnr] [{rbc,sfa,ota,ref,adv}, {col,prt,pra}, {prd,anc}, {pcf,lpl,lmt}]; 

[cre,tcr,tcr] [{rbc,sfa,ref}, {sam,lnt,col}, {sit,anc}, {pcf,lmt}]; 

[cre,tcr,clb] [{rbc,sfa,ota,ref,adv}, {sam,org,lnt,col}, {sit,prd,anc}, {pcf,lpl,lmt}]; 

[cre,lnr,clb] [{rbc,sfa,ota,ref,adv}, {sam,org,col}, {prd,anc}, {pcf,lpl,lmt}]; 

[cre,lnr,lnr] [{rbc,sfa,ota,ref,adv}, {org,col}, {prd,anc}, {pcf,lpl,lmt}]; 

Figure 3. Description of PDS with notation 

Symbol 

Structure 

Relations 

For example, when a teacher designs the assessment of portfolios in lesson form that consist of “Lesson Style: 

Creation”, “Setting Tasks: Teacher”, and “Rubric Creation: Collaboration by Teacher and Learner”, 

[cre,tcr,clb][{rbc,sfa,ota,ref,adv},{sam,org,lnt,col},{sit,prd,anc},{pcf,lpl,lmt}], 

which is a rule of notation that specifies that the portfolios that need to be collected are “Rubric”, “Selfassessment”, 

“Other-assessment”, “Reflection”, “Advice”, “Production: Sample based”, “Production: Original 

based”, “Note: Lesson Note”, “Collection”, “Learning-record: Situation-record”, “Learning-record: 

Premeditated-record”, “Anecdotal-record”, “Portfolio Conference”, “Learning Plan”, and “Learning Material” in 

Table 1. 

95

Although Figure 3 is a description corresponding to Table 1, the structure of the table can be changed and the 

relations can also be changed individually. In other words, even if relations are changed and added, the notation 

can still be expressed. 

PPS can be described like PDS. Figure 4 describes part of the PPS in Table 2 with the notation. 

#Symbol: 

PN=”Practice name”; UU=”User”; 

AP=”Assessment portfolio”; WP=”Work portfolio”; PP=”Process portfolio”; OP=”Other portfolio”; 

bro=”Browsing”; prsa=”Self-assessment”; proa=”Other-assessment”; prrf=”Reflection”; prad=”Advice” 

rgt=”Registration of work portfolios”; slt=”Selection of collected portfolios”; prec=”Process Recording”; 

rcre=”Rubric Creation”; prcf=”Portfolio Conference”; tcr=”Teacher”; lnr=”Learner”; otr=”Other”; 

rbc=”Rubric”; sfa=”Self-assessment”; ota=”Other-assessment”; ref=”Reflection”; adv=”Advice”; 

sam=”Sample based”; org=”Original based”; con=”Conclusion”; imp=”Impression”; lnt=”Lesson note”; 

ont=”Other Note”; col=”Collection”; tst=”Test”; prt=”Presentation”; pra=”Practical ability”; 

sit=”Situation-record”; prd=”Premeditated-record”; cnv=”Conversation-record”; anc=”Anecdotal-record”; 

pcf=”Portfolio conference”; lpl=”Learning plan”; lmt=”Learning material”; 

#Sturcture: 

[PN,UU] [AP,WP,PP,OP] 

PN={bro,prsa,proa,prrf,prad,rgt,slt,prec,rcre,prcf}; 

UU={tcr,lnr,otr}; 

AP={rbc,sfa,ota,ref,adv}; 

WP={sam,org,con,imp,lnt,ont,col,tst,prt,pra}; 

PP={sit,prd,cnv,anc}; 

OP={pcf,lpl,lmt}; 

Figure 4. Description of PPS with notation 

As previously discussed, we developed a framework that formally describes PDS and PPS. As the framework 

can be used to describe the relations between the lesson forms and portfolios that need to be collected and the 

relations between a user’s practices and collected portfolios that a user needs to carry out practices, 

Requirements (a) and (b) are satisfied. 

The system can systematically analyze relations by reading text described with the notation. As a result, we can 

also expect a solution to Requirement (c). 

Portfolio Assessment Support System based on formal method of description 

Outline of system 

Symbol 

Structure 

Relations 

#Relations: 

[bro,lnr][{rbc,sfa,ota,ref,adv},{sam,org,com,imp,lnt,ont,col,tst,prt,pra},{sit,prd,cnv},{pcf,lpl,lmt }]; 

[bro,tcr][{rbc,sfa,ota,ref,adv},{sam,org,com,imp,lnt,ont,col,tst,prt,pra},{sit,prd,cnv,anc},{pcf,lpl,lmt}]; 

[bro,otr][{rbc,sfa,ota,ref,adv},{sam,org,com,imp,lnt,ont,col,tst,prt,pra},{sit,prd,cnv},{pcf,lpl,lmt}]; 

[prsa,lnr][{rbc,sfa,ota,ref,adv},{sam,org,com,imp,lnt,ont,col,tst,prt,pra},{sit,prd,cnv},{lpl}]; 

[pros,tcr][{rbc,sfa,ota,ref,adv},{sam,org,com,imp,lnt,ont,col,tst,prt,pra},{sit,prd,cnv,anc},{lpl}]; 

[pros,otr][{rbc,ota},{sam,org,com,imp,lnt,ont,col,tst,prt,pra}, {sit,prd,cnv},{}]; 

[prrf,lnr][{ref},{sam,org,com,imp,lnt,ont,col,tst,prt,pra},{sit,prd,cnv},{}]; 

[prad,tcr][{adv},{sam,org,com,imp,lnt,ont,col,tst,prt,pra},{sit,prd,cnv,anc},{}]; 

[prad,otr][{adv},{sam,org,com,imp,lnt,ont,col,tst,prt,pra},{sit,prd,cnv},{}]; 

[slt,lnr][{},{sam,org,com,imp,lnt,ont,col,tst,prt,pra},{sit,prd,cnv},{}]; 

This section discusses our development of the Portfolio Assessment Support System, which we called PASS, 

based on the formal method of description. Figure 5 shows the system configuration. It consists of four 

subsystems: semantics analysis, design support, practice support, and DB management. Furthermore, each 

consists of various modules. We used Perl as the development language for the system, and also XMLDB. 

96

Semantics analysis 

The subsystem consists of PDS and PPS analysis modules. Each module checks PDS or PPS semantics. 

PDS and PPS are prepared as a text file beforehand. 

Design support 

The subsystem supports teachers’ portfolio-assessment designs through cooperation with the designinterface 

and design-control modules. After receiving the teacher’s orders from the design-interface 

module, the design-control module requests a check of the relations from the semantics analysis 

subsystem and receives the results. The design-interface module generates dynamically adaptive 

interfaces according to the results and provides them to the teacher. 

Practice support 

The subsystem supports the user’s practice through cooperation with the practice-interface and practicecontrol 

modules. After receiving the user’s orders from the practice-interface module, the practicecontrol 

module requests a check of the relations from the semantics analysis subsystem and receives the 

results. The practice-interface module generates dynamically adaptive interfaces according to the 

results, and provides them to the user. 

DB management 

The subsystem consists of a DB control module and Portfolio DB, and reads, writes, updates, and stores 

portfolios. 

Practical example 

Figure 5. System configuration 

In the design phase, the teacher decides the lesson style first to support his or her portfolio-assessment design 

(left screen of Figure 6). He or she then decides who sets tasks and who creates rubrics (right screen). The right 

screen in Figure 6 shows the interface that the system generated dynamically based on PDS according to the 

teacher’s decisions. This is the case of a lesson form that consists of “Lesson Style: Creation”, “Setting Tasks: 

Teacher” and “Rubric Creation: Collaboration by Teacher and Learner”. The system then determines necessary 

portfolios based on PDS according to the lesson form, generates the adaptive interface, and provides them to the 

teacher. Thus, the system supports portfolio-assessment designs that adjust to the teacher’s intended lesson. 

In the practice phase, when the learner clicks the “Self Assessment” button in the menu, the system checks PPS 

and provides him or her with collected portfolios that he or she needs for self-assessment (Figure 7). The learner 

can therefore self-assess him or herself efficiently with the collected portfolios. 

97

Figure 6. Screen of teacher’s portfolio-assessment design 

Figure7. Screen of user’s portfolio-assessment practice 

We developed a system that works systematically based on PDS and PPS, which satisfied Requirement (c). We 

also developed PASS, which satisfied Requirements (a), (b), and (c). Therefore, Problems (1) and (2) were 

solved. 

Conclusion 

We extracted and clarified the relations between lesson forms and portfolios that needed to be collected, i.e., 

Portfolio assessment Design Semantics (PDS) and the relations between practices and collected portfolios, i.e., 

Portfolio assessment Practice Semantics (PPS). We also developed a formal method of describing all relations 

between PDS and PPS. Moreover, we developed a portfolio assessment support system (PASS) based on the 

formal description method, which can precisely and consistently describe PDS and PPS. The system makes it 

possible to assess portfolios according to the lesson forms and practices based on the formal description method. 

Reusing past lessons has made it possible to design a more efficient way of portfolio assessment by storing past 

lessons for which portfolios have been assessed as case studies based on the formal description framework. 

There are specifications related to assessment. They are QTI (2005) and ePortfolio (2005). QTI (2005) is a 

specification that enables question and tests to be exchanged, and ePortfolio (2005) is a specification that can 

98

allow comparability of portfolio data across organizations and interoperability of applications with other 

systems. However, it is impossible to achieve support for portfolio assessment at the design and practice phases 

by only using these specifications. Therefore, we expect that the formal description method we developed will 

become useful as a standard for portfolio assessment, and consider that its integration with QTI (2005), 

ePortfolio (2005), and other e-learning standards (e.g., Learning Design, 2003; SCORM2004, 2004) will make it 

even more efficient. 

We intend to use the system in actual lessons and evaluate it in the near future. 

References 

Barton, J., & Collins, A. (1996). Portfolio Assessment, New Jersey, USA: Dale Seymour Publications. 

Bruke, K. (1992). The Mindful School: How to Assess Authentic Learning Third Edition, Illinois, USA: 

IRI/SkyLight Training and Publishing Inc. 

Chang, C. (2001). Construction and Evaluation of a Web-Based Learning Portfolio System: An Electric 

Assessment Tool, Innovations in Education and Teaching International, 38 (2), 144-155. 

Chen, G., Liu, C., Ou, K., & Lin, M. (2000). Web Learning Portfolios: A Tool For Supporting Performance 

Awareness. Innovations in Education and Teaching International, 38 (1), 19-30. 

ePortfolio (2005). IMS ePortfolio Specification, IMS ePortfolio Best Practice Guide, IMS ePortfolio Binding, 

IMS Information Model, IMS Rubric Specification. Version 1.0 Final Specification, retrieved February 28, 2006 

from http://www.imsglobal.org/ep/. 

Fukunaga, H., Nagase, H., & Shoji, K. (2001). Development of a Digital Portfolio System That Assists 

Reflection and Its Application to Learning, Japan Journal of Educational Technologies, 25 (Suppl.), 83-88. 

Hart, D. (1993). Authentic Assessment: A Handbook for Educators, New Jersey, USA: Dale Seymour. 

Learning Design (2003). IMS Learning Design. Information Model, Best Practice and Implementation Guide, 

XML Binding, Schemas. Version 1.0 Final Specification, retrieved February 28, 2006 from 

http://www.imsglobal.org/learningdesign/. 

Mochizuki, T., Kominato, K., Kitagawa, T., Nagaoka, K., & Kato, H. (2003). Trends and Application of 

Portfolio Assessment for e-Learning. Media and Education, 10, 25- 37. 

Morimoto, Y., Ueno, M., Takahashi, M., Yokoyama, S., & Miyadera, Y. (2005). Modeling Language for 

Supporting Portfolio Assessment. Proceedings of the 5 th IEEE International Conference on Advanced Learning 

Technologies, Los Alamitos, CA: IEEE Computer Society, 608-612. 

Morimoto, Y., Ueno, M., Yonezawa, N., Yokoyama, S., & Miyadera, Y. (2004). Meta-Language for Portfolio 

Assessment, Proc. The 4 th IEEE International Conference on Advanced Learning Technologies, Los Alamitos, 

CA: IEEE Computer Society, 46-50. 

QTI (2005). IMS Question and Test Interoperability Information Model, Overview, Information Model, XML 

Binding, Implementation Guide, Conformance Guide, Integration Guide, Meta-data and Usage Data, Migration 

Guide. Version 2.0 Final Specification, retrieved February 28, 2006 from 

http://www.imsglobal.org/question/index.html#version2.0. 

Puckett, M. B., & Black, J. K. (1999). Authentic Assessment of the Young Child: Celebrating Development and 

Learning (2 nd Ed.), New Jersey, USA: Prentice Hall. 

SCORM2004 (2004). Sharable Content Object Reference Model: SCORM 2nd Edition, Overview, SCORM Run- 

Time Environment Version 1.3.1, SCORM Content Aggregation Model Version 1.3.1, SCORM Sequencing and 

Navigation Version 1.3.1, Addendum Version 1.2, Advanced Distributed Learning, retrieved December 16, 2005 

from http://www.adlnet.org/scorm/index.cfm. 

Shores, F. E., & Grace, C. (1998). The Portfolio Book: A Step-by-step Guide for Teachers, Maryland, USA: 

Gryphon House. 

99

Von Brevern, H. & Synytsya, K. (2006). A Systemic Activity based Approach for holistic Learning & Training Systems. 


A Systemic Activity based Approach for Holistic Learning & Training 

Systems 

Hansjörg von Brevern and Kateryna Synytsya 

International Research and Training Center for Information Technologies and Systems, Kiev, Ukraine 

Tel. + 380.44.242-3285 

researcher@SAFe-mail.net 

ksinitsa@hotmail.com 

ABSTRACT 

Corporate environments treat work activity related to processing information and the one of learning & 

professional training (L&T) separately, by keeping L&T on a low priority scale, and perceiving little 

dependency between both activities. Yet, our preliminary analysis of an organisation has revealed that both 

activities mutually affect each other. Wholes or parts of corporate information and L&T must 

simultaneously be available in “real time”. To manage technical systems and their content from either 

source, we postulate an approach that embeds employees, learners, artefacts, and the object of their activity 

into a systemic-structural system based on activity theory. Our suggested approach is applicable to resolve 

key issues of work activity, professional training, and the modelling of adaptive technical system 

behaviours; it converges human activity and engineering, and positions the latter appropriately within the 

super ordinate and encompassing human activity system. 

Keywords 

Systemic-Structural Activity Theory, Task Analysis, Adaptive Learning & Training Systems 

Introduction 

L&T as a way to ensure up-to-date cognition and skills as well as relevance to emerging organizational needs 

and goals have become a characteristic feature of our time, implemented through traditional processes or 

technology-based environments. However, the distribution of information technologies is leading the way to 

heavy dependences of business processes on information processing and knowledge management that a variety 

of software systems support. Despite contemplated connection and interaction among information and cognition 

processing activities, little attention has been paid so far to modelling a holistic view to computer-supported 

professional activities in learning organizations. 

(von Brevern & Synytsya, 2005) posited the need for a shift of paradigm in L&T that form one holistic activity 

using the Systemic-Structural Theory of Activity (SSTA) as the underlying methodology. This paper integrates 

the L&T into a corporate realm that continuously seeks to ameliorate organisational performance i.e., efficiency, 

effectiveness, and cost-saving. L&T is no longer an isolated unit within an organisational environment but is 

directly and actively interwoven with corporate interests and issues. Equally, changes in work processes not only 

have a mutual affect on work activity and L&T but also on the goals of these activities. These facets and more 

challenge the structure of activities and the type of roles that Information Systems (IS) and Learning Technology 

Systems (LTS) play within such dynamic environment. 

Up to the present, research in educational technology has not yet addressed the interrelationship between IS and 

LTS and has largely failed to model technical solutions under dispositional viewpoints that model responsive 

systems to human cognitive behaviour. Instead, today we have educational technology based on positional 

viewpoints; i.e., mere engineering practice prevails and, in doing so, ignores the dispositional aspect caused by 

human behaviours. We conjecture that the accord of processes involved is the prerequisite to answer the “why’s” 

and only once we grasp the “why’s”, we can determine the “what’s” to solve the “how’s”. The same applies to 

L&T as (Savery & Duffy, 1995, p. 31) argue: “We cannot talk about what is learned separately from how it is 

learned, as if a variety of experiences all lead to the same understanding. Rather, what we understand is a 

function of the content, of the context, the activity of the learner, and perhaps, most importantly, the goals of the 

learner”. 

In this connection, the paper discusses the need for unified consideration of both professional information 

processing and training and suggests a systemic approach to analyze needs for training and cognitive support that 

may be addressed by technical systems. The goal of this research is to provide a holistic view on human 

activities related to information processing, learning and training, and offer psychologically grounded 

mechanisms for their study. It is expected that this approach facilitates revealing generic patterns in work 






100

acivities, bottlenecks caused by information overload, typical tasks and types of content engaged, and thus 

determining priorities, functions and content structure for technical support systems. 

Context 

An informal analysis of call agent’s work processes of a major Swiss mobile phone company made us realise 

that L&T is part of a larger context that comprises work activity and unites organisational goals, requirements, 

principles, etc. 

Corporate L&T in a learning organization has to cover a wide range of needs, such as training new workforces, 

updating employees on new products and services offered to customers, training on new tools used in the 

company, raising professional skills and educational levels, providing corporate information for smooth 

coordination of activities, teaching so-called “soft skills”, etc. L&T is tightly connected to organisational 

information and corporate knowledge structures. The diversity of information about new, altered, or obsolete 

products, services, promotions, etc. needed in responding to telephone enquiries makes call agents heavily rely 

on IS and LTS. Thus, the job of call agents requires them to spatiotemporally process information while 

answering customers’ queries or betrothal in problem solving. In doing so, answering to enquiries and 

troubleshooting is a subject of processing information whose wholes or parts do not only reside in IS but also in 

LTS as illustrated in Figure 1. 

Figure 1. Context Diagram of Processing Information and L&T 

Availability and accessibility of L&T content affect the efficiency of work activity, so both IS- and LTS-residing 

material should be made available in “real-time”. Both systems therefore need to respond to human stimuli and 

behaviours by returning meaningful and significant material to the call agent under a situated viewpoint. A 

significant response means that it has personal sense to the call agent at a particular point of time. Complexity 

and informativeness (Bedny & Meister, 1997) characterise the meaningfulness of a response; the social context 

of the work activity rather embeds than predetermines the essence of a meaningful response. However, L&T and 

information processing include not only “stimuli – response” but also complex cognitive activities of humans 

that have to be supported by the systems as shown in Figure 1. Moreover, the activities of processing 

information and L&T are no longer independent but represent two mutually dependent and equally important 

activities in the corporate domain. Both activities share one common end goal, which in our case is optimal 

customer satisfaction that contributes to the corporate mission. 

To visualise the domination of the availability of content and learning experience, contemplate the following: At 

some point of time a call agent may be preoccupied with the task of a new problem while having forgotten how 

101

he or she resolved a correspondent sub task earlier. This means that L&T content including decomposition of 

problems and lessons learnt may need to remain available and accessible at later points of time, yet changes of 

work processes and activities may require updates between relationships and dependencies, instantaneous 

availability, and may always need to reflect state-of-the-art relationships between work and L&T activities. 

Thus, arranging a supportive environment is to ensure that employees are always au courant with what was 

learnt earlier, the latest lessons learnt in customer care-taking, the latest product information, promotion details, 

repair processes, etc. The numbers of calls answered within a certain time measures call agents’ efficiency. The 

problem-solving nature of their job requires them to not only have information available instantaneously but also 

that systems facilitate content adaptation by “what is important to them” at a given point of time. The 

continuously changing nature of information requires call agents to be trained. This process is the basis of an 

iterative and inseparable interrelationship between processing information and L&T and requires technical 

support instruments enabling creation, updates and focused access to specific content. 

Yet, we have seen another reality. IS and LTS are separate, hardly share any content, and in doing so, do not 

facilitate call agent’s work processes effectively and efficiently so that processes are under the line very 

expensive. On top of that, IS or LTS deliver large junks of not closely related information; the output is neither 

structured according to the current problem nor presented in the befitting format. Hence, our call agents have 

learnt the lesson to rely on their own personal notes, printouts of L&T material, or the like (Figure 1). The severe 

result of such irrelevant and too much information is cognitive overload. Moreover, when call agents 

continuously face such ill-structured situations like absence or non-accessibility of appropriate information the 

decision-making process involved in responding to customers’ enquiries is accompanied by emotionally 

motivated stress. This is also why, call agents cannot sustain in their job for more than three years. So, the more 

complicated and indeterminate the situation, the more mental mechanisms of imagination and image 

manipulation are included as discussed by (Bedny & Meister, 1997). 

Apparently from an engineering standpoint, we would need to conduct a gap analysis between the present and 

desired system states to depict what it takes to have technical systems adapt to human behaviour to return 

meaningful, significant, and situated responses. Despite efforts made in performance support systems (Gery, 

1991), today’s technical systems still cannot yet respond satisfactorily to cognitive behaviours. New 

requirements for technical systems like IS, LTS, and their content processing arise based on human work 

activities whose evolving processes we may neither have known nor could predict earlier. So, we need a 

methodology that in the first place helps us to decompose and formalise human work activity that “… is a 

coherent system of internal mental processes and external behaviour and motivation that are combined and 

directed to achieve goals” (Bedny & Meister, 1997, p. 1). As the past decades of experience have proved, a mere 

engineering based methodology that ignores mental processes will not resolve the task. The pitfall of attempting 

to resolve human activity by conventional engineering is that it attempts to model systems on “stimulus – 

response” behaviours. This approach largely ignores cognitive mechanisms of humans and with them the 

embedded roles, context, tasks, and goals of technical systems. Instead, the answer of how to envision, 

understand, and model technical systems should be explored in human activity and its manifestation. From this 

follows that technical systems ought to present content according to how humans process it during work and 

L&T activities. Ergo, we need a methodology that helps us formally analyse and decompose work and L&T 

activities as an embedded scrutiny, while an engineering methodology that enables us to extract identifiable 

(technical) system events and responses to actions from the context of the higher-order activity system is equally 

indispensable, followed by conventional engineering methods to ultimately model technical solutions. 

Methodology 

The activities of processing information and L&T become meaningful when their processes are coupled with 

situated actions that make part of super ordinate tasks. These are a subject of the goal of activity and are 

altogether coupled with individual internalisation processes. With this respect, we recognise that there is no onesize-fits-all 

process and with it, no one-size-fits-all technical system, because activity “A” differs from activity 

“B” in the same way as organisations and their behaviours vary. Yet, unless the individual systematically 

captures corporate learning experiences, adopts, and continuously transforms new experiences and lessons, they 

will be lost. Organisations are therefore interested in preserving learning experiences, cognition, and information 

and in improving efficiency and effectiveness of manpower. Hence, technological empowerment requires us to 

observe, understand, and measure human activities in the first place, which in return“… requires studying the 

structure of activity, which is a system of actions involved in task performance that are logically organised in 

time and space” (Bedny & Meister, 1997, p. 31). 

102

In (von Brevern & Synytsya, 2005), we have presented the Systemic-Structural Theory of Activity (SSTA), 

which allows us to study cognitive and social impacts that can equally be experienced by call agents and 

customers as well as learners and instructors during work activity that includes the mediated use of technology. 

With the term systemic, “… one is not here speaking of man-machine systems, rather describing human activity 

and behaviour as a system, which, of course, is in dynamic interaction with machinery. A system is a set of 

interdependent elements that is organized and mobilized around a specific purpose or goal. System analyses 

entail extracting the relevant elements and their dynamic interactions. Systemic analyses not only differentiate 

elements in terms of their functionality, but also describe their systematic interrelationship and organization. 

Whether or not there is a system approach depends not so much on the qualities and specificity of the object 

under consideration, but rather on the perspective and methods of analysis” (Bedny et al., 2000, p. 187). (Bedny 

& Meister, 1997) describe SSTA as a pragmatic and systemic methodology that enables us to analyse human 

behaviours and processes. As we have addressed in (von Brevern & Synytsya, 2005, p. 746), an activity is a 

“goal directed system in which cognition, behaviour and motivation are integrated and organized by goals and 

the mechanisms of self-regulation” (Bedny et al., 2000). 

SSTA presents us with a model that recognizes where “at the root of the gnostic dynamic is the self-regulation 

process, which provides the mental transformation, evaluation, and correction of the situation” (Bedny & 

Karwowski, 2004). The goal-oriented activity builds an interrelated, complex, and dynamic system between 

subjects, subjects and the goal-oriented object of activity, and their mediating artefacts or tools. Figure 2 

presents the triadic schemata of the call agent’s work activity concerned with processing information; Figure 3 

presents the triadic schemata of the L&T activity. Both triadic activity schemas share the same-targeted end 

result and their mediating technical tools (IS and LTS) have different distinct contextual functionalities. While 

IS’ functionality is concerned with processing corporate information and providing appropriate content upon call 

agents’ requests during the processing information, LTS’ mediate between an instructor, a learner, peers, and is 

aimed at facilitating L&T. In this sense, the activity system of SSTA widens and extends our understanding of a 

mere engineering-based approach in view of analysing and modelling the roles of technical artefacts because it 

incorporates cognitive processes and models the context in which the dynamic activity system lives. 

Figure 2. Triadic Schemata of the Processing Information Activity System 

Figure 3. Triadic Schemata of the L&T Activity System 

103

SSTA builds the necessary formal grounds that allow analysis of human work activity, goals, tasks, and actions 

according to which technical support may be arranged. Although it is outside the scope of this paper to discuss 

each formal method of SSTA in-depth they are informational or cognitive, parametric, morphological, and 

functional. “Informational or cognitive analysis studies the cognitive processes that are concerned with task 

performance; Parametric analysis is quantitative and qualitative and allows evaluation of the effectiveness of 

work activity in terms of quality, quantity, and reliability; Morphological analysis looks at describing the logical 

and constructive features of activity on the basis of actions and operations; Functional analysis looks at the 

function blocks” (von Brevern & Synytsya, 2005, p. 747). SSTA separates the goal-driven activity into objects of 

study i.e., activity, tasks and units of analysis such as actions, operations, and function blocks (Bedny & Harris, 

2005). 

For a better comprehension, Figure 4 visually illustrates the relationship between the objects of study and the 

units of analysis in a simplified form. Once we have identified the objects of study, the units of analysis become 

important to analyse human algorithmic behaviours. A human algorithm is the representation of a “…logically 

organized system of actions through which an individual subject transforms some initial material in accordance 

with the required goal of the task. These algorithms are distinguished from similar devices such as flow charts by 

their use of actions as the basic units of analysis” (Bedny & Harris, 2005, p. 8). It is the composite of different 

actions that make an activity to a complicated dynamic structure. Therein, all elements of an activity are “… 

tightly interconnected and influence each other. From this, the requirement of using the systemic method of 

study follows. As can be seen, an activity is a process. However, this process consists of a sequence of a 

hierarchically organised structure of elements (actions and operations)” (Bedny & Meister, 1997, p. 48). So, for 

us to understand and pragmatically analyze a goal-oriented activity and its elements, we must first engage in task 

analysis as Figure 4 demonstrates. 

Task Analysis 

Figure 4. The Objects of Study and Units of Analysis of an Activity System 

The task holds an essentially central role in the activity system. In fact, our degree of understanding of the task 

and precision of task analysis determine if we model technical systems correctly or not. 

Under such critical aspects, we will need to know the principle task characteristics, context, definition, and 

influences for system modelling. According to (Leont'ev, 1978), tasks set by an individual for himself or herself 

“… in thought originate in the social conditions of his [or her] life”. (Leont'ev, 1978) enlarges our viewpoint of 

the integral embodiment of a task, its subordinate actions, the situation requiring goal-achievement, and 

conditions. He argues that “every purpose, even one like the “reaching of point N,” is objectively accomplished 

104

in a certain objective situation. For the consciousness of the subject, the goal may appear in the abstraction of 

this situation, but his action cannot be abstracted from it. For this reason, in spite of its intentional aspect (what 

must be achieved), the action also has its operational aspect (how, by what means this can be achieved), which is 

determined not by the goal in itself but by the objective- object conditions of its achievement. In other words, the 

action being carried out is adequate to the task; the task then is a goal assigned in specific circumstances” 

(Leont'ev, 1978). Among various definitions, a task “ … is inherently a problem-solving endeavour” and “… a 

set of human actions that can contribute to specific functional objective and ultimately to the output of a goal of 

a system” (Bedny & Meister, 1997, p. 19). When it comes to determine of how to improve organisational 

performance, we will need to know the requirements and conditions of the activity. With this respect, (Bedny & 

Meister, 1997, p. 19) claim that “Anything that is presented to the operator or known by him or her is a condition 

of the task, the requirements of which include the finding of a solution (or what we need to prove)”. Therefore, it 

is indispensable to first conduct task analysis in any activity system. 

Accordingly, the kinds of tasks that tasks call agents face during their daily activity of processing information 

have become our primary foci of interest. Nevertheless, even an informal task analysis has become a major 

challenge because most tasks have never been formulated precisely. The following five examples from our 

findings reflect the importance, breadths and widths, impacts, and values contained in task understanding: 

1. The task to understand ill-structured problems. A call agent’s job requires a high degree of sensory 

perceptual and imagery responses to solve situational problems. This may subjectively dominate sensory 

tasks. A call agent’s goals are often imprecise due to ambiguous task definitions by customers, superiors, 

systems that leave space for goal formation. So, systems need to help on problem identification for which IS 

could provide alternative parameters of problem descriptions. Additionally, call agents need training to 

recognize and classify customer’s input and associate it with appropriate goals. 

2. The task to do a spatiotemporal search on specific information. Irrelevant, insignificant, or too much 

feedback by IS cause unfavourable conditions to the call agent. Contradictions between task requirement 

and unfavourable conditions cause interaction breakdowns within the activity system. Task decomposition 

provides detailed information on causes of breakdowns, which serves as a starting point to modify content 

and structure of support system responses to the agent. 

3. The task to switch between tasks. The pre task instruction to a call agent is to answer the incoming call. The 

latter regulates a call agent’s behaviour. When training sessions are embedded in the daily work, the agent 

should be able to change activities, and together with them goals, tasks, and actions. Call answering and 

learning activities are different in nature so that we need to investigate motives and actions between task 

switching to determine how IS or LTS could best facilitate task switching to stimulate the motivation for 

each activity, to help refresh the knowledge of the last visit, etc. 

4. The task to classify information. Information distributed by IS is not personalized so it has to be sorted and 

prioritized by each agent. The IS may have alerted the call agent about an upcoming summer promotion 

when the agent will eventually be on holidays. Simultaneously of the system alert, customers are 

complaining about a new model for which no training has been received. So, at that very point of time the 

information about the summer promotion is not appropriate to the call agent. Analysis of personal 

interpretations – meaning and sense – can be absolutely crucial as (Bedny & Karwowski, 2004) argue. 

Therefore, information needs to be spatial and temporally significant, which altogether impose critical 

system requirements. 

5. The task to master skills consciously. For example, a routinized agent may know how to repair even the 

latest model of a mobile phone of a certain brand. So, motor actions that consist of automatisation may be 

unconscious to the call agent. (Bedny & Meister, 1997) argue that motor actions of a higher order task can 

disintegrate. That means that if for any reason, the call agent suddenly consciously deliberates each 

hierarchical motion involved in the repairing process and becomes aware of each individual goal associated 

with each step, the process will slow down and regress. L&T therefore needs to shape a call agent’s accord 

of the conditions of the task and make the call agent master it. 

Apprehension of the tasks of the activity of processing information is crucial and integral to L&T because both 

activities share the same end-goal or result as illustrated in Figure 2 and Figure 3. We also need to understand 

those tasks of the L&T activity because analogous activities mutually affect each other, albeit not discussed 

herein. For example, the requirement for modular content is equally important to the activities of processing 

information and L&T and so is the notion of different zones of proximal development (ZPD) (Giest & 

Lompscher, 2003) as in learning & teaching (von Brevern & Synytsya, 2005). Moreover, the accord of informal 

tasks is difficult when we do not have an elucidate picture of the object of study. However, this deficiency is not 

uncommon in organisational environments. Therefore, if we want to ameliorate organisational performance we 

need to know the precise foci on the objects of study. Its tasks are then the subject of task decomposition. 

105

Task Decomposition 

Task decomposition is the next step in the study of the algorithmic description of the call agent’s activity. 

Following our preliminary cognitive analysis of the task, we classify actions. “The purpose of classifying actions 

is to present an activity as a structure with a systemic organisation. All actions are organised as a system because 

of the existence of a general goal of activity. Any changes in individual action can immediately influence other 

actions. A person continually evaluates his or her own activity from the viewpoint of the goal and conditions 

under which he or she performs the activity” (Bedny & Meister, 1997, p. 32). In our study, we have discovered 

indeterminate tasks in which the goal has been ill-defined and the method of implementation partially unknown 

or ambiguous. 

Now, let us contemplate an example and demonstrate of how to decompose a task into subtasks. We can use 

subtasks during the decomposition of activity and build a hierarchical order of the units of analysis (Fig. 4). “In 

the algorithmic description of task, a subtask is … composed of one to three homogeneous, tightly connected 

actions” (Bedny & Meister, 1997, p. 32). Now, suppose the following simple case: “If a customer’s promotion 

name is X, check in the IS the promotion procedure and inform the customer. If the promotion name is Y, 

transfer the call to department Z” (where X, Y = getPromotion (promotionName:character, 

promotionCode:integer):character{1 character}; Z = department():numeric{2 digits}). Table 1 presents the 

method of decomposition; i.e., an informal algorithmic description of the task of how a semi-routinized call 

agent solves the task, the hierarchical order of the units of analysis, duration of time, and a classification of 

actions. Albeit the semantic task description emanates to be simple, this task leaves an indeterminate situation 

that bestows space for options. (Bedny & Meister, 1997, p. 22) argue “When task instructions do not contain a 

sufficiently detailed description of how to achieve the goal and consequently the task requires conscious 

deliberation about how to accomplish that goal, the task becomes one of a problem solving.” Based on the task 

decomposition of Table 1, the following addresses some of our findings: 

1. Task dependency. Our task presumes only two promotions, which is unlike. Hence, our task requires 

dependency on another as (Bedny & Meister, 1997, p. 23) claim: “An aspect of complexity derived from 

engineering theory is the degree of dependency of one task and action on another”. When we change the 

task, it re-affects the algorithmic description. 

Dependency between actions. Changes of the algorithmic description of tasks affect the structure between 

actions. 

2. Figure 5 illustrates a concept map of the structure of mental actions (based on Bedny & Meister, 1997) 

which is appended to illustrate combination of motor and mental actions in the activities of processing 

information and L&T. Notwithstanding, the object of study can have its main focus on mental actions, 

analysis of algorithmic behaviours cannot ignore either kinds of actions neither can we do so when we aim 

at improving organisational performance. 

3. Taxonomy of standard activity and duration of time. Except for the structural effect, it is important to regard 

the factor time because any actions have their duration of time. Time is an important indicator to detect 

issues. With reference to effectiveness of skilful work activity (Bedny & Meister, 1997, p. 27) suggest to 

deploy a taxonomy. 

4. System responses to mental and motor actions. Owned to the problem space of the task, task decomposition 

of Table 1, and the, at this stage unknown object of study create open options for both mental and motor 

actions. For example, the task does not include any validation subtask in view of misspelling, which 

includes motor and mental actions. For example, misspellings could be the cause of a simple typing error or 

fallacy by the customer or the call agent. Then it would be helpful if IS could facilitate resolving the case. 

Though no technical system is able to infere the cause or motive, we may wish to model technical systems, 

which “foresee” such kinds of events. Neither can we know such a need nor can we model such technical 

systems unless we first decompose the task. 

5. Interaction breakdowns. “In systemic-structural terms, a breakdown can be defined as forced changes of 

subjects’ strategies of action caused by their evaluation of an unacceptable divergence between the actual 

results of actions and the conscious goals of those actions. Typically, breakdowns are characterized by 

subjects’ (temporary or permanent) abandonment of the task in hand, which … may lead them to 

reformulate the task-goal” (Harris, 2004, p. 4-5). Contradictions between task requirement and unfavourable 

conditions cause interaction breakdowns within the activity system. Although our task does not require the 

call agent to ask the customer for the name of promotion and the promotion code, the call agent has wisely 

done so in advance (Subtask 1) out of previous experience. The most severe interaction breakdown, 

however, is the requirement to inform the customer (see Table 1 “average time” of subtask 5) because stored 

information in IS is not modular. Apprehending interaction breakdowns and actions are extremely crucial 

when it comes to goal-formation, task-fulfilment, actions involved, and the efficiency of mediating tools. 

106

One of the critical aspects of interaction breakdowns is the possible dilemma between the objective goal and 

the subjective goal of the task because of its affect onto operations: “One of the critical aspects of action is 

the formulation of a conscious goal and an associated capacity to regulate progress toward achievement of 

the goal voluntarily.” (Bedny et al., 2000, p. 174). 

Table 1. Decomposition of Activity during Task Performance 

Figure 5. Concept Map of Mental Actions 

107

Preliminary task decomposition builds the ground for the objects of study to depict desired system requirements. 

It provides guidelines for optimization of operator’s work and is suitable in the design of human-computer 

interactions, including communication mode, visualization features, and modularization of content. Herein, we 

have not formally demonstrated the stages and their fragmentation in space (motor and mental activity) and time 

as the formal methods of SSTA allow us to do so because such is outside the scope of this paper. Task 

decomposition reveals preciseness of the semantic task formulation. Exactitude and preciseness in task 

description are integral prerequisites to not only analyse or model the activity system but also in view of 

optimising technical artefacts and work activity itself. 

Subject Domain 

After obtaining information from detailed analysis of call agent’s activity, we have a picture of current ways of 

operation and from that we can identify desired activities and certain requirements for processes and technical 

systems. At this point, findings related to cognitive activity should be incorporated into engineering description 

of the support system, so that we need an engineering method that is capable to extract events, entities, state 

transitions, and the like to construct technical systems. Thus, we turn to the description of subject domain, 

which is “… the part of the world that the messages received and sent by the system are about” (Wieringa, 2003, 

p. 16). The subject domain separates identifiable entities and events from non-identifiable ones (e.g., thinking 

actions, motives) and lets us recognise and organise identifiable responses and non-identifiable stimuli. In liaison 

with our previous discussion, we posit to classify messages and events according to the object of activity and the 

relationships between subjects and mediating tools. Thus, under the umbrella of this notion and discussed by 

(von Brevern, 2004), we can bring the higher-order human activity systems of Figure 2 and Figure 3 into the 

context of technical system engineering because of their shared end-goal or result. 

It is important to order the desired levels system functionality. We can do so by integrating functionality into 

e.g., a mission statement, function refinement trees, and service descriptions. In the same way that we have 

decomposed the task, we decompose a technical system into subject-orientation decomposition, functional 

decomposition, communication-oriented decomposition, and behaviour-oriented decomposition (Wieringa, 

2003). This allows us to specify the desired system properties and functional requirements that truly originate 

from the activity system. So, technical systems acquire a dispositional stance with an activity system regardless 

whether they hold a mediating role or are the actual object of activity. 

Observation and analysis of operator’s activity provide information that may enhance learning and training in 

multiple ways: 

1. Novice training program may be focused on most common tasks, most complex tasks and training priority 

and sequences in arranging training modules may be aligned with current company priorities and current 

state. 

2. When a company uses blended learning, the learning objectives persuaded in face-to-face and technologysupported 

learning may be better balanced, and stable information and knowledge building, a framework of 

the activity, may be separated from dynamic information. 

3. Training may be further individualized employing the concept of ZPD (Vygotsky, 1930) to maximize effect 

of online training by presenting learning content relevant to currently performed tasks and preparing for 

more complicated situations. 

4. Presentation of material, both for learning purposes and information processing, is a subject of facilitating 

each core element of an action that is orientation, execution, and control. A number of principles, such as the 

split-attention, spatial contiguity, temporal contiguity, modality, redundancy, and coherence as delineated by 

(Mayer & Moreno, 2003) may be employed to derive rules for content delivery. 

5. In online courses focused on education and learning as opposed to just-in-time training, organisational 

aspects contain generic and routine frameworks that are of general interest like course overview, description 

objectives, terms and conditions, etc. (Horton, 2000). These should be aligned with corporate norms, culture 

and vision. 

These and other aspects may be depicted in the description of the L&T processes that are to be implemented. 

The key issue at this stage is bridging cognition-grounded analysis with sound engineering synthesis leading to 

technological implementation. For this purpose Educational Modeling Language as suggested in IMS Learning 

Design specification (http://www.imsglobal.org/learningdesign/index.html) may be a valuable tool, as it is 

pedagogically neutral and equally appropriate to describe technology-based and traditional learning, both 

108

individually and in groups. Moreover, through the concept of Support Activity it might be possible to describe 

information support services to connect their results directly to the L&T. 

Technical Considerations 

A major requirement for technical support systems derived from the task analysis is provisioning of relevant 

information and avoidance of cognitive overload. This can be achieved by understanding what kind of task an 

agent is working on, the current situation in his/her problem solving, and providing content focused on the 

current needs. To be able to do so, the system should operate with modular content which can be adapted to the 

user needs, assembled from parts or selected according to request. So, the underlying base to enable adaptation 

to organised and well-structured information is decomposition of material. Practice has shown that the issue of 

granularity cannot be generalised but depends on individual and customised needs of use. The two aspects that 

fall into place here is technical efficiency and, when it comes to L&T, psychological and didactic orientation 

according to the method of L&T. 

Modular content may be re-arranged in a way to address particular needs of the user (learner) and specific 

“modules” – content blocks – may be reused across LTS and IS. However, the bottom line of the granularity is 

determined by our ability to describe and discern similar blocks. For this purpose metadata are designed that 

provide multi-facet description which facilitates indexing, management, search, and exchange of the described 

objects. Based on these descriptions, content structure and relations between its components may be revealed, 

conditions for their use may be set up, and various sequencing rules created. 

Moreover, a structured and well-organised presentation of content equally requires us to abduct generic, 

continuously recurring behaviours and rules. Alexander Patterns allow us to do so (Alexander, 1979). Taken 

from instructional design principles, norms and standards, generic and routine frameworks, logical dependencies 

between learning objects and the like we can derive generic rules and behaviours. At some point we could arrive 

at certain frames – abstract structures with didactical guidelines that are filled in with various specific 

information blocks (designed according to modularity principle) depending on the learning context. 

Resembling the method(s) of L&T (Gal'perin, 1969)’s work on the stages in the development of mental acts has 

revealed that learning includes various psychic stages in the process from “understanding” to “mastery”. 

(Gal'perin, 1969)’s reference points adverts to those informational parts of an activity that is required for an 

affirmative regulation of actions and for faultless performance. If the learner is capable to present the required 

reference points in the new material, it will permit affirmative execution of the entire task by moving from one 

point to another. Hypothetically, reference points could also form a type of hierarchical arrangement of 

parameters during resolving ill-structured problem descriptions in processing information and L&T activities. In 

a broader sense, reference points can be any type of organised and well-structured information as a whole or part 

that we need in the analysis of an objective task or condition. 

There are a number of engineering issues related to static and dynamic content management, however, we would 

like to stress importance of taking human user viewpoint when designing technical solutions. This step has 

already been done – or declared – in LTS. The next might be a holistic view on user activity in complex humancomputer 

systems comprising various functions, modes and services. Growth of processing speed and storage in 

technical systems does not automatically make them more convenient to the users thus support systems design 

require closer attention to their usability and usefulness for human users. 

Conclusion 

Organisations are continuously striving to aim at ameliorating work efficiency, effectiveness, and saving costs. 

While corporate information is constantly growing, employees are challenged to manage it, retrieving adequate 

portions under stress and in a limited timeframe. Unfortunately, present technical systems, regardless of whether 

they are IS or LTS, are not sufficiently intelligent to support information management. They cannot effectively 

respond to human cognitive actions and return large junks of information that cause cognitive overload and 

stress, which ultimately leads to employees’ work dissatisfaction. While conventional system engineering 

methods cannot cope with these issues – not even when it comes to understanding requirements that originate 

from cognitive behaviours, it is obvious we need another methodology that tackles today’s dilemma at their very 

root; i.e. human activity and converges then with conventional system engineering methods. 

The very essence of this paper is that we change the conventional viewpoint that separates and isolates L&T 

from work activity. In the same manner, we diverge from common custom that models technical systems solely 

109

on mere engineering principles. Instead, we subordinate technical systems to the psychological level of human 

activity. The theoretical methodology that allows us to do so is SSTA, which we have presented herein. 

Taken from the ambitious corporate end goal or result that we investigated under the roof of the activity system, 

we have discovered that the activities of processing information and the one of L&T are adjacent, which has 

made us realize that we must regard work activity and the one of L&T as one universe. Also, based on diverse 

corporate cultures and social contexts, work and L&T activities are flavoured with cultural and social facets. In 

such rich contexts, it is of no surprise that “one-size-fits-all kinds” of technical system have a difficult stance to 

survive because they are not adaptive to human cognitive behaviours. 

The dynamic nature of human activity requires systemic analysis to differentiate elements related to 

functionality, interrelationship, and organisation. Albeit it would be outside the scope of this paper to discuss and 

practically demonstrate the formal methods of SSTA, we have adverted to the foremast step to understand an 

activity; i.e. task analysis of the object of study. Task analysis is undoubtedly critical because the task of an 

activity is the placeholder of contextual requirements. From experience, an informal task analysis has proved 

difficult when the foci of the object of study are uncertain. Once tasks are known, task decomposition is 

powerful because it reveals the structure and validity of the (usually semantic) task description, which is relevant 

for efficiency and effectiveness, its first stages of human algorithmic behaviours, possible interaction 

breakdowns, its temporal aspects, and more. Except for numerous other findings, task analysis and 

decomposition demonstrated importance of content modularity. So, although SSTA requires high expertise, this 

approach provides most precise and accurate formal methods to analyze, measure, and optimise work activity 

even to the degree to depict motivational components and self-regulation. Insomuch, SSTA is applicable in a 

wide range of domains. Moreover, this approach can be smoothly converged with engineering methods, as 

shown in discussion of the subject domain and other technical issues concerning modularity of content. 

Our next research steps include formal analysis according to the methods of SSTA; e.g., morphological analysis, 

concise requirements identification abstracted from formal aspects of task analysis and algorithmic behaviours, 

modelling of the subject domain, and system decomposition. We are also looking for practical examples that 

may guide creation of implementation framework. It is our human processes and the methods for L&T that 

determine and impose the stringent system behaviours for we need IS and LTS that are adaptive in response to 

tasks and human derivative actions. 

In conclusion, we would like to transform (Bedny & Meister, 1997)‘s original statement into: 

References 

“Formalization of procedures of description, their standardization, and the convergence of 

psychological methods with those developed in engineering and computer science, provide greater 

opportunities for the use of psychology in L&T technology”. 

Alexander, C. (1979). The Timeless Way of Building, New York: Oxford University Press. 

Bedny, G., & Karwowski, W. (2004). Meaning and sense in activity theory and their role in the study of human 

performance. International Journal of Ergonomics and Human Factors, 26 (2), 121-140. 

Bedny, G., & Meister, D. (1997). The Russian Theory of Activity: Current Applications to Design and Learning, 

Mahwah: Lawrence Erlbaum. 

Bedny, G. Z., & Harris, S. R. (2005). The Systemic-Structural Theory of Activity: Applications to the Study of 

Human Work. Mind, Culture and Activity, 12 (2), 1-19. 

Bedny, G. Z., Seglin, M. H., & Meister, D. (2000). Activity theory: history, research and application. Theoretical 

Issues in Ergonomics Science, 1 (2), 168-206. 

Gal'perin, P. Y. (1969). Stages in the Development of Mental Acts. In Cole, M. & Maltzman, I. (Eds.), A 

Handbook of Contemporary Soviet Psychology (1 st Ed.), New York: Basic Books, 249-273. 

Gery, G. (1991). Electronic Performance Support Systems: How and Why to Remake the Workplace through the 

Strategic Application of Technology, Boston: Weingarten Publications. 

110

Giest, H., & Lompscher, J. (2003). Formation of Learning Activity and Theoretical Thinking in Science 

Teaching. In Kozulin, A., Gindis, B., Ageyev, V. & Miller, S. M. (Eds.), Vyogotsky's Educational Theory in 

Cultural Context, Cambridge: Cambridge University Press, 267-288. 

Harris, S. R. (2004). Systemic-Structural Activity Analysis of HCI Video Data. In Bertelsen, O. W., Korpela, M. 

& Mursu, A. (Eds.), Proceedings of the ATIT2004: First International Workshop on Activity Theory Based 

Practical Methods for IT Design, Copenhagen, Denmark: Aarhus University Press, 48-63. 

Horton, W. (2000). Designing Web-Based Training, New York: John Wiley. 

Leont'ev, A. N. (1978). Activity, Consciousness, and Personality, Englewood Cliffs: Prentice Hall. 

Mayer, R. E., & Moreno, R. (2003). Nine Ways to Reduce Cognitive Load in Multimedia Learning. Educational 

Psychologist, 38 (1), 43-25. 

Savery, J. R., & Duffy, T. M. (1995). Problem Based Learning: An instructional model and its constructivist 

framework. Educational Technology, 35 (5), 31-38. 

von Brevern, H. (2004). Cognitive and Logical Rationales for e-Learning Objects. Journal of Educational 


von Brevern, H., & Synytsya, K. (2005). Systemic-Structural Theory of Activity: A Model for Holistic Learning 

Technology Systems. In Goodyear, P., Sampson, D. G., Yang, D. J.-T., Kinshuk, Okamoto, T., Hartley, R. & 

Chen, N.-S. (Eds.), Proceedings of the 5th IEEE International Conference on Advanced Learning Technologies 

(ICALT 2005), Los Alamitos, CA: IEEE Computer Society Press, 745-749. 

Vygotsky, L. S. (1930). Mind in Society (Blunden, A. & Schmolze, N., Trans.), Harvard University Press. 

Wieringa, R. J. (2003). Design Methods for Reactive Systems: Yourdon, Statemate, and the UML, San Francisco: 

Morgan Kaufmann. 

111

Cutshall, R., Changchit, C., & Elwood, S. (2006). Campus Laptops: What Logistical and Technological Factors are Perceived 

Critical? Educational Technology & Society, 9 (3), 112-121. 

Campus Laptops: What Logistical and Technological Factors are Perceived 

Critical? 

Robert Cutshall 

College of Business, Texas A&M University – Corpus Christi, TX 78412 USA 

Tel: +1 361-825-2665 

Fax: +1 361-825-5609 

rcutshall@cob.tamucc.edu 

Chuleeporn Changchit 

College of Business, Texas A&M University – Corpus Christi, TX 78412 USA 

Tel: +1 361-825-5832 

Fax: +1 361-825-5609 

cchangchit@cob.tamucc.edu 

Susan Elwood 

College of Education, Texas A&M University – Corpus Christi, TX 78412 USA 

Tel: +1 361-825-2407 

Fax: +1 361-825-6076 

elwoods@falcon.tamucc.edu 

ABSTRACT 

This study examined university students’ perceptions about a required laptop program. For higher 

education, providing experiences with computer tools tends to be one of the prerequisites to professional 

success because employers value extensive experience with information technology. Several universities 

are initiating laptop programs where all students are required to purchase laptop computers. The success of 

these laptop programs rely heavily on the extent to which the laptop environment is accepted and 

wholeheartedly implemented by students and faculty. Defining the logistical and technological factors 

necessary to successfully implement a laptop program becomes a critical issue to the success of the 

program. By understanding what logistical and technological factors are critical to students, such a 

program can be made more useful to students as well as more beneficial to universities. 

Keywords 

Portable computing initiative, Technology in higher education, Critical success factors, Laptop initiative 

Introduction 

Students’ use of laptop computers is becoming more prevalent in today’s universities. Previous research has 

shown that laptop computers in the classroom can lead to positive educational outcomes (Finn & Inman, 2004; 

Gottfried & McFeely, 1998; Varvel & Thurston, 2002). This more ubiquitous use of technology has caused 

several universities to uncover and manage new perceptual issues in addition to some of the more familiar issues 

from the era of computers found only in university labs. University students’ views are paramount to newer 

perceptual issues regarding laptop initiatives. 

Weiser (1998) envisioned a third wave of ubiquitous computing where computer use is fully and seamlessly 

integrated into student life. According to the studies conducted by Finn and Inman (2004) and Lim (1999), the 

results reveal that the majority of University laptop program alumni agree that portable computers were 

beneficial during their college careers. They also agree that it is important to continue the laptop program for 

future students. 

Positive student-body responses encourage universities to continue their efforts in laptop program 

implementation. Such efforts to integrate technology, however, take much support and careful planning towards 

effective implementation. Vital to implementing a successful laptop initiative are examining successful 

initiatives and uncovering students’ perceptions of critical success factors towards a laptop initiative. 

The Basis of Successful Initiatives 

Prior studies indicate that the success of computer use is largely dependent upon the attitudes of both instructors 

and students (Al-Khaldi & Al-Jabri, 1998; Liaw, 2002). A study reported that 77 percent of the variance of 






112

intent to use information technology was attributed to users’ attitudes toward computers (Hebert & Benbasat, 

1994; Liaw, 2002). No matter how capable the technology, its effective implementation depends upon users 

having positive attitudes towards the technology (Liaw, 2002). 

Experience with technology influences users’ attitudes toward computer technology (Rucker & Reynolds, 2002). 

Those who have had positive experiences with technology tend to develop positive attitudes towards it. Prior 

attitudes of educational technology use or lack thereof could impact present and future perceptions of technology 

(Steel & Hudson, 2001). 

An exemplary university that consistently proves students’ positive attitudes towards a laptop initiative is the 

University of Minnesota (UMC) – Crookston. UMC has the oldest successfully established university-wide 

laptop program (Lim, 1999). Annual evaluative studies have shown that UMC students were highly satisfied 

with the laptop initiative. This study concluded that the high percentage of satisfaction with computer training is 

likely to contribute toward a high usage of computers for learning. 

Uncovering Students’ Perceived Critical Success Factors 

Campus-wide laptop initiative literature can be divided into two aspects. The first aspect is devoted to guidance 

and encouragement from campus-wide implementation to how curriculum and courses could be transformed 

(Brown, Burg, & Dominick, 1998; Brown, 2003; Brown & Petitto, 2003; Candiotti & Clarke, 1998; Kiaer, 

Mutchler, & Froyd, 1998, Spodark, 2003). The second aspect is composed of systematic studies documenting 

the effects of mandatory computing programs on student attitudes and learning as well as modes of teaching 

(Hall & Elliott, 2003; Kuo, 2004). 

Several prior studies focus on student experience with technology (Demb, Erickson & Hawkins-Wilding, 2004; 

Finn & Inman, 2004; Kuo, 2004; Mitra, 1998; Mitra & Steffensmeier, 2000; Platt & Bairnsfather, 2000; Varvel 

& Thurston, 2002). For instance, a study pointed out that convenience, hardware/ configuration choices, and 

cost were major issues for laptop initiatives (Demb et al., 2004). Demb et al. (2004) observed that a majority of 

students noted that the cost of a laptop computer was a very important factor. Nevertheless, a vast majority of 

students appreciated the convenience of the laptop, especially in terms of portability and size. Although a 

majority of students liked their laptops and used them heavily, their inability to make choices was very 

frustrating to them. 

Despite several studies on laptop initiatives, no prior study directly examined students’ perceptions of critical 

success factors necessary for the implementation of a laptop initiative. Obtaining a student body perceptual 

overview towards the critical success factors necessary for such a laptop program would assist actions towards 

positive attitudinal development during the implementation of the laptop program. 

Methodology 

A direct survey was used to collect the data for this study. The survey questions were based on logistical and 

technological factors identified in previous studies by Demb et al. (2004), Luarn & Lin (2005), Moore & 

Benbasat (1991). Additional questions recommended by researchers and students were added to the survey. 

These questions were designed to gather data on students’ perceptions on the logistical and technological factors 

necessary to implement a laptop initiative, as well as their demographics. It should be noted that there are 

additional dimensions that influence the success of a laptop program. While pedagogical, social, and other 

dimensions are important to a laptop program’s success, the focus of this study will be on the logistical and 

technological dimensions. To check the clarity of these questions, three professors and three students were 

asked to read through the survey questions and provide feedback. Revisions to the survey were made based on 

the feedback received. 

The survey consisted of 55 items. Fifty-four of the survey items were used as five-point Likert scaled questions 

with end points ranging from “strongly disagree” to “strongly agree.” Survey items Q1 to Q28 collected 

demographic data such as age, gender, computer ownership, experience using computers, etc. Survey items 

Q29 to Q53 measured students’ perceptions on the logistical and technological factors necessary to implement a 

laptop initiative. Survey item Q54 measured students’ willingness to support a laptop initiative. Survey item 

Q55 was an open-ended question. This question was included on the survey to allow the respondents to list other 

factors they deemed important to the successful implementation of a campus-wide laptop program. 

113

Data Collection 

Surveys were distributed to 272 undergraduate and graduate students enrolled in a mid-sized four-year 

university. The participants were given a 55-item survey and allowed class time to complete the survey. All 

participants were informed that participation in the study was voluntary and that all individual responses would 

be kept anonymous. The students were asked to rate each of the survey items on a Likert-scale from 1 to 5 with 

1 being “strongly disagree” and 5 being “strongly agree.” Two hundred and sixty-nine surveys were returned. 

The average age of the respondents is 26.75 years old. Approximately 18.1 percent of the respondents agreed or 

strongly agreed with requiring all students to purchase a laptop computer for use in their education. 

Approximately 55.9 percent of the respondents disagreed or strongly disagreed with a laptop computer initiative. 

The remaining 26 percent of the respondents were neutral on a laptop computer initiative. Table 1 summarizes 

additional demographic characteristics of the respondents. 

Table 1: Demographic Characteristics 

Age (in years) 

Under 18 18-21 22-25 26-29 30-33 34-37 Over 37 

2 (0.7%) 

Gender 

105 (39%) 83 (30.9%) 29 (10.8%) 14 (5.2%) 12 (4.4%) 24 (8.9%) 

Female Male 

166 (61.7) 

Ethnicity 

102 (37.9%) 

African American Asian Caucasian Hispanic Native American 

17 (6.3%) 15 (5.6%) 133 (49.4%) 98 (36.4%) 5 (1.9%) 

First Generation College Student 

Yes No 

127 (47.2%) 

Own a Computer 

139 (51.7%) 

Desktop Laptop 

227 (84.4%) 109 (40.5%) 

Analysis and Discussion 

All of the 272 surveys distributed were returned. Virtually all of the questions were answered on 269 of the 

surveys. The remaining three surveys were returned incomplete. The research data showed an odd-even 

reliability score of 0.901, suggesting internal consistency of the data. In addition, a Cronbach’s alpha score of 

0.921 was calculated as a second measure of reliability. It should be noted that these high levels of reliability 

relate to the data resulting from the measurement, not the instrument itself. 

Logistical and Technological Factors Perceived as Critical 

To isolate which logistical and technological factors were deemed as critical to the successful implementation of 

a laptop computer initiative, the mean responses to each question were calculated and examined. The survey 

items were on a Likert-scale ranging from 1 to 5. Figure 1 below shows the five factors rated as more critical 

with means ranging from 4.37 to 4.45. 

The students believe that the University must provide a wireless network for them to access information stored at 

various points on campus and to access the Internet. Wireless network access is viewed as the most important 

factor in the success of a laptop initiative. This factor had a mean score of 4.45 out of 5. Eighty-seven percent 

of the students agreed or strongly agreed with this factor. This finding is no surprise since one of the main 

benefits of laptop computers in education is the ‘learning anywhere, anytime’ ability. In addition, this finding 

follows the wireless networking trend that currently exists in the industry. 

Having sufficient power is also seen as a critical factor in the success of a laptop initiative. The mean student 

rating on this survey item was 4.43 out of 5. The majority of the students, 87.4 percent, rated the availability of 

power outlets in class as a necessity. The laptop batteries are easily drained when the laptop is under constant 

use. Thus this finding is definitely an issue that needs to be addressed to have a successful laptop initiative. 

114

Response Mean 

4.46 

4.44 

4.42 

4.4 

4.38 

4.36 

4.34 

4.32 

4.45 

Provide a 

wireless 

network 

4.43 

Provide 

sufficient 

power outlets 

in the class 

4.42 

Provide 

students with 

access to 

printers 

Factors 

4.39 

Provide onsite 

maintenance 

Figure 1. The logistical and technological factors perceived as critical 

4.37 

Provide 

breakdown of 

all associated 

costs 

In addition, students believe that access to printers is a critical factor in the success of a laptop initiative. This 

factor had a mean score of 4.42 out of 5 with 86.2 percent agreeing or strongly agreeing with this factor. This 

finding is consistent with observed student behavior in physical computer labs. Many students complete their 

assignments on computers off-campus and bring their work to the computer lab to print hard copies. 

Students also believe that it is critical for the University to provide onsite maintenance for the laptop computers. 

This is evidenced by the high mean score of 4.39 out of 5. The majority of the students, 85.9 percent, agreed or 

strongly agreed with the necessity of onsite maintenance. If the students are expected to use their laptops as a 

learning tool, they must be able to have their laptop quickly serviced when needed. 

Another factor rated as critical by students was the issue of providing them with a breakdown of all associated 

costs of owning a laptop computer. With the increasing cost of tuition and books, money is almost always an 

issue with students. Approximately 85.5 percent of the students agreed or strongly agreed with this item. The 

mean response for this issue was 4.37 out of 5. 

Logistical and Technological Factors Perceived as Not so Critical 

To determine which logistical and technological factors were deemed as not so critical to the successful 

implementation of a laptop program, the mean responses to each question were calculated and examined. Figure 

2 below shows the five factors rated as less important with the means ranging from 2.92 to 3.58. 

While students may see the benefits of using laptop computers in their education, they do not believe that it 

should be a requirement. This factor had a mean score of 2.92 out of 5 with only 36.4 percent agreeing or 

strongly agreeing with this factor. This factor is consistent with the cost factor that was deemed as a critical 

success factor for the students. 

Students also believe that it is not critical to exchange to a new laptop after two years. The mean score of this 

issue was 3.00. While only 33.1 percent agreed or strongly agreed with exchanging to a new laptop after two 

years, 46.9 percent agreed or strongly agreed with not requiring a two year exchange. This finding is consistent 

with the question asking about the need for instantly upgrade hardware. About 75 percent do not perceive that it 

is critical to instantly upgrade hardware of the laptop. 

In addition, students believe that it is not critical to require the purchase of a backup battery. This factor had a 

mean score of 3.15 out of 5 with only 42.8 percent agreeing or strongly agreeing with this factor. This finding is 

consistent with the observed critical factor of providing sufficient power outlets in class. With sufficient power 

outlets available, the need for a backup battery becomes a non-critical issue. 

115

Figure 2. The factors perceived as not so critical 

The factor of upgrading the hard drive of the laptop was also a less critical issue. This factor had a mean score 

of 3.51 with only 50.5 percent agreeing or strongly agreeing with this factor. This is consistent with the findings 

regarding a required two year hardware update. 

Response Mean 

4.5 5 

3.5 4 

2.5 3 

1.5 2 

0.5 

0 

1 

Response Mean 

Exchange after two years 

4 

3.5 

3 

2.5 

2 

1.5 

1 

0.5 

0 

Backup battery 

All students must purchase 

3.58 3.51 

Provide 

physical 

storage 

space 

Difference Between Groups 

All faculty to use laptop 

Instantly 

upgrade 

hardware 

Instantly upgrade software 

Demonstrate laptop benefits 

Demonstrate how to use laptop 

Factors 

3.15 3.00 2.92 

Require all 

students to 

purchase a 

backup 

battery 

Factors 

Basic training 

Require all 

students to 

exchange 

to a new 

laptop after 

two years 

Standardized software 

Maintenance support 

Figure 3. Differences between support and reject groups 

Require all 

students to 

purchase 

laptop 

Support Group 

Reject Group 

Students also perceived that the necessity of providing physical storage space, for the laptop computer when not 

in use, was a less critical issue. This factor had a mean score of 3.58 with 56.1 percent agreeing or strongly 

agreeing with this factor. This finding is consistent with the portability concept of the laptop computer. Laptop 

and notebook computers are smaller and lighter which make them easy to carry around when not in use. Hence, 

physical storage space on campus is not a necessity. 

116

Support Group vs. Reject Group 

It is also interesting to examine whether there are any differences between students who favor the laptop 

program and those who are against the program in terms of their perceptions on each factor. The responses were 

therefore divided into two groups, those who favored the laptop initiative (support group) and those who did not 

support the initiative (reject group). The students who were uncertain on the laptop initiative were excluded. 

Then, t-tests were conducted to find out if there were significant differences between these two groups. Figure 3 

below shows the logistical and technological factors exhibiting a significant difference between the two groups 

at a p-value < 0.01. 

There was a statistically significant difference between the two groups on ten of the factors (see Figure 3): (1) 

exchange the laptop for a new one after two years, (2) purchase a backup battery, (3) require all students to 

purchase a laptop, (4) encourage all faculty to integrate the laptop in their teaching, (5) instantly upgrade 

software installed on the laptop, (6) demonstrate to students how to use the laptop, (7) demonstrate the laptop’s 

benefits to the students, (8) provide basic training on laptop operation, (9) provide all students with standardized 

software, and (10) provide onsite maintenance support. 

The fact that the support group rated all factors higher than the reject group demonstrated that they tend to pay 

more attention to the details of program implementation. These findings suggest that the university may want to 

consider these logistical and technological factors before implementing the laptop program. 

Conclusion 

Due to the ever increasing use of technology in primary and secondary education and the increasing demand, by 

industry, for more computer savvy graduates, the use of technology in higher education will continue to grow. 

Although many higher education institutions view the use of laptop computers as advantageous, in order to 

smooth the transition, the logistical and technological factors critical to a successful laptop program must be 

identified and addressed before implementing the requirement. 

This study identifies the logistical and technological factors that are important to students when requiring them 

to purchase a laptop computer. The results indicated that students, both those who support and those who do not 

support laptop initiatives, place a critical level of importance on the following factors: 1) having a wireless 

network in place, 2) having sufficient power outlets available in class, 3) having access to printers, 4) having 

onsite maintenance, and 5) having a breakdown of all associated costs. In addition, the two groups perceived 

that the following five factors were not so critical: 1) requiring all students to purchase a laptop, 2) requiring all 

students to exchange to a new laptop after two years, 3) requiring all students to purchase a backup battery, 4) 

instantly upgrading the hardware, and 5) providing physical storage space/locker for students to store the laptop 

when not in use. 

The results also revealed that ten logistical and technological factors were perceived differently between the 

groups who support and do not support the laptop program. These ten factors were: (1) exchange the laptop for a 

new one after two years, (2) purchase a backup battery, (3) require all students to purchase a laptop, (4) 

encourage all faculty to integrate the laptop in their teaching, (5) instantly upgrade software installed on the 

laptop, (6) demonstrate to students how to use the laptop, (7) demonstrate the laptop’s benefits to the students, 

(8) provide basic training on laptop operation, (9) provide all students with standardized software, and (10) 

provide onsite maintenance support. These findings suggest that the university may want to carefully consider 

these factors before implementing the laptop program. 

This study identified the logistical and technological factors perceived as critical by students regarding a laptop 

program. Determining such factors may allow educational institutions a base level awareness of students’ 

perceptions. This awareness could provide insights into what needs to be done towards an effective laptop 

program. 

These initial findings provide avenues for future research. To achieve a better understanding of all factors 

important to the laptop program, future research should also include the perceptions of faculty, administrators, 

and staff as well as those of students. 

117

References 

Al-Khaldi, M. A., & Al-Jabri, I. M. (1998). The relationship of attitudes to computer utilization: new evidence 

from a developing nation. Computers in Human Behavior, 14 (1), 23-42. 

Brown, D. G. (2003). Ubiquitous computing: The universal use of computers on college campuses, Bolton, MA, 

USA: Anker Publishing Company. 

Brown, D. G., Burg, J. J., & Dominick, J. L. (1998). A strategic plan for ubiquitous laptop computing. 


Brown, D. G., & Petitto, K. R. (2003). The status of ubiquitous computing. Educause Review, 38 (3), 25-33. 

Candiotti, A., & Clarke, N. (1998). Combining universal access with faculty development and academic 

facilities. Communications of the ACM, 41 (1), 36-41. 

Demb, A., Erickson, D., & Hawkins-Wilding, S. (2004). The laptop alternative: Student reactions and strategic 

implications. Computers & Education, 43 (4), 383-401. 

Finn, S., & Inman, J. (2004). Digital unity and digital divide: Surveying alumni to study effects of a campus 

laptop initiative. Journal of Research on Technology in Education, 36 (3), 297-317. 

Gottfried, J., & McFeely, M. (1998). Learning all over the place: Integrating laptop computers into the 

classroom. Learning & Leading with Technology, 24 (4), 6-12. 

Hall, M., & Elliott, K. (2003). Diffusion of technology into the teaching process: Strategies to encourage faculty 

members to embrace the laptop environment. Journal of Education for Business, 78 (6), 301-307. 

Hebert, M., & Benbasat, I. (1994) Adopting information technology in hospitals: the relationship between 

attitudes/expectations and behavior. Hospital & Health Services Administration, 39 (3), 369-383. 

Kiaer, L., Mutchler, D., & Froyd, J. (1998). Laptop computers in an integrated first-year curriculum. 


Kuo, C. (2004). Perceptions of Faculty and Students on the Use of Wireless Laptops. Society for Information 

Technology and Teacher Education International Conference 2004, retrieved 13 April 2006 from 

http://dl.aace.org/14470. 

Liaw, S. (2002). An Internet survey for perceptions of computers and the World Wide Web: relationship, 

prediction, and difference. Computers in Human Behavior, 18 (1), 17-35. 

Lim, D. (1999). Ubiquitous mobile computing: UMC’s model and success. Educational Technology & Society, 2 

(4), 125-129. 

Luarn, P., & Lin, H. H. (2005). Toward an understanding of the behavioral intention to use mobile banking. 

Computers in Human Behavior, 21 (6), 873-891. 

Mitra, A. (1998). Categories of computer use and their relationship with attitudes toward computers. Journal of 

Research on Computing in Education, 30 (3), 281-295. 

Mitra, A., & Steffensmeier, T. (2000). Changes in student attitudes and student computer use in a computerenriched 

environment. Journal of Research on Computing in Education, 32 (3), 417-433. 

Moore, G. C., & Benbasat, I. (1991). Development of instrument to measure the perceptions of adopting an 

information technology innovation. Information Systems Research, 2 (3), 192-222. 

Platt, M. W., & Bairnsfather, L. (2000). Compulsory computer purchase in a traditional medical school 

curriculum. Teaching and Learning in Medicine, 11 (4), 202-206. 

118

Rucker, J., & Reynolds, S. (2002). Learners and technology: experience, attitudes, and expectations. In Remp, 

A. M. (Ed.), Technology, Methodology, and Business Education: 2002 Yearbook, Reston, VA, USA: National 

Business Education Association, 1-20. 

Spodark, E. (2003). Five obstacles to technology integration at a small liberal arts university. T.H.E. Journal, 30 

(8), 14-24. 

Steel, J., & Hudson, A. (2001). Educational technology in learning and teaching: the perceptions and experiences 

of teaching staff. Innovations in Education and Teaching International, 38 (2), 103-111. 

Varvel Jr., V. E., & Thurston, C. (2002). Perceptions of a Wireless Network. Journal of Research on Technology 

in Education, 34 (4), 487-501. 

Weiser, M. (1998). The future of ubiquitous computing on campus. Communications of the ACM, 41 (1), 41-42. 

119

Appendix 1: Critical Success Factor Survey Instrument 

Please answer the following questions. 

CSF LAPTOP INITIATIVE QUESTIONNAIRE 

Do you own a desktop computer? 1. yes 2. no 

Do you own a laptop computer? 1. yes 2. no 

How would you rate yourself with respect to your knowledge about computers? 

1 (very poor) 2 3 4 5 6 7 (excellent) 

How long have you been using the Internet? 1. never 2. less than 1 year 3. 1-2 years 4. more than 2 years 

How often do you use the Internet per week? 1. never 2. one time 3. 2-3 times 4. more than 3 times 

How often did you use the desktop computer in high school per week? 1. never 2. one time 3. 2-3 times 4. 

more than 3 times 

How often did you use the laptop computer in high school per week? 1. never 2. one time 3. 2-3 times 4. 

more than 3 times 

How often do you use the desktop computer for classroom assignments per week? 1. never 2. one time 3. 2-3 

times 4. more than 3 times 

How often do you use the laptop computer for classroom assignments per week? 1. never 2. one time 3. 2-3 

times 4. more than 3 times 

How often do you use the desktop computer for leisure per week? 1. never 2. one time 3. 2-3 times 4. more 

than 3 times 

How often do you use the laptop computer for leisure per week? 1. never 2. one time 3. 2-3 times 4. more 

than 3 times 

Have you worked before? 1. yes 2. no 

Did your previous work require a use of a desktop computer? 1. yes 2. no 

Did your previous work require a use of a laptop computer? 1. yes 2. no 

What is your current employment status? 1. full-time 2. part-time 3. unemployed 

Does your current work require a use of a desktop computer? 1. yes 2. no 

Does your current work require a use of a laptop computer? 1. yes 2. no 

What is your expected cost per semester for the purchase of a laptop computer (including a service fee for onsite 

maintenance support)? 

1. less than $100 2. $100-$150 3. $151-$200 4. $201-$250 5. $251-$300 6. $301-$350 7. $351- 

$400 8. $401 up 

I prefer to pay ……….. 

1. …. $350 per semester (including onsite maintenance support) and graduate with a TWO year old laptop 

computer. 

2. …. $200 per semester (including onsite maintenance support) and graduate with a FOUR year old laptop 

computer. 

Your status: 1. student 2. faculty 3. staff 

120

What is your classification? 1. Freshman 2. Sophomore 3. Junior 4. Senior 5.Graduate 

Your college: 1. Arts & Humanities 2. Business 3. Education 4. Nursing 5. Science & Technology 

Are you a first generation college student? 1. yes 2. no 

Your gender: 1. male 2. female 

Your age: 1. under 18 2. 18-21 3. 22-25 4. 26-29 5. 30-33 6. 4-37 7. 38-41 8. 42-45 9. 46-49 10. 

over 49 

Your ethnicity: 1. African 2. Anglo 3. Asian 4. Hispanic 5. Native American 

How do you support your education? (Mark all that apply) 1. self 2. parents 3. spouse 4. children 5. 

scholarship 6. financial aid 7. loan 8. employer 9. Other 

Your annual income: 1. under $20,000 2. $20,000-$39,999 3. $40,000-$59,999 4. $60,000-$79,999 5. 

$80,000 and over 

How important do you think each of the following are ….…. 

1--- not important 2 --- somewhat not important 3 --- neutral 4 --- somewhat important 5 --- 

important 

For the laptop initiative to be successful, the University must…. 

…. provide a wireless network 1 2 3 4 5 

…. provide onsite maintenance support 1 2 3 4 5 

…. provide a loaner computer while the laptop is in for service 1 2 3 4 5 

…. provide sufficient power outlets in the class. 1 2 3 4 5 

…. provide sufficient power outlets outside the class. 1 2 3 4 5 

…. provide basic training to all students after they purchase the laptop 1 2 3 4 5 

…. provide all students with a university e-mail account 1 2 3 4 5 

…. provide a standardized package of software to all students 1 2 3 4 5 

…. provide a help desk to answer basic laptop operating questions 1 2 3 4 5 

…. provide a breakdown of all associated costs of owning a laptop 1 2 3 4 5 

…. provide students a laptop lease option 1 2 3 4 5 

…. provide students with network storage space 1 2 3 4 5 

…. provide students with access to printers 1 2 3 4 5 

…. provide update for virus protection 1 2 3 4 5 

…. provide physical storage space/ locker for students to store the laptop when not in use 1 2 3 4 5 

…. require all students to exchange to a new laptop after two years 1 2 3 4 5 

…. not require all students to exchange to a new laptop after two years 1 2 3 4 5 

…. require all students to purchase a backup battery 1 2 3 4 5 

…. require all students to purchase a laptop 1 2 3 4 5 

…. demonstrate the benefits of using the laptop before requiring them to purchase it 1 2 3 4 5 

…. demonstrate how to fully utilize the laptop before requiring them to purchase it 1 2 3 4 5 

…. encourage all professors to fully utilize the laptop in the class 1 2 3 4 5 

…. instantly upgrade the hard drive of the laptop. 1 2 3 4 5 

…. instantly upgrade the software installed in the laptop. 1 2 3 4 5 

…. to be able to transfer the work on my laptop to the campus computer. 1 2 3 4 5 

It is a good idea to require all students to purchase a laptop computer. 1 2 3 4 5 

List any other items that you feel are necessary to successfully implement a campus-wide laptop computer 

program. 

121

Cagiltay, N. E., Yildirim, S., & Aksu, M. (2006). Students’ Preferences on Web-Based Instruction: linear or non-linear. 


Students’ Preferences on Web-Based Instruction: linear or non-linear 

Nergiz Ercil Cagiltay 

Computer Engineering Department, Atilim University, Ankara, Turkey 

Tel: +90 312 586 83 59 

Fax:+90 312 586 80 90 

nergiz@atilim.edu.tr 

Soner Yildirim 

Department of Computer Education and Instructional Tech, Middle East Technical University, Ankara, Turkey 

Tel: +90 312 210 40 57 

Fax: +90 312 210 11 05 

soner@metu.edu.tr 

Meral Aksu 

Department of Educational Sciences, Middle East Technical University, Ankara, Turkey 

Tel: +90 312 210 40 05 

Fax: +90 312 210 11 05 

aksume@metu.edu.tr 

ABSTRACT 

This paper reports the findings of a study conducted on a foreign language course at a large mid-west 

university in the USA. In the study a web-based tool which supports both linear and non-linear learning 

environments was designed and developed for this course. The aim of this study was to find out students’ 

preferences pertaining to the learning environment and to address the factors affecting their preferences. 

The results of this study showed that the individual characteristics of the students affected their preferences 

on the learning path (linear or non-linear). 

Keywords 

Linear instruction, Non-linear instruction, Web-based course, Individual differences 

Introduction 

In traditional classroom settings, instructors present information by using a linear model. For example, a video 

may be shown from the beginning to the end or a textbook is covered from one chapter to the next. Generally, 

most of the early applications of modern technology were also based on structural and linear instruction through 

an electronic platform and mainly based on the delivery of course material (Roblyer & Edwards, 2000; 

Dalgarno, 2001, Simonson & Thompson, 1997). On the other hand, according to Howell, Williams and Lindsay 

(2003), instruction is becoming more personalized: learner-centered, non-linear and self-directed. Socialconstructivist 

pedagogical approaches have introduced different (active, learner-centered and communitycentered) 

models and pose strong arguments against the structured knowledge consumption approach (Koper & 

Oliver, 2004). 

As Silva stated (1999, p.1.), “The use of technology is important for Second Language courses, more important 

for Foreign Language courses and even more important in the curricula of the less-commonly taught foreign 

languages …”. Several studies found that computers have positive effects on teaching language. Stepp (2002) 

summarized some positive effective benefits of technology for foreign language learners. Research results of 

Computer-Assisted Instruction (CAI) showed a significant increase in students' scores in both reading 

comprehension and vocabulary and spelling (Stone, 1996; Kulik, 1994; SIIA, 2000) in the classrooms where 

computers are used. Students using computer software designed for developing spelling had significantly higher 

scores than the others (Stone, 1996; Anderson-Inman, 1990 cited in SIIA, 2000). Researchers have found that 

when students use word processors, they show a higher level of writing skills (SIIA, 2000). Hirata (2004) also 

shows that native English speakers who used the Japanese pronunciation training tool have improved their 

overall test scores significantly. However, there are some other studies showing that there were no significant 

differences between the classrooms using computer applications and those using traditional lecture-based 

courses (Wilson, 1996 cited in Gilbert & Han, 1999; Goldberg, 1997, Hokanson & Hooper 2000). For example, 

Cuban, Kirkpatrick, and Peck’s study (2001) show that investments in infrastructure and increased access to 

technology did not lead to increased integration, instead, most teachers remained “occasional” or “non-users” of 

classroom technology (p. 813). They state that limited time to learn and implement new technology was 






122

considered a serious barrier as well as poorly implemented professional development and defects in the 

technology itself. In parallel to this finding, Hokanson and Hooper (2000) pointed out that the expanded use of 

computers in education continues despite research having failed to accrue definite benefits in learner’s 

performance. According to Gilbert and Han (1999), the main reason for finding no significant difference 

between the traditional education system and the system using technology is the instructional methods. Barker, 

Giller, Richards, Banerji, and Emery (1993) reported that early implementations of computer-assisted language 

learning (CALL) also had several limitations. They were mainly built on text-based instruction with very limited 

end-user interaction and participation. 

For some researchers, presenting the information in a linear form was not a problem when the information being 

presented is well structured and simple. Often, however, as the difficulty of the material increased so did the lack 

of structure. When the knowledge domain to be taught is complex and ill structured, the use of traditional linear 

instruction becomes ineffective (Spiro, Feltovich, Jacobson, Coulson, 1991). In that context Barker (1997, pp 5) 

states that: 

… for one reason or another, many academic organizations are now therefore exploring the 

possibilities of using these new technologies to support student-managed, self-study activities in a more 

extensive way than they have in the past. 

However research findings on learner control have proven contradictory. Some findings show learner controlled 

environments lead to higher performance, whereas others show no significant difference, or even that instructor 

or program controlled systems work better. These findings illustrate that although learners’ control over the 

learning environment is important to improve the learning process, there should be some factors affecting their 

learning and preferences in such an environment. 

The existing studies outline the trends in the evolution of CALL and the development from the perspective of 

pedagogy and language learning (Warschauer & Kern, 2000), however more research into CALL is needed 

(Chambers, 2001; Davies, 2001; Levy, 2001). CALL assists language learning and is intended to enhance the 

way in which a language is taught and learned (Decoo, 2003). Decoo (2003) summarizes some of the levels of 

language teaching methods such as the label method, program method, textbook method, teacher method and 

student method. Decoo (2003) conclude that CALL is used to strengthen and improve these existing methods. As 

Chan, and Kim reports (2004), there is a shift in the second language curricula from declarative knowledge or 

“what we know about” to procedural knowledge or “what we know how to do”. This causes a greater emphasis 

on learners’ learning process (Chan, & Kim, 2004). They believe that, appropriate use of suitably designed 

Internet-based materials can make a significant contribution towards facilitating autonomous learning (the ability 

to take charge of one’s own learning) (Chan, & Kim, 2004). Accordingly developers now try to design 

interfaces that give learners more autonomy (Lonfis & Vanparys, 2001). They also claim that any foreign 

language curriculum that aims to promote autonomy must focus on putting learners in control of their linguistic 

and learning process (Chan, & Kim, 2004). However there is very limited research which examines students’ 

preferences and performance in such a learner controlled learning environment. 

In this study, a web-based tool was designed and developed for an entry-level foreign language course, which 

supports both linear and non-linear learning paths. The aim of this study is to find out students’ preferences 

regarding learner controlled environments and address the factors affecting these preferences. The next section 

examines the factors affecting students’ preferences regarding learner controlled learning environments. 

Different instructional approaches such as direct instruction and indirect instruction are also discussed in this 

section. In the third section, the research method is discussed. The fourth section reports the results of this study 

while the final section presents the conclusions and discussions of the current study. 

Background 

This section tries to investigate the factors affecting students’ preferences in a learner controlled environment. It 

also gives a brief explanation of direct and indirect teaching. 

Factors affecting students’ preferences on a learner controlled learning environment 

Individual Differences 

One prominent theory of individual differences is Howard Gardner’s Multiple Intelligences. Gardner suggests 

that all people have varying degrees of innate talents developed from a mixture of biological factors, evolution 

and culture (Gardner, 1983). Each intelligence represents an area of expertise with a specific body of knowledge, 

123

as well as a way of approaching learning in any domain. Students may experience new ways of expression, 

helping them individually, to understand multiple perspectives. In parallel to Gardner’s theory (1983), studies on 

learning and information processing suggest that individuals perceive and process information differently 

(Hitendra, 1998). According to Gilbert and Han (1999), how much individuals learn is related with the 

educational experience geared toward their particular style of learning. Mcmanus (1997, p1) argues that: 

One of the great promises of computer based instruction is the idea that someday the instruction could 

be adapted to meet the specific needs and styles of individual learners, thereby enhancing their 

learning. In order for this to happen, educators need to know which instructional and presentation 

strategies, or combination thereof, is most effective for individuals with certain learning styles and 

differences, in a given learning environment. 

Hitendra found that, certain cognitive styles might suit certain types of test tasks (1998). Cho also found that 

individual learning styles and preferences are presumed to affect the moment-to-moment selection of options in 

non-linear learning environments (1995). For example, several studies showed that the field-independent (FI) 

learning style seems to facilitate understanding the structure intended by the designer of the instruction more 

than those with a field-dependent (FD) style (SIIA, 2000). Kelley and Stack found that learners having an 

external locus of control usually tend to perceive reinforcements from other people (Kelley & Stack, 2000). 

According to Kelley and Stack, people with an internal Locus of Control (LOC) seek more control over life’s 

circumstances, and like to have more personal responsibility for outcomes (Kelley & Stack, 2000). 

Age 

Additionally, there is some evidence to suggest that the age of the learner may be an important variable in 

learner control. Shin, Schallert and Savenye (1994) indicated that students (seven or eight years of age) who 

were given only limited access through a hierarchical hypertext structure answered more questions correctly in 

the post-test than students in the free access network hypertext structure. Hannafin (1984) concluded from a 

review of the relevant research that learner control compared with program control is likely to be most successful 

when learners are older. 

Gender 

Research studies also show that gender is an effective factor on learner control. For example Knizie et al. (1992) 

found that the use of a program controlled environment resulted in better post-test performance for male 

students. Similarly, Braswell and Brown (1992) show that in interactive video learning environments females 

had a better performance than the males. 

Student’s prior knowledge and familiarity with the material 

Research studies also show that a student’s prior knowledge and familiarity with the material to be learned and 

the subject domain to be learned are the factors affecting students’ success in a learner-controlled instructional 

environment. For example, students whose prior understanding of a topic is low should be provided with more 

structured information whereas students whose prior understanding of a topic is high can be given more control 

over the instructional system (Gay, 1986). According to the findings of Charney, students who are new to the 

subject domain and non-linear learning systems may sequence the information poorly or omit important 

information altogether (1987). Cho reports that learners make poor decisions about what to study and in what 

order (i.e., selecting links) because they have insufficient knowledge about the new content (1995). 

Therefore, students vary in their cognitive or learning styles and would benefit from teaching techniques that 

appeal to their individual styles (Brown & Liedholm, 2004). By providing several different instructional 

methods, the use of technology in education will significantly improve educational performance (Gilbert & Han, 

1999) and web offers a rich environment for this purpose. Accordingly, as Internet technology is improved, 

Gardner’s theory has gained more popularity. The web offers a variety of instructional materials that could be 

incorporated into effective learning environments. 

124

Direct Teaching (Teacher Instruction) 

Direct teaching or direct instruction is a systematic way of planning, communicating, and delivering in the 

classroom. This method provides the students with strong structure that helps them to concentrate on their 

academic task. The direct instruction approach assumes that all students learn at the same speed and at the same 

way. The role of the learners in direct instruction is to stay on the task and perform. In this context program 

controlled learning environments are considered to be “direct instruction”. 

Indirect Teaching (Indirect Instruction) 

Indirect instruction is mainly student-centered. Indirect instruction seeks a high level of student involvement. It 

takes advantage of students’ interest and curiosity. It is flexible in that it frees students to explore diverse 

possibilities. Students often achieve a better understanding of the material and ideas under study and develop the 

ability to draw on these understandings. The current interoperability specifications have to be extended to 

include the multi-role interactions and the various pedagogical models that are needed to provide real support for 

learners and teachers in more advanced and newly developing educational practices. In our study, the learner 

controlled environment of the tool is considered as being “indirect instruction”. 

Several research efforts have shown that computer programs offer students greater control over their learning 

environments and have beneficial effects on students (Shoener, & Turgeo, 2001; Wooyong & Robert, 2000; 

Hargis, 2000; Kemp, Morrison & Ross, 1994; Hannafin & Sullivan, 1995; Shyu & Brown, 1992; Santiago, & 

Okey, 1992; Knowles, 1975). According to Schank students should control the educational process; not the 

computers (1993). In order to increase the learner’s control over the learning environment, organizing the 

instruction in a non-linear manner becomes important. In such an environment students have a higher degree of 

freedom regarding the method of study and the study material. 

However, there are studies showing that some students do not succeed in learner-controlled learning 

environments. For some it is hard to make decisions about what to study and in what order. They may have 

trouble in monitoring their own learning (Cho, 1995). In support of this, the findings of McNeil’s study (1991) 

show that a learner-controlled program is less effective than a computer controlled program in CAI at the 

elementary level. According to McCombs (1988 cited in Sleight, 1997), since students don’t know how to use 

strategies in a non-linear learning environment, they are having adjustment problems. Chang (2003) found that 

teacher-directed CAI was more effective in improving students’ achievement than student-controlled CAI, given 

the same learning content and overall learning time. Results from this study also revealed that students in the 

teacher-directed CAI group showed significantly more positive attitudes toward the subject matter than did those 

students in the student-controlled CAI group (Chang , 2003). According to Ellis and Kurniawan (2000) the 

flexibility in non-linear learning paths may increase complexity. 

Method 

This study was not designed to examine the effect of the web-based tool on students’ learning, instead its 

purpose was to reveal some factors effecting students’ preferences pertaining to linear and non-linear learning 

paths. In this sense, the tool for this study was designed and developed to support both linear and non-linear 

instruction. 

By addressing the factors that affect students’ preferences in this environment, the study also aimed to guide 

other research studies to gain further insight on the different behaviors of the learners. Accordingly the following 

factors were analyzed: 

(a) Individual differences among the students (age, gender, preferences on instructions (preferring teacherbased 

instructions (direct instruction) or self-paced learning (indirect instruction)), perceptions on problem 

solving (whether they are a good problem solver or not) and their learning preferences such as visual and 

audio) 

(b) Student’s prior knowledge and familiarity with the material (e.g. computers, computer experience, 

familiarity with windows-based applications and their backgrounds) 

This qualitative case study focuses on the perspectives of the participants of the study in order to uncover the 

complexity of human behavior in such a framework and present a holistic interpretation of what is happening in 

this context. As McMillan & Schumacher (2001) advocate case studies focus on phenomenon “. . . which the 

researcher selects to understand in depth regardless of the number of sites or participants” (p.398). Qualitative 

125

case studies also yield to better understanding of abstract concepts as this is the case for this study. Therefore, in 

this study the critical point is not the number of subjects but the depth and the quality of presenting the subjects’ 

learning experience with the system. 

The Course 

This study was conducted on an entry-level Turkish language course at a mid-west university in the USA. 

Turkish is a less-commonly taught foreign language in the USA. This course is offered as an elective for students 

with different backgrounds. The number of students registering in this course in each semester is usually very 

limited. Turkish instructors whose native language is Turkish, but who have no formal language education offer 

these courses. Accordingly, the instructor turnover rate is very high. Additionally, new instructors sometimes 

need additional information and sources to prepare the lectures and to find out the answers for some of their 

questions. 

The course was organized in the form of 45-minute lessons five days a week. In general, on Mondays the 

instructor would introduce the topics, on Tuesdays she would introduce the grammar issues related with that 

lesson, the next day she would do different exercises related with the topic, on Thursdays she would introduce 

some songs, audio-visual materials related with the topics and finally on Fridays she would help the students in 

group exercises. The course instructor also organized after-school conversation hours every three weeks. The 

students were free to attend these conversation hours. For these after-school conversation hours, the course 

instructor invited some people interested in talking Turkish, to help students communicate with native Turkish 

language speakers while talking about topics related to daily life. 

The Tool 

In this study, a web-based learning tool was developed for the entry-level Turkish language class (available at 

http://clio.dlib.indiana.edu/~ncagilta/tlepss.html and http://www.princeton.edu/~turkish/practice/tlepss.html). 

Several different forms of instruction such as sound, image and text were also provided. The contents of the tool 

were all adapted from the course textbook. The first unit of the course textbook bound the scope of this study. 

The contents of the tool were also enhanced with some sound files. There are 480 sound files in the system. The 

contents of the tool were fostered with some pictures (images) as well. Most of the images used for the lessons 

and examples were taken from Microsoft's clip art gallery (Microsoft, 2000). 307 images were used for this 

system. To assist the students in different applications, some tools such as text boxes, simulated Turkish 

keyboards, and indexes were also provided within the tool. The tool includes some instructions, examples and 

exercises. In Figure 1 an example page is shown. 

Figure 1. An Example in the Tool 

The tool was designed and developed to support both linear and non-linear instruction and several practice 

alternatives. 

Non-linear instruction 

Non-linear instruction was provided through the Indexes (organized as Turkish or English) (Figure 2) to help 

students to select any topic from indexed list, to initiate their own learning and direct it. Students may choose to 

select and study a concept, or go through the instruction and the examples using the index. 

126

Linear Instruction 

Figure 2. English Index providing non-linear Instruction 

Linear instruction was provided through the main menu shown in Figure 3. In this menu, the content is organized 

in the same order as introduced in the classical classroom environment. Students may follow the instructions in 

this linear order. 

Data Collection 

Figure 3. Main Menu providing linear Instruction 

The actual data collection process for this study was conducted during the first six weeks of the semester. The 

content of the tool covers the first three weeks of the course. At the beginning of the semester an orientation 

session was organized to introduce the main purpose of the research and how to use the web-based environment. 

Afterwards, students used the tool in parallel to their regular classes. The tool was not used during the classroom 

activities; rather students were provided with a CD version of the tool and were also able to access it via the 

Internet. During this period, students were asked to use the tool whenever they needed help. They had 

127

opportunities to choose any subject within the tool and study it, or practice a chosen subject. In this sense, 

students had to decide when to use the tool, how long to use the tool and how to use it according to their 

preferences. They had opportunities to repeat sounds, lessons and any activities in the tool whenever they needed 

help. After three weeks, an interview was conducted with each student in the classroom and with the course 

instructor. 

After the individual interviews, an observation session was also conducted with each student. During the 

observations, each student was asked to use the tool as if they were alone. The observer recorded each step that 

the student followed. The observation sessions took approximately half-an hour. The main purpose of the 

observation session was to find out the students’ preferences regarding the tool and to investigate their preferred 

learning path (linear or non-linear). These interviews and observations were all conducted within five days. 

During the second half of the six week period, students followed the lessons without the support of any specific 

web-based tool designed for these lessons. 

After this three week period, the next round of interviews was conducted with the students and the course 

instructor to obtain comparative feedback on the benefits and weaknesses of the tool. These interviews were all 

conducted within three days. 

During the first and the second interviews several questions were presented to the students and the instructor to 

get a better view regarding the major variables analyzed in this study. During the interviews with the course 

instructor, a different session for each student was conducted. 

Students 

There were ten students in the classroom. For the sake of anonymity, each student was assigned a different code, 

such as S1 and S2. Students were from different age groups and had different backgrounds. Meanwhile, there 

was a wide range of distinction among their preferences and their expectations during the learning process. Only 

S4, S5, S6, S8, S10 were a little familiar with the Turkish language because of a Turkish person whom they 

knew. The others were not familiar with the course content at all. There were four male students in the class; 

only three students had very low computer usage but not much familiarity with the web-based applications. Five 

students preferred to study by themselves (self-study) while others preferred to have teacher instruction. Tables 1 

& 2 summarize students’ profiles. All the data shown here was collected by means of interviews and 

observations, as mentioned in the data collection section. 

Table 1 Students’ Profiles (I) 

Code Age Gender Problem Solving? Studying with 

teacher/ 

self-studying 

Learning preferences 

S2 25 M Good Self Visual learner 

S3 21 F Not good Self Learning by listening 

S4 53 F Not good Self Learning by writing and repetition 

S7 25 M Pretty good Self Audiovisual, 

S10 18 F Good Self 

S1 42 M Excellent Teacher Drill-and practice, hands on, strode it on the 

board, do the exercises, verbalize it 

S5 25 M Very Good Teacher Learning by talking. 

S6 50 F Good Teacher Group learner, learning with traditional 

methods, Writing, talking and reading. 

S8 22 F Good Teacher Watching, writing, talking and listening 

S9 61 F Good Teacher Talking and listening, learning with traditional 

methods 

F: Female 

M: Male 

The students’ learning preferences were different from each other. Some students preferred to learn by drill & 

practice and doing exercises. On the other hand some students preferred visual illustrations. This group felt that 

visual illustrations helped them to learn easily and remember what they had learned. 

128

Table 2 Students’ Profiles (II) 

Code Likes Computers? Computer 

experience 

(years) 

Familiarity With 

windows based 

applications 

Background 

S1 Not always 30 Very high Central Eurasian Studies 

S3 Very much 3 Very high Computer 

Systems 

Information 

S2 Very much 8 High Criminal Justice 

S7 Yes, but cannot sit in front of a 19 High Law, Mathematics 

computer for several hour 

S8 A lot 10 High Biology 

S4 Yes 3 Middle Psychology 

S5 Very much 3 Middle History 

S6 Yes, but cannot sit in front of a 10 Low Educator 

computer for several hours 

S9 Not very much 16 Low MS on education 

S10 Not very much 3 Low Chemistry 

Average 10 

Limitations of the study 

Since the tool developed for this study is not a professional one, it has some limitations in the sense of the 

content included and user interface issues. Additionally, only ten subjects participated in the research. 

Results 

In this section, the data collected for this study is presented to provide evidence for the students’ preferences on 

the provided environment. Thus, first how students used the tool was analyzed. Then, students’ preferences 

pertaining to linear and non-linear types of instructions, according to their individual differences and their 

background regarding the environment were analyzed. 

How students used the tool 

Results indicated that, each student used the tool in several different ways. While some of them preferred linear 

instruction, others preferred non-linear instruction. One student who preferred to use the tool in a non-linear 

order stated that: 

If the teacher gives me a specific example, I go through the assignment; otherwise, I try to find 

the topic that we are learning in class and I study the topic simultaneously on the computer 

with the class. … I liked to use the system interactively. My preference was going through the 

index and looking for the specific things there. If I found something interesting I would explore 

it… 

Similarly, in that context another student stated that he used the tool in a non-linear way to get a better view 

about the tool. He also stated that, if it was like an exam, he would prefer to use it in a linear order. He stated: 

I like to play with things first and then when I get tired I go through each. I kind of experiment 

with it first and then go through each item randomly. But for an exam it helps to have it in 

linear, run through it from the beginning to the end. 

However, one student stated that, in some cases she preferred the linear instructions where some other cases she 

preferred non-linear instructions. She stated that: 

I liked to search through the content independently. I liked to study the tool by using the 

indexes. But, I preferred to look at different Turkish books while studying grammatical 

components, instead of studying it through the tool. I believe that manual methods work better 

for me while studying grammar. 

With regard to this issue, one student stated; even though she preferred non-linear instructions, she liked seeing 

linear and non-linear instructions together: 

129

If everything had been designed in a non-linear form, it would have been like a dictionary. 

Having the structured menu is very helpful. I preferred the dictionary part mostly, but I also 

liked seeing the content of unit one, and sometimes I went through it. After I studied the first 

unit, I wanted to come back to Main Menu. But if everything had been just like in the 

dictionary, it would have been harder. 

On the other hand, some students preferred only the linear instructions. One student stated that: 

I just started from the Main Menu, then I clicked on each one, I kind of way down the list, I did 

not get a chance to do the exercises in the book, but I worked the exercises on the CD. 

Another student who liked linear instructions reported that she followed the instructions in the book, and then 

went through the CD: 

I like some linear instruction because I am not trained completely [to be an] independent learner. The 

younger generation [is] yes. So I need some instruction. But I can do a little bit of independent 

searching. I just need practice. 

While some of the students preferred mostly to study the lessons, others did the exercises. Some of them used 

the tool on a daily basis after the lessons whereas others preferred to use it during the weekends. Most of the 

students studied the lessons from both the tool (web-based environment) and the course textbook. Only one 

student preferred to study the lessons from the web-based environment and not to use the textbook at all. Some 

students preferred to use the tool by reading the instructions and studying the examples. Some students preferred 

to study by means of exercises and went through the instructions as necessary. 

All students reported that they used the tool easily. All students found the user interface easy to use, easy to learn 

and self-explanatory. They said they could easily reach the answers of their questions using the tool. However, 

S6 reported that she encountered difficulty while studying the lessons using the tool. For example she needed 

some help from other people while using the tool, but when she needed help she could not find anybody. 

Additionally, while she was using non-linear instruction she was lost in the application. She reported, 

I should have followed the book but I did different things on the CD. So, I was lost on the CD. I blindly 

clicked on this that and was lost in the computer. 

As the course instructor describes, the technology is not her (S6) favorite. The course instructor declared that she 

helped S6 in using the tool 4 or 5 times in the computer laboratory. She added that other students did not ask for 

help. The instructor reported that, 

Sometimes she becomes intimidated because technology is something that she is not familiar with. It 

took some time for her to get used to the technology. After she got used to it, she liked it. But still 

technology is not her favorite thing. She likes studying by traditional methods. For this reason we study 

together with her to correct some mistakes in her homework. She prefers studying by writing, speaking 

and grammar; it is her preference. 

Individual differences 

According to the course instructor, using the tool in conjunction with the course curriculum improved classroom 

performance. She believes that the individual differences among students affect the classroom’s performance: 

The individual differences among students affect class performance seriously. In general freshmen are 

better in audio than the more senior students whereas the more senior students usually have problems 

in hearing the words and understanding the words in an audio exercise. If the students’ individual 

characteristics and their preferred ways of learning of in the classroom are similar, then they can 

become more active in the classroom [and] learn a lot. They are motivating each other as well as 

helping each other. If these individual characteristics are not similar, then I need to spend more time to 

building a common instructional style, which will be helpful for all [/most]of the students in the 

classroom. Most of the time, this also affects the general classroom performance. For example, 

sometimes, some students distract the others by asking a lot of questions and by not following the 

common preferred way of learning of the classroom. 

According to the course instructor, when students used the research tool, they were more prepared for the 

lessons. She did not spend much time on repeating those lessons. The instructor also reported that, students did 

not ask as many questions during the lessons in parallel to the tool as during the other lessons. She believes that 

130

students found answers for most of their questions from the tool. Using the tool was time saving for the course 

instructor as well and she gained more time to organize her lessons and do other activities in the classroom. 

Students’ preferences on the provided environment were analyzed according to the data collected by means of 

interviews and observations. Accordingly, Table 3 and 4 summarize students’ preferences on the linear and nonlinear 

ways of using the tool in the sense of individual differences. 

Table 3. Students preferring linear paths of instruction 

Code Age Prefers teacher or Problem Preferred way of learning Gender 

self-study solving 

S10 18 Self Good Female 

S8 22 Teacher Good Watching, writing, speaking 

and listening 

Female 

S6 50 Teacher Good Group learner, learning with 

traditional methods, Writing, 

speaking and reading. 

Female 

S4 53 Self Not good Learning 

repetition 

by writing and Female 

S9 61 Teacher Good Talking and listening, learning 

with traditional methods 

Female 

Average 41 

Table 4. Students preferring non-linear paths of instruction 

Code Age Prefers teacher Problem Preferred way of learning Gender 

or Self-study solving 

S1 42 Teacher Excellent Drill-and practice, hands on, Male 

S5 25 Teacher Very Good 

strode it on the board, exercises, 

verbalization 

Learning by speaking Male 

S7 25 Self Pretty Good Audiovisual Male 

S2 25 Self Good Visual Learner Male 

S3 21 Self Not Good Learning by listening Female 

Average 28 

Age: In this study, “age” was a factor affecting students’ preferences on selecting the linear or non-linear paths in 

the tool (Table 3 and 4). For example, the average age of the students who preferred linear instruction was 41 

where the average age of the students who preferred non-linear instruction was 28. In this context, during the 

interviews the course instructor said that “age” is an important factor affecting the way how students learn a 

language. She further reported: 

“Age” is also an important factor, not personally how old they are but when they learn a 

language, what system they use, what materials and techniques they use. So, when I say 

subject pronoun, the older students know what I am talking about. But I have to introduce 

these terms for the freshmen because they did not learn it that way. Similarly when I give them 

an Internet exercise, the freshmen can use it easily, but I have to spend hours with the older 

ones. 

Accordingly, while using the tool, mostly the students belonging to the younger generation (S2: 25 years old, S3: 

21 years old, S5: 25 years old, S7: 25 years old) preferred to use non-linear instruction. On the other hand, the 

more senior students (S4: 53 years old, S6: 50 years old, S9: 61 years old) preferred to use linear instruction in 

the tool. 

Gender: All male students preferred to use the tool through non-linear instruction. Only one female student (S3) 

preferred to use the tool through non-linear instruction. 

Preferred way of learning: Students who preferred to use the tool through linear instruction (S4, S6, S8, and S9) 

preferred the following ways of learning: writing, repetition, watching, group-learner or traditional learning 

methods. Other students who preferred to use the tool through non-linear instruction (S1, S2, S3, S5 and S7) 

preferred the following ways of learning: Drill and practice, visuals, listening, speaking and audiovisuals. 

131

Teacher or self-study preferences: Most students who preferred self-study (studying on their own) (S2, S3 and 

S7) also preferred to use the tool through non-linear instruction. On the other hand, other students who preferred 

studying with the help of the teacher (S6, S8 & S9) also preferred to use the tool through linear instruction. 

Perceptions on Problem Solving Capacity: Most students who preferred non-linear instruction described their 

perceptions on their problem solving capacity as very good, pretty good, or excellent (S1: excellent, S5: very 

good, S7: Pretty good). 

Prior knowledge and familiarity with the material 

Tables 5 and 6 summarize students’ preferences on the linear or non-linear way of learning with the tool 

according to their prior knowledge and familiarity with the material. Since they were not familiar with the course 

content before entering to this course, students’ prior knowledge on the course content will not be discussed 

here. 

Code Familiarity With 

Window Applications 

Table 5. Students preferring linear paths of instruction 

Likes Computers? Computer 

Experience 

(years) 

S8 High A lot 10 

S4 Middle Yes 3 

S6 Low Yes, but cannot sit in front of a 

computer for several hours 

10 

S9 Low Not very much 16 

S10 Low Not very much 3 

Average 8.4 

Code Familiarity With 

Window Applications 

Table 6. Students preferring non-linear paths instructions 

Like Computers Computer 

experience 

(years) 

S1 Very High Not always 30 

S3 Very High Very much 3 

S7 High Yes, but cannot sit in front of a 

computer for several hour 

19 

S2 High Very much 8 

S5 Middle Very much 3 

Average 12.6 

According to these results, only the familiarity with windows-based applications was an effective factor on 

students’ preferences. We could not find any relation between liking computers or not and their computer 

experience. 

Familiarity with window-based applications (FWWA): All students (except S8), whose FWWA is low or middle, 

preferred the linear learning path. Other students (except S5), whose FWWA is high or very high, preferred the 

non-linear learning path. 

Classroom Performance: Although this study is not organized to investigate the effect of the tool on students’ or 

classroom’s performance, according to the course instructor it improved the classroom performance. She 

reported that she did not spend much time on repeating those lessons included in the tool. According to the 

instructor, students did not ask as many questions during the lessons studied parallel to the developed tool as in 

the other lessons. She believes that students found answers for most of their questions from the tool. She declares 

that using this tool was also time saving for her: she gained more time for organizing her lessons and doing other 

activities in the classroom. According to the instructor, this tool helps students to study on their own: 

They [students] can study on their own, or study with the tool. It [the tool] will also, perhaps help 

develop a skill on using other technological tools, to learn a language. Again, you make it for them 

more fun and easer. [It helps] most of them to study on their own, so they are not always dependent on 

the teacher. So, upon that point of use, it is [the tool] very useful. 

132

Computer Experience: The average computer experience of the students who prefer linear path of instructions 

(8.4) is slightly lover than that of who prefer non-linear path of instructions (12.6). 

Computer Affinity: Most students who prefer the non-linear path of instruction like to use computers very much 

(S2, S3, S5). The students who do not like to use computers (S9 & S10) also preferred the linear path of 

instruction. 

Discussions and Conclusion 

The results of this study underlined that the learners’ preferred learning path (linear or non-linear) depends on 

their personal characteristics such as their age, perceptions on problem solving, teacher or self study preferences, 

familiarity with the windows based computer applications, gender and preferred way of learning. 

Earlier studies show that young students of age 7 or 8 are more successful on structured learning materials (Shin, 

Schallert, Savenye, 1994). Also, learner controlled programs are more successful than the program controlled 

when the learners are older (Hannafin, 1984). In our study, we have found that older students of age around 40 

also prefer linear and more structured instruction. However middle age group students of age around 20-30 

usually prefer non-structured and non-linear instruction. This shows us that while younger students and older 

students prefer more structured and linear way of learning, middle age group students prefer non-linear way of 

learning. 

We believe that, such tools can be used in the or outside the classroom to support current methods of 

instructions. The following section offers recommendations for instructors, designers and for further research. 

Implications for the Instructors 

Since students’ preferences on the learning path of instructions (linear or non-linear) differ, it is not always 

possible to provide an appropriate method for each student in the classroom. In such cases, the instructor has to 

choose a method which is common for most of the students. Accordingly, technological tools providing several 

different alternative ways of instructional methods could help to guide students according to their individual 

preferences. 

Implications for Further Research 

The factors analyzed in this study need to be evaluated in other learning environments with a different group of 

students. For example, learners’ familiarity with the most recent technological innovations and older 

technological applications should be analyzed as separate factors to get a better understanding of the design 

issues for such tools. 

Implications for the Designers 

The studies carried in the last 20 years showed that older students benefit most from the direct instruction (Volet, 

1995). In our study even the average computer usage score of students is 8.4 (Table 5), still the older students 

preferred direct instruction and they preferred to follow a linear path of instruction. Additionally, the students 

who have preferred non-linear path of instruction were more familiar with the windows-based computer 

applications. This shows that the students are not always well prepared for the new technologies, since 

technological developments occur very rapidly. This could be a big barrier for adapting technology into the 

traditional educational systems. In order to handle this problem, while designing new instructional systems, 

involving both the previous technologies and the current new technological approaches at the same time could be 

helpful. For example, in our study, we have found that individual differences among the students affect their 

preferences pertaining to linear and non-linear paths. The linear way of learning is more traditional. In that sense, 

it is not always easy to move learning from linear to non-linear organization. In order to benefit more from the 

non-linear way of learning and to ease students’ transition between these approaches, providing both methods at 

the same time and leaving the choice to the learner could be a good strategy. Such an approach helps students to 

133

go between these two approaches and to get use to the non-linear instruction. This approach could be applicable 

to any instructional tool involving more recent technologies. 

In our study we could not find any relation between students’ previous knowledge on the course content, if they 

liked working with computers or not as well as their computer experience and their preferences on linear and 

non-linear instruction. Additionally, these factors need to be tested in other research studies, and further the 

results of this study need to be supported by other qualitative and quantitative research. 

References 

Barker, P. (1997). Enhancing learning opportunities through electronic course delivery. Paper presented at the 

EDUTEC 97 Conference, University of Malaga, Spain, 26 October 1997. 

Barker, P., Giller, S., Richards, S., Banerji, A., & Emery, C. (1993). Interactive Technologies for Language 

Learning. In Proceedings of the ED-MEDIA '93 World Conference on Educational Multimedia and Hypermedia 

Florida, USA, 1993, 26-31. 

Braswell, R., & Brown, J. (1992). Use of interactive videodisc technology in a physical education methods class. 

Paper presented at the annual meeting of the American Educational Research Association, San Francisco, CA, 

USA. 

Brown, B. W., & Liedholm, C. E. (2004). Student Preferences in Using Online Learning Resources. Social 

Science Computer Review, 22 (4), 479-492 

Chambers, A. (2001). Introduction. In Chambers, A. & Davis, G. (Eds.), ICT and language learning. A 

European perspective, Lisse: Swets & Zeitlinger. 

Chan, W. M., & Kim, D. (2004). Towards Greater Individualization and Process-Oriented Learning Through 

Electronic Self-Access: Project “e-daf”. Computer Assisted Language Learning, 17 (1), 83-108. 

Chang, C. (2003). Teaching earth sciences: should we implement teacher-directed or student-controlled CAI in 

the secondary classroom? International Journal of Science Education, 25 (4), 427-438. 

Charney, D. (1987). Comprehending non-linear text: The role of discourse cues and reading strategies, Hypertext 

planning committee. In Proceedings of Hypertext ’87, Chapel Hill, North Carolina: The University of North 

Carolina, 109-120. 

Cho, Y. (1995). The nature of learner's cognitive processes in learner- and program-controlled hypertext learning 

environments. UMI, DAI-A 56/06, p. 2207, Publication Number, AAT 9534749. 

Cuban, L., Kirkpatrick, H., & Peck, C. (2001). High access and low use of technologies in high school 

classrooms: Explaining an apparent paradox. American Educational Research Journal, 38 (4), 813-834. 

Dalgarno, B. (2001). Interpretations of constructivism and consequences for Computer Assisted Learning. 

British Journal of Educational Technology, 32 (2), 183-194. 

Davies, G. (2001). New technologies and language learning: A suitable subject for research. In Chambers, A. & 

Davis, G. (Eds.), ICT and language learning. A European perspective, Lisse: Swets & Zeitlinger. 

Decoo, W. (2003). Language Methods and CALL: redefining our Relations. Computer Assisted Language 

Learning, 16 (4), 269-274. 

Ellis, R. D., & Kurniawan, S. H. (2000). Increasing the usability of online information for older users - A case 

study in participatory design. Instructional Journal of Human-Computer Interaction, 12 (2), 263-276. 

Gardner, H. (1983). Frames of Mind: The theory of Multiple Intelligences, New York, USA: Basic Books. 

Gay, G. (1986). Interaction of learner control and prior understanding in computer assisted video instruction. 

Journal of Educational Psychology, 78 (3), 225-227. 

134

Gilbert, J. E., & Han, C. Y. (1999). Adapting instruction in search of ‘a significant difference’. Journal of 

Network and Computer Applications, 22 (3), 149-160. 

Goldberg, M. W. (1997). CALOS: First Results from an Experiment in Computer-Aided Learning. In 

Proceedings of the ACM's 28 th SIGCSE Technical Symposium on Computer Science Education. 

Hannafin, M. J. (1984). Guidelines for determining locus of instructional control in the design of computerassisted 

instruction. Journal of Instructional Development, 7 (3), 6-10. 

Hannafin, R. D., & Sullivan, H. (1995). Learner control in full and lean CAI programs. Educational Technology 

Research and Development, 43 (1), 19-30. 

Hargis, J. (2000). The Self-Regulated Learner Advantage: Learning Science on the Internet. Electronic Journal 

of Science Education, 4 (4), retrieved June 30, 2006, from, http://unr.edu/homepage/crowther/ejse/hargis.html. 

Hirata, Y. (2004). Computer Assisted Pronounciation Training for Native English Speakers Learning Japanese 

Pitch and Duration Contrasts. Computer Assisted Language Learning, 17 (3-4), 357-376. 

Hitendra, P. (1998). An Investigation of the Effect of Individual Cognitive Preferences on Learning Through 

Computer-Based Instruction. Educational Psychology, 18 (2), 171. 

Hokanson, B., & Hooper, S. (2000). Computers as cognitive media: examining the potential of computers in 

education. Computers in Human Behaviour, 16 (5), 537–552. 

Howell, S. L., Williams, P. B., & Lindsay, N. K. (2003). Thirty-two Trends Affecting Distance Education: An 

Informed Foundation for Strategic Planning, retrieved April 13, 2006, from, 

http://www.westga.edu/~distance/ojdla/fall63/howell63.html. 

Kelley, T. M., & Stack, S. A. (2000). Thought recognition, locus of control, and adolescent well-being. 

Adolescence, 35, 531-551. 

Kemp, J. E., Morrison, G. R., & Ross, S. M. (1994). Designing effective instruction, NY, USA: McMillan. 

Kinzie, M. B., Sullivan, H. J., & Berdel, R. L. (1992). Motivational and achievement effects of learner control 

over content review within CAI. Journal of Educational Computing Research, 8 (1), 101–114. 

Knowles, M. S. (1975). Self-directed learning: A guide for learners and teachers. Chicago: Association Press. 

Koper, R., & Olivier, B. (2004). Representing the Learning Design of Units of Learning. Educational 


Kulik, J. A. (1994). Meta-analytic studies of findings on computer-based instruction. In Baker, E. & O'Neil, H. 

(Eds.), Technology Assessment in Education and Training, Hillsdale, NJ and Hove, UK: Lawrence Erlbaum. 

Levy, M. (2001). Coherence and direction in CALL research: Comparative Designs CALL & The challenge of 

Change in Exeter, Elm Bank. 

Lonfis, C., & Vanparys, J. (2001). How to Design User-Friendly CALL Interfaces. Computer Assisted Language 

Learning, 14 (5), 404-417. 

McManus, T. F. (1997). Self-regulated learning and web based instruction. In Carlson, P. & Makedon, F. (Eds); 

Proceedings of the World Conference on Educational Multimedia and Hypermedia, VA, USA: AACE. 

McMillan, J. H., & Schumacher, S. (2001). Research in Education: a conceptual introduction (5 th Ed.), Addison 

Wesley. 

McNeil, B. J. (1991). Meta-analysis of interactive video instruction: A 10-year review of achievement effects. 

Journal of Computer-based Instruction, 18 (1), 1-6. 

135

Microsoft (2000). Microsoft's clip art gallery, retrieved April, 13, 2006, from, 

http://cgl.microsoft.com/clipgallerylive/. 

Roblyer, M. D., & Edwards, J. (2000). Integrating educational technology into teaching (2 nd Ed.), Upper Saddle 

River, NJ, USA: Merrill. 

Santiago, R., & Okey, J. (1992). The effects of advisement and locus of control on achievement in learnercontrolled 

instruction. Journal of Computer-Based Instruction, 119 (2), 47-53. 

Schank, R. C. (1993). Learning via multimedia computers. Technology in Education, 36 (5), 54-56. 

Shin, E. C., Schallert, D. L., & Savenye, W. C. (1994). Effects of learner control, advisement, and prior 

knowledge on young students' learning in a hypertext environment. Educational Technology Research and 

Development, 42 (1), 33-46. 

Shoener, H. A., & Turgeo, A. J. (2001). Web-Accessible Learning Resources: Learner-Controlled versus 

Instructor Controlled. Journal of Natural Resources and Life Sciences Education, 30, 9-13. 

Shyu, H., & Brown, S. (1992). Learner control versus program control in interactive videodisc instruction: What 

are the effects in procedural learning. International Journal of Instructional Media, 19 (2), 85. 

SIIA (2000). Research Report on the Effectiveness of Technology in schools, 7 th edition, Software & Information 

Industry Association, Washington, DC, retrieved April, 13, 2006, from, 

http://www.sunysuffolk.edu/Web/Central/InstTech/projects/iteffrpt.pdf. 

Silva, R. (1999). Computer-Aided Instruction of Foreign Languages: the case of Beginning Brazilian 

Portuguese. Paper presented at the National Association for Self Instructional Language Programs (NASILP) 

Conference, Washington D.C., USA, October 1999. 

Simonson, M. R., & Thompson, A. (1997). Educational computing foundations (3 rd Ed.), Upper Saddle River, 

NJ, USA: Prentice Hall. 

Sleight, D. A. (1997). Self-Regulated Learning during Non-Linear Self-Instruction, retrieved April, 13, 2006, 

from, http://www.msu.edu/~sleightd/srl.html. 

Spiro, R. J., Feltovich, P. J., Jackson, M. J., & Coulson, R. L (1991). Cognitive flexibility, constructivism, and 

hypertext: random access instruction for advanced knowledge acquisition in ill-structured domains. Educational 

Technology, 31 (5), 24-33. 

Stepp, J. (2002). Student Perceptions on Language learning in a technological environment: Implications for the 

millennium. Language Learning & Technology, 6 (1), 165-180. 

Stone III, T. T. (1996). The academic impact of classroom computer usage middle-class primary grade level 

elementary school children. Dissertation Abstracts International, 57/06-A (Order No. AAD96-33809). 

Volet, S. (1995). Process-oriented instruction: A discussion. European Journal of Psychology of Education, 10 

(4), 449-459. 

Warschauer, M., & Kern, R. (Eds.) (2000). Network-based language teaching: Concepts and practice, 

Cambridge: Cambridge University Press. 

Wooyong, E., & Robert, A. R. (2000). The effects of self-regulation and instructional control on performance 

and motivation in computer-based instruction. International Journal of Instructional Media, 27 (3), 247. 

136

Hsieh, Y.-C. J., & Cifuentes, L. (2006). Student-Generated Visualization as a Study Strategy for Science Concept Learning. 


Student-Generated Visualization as a Study Strategy for Science Concept 

Learning 

Yi-Chuan Jane Hsieh 

Department of Applied Foreign Languages, College of Commerce, Ching Yun University, Jung-Li 320, Taiwan 

Tel: (+886) 3-4581196 ext. 7903 

Fax: (+886) 3-250-3014 

ychsieh@cyu.edu.tw 

Lauren Cifuentes 

Department of Educational Psychology, College of Education and Human Development 

Texas A&M University, College Station, Texas 77843-4225, USA 

Tel: (+1) 979-845-7806 

Fax: (+1) 979-862-1256 

laurenc@tamu.edu 

ABSTRACT 

Mixed methods were adopted to explore the effects of student-generated visualization on paper and on 

computers as a study strategy for middle school science concept learning. In a post-test-only-control-group 

design, scores were compared among a control-group (n=28), a group that was trained to visualize on paper 

(n=30), and a group that was trained to visualize on computers (n=34). The paper group and the computer 

group performed significantly better on the post-test than the control group. Visualization as a study 

strategy had positive effects whether students used paper or computers to generate their visualizations. 

However, no significant difference existed between the paper group and the computer group’s scores. 

Qualitative results indicated that students who were trained to visualize on paper or on computers had 

positive attitudes toward the use of visualization as a study strategy, and they engaged more in purposeful 

tasks during study time than the control group. However, several participants in the computer group felt 

negatively about the use of visualization; they claimed that visualization demanded too much time and 

effort. Attitudes toward visualization as a study strategy differed within groups indicating that visualization 

might interact with learner characteristics. 

Keywords 

Visualization, Study strategy, Concept mapping, Science learning 

Theoretical Background 

Constructivist theorists contend that learning occurs when learners actively construct their own knowledge and 

think reflectively when information and concepts are presented to them (Lee, 1997). Students build their 

understanding in the context of mentoring from instructors who help them find connections among concepts 

(Bransford, 2000; Julyan & Duckworth, 1996). “Connections that are obvious to an instructor may be far from 

obvious to a pupil” (Driver, 1983, p. 2). Such connections are made through experience over time. One strategy 

for providing students with constructivist learning experiences that also gives teachers access to the students’ 

understandings for effectively mentoring is student-generated visualization. 

Student-generated visualization is defined as student-created graphical or pictorial representations of sequential, 

causal, comparative, chronological, oppositional, categorical, and hierarchical relationships among concepts, 

whether hand drawn on paper or constructed on computers (Cifuentes & Hsieh, 2004a, 2004b, 2004c; Cifuentes, 

1992; Wileman, 1993). Students’ study notes are visualizations if the concepts are presented in the form of direct 

representations, diagrams, matrices, charts, trees, tables, graphs, pyramids, causal chains, timelines, or even 

outlines (Cifuentes & Hsieh, 2004a, 2004b, 2004c). 

Wittrock (1990) stressed that successful comprehension requires learners to actively construct relationships 

among parts of a text, and between the text and knowledge and experience, and to invest their effort at 

generating headings, summaries, flow charts, pictures, tables, metaphors, and analogies during or after the 

reading process. Plotnick (1997) later listed several advantages of constructing visual representations of text: (a) 

visual symbols are quickly and easily recognized; (b) minimum use of text makes it easy to scan for a word, 

phrase, or the general ideas; (c) visual representation allows for development of a holistic understanding that 

words alone cannot convey (p. 3). 






137

One of the visualization techniques commonly used by learners for knowledge construction is concept mapping. 

Concept mapping is the “process for representing concepts and their relationships in graphical form, providing 

students with a visually rich way to organize and communicate what they know” (Anderson-Inman & Ditson, 

1999, p. 7). As students construct concept maps, they actively revise and manipulate the information to be 

learned and look for meaningful concept connections (Kinchin, 2000). Concept mapping also “enhances 

students’ self-esteem through the sense of accomplishment of constructing a map and the subsequent realization 

that they can extract concepts to construct their own meaning” (Novak, 1998, p. 124). 

Several tools facilitate generating visual representations of concepts, including paper and pencil, pen or colored 

markers, computer graphics software, or concept mapping software, such as Inspiration or Microsoft Visio. 

Constructivist theorists indicate that computers can be the “cognitive tools” to support learners’ thinking 

processes (Jonassen, 2000; Lajoie, 1993). Namely, computers can be used as tools for “analyzing the world, 

accessing information, interpreting and organizing individuals’ personal knowledge, and representing what they 

know to others” (Jonassen & Reeves, 1996, p. 694). Anderson-Inman (1996) noted that the creation of 

visualization on computers made the learning process more accessible to students, and it helped mitigate the 

frustration and confusion felt by students while constructing concept maps with the paper-and-pencil approach. 

The use of computers for constructing visual representations of concepts offers several advantages. Students can 

manipulate graphical representations with ease, and can perform subsequent revision and dynamic linkage of 

concepts. In addition, computers support learners’ production of sophisticated-looking graphics regardless of 

their level of artistic skill (Cifuentes & Hsieh, 2004a, Anderson-Inman & Ditson, 1999; Anderson-Inman & 

Zeitz, 1993; Lanzing, 1998). 

Researchers, such as Anderson-Inman (1992), and Anderson-Inman, Knox-Quinn, & Horney (1996), found that 

the use of computer-based concept mapping tools as a vehicle to synthesize information during study time had a 

positive impact on learning. Computer graphics software, such as AppleWorks and PhotoShop allow 

students to access a variety of tools that help them draw and paint objects to visually organize and represent what 

they know. Jonassen (1996) indicated that drawing and painting programs are powerful expressive tools for 

learners to visually articulate what they know, and to make their knowledge structures more easily interpretable 

by other viewers. Skilled learners can use drawing and painting tools for image generation to facilitate making 

sense of their ideas and producing an organized knowledge base (Jonassen, 1999). When the drawing and 

painting programs are used as “cognitive tools” for mindful engagement, learners’ thinking processes can be 

supported, guided, and extended (Derry & Lajoie, 1993). 

Nevertheless, there are several concerns regarding the use of computers during study time. Downes (2000) noted 

that children between 5 and 12 years old usually perceive the computer as both a tool and a toy. When children 

are thinking and using the computer as a toy, it is “playable.” Such a conception of the computer enables 

children to complete their work-related tasks more through playful means. Without adult guidance, children who 

use computers during study time are more likely to spend their time fiddling around instead of concentrating on 

purposeful tasks. Computer-related distractions, such as technological problems, and multimedia, or software 

features also have an impact on students’ effective use of computers during study time. 

Generally, research studies indicate that visualization on computers has improved student learning. In studies in 

which computers have been a distraction, those researchers concluded that computers would have been 

beneficial given controlled elements in the school settings, explicit instruction in when, what and how to 

visualize concepts on computers, the selection of the appropriate computer tool with students’ high competency 

in its usage, and frequent feedback regarding accuracy of students’ visualization. In this study, further 

investigation into the effect of the use of computer-based drawing and painting tools on learning was conducted. 

Each of the concerns regarding the use of computers during study time was addressed in the student visualization 

training. 

Research Questions 

This study explored the use of visualization as a study strategy for middle school science concept learning. The 

relative effectiveness of student-created visualization on paper and on computers was investigated. The research 

questions addressed through the quantitative method included: (1) Did students trained to visualize on paper 

perform better on a comprehension test than those not trained to do so? (2) Did students trained to visualize on 

computers (using Microsoft 

138 

TM drawing and painting tools) perform better on a comprehension test than those 

who not trained to do so? (3) Was there any difference on comprehension posttest scores between students who 

trained to visualize on computers (using Microsoft TM drawing and painting tools) and those who trained to 

visualize on paper?

Other questions addressed through the qualitative method were listed as the following: (4) What was the 

difference among groups in their level of engagement during study time? (5) What were the elements in the 

school setting and learners’ behaviors that might contribute to the effectiveness or lack of effectiveness of the 

use of Microsoft TM drawing and painting tools to generate visualizations during study time? (6) What were 

students’ attitudes toward generating visualizations during study time? (7) Had students ever been exposed to the 

visualization skills that were identified in the provide visualization workshops? 

Based on previous studies, we expected that students who received visualization training on either paper or 

computers would outperform students who did not receive training. We provided visualization training to help 

the participants develop the abilities to organize, to monitor, and to control their generative processes. 

Microsoft TM drawing and painting program was selected as a visualization tool because the students were 

competent in using those tools. 

Methods 

This study used the quantitative and qualitative methods to investigate the research topic. The participants were 

8th graders (N=92) in six intact science classes at a rural, public, junior high school in the Southern Texas. Two 

classes were randomly assigned to be the control group, two classes to be the visualization/paper group, and two 

classes to be the visualization/computer group. All participants received the given treatment as part of their 

curricular activity in regular science class periods. 

Because of the lack of random selection and random assignment of individuals to groups, evidence of groups’ 

equivalence was especially important. According to the chi-square tests results, the three groups did not differ 

across age or ethnicity, but did differ in gender. The visualization/computer group had significantly more male 

students than the other two groups (p< .05). Further, in order to determine whether the three groups differed in 

their perception of the five biological topics (Energy, Cells, Homeostasis, Coordination, and Transport in Plants 

and Animals), students were asked to report the extent (I knew none/some/a lot of/all of the information) to 

which they had been previously exposed to the information presented in the studied essays after the treatment. 

The frequency counts of the extent to which the students had been exposed to the studying content in studying 

essays was shown in a 5 x 4 chi-square contingency table. Five Pearson chi-square tests results indicated that 

there were no significant differences among the proportion of students in the three groups who checked the 

extent to which they had been exposed before to the five biological topics. Accordingly, the students in each 

group did not differ in their prior knowledge on the content in studying essays. It was suspected that the 

participants’ prior knowledge of the content shown on the science essays did not confound their comprehension 

test scores. 

In addition, in order to gain a better understanding of the participants’ computer use at school and at home, a 

“Computer Use Survey” was administered prior to the treatment. The results indicated that the three groups did 

not differ significantly in the frequency of using a computer at school and at home, in the amount of time spent 

in using a computer at school and at home, in the number of computer courses taken in the past, in their 

performances in previous computer course(s), in the average amount of time using a computer at home each 

time, and in the frequency of using computer tools to support various learning tasks. 

Procedures 

A posttest-only control group design was used. The control group attended 100-minute non-visualization 

workshop. Each day, students watched a 25-miunte science-related videotape in their regular classroom. Videos 

were carefully selected to assure that content did not include concepts to be covered in the experimental studying 

materials. On the first, second, third, and fourth days after the workshop and on days five and six, students were 

given five different science essays for unguided and independent study. When students finished their studying, 

they handed in their study notes to the test administrator, and they were given a test covering comprehension of 

the science essays. 

As for the visualization/paper group, students attended a 100-minute paper-form visualization workshop, which 

lasted 25-miunte per day. On days one, two, three, and four, students were instructed in how to identify each of 

the three types of structures (cause-effect, sequence, and comparison-contrast) and in how to use visual 

conventions (fishbone diagram, flowchart, matrix/table) to represent these structures on paper. On the first, 

second, third, and fourth days after the workshop and on days five and six, students were given the same science 

essays to study as the control group. Students had to use their learned visualization skills to graphically represent 

139

the interrelationships among concepts on paper. When students finished their studying, they handed in their 

study notes to the test administrator, and they were given the same comprehension test as the control group. 

The visualization/computer group attended a 100-minute paper-form visualization workshop, which lasted 25 

minutes per day. During the workshop, they were asked to perform their visualizing work on computers, using 

Microsoft TM drawing and painting tools. The visualization workshop was delivered in the school computer lab. 

The content and the structure of the workshop were the same as for the visualization/paper group except that the 

teacher modeled visualization on computers rather than on paper and students generated their visualizations on 

computers. On the first, second, third, and fourth days after the workshop and on days five and six, students were 

given the same science essays to study as the control group. Students had to use the learned visualization skills to 

graphically represent the interrelationships among concepts on computer. When students finished their studying, 

they handed in their study notes to the test administrator, and they were given the same comprehension test as 

the control group. 

After the completion of the comprehension posttest, all participants were asked to verbally describe their study 

strategy used to prepare for the comprehension posttest. The visualization/paper and the visualization/computer 

groups were further asked to respond two questions: (1) How did you feel about the generation of visualizations 

during study time? Did it help you learn the content? and (2) Have you ever learned the visualization strategy 

before? 

During the entire study, the participants’ science teacher delivered the treatments. During the workshop, one of 

the researchers helped this teacher in guiding students in how to properly represent what they learned. The 

science teacher and the researchers also provided feedback to the visualization groups regarding the 

appropriateness of their visualizations during their study time. Following the treatments, the researchers 

administered the comprehension posttest for all groups. 

Essays studied by groups and the comprehension posttest 

The five science essays for students to study were excerpted from an American biology textbook, Biology: the 

web of life (Strauss & Lisowski, 1998), adopted in the 9-12 grade Texas science curriculum. The illustrations 

were eliminated in order to create the text-based document. The higher level reading passages were used to 

assure a high level of difficulty and a lack of student exposure to the content. 

The first day after the treatment video-watching or visualization workshop, all participants were asked to spend a 

maximum of 20 minutes studying an essay on science concepts on “Energy”, and then they took a paper-andpencil 

comprehension test containing five multiple-choice items. On the second day, after the video-watching or 

visualization workshop, all participants were asked to spend a maximum of 20 minutes studying an essay on 

“Cells”. After that, a paper-and-pencil comprehension test containing two multiple-choice items were 

administered. 

Likewise, on day three, after the video-watching or visualization workshop, all participants were asked to spend 

a maximum of 20 minutes studying an essay on “Homeostasis”, and then they took a paper-and-pencil 

comprehension test containing eight multiple-choice items. On the fourth day, after the video-watching or 

visualization workshop, all participants were asked to spend a maximum of 20 minutes studying an essay on 

“Coordination”. After students studied the written verbal science text, a paper-and-pencil comprehension test 

containing six multiple-choice items was administered. 

On day 5, all participants were asked to study an essay on “Transport in Plants and Animals.” They had a 

maximum of 50 minutes to study on the fifth day. When the study period time was over, the researchers 

collected all students’ essays. On the sixth day, the researcher handed the same essay back to the students, and 

they had a maximum of 35 minutes to complete their studying. After that, students were given a comprehension 

test containing nine multiple-choice test items to measure their understanding of the reading materials that they 

had spent time studying since day 5. 

The test items administered on each day were combined to form a 30-item multiple choice comprehension test. 

Students did not receive the results of their comprehension posttest each day; instead, the researchers graded 

each participant’s total testing items to form their final scores in the last day of the study. The test items were 

derived from the Taiwan Educational Department Test Bank for Middle School Biology (Taiwan National 

Institute for Compilation and Translation, 2000) that evaluated students’ higher level of thinking skills, which 

140

equired them to apply comprehension, analysis, and synthesis skills (Bloom & Krathwohl, 1956) to answer 

questions. The test items were constructed and validated by three content experts to be appropriate for this study. 

For the purpose of scoring students’ responses to the comprehension test items, 3.5 points were given for a 

correct answer and no credit was given for incorrect or unanswered questions. The reason to assign 3.5 points to 

a correct answer was for the test to have face validity, and have no detrimental result on the total score. The 

comprehension test had a reliability coefficient of 0.71 (coefficient alpha). 

Results 

One-way analysis of variance indicated that there was a significant difference among the three treatment groups, 

F (2, 89) = 20.363, p

The control group. On the first day, students were told that those who worked very hard on the learning tasks 

and behaved well would get small gifts, and points would be deducted from grades related to the activity for 

those students who did not work hard. Approximately 12 students engaged themselves in serious studying in the 

following days. However, eight male students still kept visiting with their friends and making a fuss during the 

prescribed study time on days two, four, and five. These students did not have a sense of time pressure for the 

completion of the assigned task, and paid less attention to the instructor. Nevertheless, most female students 

were able to sit quietly in the classroom, and read the textual materials with highlighting and underlining. On the 

third day, when asked to study the provided science essay, most of the students started making noises and 

expressed no interest in doing the assigned work. After a couple of minutes of goofing around with their 

classmates, some of those students went back to their work, but some still did not care much about their work. 

None of the students asked to take the test before the prescribed study time. When the prescribed study time was 

over, no students requested more time to study. 

The visualization/paper group. Students were more attentive to the learning tasks than the control group. At the 

beginning of the first day, students were informed that they would get small gifts if they studied the reading 

materials quietly and created visualizations during study time. As with the control group, students were told that 

points would be deducted from grades related to the activity for those students who did not work hard. Two-third 

of the students was very cooperative in the following days. There was only one student who kept talking to other 

folks the entire study time on days one, two and three. The teacher thus deducted points from that student's 

conduct behavior grade, and he started to behave himself on days four, five, and six. Some students had been 

asking the teacher each day whether their visualization work and test would be graded. The teacher responded to 

them by saying that their visualization work and test score would be graded and used for the teacher’s references. 

Many female students studied diligently and worked hard on the generation of their visualization throughout the 

six days. The teacher walked around the classroom all the time and provided necessary help to those students 

who had difficulty visually representing the related concepts in the texts, and gave corrective feedback to 

students’ visualizations. Four male students were fond of having conversation with each other during study time, 

and the teacher had to regulate their behaviors, but generally this group was engaged in studying. 

The visualization/computer group. Again, students were informed that they would get small gifts if they studied 

the reading materials quietly and created visualizations during study time and that points would be deducted 

from grades related to the activity for those students who did not work hard. About half of the students were able 

to concentrate on drawing and painting the graphics relevant to the reading content on computers. One student, 

who made a racket in the computer lab on the first day, started to behave himself very well on days two to six in 

order to get more gifts from the teacher. Further, one boy expressed his inability to study independently and 

chose to visit with his friends in the computer lab throughout the six days. He just copied his near-seat 

classmate’s work on computers in the last five minutes of the prescribed study time. Two girls were found to 

open another computer window to draw something irrelevant to the assigned reading texts, and they did not 

study at all. Two boys liked to make fun of each other while constructing visualizations on computers during 

study time. Two additional male students felt uncomfortable seated in their chairs at the computers, and they 

often chose to scuffle down the aisle or do make scary movements to surprise the teacher and classmates. The 

teacher deducted points from the grades of some of the students’ conduct behaviors, and asked them to study 

quietly. 

During study time, some students requested help because of their difficulties in adjusting the position of lines, 

circles, squares, arrows, or making tables on computers. Students’ difficulties in the features of the Microsoft TM 

drawing and painting tools diminished as their progressive familiarity in the functional features of the tools each 

day. 

Further, three or four students complained about their slow typing speed and asked someone else to type for 

them. One male student worked very hard on the computer, but he needed much help from the teacher in order to 

understand the content of the textual information due to his limited English proficiency. He was quite frustrated 

about his inability to spell the English words accurately and about his slow typing speed on computers. Six or 

seven students spent much time playing with the functions of the computer drawing and painting tools during the 

study time; apparently, computers continued to distract these students’ studying process. They were not able to 

finish their studying by the end of the prescribed study time on days one, two, three, and four. 

142

Elements that contributed to the effectiveness and lack of effectiveness of the use of computers to generate 

visualizations in the workshops and during the study time 

The participants in the visualization/computer group were taken to the computer lab in a separate building near 

their science classroom for six days. There were 21 PC computers in the lab, and each of them was connected to 

the Internet. Every student had his or her own I.D. and password for logging in to the computer system. 

However, none of the computers had been networked to the printers and, therefore, saving all files for printing 

was impractical. All students were asked to save their work in folders on the computers at the end of the study 

time. 

On the second day, there were only 19 computers available for 22 students because of technological problems, 

and six students were paired with partners to share a computer during the workshop and the study time. The 

teacher delivered the visualization workshop using Microsoft TM PowerPoint slides, which were presented on a 

big TV screen at the front of the computer lab. Students had to arrange their seats in order to see the TV screen 

during the workshop. Students sitting in the back of the room had a problem with the words presented on the TV 

screen being too small to read, and they were told to read their slide handouts during the workshop. The teacher 

wrote the complete steps for opening the Microsoft TM drawing and painting tools on the blackboard located in 

the back of the computer lab. The teacher continuously walked around the lab to make sure every student 

followed directions. Some students did not bring paper or pencil to the computer lab on the first day, so the 

researcher provided those students with pencils and pens. 

During the workshop, when asked to practice visualizing sequential relationships found in the text on computers, 

all the students were busy engaging themselves in drawing circles or rectangles with the connected lines. Several 

spent much time figuring out how to type the words into the created circles or rectangles. Two or three students 

chose to draw on paper first before constructing visualizations on computers. Many students typed using only the 

forefinger of their right or left hands, and this slow typing process made them dislike what they had to do on 

computers. Three or four students tended to focus on beautifying their matrices with lots of colors and totally 

neglected the accuracy of the visual representations of concepts. Several students, probably three or four, looked 

around the room a lot, and they often expected the teacher to come to their seats and instruct them on what to do. 

During the essays study time, one girl worked very diligently throughout the six days, and she asked the 

teacher’s immediate feedback about her visualizations all the time. Male students tended to be highly distracted 

by the computer software, and liked to play computer games. Two or three students drew graphics that were 

totally irrelevant to the science concepts being explored on days five and six, and the teacher instructed them to 

get back to their assigned task. Six students studied continuously and drew quietly at the computer regardless of 

the noises created by other students in the computer lab. 

Students’ attitudes toward generating visualizations during study time 

For those students who visualized on paper during study time, 7 out of the 30 expressed that the generation of 

visualization that showed interrelationships among concepts helped them learn, and that they could understand 

what they read better through the use of this strategy. One of the students even mentioned that the visualizing 

process helped her learn because she was a visual learner. Five students did not think that visualizations could be 

helpful to their studies, and they stated that they would rather use the study strategy they were used to, such as 

memorization, or asking themselves questions during the study time. One of those students disliked the use of 

visualization as a study strategy because he felt it took too much effort and time to do—he thought it was quite 

boring. 

Students who received computer-based visualization workshop training and who constructed graphics on 

computers during study time showed more willingness to include visualization as part of their regular study 

technique. Twelve of the 34 students indicated that visualization was helpful to their studying of the assigned 

textual information. One of those expressed that visualizing the relationship among concepts on computers made 

the learning “pretty fun.” Another student said that graphing the reading content on her computers improved her 

understanding of the concepts. Four students mentioned that the use of computer tools to create visual 

representations was great because it helped them answer test questions better. However, five students did not 

think that visualization was a useful study strategy because they did not know what to do on the computers 

without the teacher's guidance. One of those students stated that she needed to spend more time learning how to 

create graphics that represent relationships among concepts on computers before she could use it comfortably 

during independent study time. Some students were interested in using computers to draw the relationships 

143

among concepts although they had no or limited experiences using the computer drawing and painting tools 

before. 

Students’ prior exposure to the visualization skills identified in the visualization workshops 

Twenty-three out of the 30 students in the visualization/paper group expressed that they had not learned how to 

generate graphical visualizations to show interrelationships among science concepts in the past. Five students in 

the visualization/paper group reported that they had learned a similar visualization skill at school—probably at 

the grade level of five, six, or seven. However, only 2 out of the 34 students in the visualization/computer group 

had learned how to construct visual representations of concepts before in their seventh grade classes. 

Summary of results 

The results of the statistical analyses conducted to answer the first three research questions indicated that the 8 th 

graders who were trained to visualize on paper or on computers during study time scored higher on a 

comprehension posttest than those students who attended the non-visualization workshops and applied an 

unguided study strategy. However, no significant difference was found between the mean scores of the 

visualization/paper group and the visualization/computer group. 

The control group showed the least level of engagement during the study time, and they did not necessarily use 

of the entire study time to master the content being studied. Over the week, students in the control group became 

less engaged. However, students in the visualization/paper group were highly engaged in generating 

visualizations associated with the concepts in the studying essays. With the teacher’s guidance and the 

researcher’s assistance, most students in the visualization/computer group were also able to engage in the 

creation of meaningful visualizations on computers during the workshop and during the study time. However, 

computers distracted some male students. Generally, female students in the three groups studied quietly the 

entire study time; whereas, the male students were in need of more teacher supervision. Over the week, students 

in the visualization/paper group and visualization/computer group became more engaged. 

Elements in the school setting that contributed to instructional effectiveness in the computer-based visualization 

training included adults’ assistance in how to visualize on computers, and the teacher’s feedback regarding the 

accuracy of students’ visualizations. Elements that interfered with instructional effectiveness in the computerbased 

visualization training and may explain the lack of difference between computer-based visualization and 

paper and pencil-based visualization were: (a) inordinate time spent exploring the features of the tools, (b) 

students’ slow typing speed, (c) students’ inefficient use of study time at the computer, and (d) computer 

distractions, such as multimedia programs, or computer games. These elements might have negatively impacted 

the test performance of male students in particular in the visualization/computer group. 

Approximately 23% of students in the visualization/paper group and 35% of students in the 

visualization/computer group thought that visualization as a study strategy was helpful for their understanding of 

science concepts. Those students showed positive attitudes toward the use of visualization as a study strategy for 

learning science textual materials. About 16% of students in the visualization/paper group and 15 % of students 

in the visualization/computer group felt negatively about the use of visualization. They claimed that visualization 

demanded too much time and effort, and that they needed a teacher’s guidance in how to visualize on computers 

and how to make use of features of drawing and painting tools to generate visual representations of concepts. 

Discussion 

Constructivists contend that students should be provided with learning activities that help them negotiate 

meaning internally and socially, rather than simply receiving explanations of phenomenon from those in the 

know (Fosnot, 1996). Student-generated visualization as a study strategy supports constructivist learning by 

allowing students to extract concepts to construct knowledge and it enables them to think reflectively while they 

learn. Our findings provide further evidence that the visualizing process engages learners in meaningful learning. 

The construction of visual representations during study time can help individuals build internal associations 

among information in long-term memory, and external associations between new information and prior 

knowledge. 

144

Similar to the findings of previous studies (Cifuentes & Hsieh, 2004b, 2004c; Anderson-Inman & Ditson, 1999), 

we found that students who were instructed in the structures of text, and who subsequently made use of visual 

conventions to spatially represent concepts on paper or on computers during their study time, performed better 

on the comprehension test than did the groups who studied in the traditional manner of re-reading, underlining, 

and highlighting. The visualization training provided in this study successfully helped students develop the 

abilities to organize, to monitor, and to control their generative processes. The participants were able to generate 

meaningful visualizations with sufficient practice, and with teachers’ feedback regarding the appropriateness of 

their visualizations during study time. Students who constructed meaningful visualizations deeply processed 

what they learned, and they felt more confident in their own acquired knowledge. 

Our study also provided evidence for the Perkins & Unger (1999) theory indicating that students who are able to 

construct their own representations among concepts demonstrate a deep understanding of those concepts. The 

recognition of structures in science expository texts and the use of visual conventions to spatially represent tobe-learned 

materials positively affected the way that young participants approached science texts. Studentgenerated 

visualization, when created with sufficient cues from adults, showed relevancy to the content, and 

demonstrated accuracy of the interrelated concepts, which successfully improved learners’ understanding of 

concepts. 

Consistent with the findings in the Cifuentes, & Hsieh (2004a) study, we also found that middle schoolers easily 

engaged in unpurposeful tasks on computers as a result of being distracted by the perceived computer 

affordances. The students in the visualization/computer group tended to spend an inordinate amount of time 

adjusting the display of graphics, changing the color and font of texts, and experimenting with different 

functions on the tool bar when constructing visual representations on computers. Several students intended to 

make their graphics very “artistic” and “appealing” with no consideration given to the content of their 

visualizations. 

However, the students in the visualization/paper group were able to concentrate on the reading of the 

information and finish studying. When compared to the visualization/paper group, students in the 

visualization/computer group were placed in a more complex environment with intervening elements that easily 

diverted their attentions students. Several students in the visualization/computer group were unaware of what 

they were doing at the computer and they appeared to lack metacognitive strategies. Nevertheless, given 

visualization as a study strategy, students who constructed visualization on paper or on computers engaged 

themselves more in the meaning-making process than those who simply read and applied the unguided study 

strategy to master the content being studied. Visualization offered students a means to reflect on what they 

learned and assess their understanding of concepts. 

Further, the results of this study indicated there was no significant difference between the test performance of the 

visualization/paper group and the visualization/computer group. Contrary to the findings in the Cifuentes & 

Hsieh (2004a) study, the researcher found that with the instructor’s and the researcher's guidance and supervision 

in the computer lab, the student-generated visualization on computers was as effective as visualization on paper 

for facilitating science concept learning. The differences between the findings in this study and the Cifuentes & 

Hsieh’s (2004a) study might be due to the teachers’ delivery of the training in visualization, the presence of 

teachers during students’ study time, the provision of immediate feedback from teachers regarding the 

appropriateness of student-generated visualizations during study time, and the availability of the researcher’s 

assistance with the features of the drawing and painting tools on computers. The participants of this study might 

also have been motivated to learn because of external rewards, such as gifts or praise, granted on each day of the 

study. 

The use of computer drawing and painting tools to generate visualizations during study time had a positive 

impact on middle school science concept learning in this study. The students in the visualization/computer group 

scored significantly higher on the comprehension posttest than did the students who did not attend the computerbased 

visualization workshop and who had applied the unguided study strategy during study time. With the 

appropriate management and control of students’ behaviors in the computer lab, the students in the 

visualization/computer group were able to engage themselves in the generation of facilitative visual 

representations of texts on computers. Computers thus functioned as “cognitive tools” for students to reflect, 

refine, and assess their structural knowledge. 

Even though we found that the construction of visualizations on paper and on computers during study time was 

fruitful to middle school science concept learning, most participants in the visualization/paper and in the 

visualization/computer groups did not express their wish to use such a strategy as their regular study technique. 

It was found that the process of creating visualizations that showed interrelationships among concepts required 

145

learners to be attentive to learning tasks, and to be willing to invest their time and effort to think deeply, 

organize, and process what they learned. Rather than invest in visualization, pupils are more likely to attach to a 

quick and easy memorization strategy to master something new. Students rarely have the chance to construct 

other useful study strategies necessary to become better learners. The findings of this study urge educators to 

help young unsophisticated learners develop expertise in how to visualize so that the process becomes laborsaving 

and so that they can use that expertise to construct useful knowledge within each subject domain. 

Limitations of the Study 

This study result may not generalize to the students of different grade levels or to students from dissimilar school 

settings. The effect of the use of self-generated visualization as a study strategy on middle school science 

concept learning may be specific to the participants. The participants’ behaviors were influenced by the learning 

climate in class, as well as by their teacher’s expectations of them. 

Further, the participated middle school had a limited number of computers in the lab; therefore, students’ access 

to the computers was restricted. Urban middle schools may have had higher probabilities of being equipped with 

more and better computers, and students in those schools have more competency to use computers as a study 

tool. Additionally, the learning outcomes might be dissimilar as a result of the use of different drawing and 

painting tools on different computer platforms to generate visualizations. Also, the reading materials used in this 

study were mainly from biology subject matter. It should be noted that the use of visualization as a study strategy 

for interpreting textual information from other domains might not produce outcomes similar to this study. 

Conclusion 

Visualization is a factor in scientific understanding (Earnshaw & Wiseman, 1992; Peltzer, 1988), and it is also an 

important part of becoming a creative thinker (De Bono, 1995; Torrance & Safter, 1999). Given the power of 

visualization as a study strategy and the limited visualization skills among learners, an orientation is essential to 

prepare learners for using visual techniques to represent interrelationships among concepts. With appropriate 

scaffolding for students, young learners can construct meaningful concept representations and progressively 

internalize visualization as part of their study strategy (Bliss, Askew, & Macrae, 1996). 

Teachers need to provide regular feedback about student-generated representations so students can repeatedly 

revise their visuals until they match the scientific reality. Jonassen (1999) described the kinds of activities in 

which students can engage for meaning making. Such activities should be incorporated into the curriculum to 

prepare students for scientific thinking: 

The process for building the ability to model phenomena requires [1] defining the model, [2] using the 

model to understand some phenomena, [3] creating a model by representing real world phenomena and 

[4] making connections among its parts, and finally [5] analyzing the model for its ability to represent 

the world (p. 227). 

The more time for modeling and practice in how to visually represent the interrelationships among concepts, the 

more likely students will learn science concepts through visualization. Middle school students should be 

encouraged to engage in the diverse practice of learning how to construct their own concept representations 

while receiving expert or teacher’s feedback regarding their appropriateness. 

Given the positive effect of training students to create visualizations using computer-based drawing and painting 

tools, students may be instructed to use computers as a study tool to intentionally organize and integrate content 

ideas. The computer-based visualization tool provides students with a means to depict the underlying structure of 

the ideas they are studying, and to understand how new concepts are related to the known, which is the basis for 

meaning making (Jonassen, Peck, & Wilson, 1999). Training on metacognitive strategies in a computer-based 

learning environment is of importance for young learners’ efficient use of study time. When students study with 

computers, they need to be advised to regard computers as a partner to help them construct knowledge instead of 

as playable toys. Curriculum is suggested to include computer graphics activities from an early age so computers 

can become partners to support learners in their efforts to become better thinkers, rememberers, and problem 

solvers (Jonassen, 2000). Visualization should be a focus of science curriculum. 

146

References 

Anderson-Inman, L. (1992). Computer-supported studying for students with reading and writing difficulties. 

Reading and Writing Quarterly: Overcoming Learning Disabilities, 8, 317-319. 

Anderson-Inman, L. (1996). Computer-assisted outlining: Information organization made easy. Journal of 

Adolescent & Adult Literacy, 39 (4), 316-320. 

Anderson-Inman, L., & Ditson, L. (1999). Computer-based concept mapping: A tool for negotiating meaning. 

Learning and Leading with Technology, 26 (8), 6-13. 

Anderson-Inman, L., Knox-Quinn, C., & Horney, M. A. (1996). Computer-based study strategies for students 

with learning disabilities: Individual differences associated with adoption level. Journal of Learning Disabilities, 

29 (5), 461-484. 

Anderson-Inman, L., & Zeitz, L. (1993). Computer-based concept mapping: Active studying for active learners. 

The Computer Teacher, 21 (1), 6-8, 10-11. 

Bliss, J., Askew, M., & Macrae, S. (1996). Effective teaching and learning: Scaffolding revisited. Oxford Review 

of Education, 22 (1), 37-61. 

Bloom, B. S., & Krathwohl, D. R. (1956). Taxonomy of educational objectives: the classification of educational 

goals, handbook I: cognitive domain, New York: Longmans. 

Bransford, J. D. (Eds.) (2000). How people learn: brain, mind, experience, and school, Committee on 

Developments in the Science of Learning and Committee on Learning Research and Educational Practice, 

Commission on Behavioral and Social Sciences and Education, National Research Council. Washington, DC. 

Cifuentes, L. (1992). The effects of instruction in visualization as a study strategy, Unpublished doctoral 

dissertation, Chapel Hill, NC: University of North Carolina. 

Cifuentes, L., & Hsieh, Y. C. (2004a). Visualization for middle school students' engagement in science learning. 

Journal of Computers in Mathematics and Science Teaching, 23 (2), 109-137. 

Cifuentes, L., & Hsieh, Y. C. (2004b). Visualization for Construction of Meaning during Study Time: A 

Quantitative Analysis. International Journal of Instructional Media, 30 (3), 263-273. 

Cifuentes, L., & Hsieh, Y. C. (2004c). Visualization for Construction of Meaning during Study Time: A 

Qualitative Analysis. International Journal of Instructional Media, 30 (4), 407-417. 

De Bono, E. (1995). Mind power, New York: Dorling Kindersky. 

Derry, S. J., & Lajoie, S.P. (1993). A middle camp for (un)intelligent instructional computing: An introduction. 

In S. P. Lajoie and S. J. Derry (Eds.), Computers as cognitive tools, Hillsdale, NJ: Lawrence Erlbaum 

Associates, 1-11. 

Downes, T. (2000). The computer as a “playable” tool. Paper presented at annual meeting of the American 

Educational Research Association, New Orleans, LA. (ERIC Document Reproduction Service No. ED 442 454). 

Driver, R. (1983). The pupil as scientist?, Milton Keynes, England: Open University. 

Earnshaw, R. A., & Wiseman, N. (1992). An introductory guide to scientific visualization, New York, NY: 

Springer-Verlag. 

Fosnot, C. T. (1996). Constructivism: theory, perspectives, and practice, New York, NY: Teachers College. 

Jonassen, D. H. (1996). Computers in the Classroom: mindtools for critical thinking, Englewood Cliffs: 


Jonassen, D. H. (1999). Designing constructivist learning environments. In C. M Reigeluth (Eds.), Instructional 

design theories and models: a new paradigm of instructional theory, Hillsdale, NJ: Lawrence Erlbaum, 215-239. 

147

Jonassen, D. H. (2000). Computers as mindtools for schools: Engaging critical thinking, Upper Saddle River, 

N.J.: Merrill. 

Jonassen, D. H., & Reeves, T. C. (1996). Learning with technology: Using computers as cognitive tools. In D. H. 

Jonassen, (Ed.), Handbook of research on educational communications and technology, New York: Macmillan, 

693-719. 

Jonassen, D. H., Peck, K. L., & Wilson, E. G. (1999). Learning with technology: a constructivist perspective. 

Upper Saddle River: Prentice Hall. 

Julyan, C., & Duckworth, E. (1996). A constructivist perspective on teaching and learning science. In C. T. 

Fosnot (Ed.), Constructivism: theory, perspectives, and practice, New York, NY: Teachers College Press, 55-72. 

Kinchin, I. M. (2000). Concept mapping in biology. Journal of Biological Education, 34 (2), 61-68. 

Lajoie, S. P. (1993). Computer environments as cognitive tools for enhancing learning. In S. P. Lajoie & S. J. 

Derry (Eds.), Computers as cognitive tools, New Jersey: Lawrence Erlbaum, 261-288. 

Lanzing, J.W.A. (1998). Concept mapping: Tools for echoing the minds eye. Journal of Visual Literacy, 18 (1), 

1-14. 

Lee, P-L. H. (1997). Integrating concept mapping and metacognitive methods in a hypermedia environment for 

learning science, Unpublished doctoral dissertation, West Lafayette: Purdue University. 

Novak, J. D. (1998). Learning, creating, and using knowledge: concept maps as facilitative tools in schools and 

corporations, New Jersey: Lawrence Erlbaum. 

Peltzer, A. (1988). The intellectual factors believed by physicists to be most important to physics students. 

Journal of Research in Science Teaching, 25 (9), 721-731. 

Perkins, D. N., & Unger, C. (1999). Teaching and learning for understanding. In C. M Reigeluth (Eds.), 

Instructional-design theories and models: a new paradigm of instructional theory, Hillsdale, NJ: Lawrence 

Erlbaum, 91-114. 

Plotnick, E. (1997). Concept mapping: A graphical system for understanding the relationship between concepts, 

ERIC Clearinghouse on Information and Technology, Syracuse University, 4-194 Center for Science and 

Technology, Syracuse, NY 13244-4100 (ERIC Document Reproduction Service No. 407 938). 

Strauss, E., & Lisowski, M. (1998). Biology: the web of life, Menlo, CA: Scott Foresman-Addison Wesley. 

Taiwan National Institute for Compilation and Translation (2000). Middle school biology: volume one, Taipei: 

National Institute for Compilation and Translation. 

Torrance, E. P., & Safter, H. T. (1999). Making the creative leap beyond…, Buffalo, NY: Creative Education 

Foundation Press. 

Wileman, R. E. (1993). Visual communicating, Englewood Cliffs, NJ: Educational Technology Publications. 

Wittrock, M.C. (1990). Generative processes of comprehension. Educational Psychologist, 24 (4), 345-376. 

148

Rafi, A., Samsudin, K. A., & Ismail, A. (2006). On Improving Spatial Ability Through Computer-Mediated Engineering 

Drawing Instruction. Educational Technology & Society, 9 (3), 149-159. 

On Improving Spatial Ability Through Computer-Mediated Engineering 

Drawing Instruction 

Ahmad Rafi 

Faculty of Creative Multimedia, Multimedia University, Cyberjaya, 63100 Selangor, Malaysia 

Tel: +603-8312 5555 

Fax: +603-8312 5554 

ahmadrafi.eshaq@mmu.edu.my 

Khairul Anuar Samsudin and Azniah Ismail 

Fakulti Teknologi dan Komunikasi, Universiti Pendidikan Sultan Idris, Tanjung Malim, 35900 Perak, Malaysia 

Tel: +605-450 5021/5041/5050 

Fax: +605-458 2615 

kanuar@upsi.edu.my 

azniah@upsi.edu.my 

ABSTRACT 

This study investigates the effectiveness of computer-mediated Engineering Drawing instruction in 

improving spatial ability namely spatial visualisation and mental rotation. A multi factorial quasi 

experimental design study was employed involving a cohort of 138, 20 year old average undergraduates. 

Three interventional treatments were administered, namely Interactive Engineering Drawing Trainer 

(EDwgT), conventional instruction using printed materials enhanced with digital video clips, and 

conventional instruction using printed materials only. Using an experimental 3 x 2 x 2 factorial design, the 

research has found statistical significant improvements in spatial visualisation tasks. There appears to be no 

improvement in reaction times for Mental Rotation. The authors also have investigated the gender 

differences and the influence of prior experience of spatial ability. 

Keywords 

Spatial ability, Spatial visualisation, Mental rotation, Computer-mediated Engineering Drawing 

Introduction on spatial ability 

Spatial ability is widely perceived by the scientific fraternity to be a vital trait in human intelligence. This 

involves a cognitive process through the brain’s attempts to perceive and interpret certain types of incoming 

information namely the visual and spatial information. A simple comprehension of this visual-spatial 

information is one’s ability to remember how to move around in a house. This skill becomes evident when 

individuals are engaged in activities rich in visual and spatial content such as Engineering, Science, Architecture 

and Technology. Though deemed vital, the emphasis to train and practice in spatial skills seldom receives the 

desired attention where historically other forms of intelligence in particular mathematical ability have been given 

more emphasis albeit recent evidence that suggests a similar importance of humans mental capabilities of the 

former. According to Smith (1964), many assumptions of spatial ability tests were prone to practical and 

mechanical abilities that are useful in predicting success in technical fields, but not as measures of abstract 

reasoning abilities. 

Several definitions has been proposed, all having different nomenclatures but in essence they indicate this ability 

to be a non-unitary construct of human intelligence deemed very important in life. Mayer and Sims (1994) 

explained spatial ability as the ability to rotate or fold objects in two or three dimensions and to imagine the 

changing configurations. According to Sternberg (1990), one’s spatial ability pertains with the visualisation of 

shapes, rotation of objects, and how pieces of a puzzle would fit together. Likewise, Linn and Petersen (1985) 

indicated the skills in representing, transforming, generating and recalling symbolic and nonlinguistic 

information are asscociated with this ability. Spatial ability is also viewed to be multi faceted comprising several 

sub skills. Meta-analysis of studies made between 1974 and 1982 by Linn and Petersen (1985) suggested spatial 

ability to be composed of three distinct components namely spatial visualisation, spatial perception and mental 

rotation. 

Factors influencing the development of spatial ability 

Factors influencing the development of spatial ability have been identified in several psychological studies such 

as gender, age and spatial-related experience (Miller, 1996). Results of the meta-analysis (Linn & Petersen, 






149

1985; Voyer et al.,1995) show that gender differences in favor of males in spatial abilities exist. The magnitude 

of these gender differences are slight, moderate and pronounced in spatial visualisation, spatial perception and 

mental rotation respectively. Men are arguably faster at the second stage of the mental rotation operation 

comprising sequential stages; perceiving, rotating, and comparing where women are relatively slower at this 

stage (Kail et al., 1979; Petrusic et al.,1978) contributing to this disparity. Explanations from biological and nonbiological 

perspectives namely socio-cultural factors have expatiated women’s relative lower performance in 

spatial tasks. Several theories such as male cerebral lateralisation (Gur et al., 2000; Vogel et al., 2003), sex 

hormone (Hier & Crowley, 1982), and x-linked genetic theory (Harris, 1978) explicate the cause of gender 

differences from biological perspective. 

Gender differences in spatial aptitude may be caused by several socio-cultural factors such as gender-typed 

socialisation. Children’s conception of how to behave is shaped by the gender-linked beliefs about them resulted 

from their parental notions on gender-related abilities (Swim, 1996). Girls are inadvertently not encouraged to 

pursue an interest in masculine domains which are heavily loaded with spatial reasoning aspects. Females may 

also underestimate their potentials in spatial domains due to their parental beliefs that affect their self-perception 

of capability (Fouad & Smith, 1996). Male children have more opportunities to engage in spatial activities at an 

early age through gender-biased activities that promote spatial skills development (Baenniger & 

Newcombe,1989). The situation is exacerbated by girls’ lack of practice that engenders low motivation where 

engagement in spatial tasks will be discouraged or the likelihood of failure will increase (Stericker & 

LeVesconte, 1982). 

Age related differences in spatial ability were investigated by Piaget and Inhelder (1971) suggesting that an 

individual’s cognitive development determines the potential of what one could achieve. At the initial stage a 

child exhibits a purely egocentric view of the world that continues to the second stage. A child develops spatial 

thoughts independent of images but still requires the actual presence of the object being manipulated at the third 

stage in the range of 7 to 12 years old. This stage of development concurs with Ben-Chaim’s et. al (1989) 

proposition that the optimal age for acquiring spatial skills through instruction is between 11 and 12 years old. 

The capability of exploring mental manipulation involving infinite spatial possibilities and complex 

mathematical concepts is attainable at the formal operational stage from the age of 13 years old onwards. The 

maximum potential is reached at the age of 17 implying that students in high schools or post secondary 

education are formal operational thinkers. Individuals’ spatial ability seems to reach maturation stage at 

adolescence and will gradually decline in the late twenties in the general populations due to aging effects even 

among individuals who are using these abilities in their profession (Salthouse et. al, 1990). 

Prior to Maccoby’s and Jacklin’s (1974) study there were many hypotheses debating whether spatial ability was 

inherited or learned behavior. Findings of some research after this time provide evidence that spatial skills can be 

taught (Alington et al., 1992; Bishop, 1989; Petrusic et al., 1978). In 1987, the National Council of Teachers of 

Mathematics recommended that an individual must have experienced with geometric properties and relationships 

to help develop this ability. The extent of improvement and level of performance gain for both sexes are very 

important considerations in studies dealing with spatial skills training. Practice effects have also been detected in 

previous research for spatial training both for general and specific skills. Empirical evidence shows females 

through practice improved at a significantly higher rate because according to Baenniger and Newcombe (1989) 

men might be operating closer to their maximum potential. On the other hand, Sorby’s (1999) study suggests 

that there is significant room for improvement of spatial skill for both men and women depending on the 

difficulty of the task. Of late, various instructional interventional programs using computer and multimedia 

technologies have been carried out (Ahmad Rafi et al., 2005; Khairul & Azniah, 2004; Lehmann, 2000; Turos & 

Ervin; 2000) to address spatial skills acquisition and its corresponding impact for both genders reveal promising 

results. 

Purpose of Study 

Spatial skills such as spatial visualisation and mental rotation are important abilities in tertiary education 

involving studies which are highly involved with spatial and visual understanding especially in Engineering, 

Computer Graphics, Architectural and other scientific courses. Investigation on developing these cognitive skills 

among undergraduates is vital to assist students lacking these abilities. The effectiveness of training using 

different instructional methods will suggest educators on utilising appropriate approaches that maximise the 

training impact. Demographic factors such as gender and spatial experience also contributes to the overall 

training outcomes. Considering these points, three research questions were formulated to articulate the research 

study as follows: 

150

a) Will learner’s spatial visualisation and mental rotation abilities improve after performing Engineering 

Drawing tasks? 

b) Are there substantial differences in improvement attributed to methods of instruction used for 

Engineering Drawing tasks? 

c) Do demographic factors such as gender and spatial experience substantially influence training 

outcomes? 

In this study, three experimental conditions involving Engineering Drawing exercises were administered namely 

the computer-mediated Interactive Engineering Drawing Trainer (EDwgT), conventional instruction using 

printed materials enhanced with digital video clips, and conventional instruction using printed materials. 

Conventional instructional method was operationalised as learning strategy that employs similar training 

materials of Engineering Drawing using printed materials with tutor supervision. Several predictions were made 

in order to answer the research questions with regard to spatial skills improvement, effectiveness of training 

methods and the influence of demographic factor on training outcomes. These are: 

a) learners’ performance gain in spatial ability measures namely spatial visualisation, mental rotation 

accuracy and mental rotation speed will be significant in the computer-mediated EDwgT followed by 

video-enhanced conventional instruction, and finally conventional instruction, 

b) male learners will achieve greater performance gain in the spatial ability measures than female 

learners, 

c) high spatial experience learners will achieve greater performance gain than low spatial experience 

learners. 

Methods 

Participants 

A cohort of one hundred and thirty eight (138) undergraduate students with an average age of 20 participated in 

the experiment involving equal numbers of females and males. The sample was drawn from an intact class of 

Computer-Aided Design course involving students majoring in Information Technology and Communication 

(ITC), Universiti Pendidikan Sultan Idris, Malaysia. A convenience sampling was used and the participants were 

randomly assigned to three treatment groups forming two experimental groups and one control group. The 

treatments for the two experimental conditions were computer-mediated instruction and video-enhanced 

conventional instruction. The control group only received conventional instructions. All participants were given 

course credits for their participation. 

Instructional Materials and Apparatus 

The EDwgT was carried out by each participant on a personal computer with a pre-installed multimedia program 

freely downloaded from the Visualisation and Spatial Reasoning website (Blasko & Holiday-Darr, 2000). It was 

developed to replicate typical Engineering Drawing tasks comprising of multi 2D orthographic views together 

with isometric representations. This program allowed the user to learn the fundamentals of constructing the 

orthographic projections from an isometric view. Several utility buttons were also embedded in the interface for 

user guidance while exploring the problem solving activities. 

Individual differences in spatial visualisation and mental rotation were measured respectively using the Spatial 

Visualisation Test (Middle Grades Mathematics Project, 1983) and an online mental rotation test developed by 

Chay (2000) based on the Vandenberg and Kuse (1978) respectively. The categorisation of participants into high 

and low spatial experience levels was adopted from Spatial Experience Questionnaire (SEQ) designed by 

McDaniel et. al. (1978). 

The first instrument comprises 32 multiple-choice items with five response choices for each item. Items 

consisted of views having one, two and three dimensional figures shown in line drawings to be completed within 

25 to 35 minutes. The second instrument, the electronic test version comprises 30 evaluation items each 

composed of a target and comparison figure positioned at various degrees of orientation. Scoring involved 

assigning one point for a correct response and all questions were randomly presented to avoid order effect. 

151

Procedure 

This study employed a factorial pre-test post-test quasi experimental (3 x 2 x 2) design protocol comprising three 

independent variables namely treatment group (3 levels: experimental 1, experimental 2 and control), gender (2 

levels: females and males) and prior spatial experience (2 levels: high and low). The dependent variables in the 

study were mean scores of the spatial visualisation and mental rotation tests that served as both pre-test and posttest 

measures. The SEQ is a 25-item self-reported questionnaire consisting of a list of spatial activities such as 

drawing, map reading, and playing chess. Each item has a four-point scale ranging from two extremes (i.e. very 

often and never) to be rated by participants on the experiment. The two levels of spatial experience were 

categorised based on the total scores of the questionnaire such that low experience was 42 and below, and high 

experience was 43 and above. For the mental rotation test all participants were explicitly explained that their 

performance in the spatial test depend on both accuracy and speed factors in which they were allowed three 

minutes to complete the test. Practice session on two to three items in the online mental rotation test was 

instructed prior to pre-testing measurements to allow participants to familiarise with the interactive instructions 

and interfaces. 

Data containing number of correct attempts and response times in seconds were automatically generated and 

retrieved for analysis in statistical program. The spatial visualisation test was developed with manual instruction 

in which students were given 25 to 30 minutes to complete the tasks. During the intervention, training in EDwgT 

to improve performance in spatial tasks was applied to the first experimental group in the computer lab. 

Figure 1: The Interactive Engineering Drawing Trainer (EDwgT) window 

The second experimental group received similar instructional materials in printed form augmented with video 

materials. The digital video clips were produced using the screen motion capture software, Techsmitch’s 

Camtasia Studio Two to record screen activities into avi files. Participants were advised to use the clips to assist 

problem solving issues. 

Figure 2: A snapshot of a video clip launched by users 

152

The control group only received printed instructional materials consisting Engineering Drawing exercises 

scheduled in five sessions, two hours per week for five weeks duration. All groups received the same training 

materials, time allocation and time of day of training to avoid external threats confounding the treatment. This 

ensures any differences in performance after treatment were solely attributed to the effectiveness of the 

instructional approaches used. The authors were the facilitators throughout the training sessions providing 

guidelines and directions to ensure participants could follow the treatment programs with ease. Post-test was 

administered on the fifth week utilising the same tests used in the pre-testing stage to reveal any performance 

gain resulting from the spatial training. The five-week gap between pre-testing and post-testing will ensure that 

any improvement detected is not attributed to test re-test effect but is due to the intended effect of spatial 

training. 

Data Analysis 

It is important to ensure that the groups were statistically equivalent prior to treatments. Three independentsample 

t-tests for each pre-test of the spatial ability revealed no significant group differences at p> 0.05. Similar 

tests conducted to detect diferences due to gender and spatial experience showed all the groups were statistically 

equivalent except for spatial visualisation pre-test measure indicating high spatial experience participants were 

more accurate than the low spatial experience counterparts. The absence of gender differences for both measures 

of spatial ability especially in mental rotation were most surprising as most literature reported otherwise. Table 1 

summarises the means and standard deviations of the pre-test and post-test measurements. 

Table 1: Means and standard deviations for the spatial ability measures 

Spatial 

Mental Rotation 

Conditions 

Visualisation 

(max 32) 

(max 30) Reaction time (secs) 

Pre Post Pre Post Pre Post 

Experimental 1 

Females High exp. 13.70 19.40 24.00 27.10 124.88 100.67 

(4.69) (2.63) (2.71) (1.20) (42.81) (32.38) 

Low exp. 12.46 16.15 19.77 22.54 118.43 126.90 

(2.66) (2.64) (1.79) (2.50) (40.51) (32.00) 

Males High exp. 15.60 20.10 24.60 27.00 113.31 94.70 

(4.03) (3.96) (2.37) (1.25) (25.68) (24.41) 

Low exp. 14.23 19.15 19.92 22.15 113.98 119.46 

(3.92) (3.83) (1.15) (2.11) (42.27) (32.70) 

Total 

13.91 18.57 21.78 24.39 119.10 97.68 

(Females+Males) 

Experimental 2 

(3.86) (3.57) (2.97) (3.01) (34.87) (28.08) 

Females High exp. 14.00 17.88 22.13 24.00 142.32 119.56 

(4.04) (2.75) (2.36) (2.92) (34.06) (30.16) 

Low exp. 12.93 15.07 21.07 22.67 124.58 117.87 

(2.63) (2.02) (3.56) (3.41) (38.07) (31.90) 

Males High exp. 15.67 20.00 22.11 24.11 115.80 105.58 

(4.00) (2.87) (3.62) (3.57) (39.86) (43.79) 

Low exp 13.36 17.50 20.00 23.14 137.96 129.61 

(3.43) (2.07) (3.68) (3.61) (39.53) (28.43) 

Total 

13.78 17.26 21.13 23.33 130.03 119.34 


(3.46) (2.89) (3.44) (3.37) (38.24) (33.23) 

Control 

Females High exp. 

Males 

Low exp. 

High exp. 

Low exp. 

14.83 

(4.31) 

12.82 

(2.74) 

15.70 

(3.77) 

14.31 

(3.17) 

18.67 

(2.07) 

14.59 

(3.02) 

18.00 

(4.11) 

16.92 

(2.81) 

19.83 

(3.54) 

19.82 

(3.50) 

19.60 

(3.37) 

20.69 

(3.04) 

21.00 

(3.80) 

22.35 

(3.32) 

21.80 

(3.68) 

21.85 

(3.58) 

138.52 

(29.80) 

141.85 

(38.21) 

137.79 

(40.33) 

150.28 

(24.25) 

117.97 

(27.36) 

127.74 

(36.85) 

124.29 

(40.83) 

131.92 

(25.32) 

153

Total 


Spatial Visualisation 

14.13 

(3.40) 

16.52 

(3.43) 

20.02 

(3.27) 

21.91 

(3.44) 

142.92 

(33.54) 

126.90 

(33.02) 

Three-way interaction between method of instructions, gender and level of spatial experience was not found to 

be significant, F(2,126) = 0.58, p = .56 by the analysis of variance performed. Similarly, no three-way 

interactions among the three independent variables were found to be signififant. The main effect for method of 

instructions was significant, F(2,126) = 3.47, p = .03, showing better performances for participants receiving 

training in EDwgT compared to the other two methods. Gender factor produced significant main effect, F(1,126) 

= 9.91, p = .002 indicating males to be more accurate in spatial visualisation tasks than their female counterparts. 

A highly significant main effect of spatial experience was detected, F(1,126) = 21.60, p = .0005 revealing better 

spatial visualisation performance for high spatial experience participants compared to their low spatial 

experience counterparts. 

Table 2 summarises the results of the main effects and interactions for spatial visualisation post-test. 

Table 2: ANOVA for the Spatial Visualisation Test 

Source df Sum of Squares Mean Square F Value 

Method 2 60.96 30.48 3.47* 

Gender 1 87.16 87.16 9.91** 

Spatial Experience 1 190.01 190.01 21.60 *** 

Method x Gender 2 11.28 5.64 .64 

Method x Spatial Experience 2 2.01 1.01 .11 

Gender x Spatial Experience 1 27.84 27.84 3.17 

Method x Gender x Spatial 2 10.22 5.11 .58 

Experience 

Error 126 1108.37 8.80 

*p

The main effect for method of instructions was found to be highly significant, F(2,126) = 9.90, p = .0005 

demonstrating greater mental rotation accuracy for participants receiving training in EdwgT compared to the 

other two instructional approaches. Spatial experience factor produced significant main effect, F(1,126) = 10.08, 

p = .002 indicating participants with high spatial experience to be more accurate in mental rotation tasks than 

their low spatial experience counterparts. Main effect for gender was found to be not significant, F(1,126) = 

0.02, p = .90. 

A simple main effects analysis for the high spatial experience factor indicated a significant difference between 

the mean scores of mental rotation accuracy for method of instructions factor, F(2,50) = 17.94, p = .0005. A 

follow-up comparison analysis using Tukey’s procedure (alpha = .05) revealed the mean score of mental rotation 

accuracy of the EDwgT group was significantly higher than the second experimental group that received videoenhanced 

conventional instruction. Likewise, the mean score of mental rotation accuracy of this second 

experiemental group was significantly higher that the control group. A simple main effects analysis for the low 

spatial experience factor did not show any significant differences for the three groups, F(2,82) = .47, p = .63. 

Mental Rotation (Reaction Time/Speed) 

The three-way interaction between method of instructions, gender and level of spatial experience was not found 

to be significant, F(2,126) = 0.63, p = .53. Likewise, the three two way interactions among the three independent 

variables were not found to be significant in the same analysis. The main effect for method of instructions was 

not significant, F(2,126) = 2.25, p = .11, showing all the groups made statistically similar performance gain. 

Gender factor did not produce significant main effect, F(1,126) = 0.02, p = .88 indicating both female and male 

participants were more or less having similar reaction time in completing the mental rotation tasks. However, 

there was a significant main effect for spatial experience factor, F(1,126) = 6.82, p = .01 indicating that high 

spatial experience participants were faster in mental rotation tasks than participants with low spatial experience 

after performing Engineering Drawing tasks. 

Table 4 summarises the results of the main effects and interactions for mental rotation speed post-test. 

Table 4: ANOVA for the Mental Rotation Test (Reaction Time) 

Source df Sum of Squares Mean Square F Value 

Method 2 4805.66 2402.83 2.25 

Gender 1 23.58 23.58 .02 

Spatial Experience 1 7282.26 7282.26 6.82* 

Method x Gender 2 755.81 377.91 .35 

Method x Spatial Experience 2 1791.32 895.66 .84 

Gender x Spatial Experience 1 432.01 432.01 .41 

Method x Gender x Spatial Experience 2 1346.08 673.04 .63 

Error 126 134556.78 1067.91 

*p

Given the absence of any significant gender difference prior to treatment indicates that male participants were 

found to be relatively superior than females in the spatial visualisation measure as anticipated in the second 

prediction. Males may have developed greater understanding of spatial properties and relationships enabling 

them to transfer this skill in spatial visualisation tasks. The relative reduction in female performance in handling 

Engineering Drawing tasks may be partly attributed to the belief that it is a male-biased activities seldom 

encountered by females similar to Spencer et al. (1999) theory of Stereotype Threat. The difference in 

performance between the high and low spatial experience groups widened after training suggests that level of 

spatial experience can produce different training outcomes. Significant difference attributed to spatial experience 

was detected favoring high spatial experience participants thus supporting the third prediction. The 5-week 

spatial training certainly has further strengthened the former’s spatial visualisation skill that effectively made 

them more effective in managing spatial strategy. The absence of any interactions among the three variables 

suggests that engagement in Engineering Drawing tasks can improve spatial visualisation skill irrespective of 

gender and level of spatial experience. 

Mental Rotation Accuracy 

Highly significant performance gain in mental rotation accuracy was also found favoring participants that 

received computer-mediated instruction training compared to the other groups. The performance gain acquired 

by the second experimental group was significantly higher than the control group hence confirming the first 

prediction of the study for this measure. The second prediction of males outperforming females in mental 

rotation accuracy did not materialise as both acquired equivalent improvement for correct responses in mental 

rotation measure after training. Significant performance gain by high spatial experience group than those with 

low spatial experience attested the third prediction in mental rotation accuracy. The presence of an interaction 

between instructional method and spatial experience factor influenced the overall spatial training. High spatial 

experience participants that trained in EDwgT showed higher mental rotation accuracy than the other groups. 

Participants of similar spatial experience level trained with augmented video-enhanced approach attained better 

mental rotation accuracy than high spatial experience participants in the control group. On the contrary, low 

spatial experience participants performed equally regardless of the instructional methods employed. 

Therefore, any training program to enhance mental rotation in terms of accuracy involving learners with high 

spatial experience must take into consideration the types of instructional media being employed. The type of 

instructional approach adopted will determine the level of achievement as demonstrated in the study where 

computer-mediated spatial training produced the best outcome. However, the method of instructions used for 

spatial training in mental rotation accuracy may not be as critical for low spatial experience learners. One 

plausible explanation is that the effectiveness of such educational tool was not actualised since the low spatial 

experience participants may not possess sufficient experience or exposure (Reed & Overbaugh, 1993) and 

proficiency in using computer-mediated instructions which entail considerable degree of skills in humancomputer 

interaction. This is supported by the fact that several items used in the SEQ include information on 

computer software and games experiences. Other contributing factor was the duration of the course that may not 

suit well with this group and may require longer training period to gain confidence and ease in using novel but 

unfamiliar methods. 

Mental Rotation Speed 

Methods of training to enhance participants’ reaction time in mental rotation tasks did not support the first 

prediction for this spatial skill measure. The ability of the three groups to perform mental rotation tasks as 

quickly as possible was found to improve slightly through training irrespective of the instructional approaches 

used. The impact of performing Engineering Drawing exercises produced equivalent impact in reaction time 

regardless of instructional method used. The instructional method may not be the critical determinant but the 

nature of the Engineering Drawing activities itself needs to be analysed. The Engineering Drawing problems 

used mainly involved completing incomplete orthographic and isometric views requiring learners to visualise 

three-dimensional objects transforming from two-dimensional planar faces. 

This spatial information processing is cognitively more related to spatial visualisation than mental rotation 

although the latter cognitive function is involved but its speed factor is not so much emphasised when 

performing Engineering Drawing tasks. No gender difference was detected to support the second prediction for 

mental rotation speed. Both females and males have made similar gains by becoming slightly quicker in mental 

rotation after training. This implies that engaging in Engineering Drawing activities can be quite benificial for 

156

oth sexes. Significant difference attributed to spatial experience factor was found confirming the third 

prediction for mental rotation speed. High spatial experience participants were faster than the low spatial 

experience group. Their relativeley higher spatial experience may have assisted them to mentally rotate objects 

much quicker through utilisation of enhanced mental rotation skills that they possessed. The absence of any 

interactions among the variables suggests that engagement in Engineering Drawing tasks in the three different 

treatment conditions can slightly improve mental rotation speed ability irrespective of gender. 

Conclusion and Recommendations 

Computer-mediated instruction for spatial training of Engineering Drawing exercises substantially improved 

spatial visualisation however the mental rotation improvement was marginal. The performance gain for 

individual depends on the instructional methods, especially attributed to its high degree of interaction and 

interactive simulation feedback, enhancing learners’ understanding and comprehension. Performance improved 

by having less commission of errors and reduced reaction time as progression of training continued. Its 

effectiveness to a certain extent was replicated for improving mental rotation accuracy but not the speed factor. 

In fact, all the three methods employed produced equivalent impact for this speed measure of mental rotation. 

Engineering Drawing activities seem to tap more on spatial visualisation skill than mental rotation skill. One 

plausible explanation is that training using Engineering Drawing activities utilises learning contents and 

presentation strategy that are not suited to tap on mental operations associated with mental rotation speed ability. 

Specific mental rotation training using learning content is more effective such as object matching exercises and 

presentation of spatial exercises with time limitation (Khairul & Azniah, 2004; Turos & Ervin, 2000). 

Training programs also need to weigh mediating factors such as gender and spatial experience. The level of 

spatial experience has been found to be a critical factor recurring significantly in the analysis of spatial 

visualisation and mental rotation abilities. Gender was also found to be a factor that may impact training 

outcomes using different methods of training in spatial visualisation skill. Emphasis should be given to spatial 

experience and gender when implementing interventional or remedial program of sort. This will ensure 

appropriate utilisation of instructional method for a particular group to maximise training efficacy. More studies 

employing stringent experimental design methodology with randomised samples should be undertaken for future 

research to establish evidence for the general students population. Experimental conditions can be manipulated 

to include latest computer technologies such as multimedia, virtual reality (VR), collaborative environment, and 

time-emphasis practice sessions. Student characterictics such as learning styles can be also studied to examine its 

impact on spatial skill training. This is vital since solving spatial tasks of spatial visualisation and mental rotation 

may involve different adoption of cognitive strategies among diverse student groups. 

References 

Anuar, K., & Ismail, A. (2004). The improvement of mental rotation through computer-based multimedia tutor. 

Malaysian Online Journal of Instructional Technology, 1 (2), retrieved April 30, 2006, from, 

http://pppjj.usm.my/mojit/articles/V1N2-final/MOJIT-Khairul.htm. 

Alington, D. E., Leaf, R. C., & Monaghan, J. R. (1992). Effects of stimulus color, pattern, and practice on sex 

differences in mental rotations task performance. Journal of Psychology, 126 (5), 539-553. 

Baenniger, M., & Newcombe, N. (1989). The role of experience in spatial test performance: A meta analysis. 

Sex roles: A Journal of Research, 20 (5-6), 327-334. 

Ben-Chaim, D., Lappan, G., & Houng, R. T. (1989). Adolescents' ability to communicate spatial information: 

Analyzing and effecting students' performance. Educational Studies in Mathematics, 20 (2), 121-146. 

Bishop, A. J. (1989). Review of research on visualisation in mathematics education. Focus on Learning 

Problems in Mathematics, 11 (1), 7-15. 

Blasko, D., & Holiday-Darr, K. (2000). The visualisation assessment and training homepage, retrieved April 30, 

2006, from, http://viz.bd.psu.edu/viz/. 

Chay, J. C. (2000). 3D Mental rotation test, retrieved April 30, 2006, from, 

http://www.uwm.edu/People/johnchay/mrp.htm. 

157

Ellis, T. (2001). Animating to improve learning: A model for studying multimedia effectiveness. Proceedings of 

the 31 st ASEE/IEEE Frontiers in Education Conference, 10-13 October, Reno, Nevada, USA. 

Fouad, N. A., & Smith, P. L. (1996). A test of a social cognitive model for middle school students: Math and 

Science. Journal of Counseling Psychology, 43 (3), 338-346. 

Gur, R., Alsop, D., Glan, D., Petty, R., Swanson, C., Maldjian, J., Turetsky, B., & Detre, J. (2000). A FMRI 

study of sex differences in regional activation to a verbal and spatial task. Brain and Language, 74 (2), 157-170. 

Harris, L. J. (1978). Sex differences in spatial ability: Possible environmental, genetic, and neurological factors. 

In Kinsbourne, M. (Ed.), Asymmetrical Function of the Brain, Cambridge: Cambridge University Press, 405- 

522. 

Hier, D. B., & Crowley Jr., W. F. (1982). Spatial ability in androgen-deficient men. New England Journal of 

Medicine, 306 (20), 1202-1205. 

Kail, R., Carter, P., & Pellegrino, J. (1979). The locus of sex differences in spatial ability. Perception and 

Psychophysics, 26, 182-186. 

Lehmann, W. (2000). Group differences in mental rotation. Magdeburger Arbeiten zur Psychologie, 2, 1-18. 

Linn, M. C., & Petersen, A. C. (1985). Emergence and characterisation of sex differences in spatial ability: A 

meta-analysis. Child Development, 56, 1479-1498. 

Maccoby, E., & Jacklin, C. N. (1974). The psychology of sex differences, Stanford: Stanford University Press. 

Mayer, R. E., & Sims, V. K. (1994). From whom is a picture worth a thousand words? Extensions of a Dualcoding 

theory of multimedia learning. Journal of Educational Psychology, 86 (3), 389-401. 

McDaniel, E., Guay, R., Ball, L., & Kolloff, M. (1978). A spatial experience questionnaire and some preliminary 

findings. Paper presented at the Annual Meeting of the American Psychological Association, August 1978, 

Toronto, Canada. 

Middle Grades Mathematics Project (1983). Spatial visualisation test, Department of Mathematics, Michigan 

State University, USA. 

Miller, C. L. (1996). A historical review of applied and theoretical spatial visualisation publications in 

engineering graphics. The Engineering Design Graphics Journal, 60 (3), 12-33. 

National Council of Teachers of Mathematics (1987). Mathematics for Language Minority Students. NCTM 

News Bullein, May 1987. 

Petrusic, W. M., Varro, L., & Jamieson, D. G. (1978). Mental rotation validation of two spatial ability tests. 

Psychological Research, 40 (2), 139-148. 

Piaget, J., & Inhelder, B. (1971). The child's conception of space, London: Routlege and Kegan Paul. 

Rafi, A., Anuar, K., Samad, A., Hayati, M., & Mahadzir, M. (2005). Improving spatial ability using a Web-based 

Virtual Environment (WbVE). Automation in Construction, 14 (6), 707-715. 

Reed, W. M., & Overbaugh, R. C. (1993). The effects of prior experience and instructional format on teacher 

education students' computer anxiety and performance. Computers in the Schools, 9 (2/3), 75-89. 

Salthouse, T. A., Renee, L., Babcock, E., Skovronek, D., Mitchel, R. D. & Palmon, R. (1990). Age and 

experience effects in spatial visualisation. Developmental Psychology, 26, 128-136. 

Smith, I. M. (1964). Spatial ability: Its educational and social significance, London: University of London. 

Sorby, S. A. (1999). Developing 3-D visualisation skills. Engineering Design Graphics Division, 63 (2), 21-32. 

158

Spencer, S. J., Steele, C. M., & Quinn, D. (1999). Stereotype threat and women’s math performance. Journal of 

Experimental Social Psychology, 35, 4-28. 

Stericker, A., & LeVesconte, S. (1982). Effect of brief training on sex-related differences in visual-spatial skill. 

Journal of Personality and Social Psychology, 43, 1018-1029. 

Sternberg, R. J. (1990). Metaphors of mind: Conceptions of the nature of interlligence, Cambridge: Cambridge 

University Press. 

Swim, J. K. (1996). Perceived versus meta-analytic effect sizes: Assessment of the accuracy of gender 

stereotypes. Journal of Personality and Social Psychology, 66, 21-36. 

Turos, J. & Ervin, A. (2000). Training and gender differences on a web-based mental rotation task. The Penn 

State Behrend Psychology Journal, 4 (2), 3-12. 

Vandenberg, S. G., & Kuse, A. R. (1978). Mental rotation, a group test of three-dimensional spatial 

visualisation. Perceptual and Motor Skills, 47, 599-604. 

Vogel, J. J., Bowers, C. A., & Vogel, D. S. (2003). Cerebral lateralisation of spatial abilities: a meta-analysis. 

Brain and Cognition, 52 (2), 197-204. 

Voyer, D., Voyer, S., & Bryden, P. (1995). Magnitude of sex differences in spatial abilities: A meta-analysis and 

consideration of critical variables. Psychological Bulletin, 117 (2), 250-270. 

159

Delwiche, A. (2006). Massively multiplayer online games (MMOs) in the new media classroom. Educational Technology & 

Society, 9 (3), 160-172. 

Massively multiplayer online games (MMOs) in the new media classroom 

Aaron Delwiche 

Department of Communication, Trinity University, San Antonio, TX, USA 

Tel: +1 210 999-8153 

adelwich@trinity.edu 

ABSTRACT 

Recent research demonstrates that videogames enhance literacy, attention, reaction time, and higher-level 

thinking. Several scholars have suggested that massively multiplayer online games (MMOs) such as 

Everquest and Second Life have educational potential, but we have little data about what happens when 

such tools are introduced in the classroom. This paper reports findings from two MMO-based courses in the 

context of situated learning theory. The first course, focused on the ethnography of on-line games, used the 

game Everquest as a vehicle for teaching research methods to 36 students in an undergraduate 

communication course. The second course used the game Second Life to teach the fundamentals of videogame 

design and criticism. Synthesizing comments from student web logs with data collected from followup 

surveys, the paper highlights key findings and offers concrete suggestions for instructors contemplating 

the use of multiplayer games in their own courses. Recommending that potential virtual environments be 

selected on the basis of genre, accessibility, and extensibility, it is suggested that game-based assignments 

are most effective when they build bridges between the domain of the game world and an overlapping 

domain of professional practice. 

Keywords 

Virtual environments, Video-games - learning, Educational technology, Game design, Situated learning 

In April 2003, thirty-seven Halflings of mixed gender congregated at a large public university in the Pacific 

Northwest. The first meeting of the Halfling Ethnographers Guild was in session. Known throughout the land for 

their good-natured hospitality, these hobbit-like creatures were well-suited to the task of collecting qualitative 

data about their fellow citizens. There was just one problem: this group of eager social scientists had quite 

literally been “born yesterday.” Before they could undertake any sort of research, they would first have to learn 

how to talk, how to move, and how to avoid being killed by diseased rats. 

The convergence of high-speed Internet connections, sophisticated graphics cards, and powerful microprocessors 

has paved the way for immersive virtual environments populated by thousands of users simultaneously. These 

environments have been referred to as persistent worlds, multi-user virtual environments (MUVEs), and 

massively multiplayer on-line games (MMOs). Gamers enter these worlds by creating highly personalized digital 

personae called “avatars.” Many of these games are highly addictive, and some users spend more waking time 

with friends in the digital world than with human beings in their physical environment. As these worlds have 

matured, they have developed many characteristics of physical communities such as specialized language, 

political structures, complex social rituals, and shared history (Steinkuehler, 2004). 

In recent years, several theorists have highlighted the educational potential of multiplayer games. Noting that 

these cultural objects raise important questions about identity, community, and the influence of technology in 

our daily lives, some believe that virtual environments facilitate a “psychosocial moratorium” that has profound 

therapeutic and educational benefits (Turkle, 1995). Others draw attention to the way these distributed 

environments facilitate networked learning that transcends the game itself (Herz, 2002). Foreman (2003) predicts 

that shared graphical worlds are “the learning environments of the future” (p. 14). 

While optimistic about the potential of MMOs, Foreman (2004) asks if they can “simulate within the computerenhanced 

classroom the kind of ‘situated learning’ experienced by students in the real world” (High Cost section, 

para. 3). He also wonders if students can apply knowledge obtained from these virtual environments once the 

game has ended. Although Foreman is concerned with the use of MMOs in courses across the liberal arts 

curriculum, his questions are highly relevant to those who teach courses on the theoretical implications of new 

media. Instructors working in this area commonly lament the difficulty of explaining cyberculture through the 

non-interactive medium of the textbook. The Cybercultures Reader (Bell & Kennedy, 2000), Society Online: The 

Internet in Context (Howard & Jones, 2004), and Media and Society in the Digital Age (Kawamoto, 2002) are 

excellent works, but many instructors feel that a traditional textbook is incapable of capturing the interactive, 

hypertextual and dynamic nature of on-line life. 






160

Synthesizing findings from two MMO-based courses, this article argues that MMOs are living, breathing 

textbooks that provide students with first-hand exposure to critical theory and professional practice. The first 

course, which focused on the ethnography of on-line games, used Everquest as a vehicle for teaching research 

methods to 36 students in an undergraduate communication course. The second course used Second Life to teach 

the fundamentals of video-game design and criticism. 

After briefly reviewing current research on the educational promise of video-games, I explain the objectives and 

design of the two courses. Since this paper is intended to provide a review framework of preliminary classroom 

experience with massively multiplayer environments, a formal and detailed classification of student data is not 

incorporated in this analysis. However, synthesizing comments from student web logs with data collected from 

follow-up surveys, I highlight key findings and offer concrete suggestions for instructors who may be 

contemplating the use of multiplayer games in their own courses. Although both of the case studies incorporated 

MMOs in games-related courses, my findings suggest that these virtual worlds have broader relevance to the 

new media classroom. 

Gaming in the classroom 

During the 1980s, early studies on video games focused primarily on dangerous content, with the Surgeon 

General warning that young people were addicted “body and soul” to dangerous machines. However, in recent 

years, a growing body of work makes more positive claims about the educational benefits of games. Researchers 

have argued that video-games enhance computer literacy (Benedict, 1990), visual attention (Bavelier & Green, 

2003), and reaction time (Orosy-Fildes & Allan, 1989). Meanwhile, in works written for popular audiences, Gee 

(2003) and Johnson (2005) have suggested that video-games teach players to become problem solvers. 

In well-designed games, players can only advance to higher levels by testing a range of strategies. Many games 

allow users to modify difficulty levels and other environmental variables, thus fostering insight into the 

constructed nature of all simulations. Games such as The Sims and Grand Theft Auto teach principles of 

representation, semiotics, and simulation in situated, experiential ways that cursory exposure to the works of 

Baudrillard might not. As Gee (2003) notes, good video-games teach users “to solve problems and reflect on the 

intricacies of the design of imagined worlds and the design of both real and imagined social relationships and 

identities in the modern world” (p. 48). 

Of course some games are more likely to promote these goals than others. Even the controversial Grand Theft 

Auto series is more likely to promote “higher-level thinking” than reflex-based titles such as Pong or Pac-Man. 

Indeed, claims about the educational value of video-games tend to be based on more esteemed genres such as 

simulations, puzzles, strategy, adventure, and role-playing games (RPGs). 

Researchers continue to document the educational potential of games, but there have been few attempts to 

explain their effectiveness in the context of an overarching theoretical perspective. The work of Steinkuehler 

(2004, 2006, in press) and Gee (2003) is a notable exception. Both argue that the instructional potential of games 

can only be realized through attention to the fact that learning is a social practice. 

In their landmark elaboration of situated learning theory, Lave and Wenger (1991) challenge the conventional 

view of education as a process by which an individual internalizes culturally given knowledge. They argue that 

learning, thinking and knowing emerge from a world that is socially constructed. Meaning is contextual, and 

learning is what happens when individuals become increasingly involved as participants in social communities 

of practice. Examining networks of midwives, tailors, and recovering alcoholics, the authors offer apprenticeship 

as an example of situated learning in which knowledge is obtained through a process of participation within a 

community. 

This is particularly relevant to MMOs, for these environments are complex discursive communities characterized 

by a "full range of social and material practices" (Steinkuehler, 2004, p. 9). When newcomers enter into such 

spaces, they are gradually introduced to a complex social framework through the tutelage of other community 

member. These social spaces encourage information sharing and collaboration both within and beyond game 

parameters. In external message boards, fans discuss strategies, recommend improvements to game software, and 

share chunks of software code ("mods") that transform rules and content. The distributed learning evident in 

such forums is similar to that found in other knowledge communities, from discussion forums dedicated to 

sewing to mailing lists focused on programming techniques. 

161

Gee (2003) argues that all knowledge communities function as semiotic domains that give meaning to the social 

and physical practices that exist within them. He notes that learners enter into such spaces as newcomers, 

unfamiliar with domain-specific meanings. Through active participation, they learn to experience the world in 

new ways and they affiliate with others who understand community practices and concerns. Gradually, the new 

participant develops problem solving strategies, some of which might be applicable in other arenas. This aspect 

of Gee’s analysis is closely related to the apprenticeship process that Lave and Wenger (1991) refer to as 

“legitimate, peripheral participation.” However, Gee takes this perspective one step further, suggestion that true 

critical learning occurs when the learner engages in a process of meta-level reflection, “thinking about the 

relationships of the semiotic domain being learned to other semiotic domains” (p. 50). 

Of course, situated learning is hardly unique to MMOs. From the headquarters of the school newspaper to the 

halls of student government, college students have long been apprenticed in communities of practice. Yet, 

MMOs are intriguing because social interaction, cooperation, and knowledge sharing is central to their 

enjoyment. Hertz (2002) observes that these networked games "fully leverage technology to facilitate 'edge' 

activities -- the interaction that happens through and around games as players critique, rebuild, and add on to 

them, teaching each other in the process. Players learn through active engagement not only with the software but 

with each other" (p. 173). This stands in marked contrast to dominant pedagogical approaches. Learning is often 

viewed as a highly individualized activity that stops at the classroom door. Even today, many teachers 

discourage students from collaborating on homework assignments, viewing such behavior as “cheating.” 

In addition to their social character, MMOs are uniquely engaging environments. In a recent study of 30,000 

players, Yee (2006) found that 70% had spent at least 10 continuous hours in a virtual world at one sitting. Many 

researchers have observed that players intensely involved with video-games display characteristics of the 

psychological state that Csikszentmihalyi (1990) has termed the “flow experience.” Hoffman and Novak (1996) 

explain that the flow state is characterized by user confidence, exploratory behaviors, enjoyment, distorted time 

perception, and greater learning. Drawing on Heeter's (1992) work on telepresence, Chou and Ting (2003) argue 

that on-line games such as MMOs are uniquely likely to evoke all of these effects among users. 

This level of engagement is exciting because it helps develop “the motivation for an extended engagement” that 

is crucial to mastering a complex body of knowledge (Gee, 2004, p. 4). Peng (2004) notes that “students learn in 

a flow state where they are not just passive recipients of knowledge, but active learners who are in control of the 

learning activity and are challenged to reach a certain goal” (pp. 10-11). Garris, Ahlers, and Driskell (2002) 

agree, pointing out that “motivated learners more readily choose to engage in target activities, they pursue those 

activities more vigorously, and they persist longer at those activities than do less motivated learners” (p. 454). 

Finally, MMOs also encourage role-playing behaviors that have educational promise. The literature is packed 

with examples of role-playing techniques being successfully deployed in the college classroom. At Barnard 

College, history students role-play key moments of the French Revolution and collectively enact imperial 

politics of 16 th Century China (Fogg, 2001). In Australia, college students are prodded to consider the 

environmental and social effects of building a tourist resort on indigenous territory (Cutler, & Hay, 2000). Some 

of these exercises have been conducted on-line, while others were designed for face-to-face interaction. All have 

been highly successful. 

Luff (2000) notes that successful role identification helps students escape the grip of contemporary norms and 

beliefs. Whether they project themselves into the role of an Athenian politician or a patient suffering from 

chronic pain, they are forced to shift perspective and imagine the world through different eyes. Luff refers to 

role-playing as “the ultimate empathy exercise.” Bell (2001) makes a similar argument, noting that the ability of 

role-playing techniques to affect attitudes and behavior has been widely demonstrated. Bender (2005) points out 

that the on-line environment is particularly well suited to role-playing activities. 

MMOs have instructional promise because they immerse students in complex communities of practice, because 

their immersive nature invites extended engagement with course material, and because they encourage roleplaying. 

So, how can these environments be used in the classroom? In their elaboration of situated learning 

theory, Lave and Wenger (1991) emphasize that their approach “is not in itself an educational form, much less a 

pedagogical strategy or a teaching technique.” Despite the disclaimer, this theoretical perspective does offer 

some guidance in the classroom. As Quay (2003) notes, citing Wenger (1998), the instructor’s role is to 

implement forms of practice that open up participation to newcomers. 

The ability of students to become actively involved in communities of practice within the game world has been 

well documented. The next step is to link student engagement in the game world to engagement in an 

162

overlapping knowledge community that is connected to the theoretical concerns of the course in which the MMO 

is being used. Once students are highly engaged in the process of role-playing and information seeking, it is 

relatively easy to convince them to role-play as apprentice participants within a higher-level theoretical 

community. The students can also be encouraged to make the critical leap into meta-reflection about the similar 

learning processes embedded in both domains. In the following pages, I describe two courses that melded game 

activities with participation in other knowledge communities. 

Course Description 

Course 1. Ethnography of On-line Games 

In April 2003, I taught a course entitled “Ethnography of Massively Multiplayer On-line Role-playing Games” 

to a group of 36 undergraduates at a large public university. In this course, students explored the behaviors, 

cultural practices, symbolic expressions and motivations of MMO players. In lieu of a textbook, they purchased 

a three-month subscription to the game Everquest. Arguably the most successful virtual world in North America 

at the time the course was offered, Everquest is a fantasy-themed MMO which encourages players to assume 

virtual identities such as “dwarf warrior” or “gnome magician.” Over the course of weeks, months, and years, 

players gradually build up the skill level, capabilities, and wealth of their game character. Along the way, they 

compete for scarce virtual resources and join vast in-game social networks known as guilds. At last count, more 

than 420,000 players paid $15 per month to immerse themselves in the game’s virtual world. At peak moments, 

more than 98,000 players interact with one another on the game servers simultaneously – a population roughly 

approximate to that of Dearborn, Michigan. Virtual items (e.g. swords, wands, armor, characters) are regularly 

sold on Ebay for US dollars, leading one economist to calculate that the typical player earns $3.42 in real-world 

dollars during each hour of gameplay. He extrapolates that Everquest’s per-capita GNP is $2,266 – making it the 

77 th richest “nation” in the world (Castranova, 2001). 

The mind-boggling complexity of this virtual world made it an ideal vehicle for exploring social and 

philosophical issues related to new media. Along with the game Everquest, a comprehensive course packet 

synthesized articles on gaming, virtual community, and social-science research methods. Students were assigned 

approximately 80 pages of reading each week. The course packet included articles specifically analyzing 

Everquest (Yee, 2001; Castranova, 2001; Griffiths, Davies, & Chappell, 2003), the cultural roots of role-playing 

games (Fine, 1983; Mackay, 2001), psychological significance of role playing (Douse & McManus, 1993; 

Hughes, 1998), and identity construction in virtual worlds (Turkle, 1995; Stone, 1995). These theoretical and 

substantive works were accompanied by methods pieces that described interview techniques and ethnographic 

approaches (Emerson, Fretz & Shaw, 1995; Fetterman, 1989; Mann & Stewart, 2000). 

Ethnography is a qualitative research method in which the investigator strives toward an insider’s perspective of 

a particular culture. Originally used to study indigenous cultures in remote locations, the method has been widely 

applied to other cultural groupings within industrialized nations. Ethnographic research combines participant 

observation with qualitative interviews and analysis of related cultural artifacts. Because the academic quarter is 

only ten weeks long, it was impossible for students to gain a fully emic perspective on Everquest players. While 

recognizing these limitations, the class gave the students a trial run of an ethnographic research project. In class, 

we jokingly referred to their abbreviated methods as “ethnography lite.” 

Approximately half of the class time was spent in the virtual world, and students were asked to log at least five 

hours in the game-world each week. Role-playing the part of ethnographers, students crafted research questions 

and set out to collect data through a process of participant observation. They documented their field notes and 

reactions to class assignments in publicly accessible web logs. At the end of the quarter, students delivered 

conference-style presentations of their findings. 

The course objectives were two-fold. The primary goal was for the class to investigate collectively the activities 

of on-line gamers in the context of what Silver (2000) terms “critical cyberculture studies.” Along the way, 

students were exposed to the fundamental principles of social science research. By the time students finished the 

course, I hoped they would understand the relationship between research questions and data collection, the 

necessity of obtaining informed consent, sampling considerations, the way interview techniques and question 

design can shape results, and the difference between qualitative and quantitative methods. 

Class exercises varied, depending on the reading. In one session, after discussing the concept of on-line identity, 

four students created avatars of varying genders. Within the game, the remainder of the class asked questions and 

163

attempted to guess which students were virtually cross-dressing. This paved the way for a thoughtful discussion 

of Goffman’s (1959) theories on the social performance of identity. Students were not graded on their activities 

within the game world, but they were asked to reflect on the discussion in a subsequent web log posting. In a 

second exercise, students traveled to different continents in the game and used the built-in chat tools to talk with 

other players. This activity was followed by a spirited conversation about on-line interview techniques and the 

ethical considerations of ethnographic research. It also prompted students to reflect on their experiences of being 

instructed by other players. 

During the quarter, students were active participants in two semiotic domains. Initially, as residents of the game 

world, they participated in a distributed network of information and ideas related to game dynamics. They 

participated in on-line forums, talked with other players, and shared strategies with each other. However, 

throughout this process, they also participated in a community of practice as communication researchers. They 

discussed research ethics, compared interview techniques and evaluated each other’s methodologies. They also 

participated in web sites and discussion forums populated by game researchers. The two domains (player and 

researcher) overlapped, and students became highly engaged with both identities. 

While I realized most students had no intention of becoming communication scholars after the course was 

finished, this class attempted to make students more critical consumers of research findings by providing firsthand 

exposure to data collection and analysis. This is analogous to the way media literacy educators promote 

critical analysis of representations by giving students hands-on experience with video cameras and editing decks 

(Van Buren & Christ, 2000). 

Course 2. Game design for the web 

In January 2004, I taught a course on game design to 15 communication students at a small liberal arts college in 

the Southwest. This course covered the social dynamics of virtual worlds, fundamentals of game design, and 

video-game aesthetics. The class had two primary learning objectives. As with the ethnography course, a 

primary goal was to explore critical themes of cyberculture studies through sustained interaction with other 

gamers. A second objective was to experiment with the mechanics of game design by creating games that could 

be played by other residents of the game Second Life. Like Everquest, Second Life makes it possible for 

thousands of users to interact simultaneously in a graphical virtual world. This is where the similarities end. 

While Everquest (along with most MMOs) is characterized by a fantastic narrative, clearly defined rules, and 

competitive goals, Second Life is an open-ended environment in which players themselves design the world, its 

objects and their behaviors. Incorporating sophisticated three-dimensional modeling tools and a powerful 

scripting language, the game invites players to freely unleash their imaginations. Rather than struggling to 

acquire a gold piece or an enchanted Sword of the Bloodsworn in Everquest, players in Second Life derive 

pleasure from displaying their own creations and admiring those of others. If Everquest was a good place to 

begin investigating key ideas in cyber culture studies, Second Life was clearly an ideal environment in which 

students could create and analyze their own games. 

In the game design course, students created games within the virtual world of Second Life. The class was split 

into three development teams of five students each. Some acted as project managers, while others focused on 

narrative development, three-dimensional modeling, avatar customization, and scripting. As with the first course, 

I had no illusions that students in this course would ultimately pursue a career in game design. However, in the 

process of designing their own games, students would become further sensitized to the aesthetic characteristics 

of video-games while understanding more about the underlying structures of all games. 

By the end of the course, I hoped that students would be able to wield a substantial vocabulary for game 

criticism that incorporated equally the language of game designers and critical game theorists. In addition to the 

on-line simulations, students were assigned approximately 50 pages of reading each week. The course packet in 

the game design course included articles about the history of multiplayer games (Chick, 2003), identity 

formation in virtual worlds (Stephenson, 1993; Turkle, 1995), the presumed motivations of gamers (Crawford, 

1982), the nature of play, rules, and cheating (Salen and Zimmerman, 2004), aesthetic criteria for analyzing 

games (Pearce, 2001; Wolf, 2001; Jenkins & Squire, 2002; King & Krzywinska, 2002), and sociological debates 

about violence, gender, and sexuality in video-games (Robins, 1994; Relph, 2002; Consalvo, 2003). 

Class exercises were closely matched to the content of the reading. One day, after spending several weeks 

discussing the mechanics of game design, we engaged in a face-to-face version of a popular parlor game called 

“Mafia.” Next, students attempted to replicate the game within the confines of Second Life. They immediately 

164

ealized that the virtual world made it almost impossible to play according to the traditional rules. Hashing out 

design solutions in their web logs, students developed a second version of the game that was quite successful in 

the on-line environment. 

Once again, this course was premised on the potential of situated learning. Students were active participants in 

the quirky world of Second Life, and they came to think of themselves as community residents. They learned 

how to use the game’s scripting and object creation tools by participating in tutorials and activities organized by 

other residents. Throughout this process, they also participated in a community of practice as game designers. 

They discussed game mechanics, deconstructed the strengths and weaknesses of the Second Life environment, 

and participated in professional forums maintained by game designers. Because participating in the creative 

process is the main attraction of Second Life, the two semiotic domains (game player and game designer) were 

tightly integrated. 

Findings 

Game accessibility is crucial to learning 

Two weeks into the ethnography course, I realized many students were having difficulty mastering the 

mechanics of Everquest. Most students had experienced games on home consoles as children, but only a few 

were able to quickly grasp the game’s complicated interface. In the second week of class, each student created a 

Halfling druid as his or her avatar, and all 36 players simultaneously entered a region of the world known as the 

Misty Thicket. Completely unfamiliar with the controls, students ran in all directions, bumping into hostile 

monsters and accidentally triggering attacks on the part of computer-controlled non-player characters. In less 

than an hour, each player had been killed many times. However, due to a quirk in the game’s design, the bodies 

of dead avatars remain visible in the virtual world for twenty four hours. By the end of the class session, the 

Misty Thicket was littered with scores of Halfling corpses. The sight was especially disturbing to other gamers 

who just happened to be wandering through the area at that moment. 

Students recorded their frustration with the steep learning curve in their personal web logs: 

I feel like all I am doing is running around like a chicken with my head cut off. (Sue, ethnography 

student) 

I played Everquest in class for about 10 minutes today, and I got killed by a giant bat, a giant spider, a 

Halfling named Beardo or something, and I drowned. Sometimes, I stay up at night wondering how I 

got so far in real life. (Eric, ethnography student) 

At first, I was troubled by reports of frustration and frequent death. But, failure is not necessarily a bad thing. 

Jones (1997) notes that making mistakes is a crucial element of games and of education: “One can be told 

countless times, but making the mistake and the proper adjustment creates deeper connections with the content 

than simply trying to remember.” In response to this initial trouble, I worked with one of the more experienced 

students to design in-class exercises that would make the learning curve less painful while also allowing students 

to make mistakes. Only a handful gained full mastery of the game, but every student was able to conduct ingame 

interviews. 

A year later, when planning the game design course, I found several MMOs that would be easier to master. 

Game developers had made fantastic strides in usability since Everquest was introduced in 1998. Second Life is a 

non-structured virtual environment with extensive tutorials and instructional support for new users. It also offers 

powerful tools for creating three dimensional objects and for scripting the behavior of those objects within the 

game world. The class quickly mastered the Second Life fundamentals, and they were able to use the game to 

create custom avatars, buildings, and objects when implementing their games. 

Students preferred to play the game with others 

In both courses, I asked students to invest five hours a week in solo play outside of class. This requirement was 

intended to increase their familiarity with the culture of the virtual world while helping students achieve greater 

competence. In both courses, many students resisted this requirement. It is possible that a lack of sufficient 

computing power made it difficult to play the game at home, but students could access the software on lab 

165

computers. There might also have been psychological barriers to viewing a game as homework. The most likely 

explanation is that students found the game much more enjoyable when other people were playing in their 

immediate environment. 

In their analysis of political formations in virtual environments, Jakobsson and Taylor (2003) argue that social 

networks and social capital are crucial to the immersive appeal of virtual environments. Furthermore, as noted 

earlier, social interaction is a crucial component of situated learning. The absence of social networks can be a 

barrier to newcomers. Once again, comments in the web logs reinforced the importance of the social dimension. 

Someone I met on the Sanya server told me that Everquest is just a game until you talk to someone. He 

was right. Interaction with others expands the enjoyment of the game exponentially. (Joyce, 

ethnography student) 

I think that the reason the in-class exercises were popular was because everyone was able to see that 

their classmates were as clueless as themselves about the game. It’s always better to be lost with a 

buddy, than lost alone. (Carlos, ethnography student) 

Unless they are genuinely hooked, it is unrealistic to expect students to spend much extra time in a virtual world 

on their own. Yet, devoting additional class time to game play detracts from discussion time. One way of solving 

this problem is to set up weekly “gaming sessions” outside of class hours. For example, one might tell students 

that they need to meet in the lab between 6:30 and 9:00 p.m. on a recurring weeknight to explore the game world 

together. A similar approach is often used in film studies courses, when instructors arrange evening or weekend 

viewing times. As long as they are warned in advance, students seem willing to make the time commitment. The 

more relaxed and informal nature of the gaming sessions also reinforces class camaraderie, feeding back into the 

quality of participation during more formal discussions. 

MMOs are safe learning environments 

Castranova (2001) identifies three defining features of virtual worlds: interactivity, physicality, and persistence. 

To this, I would add a fourth characteristic. Virtual worlds are safe. The player’s avatar may be exposed to an 

array of in-game dangers, but the human being is never at risk of physical harm. Furthermore, in most massively 

multiplayer games, the characters themselves do not experience permanent death. The character may lose 

experience points or a modest amount of wealth but, as Grimmelmann (2003) points out, virtual death “doesn’t 

really seem very deadly.” He notes that this dimension of safety is what makes virtual reality an effective therapy 

for agoraphobia and other anxiety-related disorders (Vincelli et al., 2003). It is also an important component of 

education. 

Safety is crucial to any learning environment. When students feel threatened, they clam up. Diamantes & 

Williams (1999) cite research suggesting that the uppermost levels of the human brain function best in 

supportive, non-threatening environments. Conversely, dangerous environments are more likely to provoke 

fight-or-flight responses in the lower-brain stem. In the ethnography course, students were asked to do 

something that can provoke as much anxiety as public speaking: they were asked to interview other people. In 

the game design course, students were asked to undertake ambitious projects using unfamiliar technology tools. 

In both cases, the safety of virtual environments, combined with a mood of playful intellectual freedom, made it 

easier for students to throw themselves into the role of inquisitive social scientists and game developers. 

Outcomes 

Students reported that learning occurred. 

In both classes, students were asked to reflect on course design and their experiences throughout the term. Many 

mentioned that they had initially expected a video-game-themed course to be a “joke” or a “blow-off class.” 

Fortunately, the vast majority reported that the course was both informative and enjoyable. 

I learned a great deal about ways to make the research repeatable and as closed to individual 

interpretation as possible. I’m actually looking forward to the next time I study people and behaviors. 

(Joyce, ethnography student) 

166

This class, without exaggeration, was one of the best learning experiences I had during my eight-year 

college career. Through play, I was able to learn on a transparent level. I learned without the pain. (Tim, 

ethnography student) 

The availability of knowledge [in Second Life] provided a learning environment unmatched by any 

other class I have taken. We found ourselves in an enjoyable environment where we could learn from 

the professor, each other, and countless others whom we have never even met. (Alex, game design 

student) 

This is probably the best course I have taken thus far at this university. While I loved playing in Second 

Life, I found the theory equally interesting, and the feeling of exploring new and unexplored media 

quite exciting. (Melissa, game design student) 

Such evidence is anecdotal, and there are many reasons to be suspicious of self-reports when assessing 

curriculum effectiveness. Nevertheless, it is interesting to note that these students explicitly articulated the 

underlying premise of the courses. They commented on the successful melding of communities of practice 

(“learning on a transparent level”), and reported their satisfaction with the ability to learn from other members of 

the game community. 

Students produced quality research 

Another way of assessing course effectiveness is to look at the quality of student work. The depth of analysis in 

student web logs, along with strong scores on written exams, suggested that students were grasping and applying 

the material. This was particularly evident in both sets of final projects: research papers in the ethnography 

course and a collectively-designed game in the game development course. 

In the ethnography course, students refined their research questions and headed out into Everquest to interview 

other gamers. Students selected their own topics, often raising issues that had not been considered by other 

researchers. A conference-style series of presentations was scheduled for the final exam period, and it took more 

than three hours for each student to summarize their work. Although we significantly exceeded the allotted time, 

students willingly stayed late to pepper their classmates with methodological and theoretical questions. 

Students explored a range of topics, including: the characteristics and motivations of Everquest players, the 

appeal of supernatural themes, gaming addiction , the therapeutic effects of role-playing, the slippery 

relationship between the virtual and the real, construction of on-line identity, the depth of on-line friendships, 

gender-bending, effects on off-line relationships, the dreams of Everquest players, the relationship between 

introversion and avatar attractiveness, the use of mind-altering substances by gamers, the impact of gender and 

race on player interaction, attitudes of gay, lesbian and bisexual MMO players, and class reactions to the game 

itself. 

In the game design course, students created games within the virtual world of Second Life. The class was split 

into three development teams of five students each. Some acted as project managers, while others focused on 

narrative development, three-dimensional modeling, avatar customization, and scripting. Considering that few 

students had prior experience with multimedia development, the final projects were highly creative. From a 

pirate-themed scavenger hunt complete with talking parrot to a three-dimensional hedge maze populated by Pac- 

Man like creatures, the games were well conceived and implemented. On the final day of the course, students 

turned in papers that dissected their games using the aesthetic and critical vocabulary that had been developed in 

the course readings. In terms of both theory and production, students had clearly mastered the course material. 

Recommendations 

Warn students about the potential for addiction 

Two months after the ethnography course ended, an anonymous net user linked the on-line syllabus to a popular 

site called Fark.Com. The link was quickly forwarded to game-themed mailing lists and MMO discussion 

forums. During the next few weeks, I received almost 100 e-mail messages from gamers around the world. As 

one respondent wryly commented, “it appears that the players are as interested in your students as your students 

are in them.” 

167

On the whole, the gaming community’s response to the course was positive and good-natured. Players of all ages 

wanted to know if they could take the course via distance education, and some were willing to move to the 

Pacific Northwest if necessary. Several high-level players courteously introduced themselves and offered to 

answer all of the students’ questions. However, I was also contacted by players who described themselves as 

former Everquest addicts. While acknowledging that the sociological aspects of the game are intriguing, these 

players worried that the assignments might inadvertently cause students to become addicted. 

[Y]ou could potentially get people addicted and lost in this world. I was addicted to the game for 3 

years and it IS a very, very powerful addiction. I strongly urge you to explain to everyone in advance 

that if they have strong addictive-type personalities not to force them to do this. . . [I] could not be any 

more serious. (Karen, recovering addict) 

One man in his late 30’s asked me to think carefully about the potential risks, citing many stories of students 

whose lives had been ruined by Everquest addiction. 

The very nature of this game, which makes it such an intriguing subject to build a college class around, 

is inherently alluring to those whose lives and priorities are lacking in meaning and substance. (Nick, 

recovering addict) 

Fortunately, although a handful of students (and the instructor) developed a temporary fixation on Everquest that 

could be compared to an addiction, this behavior subsided when the class ended. Nevertheless, any instructor 

who contemplates introducing an MMO in the classroom should consider the fact that extreme immersion can 

lead to addiction. Ethical teachers will warn students about this possibility during the first day of class, and will 

periodically check in with students to ensure that they are not spending too much time in the game environment. 

Evaluate several MMOs before selecting one for your course 

All virtual environments are not created equal. Everquest would not have worked for the game design course 

because it lacks tools for creating user-generated content. Second Life would not have lent itself to the same 

range of research questions explored by the ethnography students. 

When selecting an MMO, key issues include accessibility, genre, and extensibility. As noted earlier, the learning 

curve for Everquest is very steep, and it is difficult for beginners to understand game dynamics without investing 

huge amounts of time. A more usable environment would allow the students to spend more time making 

connections between in-game occurrences and specific theoretical concepts. Genre also makes a difference. 

Swords and sorcery themes have a proud tradition in the world of gaming, but they don’t appeal to everyone. 

Sims Online and Second Life are two games that break this mold. Also, depending on the context, it may be 

important to find an environment that allows the instructor and students to extend the world and design new 

scenarios. Second Life proved quite promising in this regard. 

Tightly integrate semiotic domains 

For an MMO-themed class to be effective, learning objectives should be identified at the outset. Along with the 

macroscopic theoretical goals, students should be given a series of smaller objectives or “baby steps” that are 

related to game mechanics. In the Everquest course, the broader research objective kept the class on track. 

However, more clearly identified low-level goals would have increased students’ mastery of the game during the 

first few weeks. A handful of in-class exercises gave students immediate objectives on which to focus (e.g. the 

fundamentals of movement or how to transcribe on-line conversations), but there could have been more of these 

in-class activities. 

These assignments should encourage parallel activities in both communities of practice. For example, once they 

have mastered conversational basics, students might be encouraged to seek out help from another player within 

the game world. They could document their reactions to this experience in an on-line web log. Several weeks 

later, they might be encouraged to post specific questions in on-line forums or mailing lists devoted to 

professional practice. In both cases, the exercises would reinforce the value of learning by asking questions of 

more experienced community members. 

168

Conclusion 

This paper has focused on the use of MMOs to teach research methods, game design and cyber culture, but they 

can be used in other situations. For example, Bradley and Froomkin (2003) suggest that these environments 

could be an effective realm in which to assess the effects of simple changes to legal rules. This application could 

easily be extended to courses in media law and telecommunication policy as a parallel to the effect of court 

decisions and federal rule making. The interview techniques used by the ethnography class could easily be 

applied to reporting students looking for news stories. In fact, news stories about activities in the virtual world 

would be an excellent and safe introduction to news values, source reliability, and the “five Ws” (who, what, 

when, where, and why) stressed in many US journalism courses. Some MMOs allow players to market realworld 

products, and these virtual worlds could be used to teach advertising students such concepts as audience 

segmentation and product targeting. 

Other theorists have noted that virtual worlds shed light on economic transactions and the formation of political 

communities (Castranova, 2001; Grimmelmann, 2003). Global organizations – from transnational corporations 

to NGOs – could use these tools to promote intercultural communication skills between members in far-flung 

locations (see Cox, 1999; Hofstede, 1999). Conklin (2005) proposes a variety of ways that Second Life can be 

used to explore concepts in political theory such as cooperation, reputation, self-governance and class 

relationships. 

Yet, the decision to implement an MMO-based curriculum should be more than a gimmick. As elaborated above, 

these virtual worlds are used most effectively as a bridge between overlapping communities of practice. During 

the final weeks of class, students can be encouraged to reflect on their participation within the domain of the 

game world and within the domain of professional practice. In articulating the similarities and differences 

between the two experiences, students have the opportunity to engage in meta-level awareness of their own 

learning process. 

However, we should not uncritically embrace MMOs. The same factors that promote immersion and engagement 

can lead to addictive gaming behaviors that displace other activities such as exercise, real-world social 

interaction, and homework. Some theorists speculate that excessive involvement in virtual worlds negatively 

impacts interpersonal relationships, scholarship and family life (Messerly, 2004). Yet, similar charges were made 

against such media as film, comic books, and television when they first arrived, and each of these 

communication technologies has demonstrated its instructional effectiveness. In the absence of solid data 

supporting negative claims about MMOs, we should cautiously proceed with efforts to incorporate virtual 

environments into the classroom. 

As this paper has repeatedly argued, context is crucial. In many classroom situations, face-to-face discussion 

makes more sense than on-line interaction. Furthermore, MMOs need not be the centerpiece of the course in 

order for learning to take place. As we explore the potential of virtual worlds, we should remind ourselves that 

they are not a panacea. In many situations, traditional methods of instruction will work just fine. Yet, we should 

not be afraid to experiment. Experimentation, like play itself, is ripe with possibility. 

References 

Bavelier, D., & Green, C. (2003). Action video game modifies visual selective attention. Nature, 43, 534-537. 

Bell, D., & Kennedy, M. (2000). The CyberculturesReader, New York, USA: Routledge. 

Bell, M. (2001). Online role-play: Anonymity, engagement and risk. Education Media International, 38 (4), 

251-260. 

Bender, T. (2005). Role playing in on-line education: A teaching tool to enhance student engagement and 

sustained learning. Innovate, 1 (4), retrieved June 30, 2006, from, 

http://www.innovateonline.info/index.php?view=article&id=57. 

Benedict, J. (1990). A course in the psychology of video and educational games. Teaching of Psychology, 17 (3), 

206-208. 

169

Bradley, C., & Froomkin, M. (2003). Virtual worlds, real rules. New York Law School Law Review, 49 (1), 103- 

146. 

Castranova, E. (2001). Virtual worlds: A first-hand account of market and society on the Cyberian frontier. The 

Gruter Institute Working Papers on Law, Economics, and Evolutionary Biology, 2 (1). 

Chick, T. (2003). Building whole societies. Gamespy, Retrieved March 16, 2006, from, 

http://archive.gamespy.com/amdmmog/week5/. 

Chou, T., & Ting, C. (2003). The role of flow experience in cyber-game addiction. Cyberpsychology and 

Behavior, 6 (6), 663-676. 

Conklin, M. (2005). 101 uses for Second Life in the college classroom. Paper presented at Games, Learning and 

Society 1.0, Madison, Wisconsin. Retrieved March 16, 2006, from, 

http://trumpy.cs.elon.edu/metaverse/gst364Win2005/handout.html. 

Consalvo, M. (2003). Hot dates and fairy-tale romances: Studying sexuality in video games (Chapter 8). In 

Wolf, M. J. P. & Perron, B. (Eds.), The Video Game Theory Reader, London: Routledge, 171-194. 

Cox, B. (1999). Achieving intercultural communication through computerized business simulation/games. 

Simulation & Gaming, 30 (1), 38-50. 

Crawford, C. (1982). Why do people play games? Art of Computer Game Design, Berkeley: McGraw-Hill. 

Csikszentmihalyi, M. (1990) Flow: The Psychology of Optimal Experience, New York, USA: Harper and Row. 

Cutler, C., & Hay, I. (2000). 'Club Dread': Applying and refining an issues-based role play on environment, 

economy, and culture. Journal of Geography in Higher Education, 24 (2), 179-198. 

Diamantes, T., & Willams, M. (1999). Using a game format to teach national education standards to aspiring 

school administrators. Education, 119 (4), 727-730. 

Douse, N., & McManus, I. (1993). The personality of fantasy gamers. British Journal of Psychology, 84 (4), 

505-509. 

Emerson, R., Fretz, R., & Shaw, L. (1995). Writing Ethnographic Fieldnotes, Chicago, USA: University of 

Chicago Press. 

Fetterman, D. (1989). Ethnography: Step by step, London: Sage. 

Fine, G. (1983). Shared Fantasy: Role-playing Games as Social Worlds, Chicago: University of Chicago Press. 

Fogg, P. (2001). A history professor engages students by giving them a role in the action. Chronicle of Higher 

Education, 48 (12), 12-14. 

Foreman, J. (2003). Next-generation: Educational technology versus the lecture. Educause Review, July/August, 

12-22. 

Foreman, J. (2004). Video game studies and the emerging instructional revolution. Innovate, 1 (1), retrieved 

March 16, 2006, from, http://www.innovateonline.info/index.php?view=article&id=2. 

Garris, R., Ahlers, R., & Driskell, J. (2002). Games, motivation, and learning: A research and practice model. 

Simulation and Gaming, 33 (4), 441-467. 

Gee, J. P. (2003). What Video-games have to Teach us about Learning and Literacy. New York, USA: Palgrave 

Macmillan. 

Gee, J. P. (2004). Learning about Learning from a Video Game: Rise of Nations, retrieved March 16, 2006, 

from, http://distlearn.man.ac.uk/download/RiseOfNations.pdf. 

170

Goffman, E. (1959). The Presentation of Self in Everyday Life, New York: Anchor Books, Doubleday. 

Griffiths, M., Davies, M., & Chappell, D. (2003). Breaking the stereotype: The case of on-line gaming. 

Cyberpsychology and Behavior, 6 (1), 81-91. 

Grimmelmann (2003). The state of play: Life, death and democracy online. Posting to Lawmeme, retrieved 

March 16, 2006, from, http://research.yale.edu/lawmeme/modules.php?name=News&file=article&sid=1239. 

Heeter, C. (1992). Being there: the subjective experience of presence. Presence, 1, 262-271. 

Hertz, J. C. (2002). Gaming the system: What higher education can learn from multiplayer on-line worlds. In 

Devlin, M., Larson, R. & Meyerson, J. (Eds.), Internet and the University: 2001 Forum, Cambridge, USA: 

Educause, 169-191. 

Hoffman, D. L., & Novak, T. P. (1996). Marketing in hypermedia computer-mediated environments: conceptual 

foundations. Journal of Marketing, 60, 50-68. 

Hofstede, G. J., & Pedersen, P. (1999). Synthetic cultures: Intercultural learning through simulation games. 

Simulation & Gaming, 30 (4), 415-440. 

Howard, P. & Jones, S. (2004). Society Online: The Internet in Context, Thousand Oaks, CA: Sage. 

Hughes, J. (1998). Therapy Is Fantasy: Roleplaying, Healing and the Construction of Symbolic Order. Paper 

presented in Anthropology IV Honours, Medical Anthropology Seminar, retrieved March 16, 2006 from, 

http://www.rpgstudies.net/hughes/therapy_is_fantasy.html. 

Jakobsson, M., & Taylor, T. L. (2003). The Sopranos meets Everquest: Social Networking in Massively 

Multiplayer Online Games. Paper presented at the 5 th International Digital Arts and Culture Conference, 

Melbourne, retrieved March 16, 2006, from, hypertext.rmit.edu.au/dac/papers/Jakobsson.pdf. 

Jenkins, H., & Squire, K. (2002) Art of contested spaces. In King, L. (Ed.), Game On: The History and Culture 

of Video Games, New York, USA: Universe Publishing, 64-75. 

Johnson, S. (2005). Everything Bad Is Good for You: How Today’s Popular Culture Is Actually Making Us 

Smarter, New York, USA: Riverhead Books. 

Jones, M. G. (1997, February). Learning to Play, Playing to Learn: Lessons Learned from Computer Games. 

Paper presented at the Association for Educational Communications and Technology, Albuquerque, New 

Mexico. 

Kawamoto, K. (2002). Media and Society in the Digital Age, Boston, USA: Allyn & Bacon. 

King, G., & Krzywinska, T. (2002). Introduction: Cinema/videogames/interfaces. In King, G., & Krzywinska, T. 

(Eds.), Screenplay: Cinema/videogames/interfaces, London: Wallflower Press, 1-32. 

Lave, J., & Wenger, E. (1991). Situated Learning: Legitimate Peripheral Participation, Cambridge: Cambridge 


Luff, I. (2000). "I've been in the Reichstag": Rethinking roleplay. Teaching History, August, 8-18. 

Mackay, D. (2001). The Fantasy Role-Playing Game: A New Performing Art, Jefferson, NC: McFarland. 

Mann, C., & Stewart, F. (2000). Internet Communication and Qualitative Research: A Handbook for 

Researching On-line, London: Sage. 

Messerly, J. G. (2004). How computer games affect CS (and other) students’ school performance. 

Communications of the ACM, 47, 28-31. 

Orosy-Fildes, C., & Allan, R. (1989). Videogame play: Human reaction time to visual stimuli. Perceptual and 

Motor Skills, 69 (1), 243-248. 

171

Pearce, C. (2001) Story as play space. In King, L. (Ed.), Game On: The History and Culture of Video Games, 

New York, USA: Universe Publishing, 112-119. 

Peng, W. (2004). Is playing games all bad? Positive effects of computer and video games in learning. Paper 

presented at 54 th Annual meeting of the International Communication Association, New Orleans, USA, May 27- 

31. 

Quay, J. (2003). Experience and participation: Relating theories of learning. The Journal of Experimental 

Education, 26 (2), 105-116. 

Relph, J. (2002). Broads, a bitch, never the snitch. In King, L. (Ed.), Game On: The History and Culture of 

Video Games, New York, USA: Universe Publishing, 58-61. 

Robins, K. (1994). The haunted screen. In Bender, G. & Druckrey, T. (Eds.), Culture on the Brink: Ideologies of 

Technology, Seattle: Bay Press, 305-315. 

Salen, K., & Zimmerman, E. (2004) Rules of Play: Game Design Fundamentals, Cambridge: MIT Press. 

Silver, D. (2000). Looking backwards, looking forwards. Cyberculture studies 1990-2000. In Gauntlett, D. (Ed.), 

Web.Studies: Rewiring Media Studies for the Digital Age, Oxford: Oxford University Press, 19-30. 

Steinkuehler, C. A. (2004). Learning in massively multiplayer online games. In Kafai, Y. B., Sandoval, W. A., 

Enyedy, N., Nixon, A. S. & Herrera, F. (Eds.), Proceedings of the 6 th International Conference of the Learning 

Sciences, Mahwah, NJ, USA: Erlbaum, 521-528. 

Steinkuehler, C. A. (2006). Why game (culture) studies now? Games & Culture, 1 (1), 1-6. 

Steinkuehler, C. A. (in press). Cognition and literacy in massively multiplayer online games. In Leu, D., Coiro, 

J., Lankshear, C. & Knobel, K. (Eds.), Handbook of Research on New Literacies. Mahwah NJ, USA: Erlbaum. 

Stephenson, N. (1993). Snowcrash, New York, USA: Bantam. 

Stone, A. (1995). The War of Desire and Technology at the Close of the Mechanical Age, Cambridge: MIT 

Press. 

Turkle, S. (1995). Life on the Screen: Identity in the Age of the Internet, New York, USA: Simon & Schuster. 

Van Buren, C., & Christ, W. (2000). The case for eliminating the production/theory dichotomy in curricular 

decision-making. Feedback, 41 (3), 41-50. 

Vincelli, F., Anolli, L., Bouchard, S., Wiederhold, B. K., Zurloni, V., & Riva, G. (2003). Experiential cognitive 

therapy in the treatment of panic disorders with agoraphobia: A controlled study. Cyberpsychology and 

Behavior, 6 (3), 321-329. 

Wenger, E. (1998). Communities of Practice: Learning, Meaning and Identity, Cambridge, England: Cambridge 


Wolf, M. J. P. (2001). Space in the video game. In Wolf, M. J. P. (Ed.), The Medium of the Videogame, Austin, 

TX: University of Texas Press, 51-75. 

Yee, N. (2001). The Norrathian Scrolls: A Study of Everquest (version 2.5), retrieved March 16, 2006, from, 

http://www.nickyee.com/eqt/home.html. 

Yee, N. (2006). The Psychology of MMORPGs: Emotional Investment, Motivations, Relationship Formation, 

and Problematic Usage. In R. Schroeder & A. Axelsson (Eds.), Avatars at Work and Play: Collaboration and 

Interaction in Shared Virtual Environments, London: Springer-Verlag, 187-207. 

172

Capus, L, Curvat, F., Leclair, O., & Tourigny, N. (2006). A Web environment to encourage students to do exercises outside 

the classroom: A case study. Educational Technology & Society, 9 (3), 173-181. 

A Web environment to encourage students to do exercises outside the 

classroom: A case study 

Laurence Capus, Frédéric Curvat, Olivier Leclair and Nicole Tourigny 

ERICAE, Département d’informatique et de génie logiciel, Université Laval, Canada, G1K 7P4 

laurence.capus@ift.ulaval.ca 

nicole.tourigny@ift.ulaval.ca 

ABSTRACT 

For the past five years, our students have been passing less and less time preparing for lectures and exams. 

To encourage them to do more exercises, a pedagogical activity was offered outside the classroom. With 

the goal of making students more active during the problem-solving process, an innovative online 

environment, Sphinx, was developed. Sphinx proposes a set of exercises with their solutions, and invites 

students to explain them. Sphinx also gives students the opportunity to exchange ideas about an exercise 

and its solution, and gives the teacher the ability to observe if students take part in the problem-solving 

process. The originality of Sphinx lies in its ability to support students between lectures with a common 

learning strategy: learning-by-example. An experiment was conducted with 137 students during the 2003 

fall session. The solved exercises were consulted most often before the two exams. However, only a few 

students participated actively in the experiment by intensively using the Sphinx environment; they obtained 

a better average score than the rest of the class. Thus, participation and collaboration had a positive effect 

on the students’ marks. Tools should be added to motivate more students to take part in the explaining and 

collaborating process. 

Keywords 

E-learning, Self-explanation, Learning-by-example, Collaborative learning, Case study 

Introduction 

For the past five years, we have observed that our students generally have been passing less and less time 

preparing for lectures. Only a few have been reading theory and doing exercises. We also noted that they were 

not prepared for their exams. We decided to offer a pedagogical activity outside the classroom to encourage 

them to do more exercises. 

Usually, the teachers on our campus use WebCT® or Learning Space® as online environments to support 

distance learning. Such environments exploit information and communication technologies to offer relevant 

tools: e-mail, discussion forum, chat, etc. However, they do not include specific tools for helping students learn 

the problem-solving process by solving exercises within these environments (Cronjé, 2001). For courses given in 

the classroom, teachers generally use Web sites with classic elements such as electronic slides, descriptions of 

assigned projects (or class projects), marks, exercises to be solved, etc. However, these elements are static. If 

students want to ask questions, they have to send a message to their teacher. This can be time consuming for the 

teacher and is not flexible for the students. 

Since the 2002 fall session, we have proposed an online environment – more interactive than those usually 

employed on our campus – with examples of solved exercises that students are invited to self-explain (Capus et 

al., 2003). This environment has been improved in response to the various users’ comments (students and 

teaching team). Our teaching method is original because we combine classroom instruction with an online 

environment to support learning outside the classroom. The purpose of the online environment was to make 

students more active during the problem-solving process and to motivate them to work more outside the 

classroom by giving them feedback within a virtual classroom between lecture sessions. It also allows the 

teacher to follow the progress of each student using the environment. 

In this paper, we describe a case study conducted with 137 students during the 2003 fall session in a university 

course on artificial intelligence. Generally, the students consulted exercises, but they did not often explain them. 

The group of students who used the Sphinx environment most frequently obtained a better average score than 

that of the class as a whole. The objective of this paper is to show how the functional architecture of the Sphinx 

environment supports learning and teaching outside the classroom. 

Next section describes how the Sphinx environment supports learning outside the classroom. This is followed by 

description of the experiment we conducted, as well as an analysis and discussion of the results obtained. 






173

How to encourage students to do exercises outside the classroom? 

To encourage students to do exercises outside the classroom, we developed Sphinx, an online environment that 

presents a set of exercises with their solutions, available at anytime on any platform. Sphinx allows students 

access to worked examples and derived problems and uses a collaborative model for learner advancement in 

problem-solving skills. It also allows the teacher to observe if students take part in solving exercises. This 

section describes the pedagogical principles on which Sphinx is based, the needs that Sphinx should help satisfy 

to support learning and teaching efficiently, the functionalities added to fulfil these needs, and finally the 

Sphinx’s architecture. 

The pedagogical principles of the Sphinx environment 

The Sphinx environment combines learning-by-example and collaborative learning. Several authors, like 

VanLehn (1998), mentioned that study by means of examples is a common and natural way of learning. Sphinx 

also integrates self-explanation, because research has shown that the best problem-solvers self-explain more than 

other students (Chi, 2000; Chi et al., 1989). VanLehn and Jones (1993) conducted an experiment on the ability of 

students to solve problems after studying examples. Their hypothesis was that good problem-solvers are able to 

encode knowledge contained in an example and then explain implicit information related to this example; thus, 

they should obtain a better score when they solve new problems. Their results showed that the ones who learned 

the most had given the greatest number of explanations while studying examples. 

To encourage students to self-explain a chosen problem in Sphinx, they are invited to write their explanations 

within the environment in order to formalize them more systematically. The collaboration allows students to 

learn with peers (Oliver and Omari, 2001). All students can look at the explanations and comment on them or 

can respond to another student’s questions. The teacher waits for a moment before commenting or answering in 

order to let the collaboration process take place. Global feedback can also be given in the classroom so teachers 

can avoid having to intervene frequently to answer students’ questions. This aspect is valuable because it is 

important that the use of the environment should not increase the teacher’s tasks. 

Students can find different kinds of examples in the Sphinx environment – questions/answers, problems with 

several solving steps, etc. – with different levels of difficulty. In fact, the teacher adds these examples according 

to their relevance to the course. Generally, the examples in the Sphinx environment are taken from an exercise 

book used in the course during previous terms. Table 1 presents an example from the Sphinx environment on 

resolution refutation proof, based on the “dead dog” problem (Luger, 2005). The solution is voluntarily a simple 

list of steps without explanations. The students have to reason through the problem in order to understand how 

the solution was obtained. 

Table 1. An example from the Sphinx environment 

Problem With the following premises: 

P1: ¬dog(X) ∨ animal(X) 

P2: dog(fido) 

P3: ¬animal(Y) ∨ die(Y) 

Peter has to prove that Fido will die. 

Solution S1: ¬dog(X) ∨ animal(X) 

S2: dog(fido) 

S3: ¬animal(X) ∨ die(X) 

S4: ¬die(fido) 

S5: ¬animal(fido) 

S6: ¬dog(fido) 

S7: {} 

Peter proved that Fido will die. 

Students can use self-explanation or explanation to make this reasoning. A self-explanation can contain an 

inference that refers to a piece of knowledge; in this case, the self-explanation specifies the application 

conditions of an action, the consequences of an action, the relation between an action and a goal or the relation to 

principles illustrated by the example (Chi et al., 1989; Khan and Yip, 1995). For the example contained in Table 

1, a good explanation for step S4 can be: “To make a resolution refutation proof, add the negation of what is to 

174

e proved (Fido will die), in clause form, to the set of axioms.” In other words, the expected explanations 

correspond to the justifications for the reasoning used to solve the problem. A self-explanation can also be a 

paraphrase, a self-check or even be a textual fragment with no sense, but in all cases it is good for learning (Chi 

et al., 1989). Consequently, the most important goal is to motivate the students to explain and discuss solutions. 

A good explanation can then be obtained at the end of a discussion between users (the students and eventually 

the teaching team). 

How does the Sphinx environment support learning? 

A recent faculty study showed that students worked only about one hour per week per course instead of the six 

hours advised by teachers. We noted our students were more insecure about their exams and the class average 

had decreased. We needed a way of encouraging our students to work more outside the classroom. The Sphinx 

environment is a tool for attracting the students into a learning community, and then creating enthusiasm among 

the students. This environment also allows the teacher to observe the work of each student logged on, and to get 

some information about her/his difficulties. However, the development of an educational environment is not 

easy. It is rather a complex task because of the required knowledge from various fields: user interface 

engineering, cognitive sciences, educational sciences, and knowledge engineering (Tchounikine, 2002). After a 

preliminary experiment in 2002 (Capus et al., 2003), we collected users’ comments to improve the online 

environment. Then, we identified special needs and added particular functionalities to fulfil these needs in order 

to support the learning process in the Sphinx environment. 

Needs to be considered 

As for any software, the user interface should be appropriate to the users' needs. It is doubly important in an 

educational context where the main objective is to help the student acquire new knowledge. Thus, the interface 

should not hinder this knowledge process by discouraging the user with its ineffectiveness or problems of 

coherency. The graphic interface should be ergonomic and should provide all the information necessary to the 

user, who should be able to personalize it. A user model must be provided to save user preferences about media 

types, colour parameters and the position of the interface. It is also important to provide simple contextual help 

that is available when needed. 

More particularly, several means of facilitating learning can be proposed based on the principle of help in real 

time. We suppose that help given to the learner during the use of the environment effectively encourages selfexplanation. 

For each category of explanation, it is important to give the user effective and adapted advice. Then, 

adapted semantics can be employed to allow the student to clarify her/his thought. To help the learner structure 

her/his thought, one can highlight special argumentative keywords contained in her/his explanation. This 

emphasis on the logic structure of the sentence should improve the re-reading. A way of helping the learner to 

construct her/his explanation is to begin with a description of the action implemented in the step of the problemsolving 

process to be explained. The action is characterized by an action verb, a complement and parameters. A 

list of words from an educational dictionary (Legendre, 1993) can be proposed for each part of the action. An 

immediate automatic retroaction is then possible if the right response is stored in the environment. It is important 

to construct a learner model (Baker, 2000) in order to advise the student on the examples to be studied and the 

concepts to be revised. The profile of the learner should integrate her/his knowledge about the domain. To build 

such a model, one can use various methods: based on the course, based on results, or by self-assessment 

(Delestre, 2000). 

Functionalities to support the learning process 

We added some particular functionalities to the online environment in order to meet the above needs. These 

functionalities should more efficiently support the learning process. 

Different navigation modes 

Students can navigate among the examples, as they prefer. Examples are classified according to the structure of 

the course, which is represented by a tree in the user interface. Each week of the course (or leaf of the tree) 

proposes a list of examples. Our objective is to propose another way of navigating to help students that are 

175

dependent on field. This is an important tool (Triantafillou et al., 2003) for advising them during the revision 

phase. A list of keywords, not visible to users, is combined with each example. These keywords are used to find 

examples related to predefined subjects. To guide the student, the example links are coloured. We defined three 

categories of links, e.g. three colours: for the ‘not visited links’, ‘visited links’ and ‘links to be visited’ 

(belonging to the set of suggested links). This principle is simple, and similar to the one used on all Web sites. 

Thus, students can navigate freely or by using a more personalized revision mode. 

Information about messages 

We based this functionality on the principles of discussion forums. For each example, the user interface indicates 

the number of messages posted. This functionality allows the student to quickly see if comments were made on 

her/his explanation or if new messages were added since her/his last logon. 

Modifications of individual preferences 

The student can also modify her/his individual preferences concerning nickname, password, email, colours and 

links, etc. This functionality allows the student to personalize her/his own interface making the user interface 

become more familiar. 

Saving users’ actions 

Users’ actions are saved in a special folder in text files. These files have a specific format, which is easy to 

translate by parsing. Each line of the files is a record composed of a series of values corresponding to a relevant 

event in a user’s action. Each record contains the event type, for instance ‘write an explanation’, ‘read an 

explanation’, or ‘display an example’. If the event is ‘write explanation’, the time to write an explanation and the 

number of significant words in the written explanation are added to the record. For each event type, other 

information is saved such as the number of positive (e.g. ‘I agree with this explanation’) and negative (e.g. ‘I 

disagree with this explanation’) students’ messages as well as the number of positive and negative teacher’s 

messages. The list of keywords for the example consulted and its difficulty level, as well as the number of words 

in the written explanation are also added to the record. All this information allows the environment to update the 

learner model. 

Learner model 

The learner model indicates the knowledge level of each student and is quite simple. A network of relevant 

concepts studied during the course describes this model. The concepts can take the following values: ‘not seen’, 

‘seen’, ‘known’, ‘learned’. At the beginning of the session, each concept has the value ‘not seen’. According to 

the students’ actions, the learner’s model is updated by modifying the value of each concept implied by the 

actions. In fact, the information gathered by the preceding functionality allows the environment to modify the 

new value of a concept. Assessment of learner’ knowledge using only this information is not accurate because 

other elements are needed. For instance, students can learn concepts in the classroom or with their handbook. For 

the time being, the learner model does not take this outside learning into account. However, we have found a 

way of making it work for our context. Sphinx uses a reference table specifying the number of points for each of 

the user’s actions according to its importance and its a priori impact on learning. For instance, if a student 

explains an example, the number of points is greater than if this example is only read. The rationale behind this 

approach is that explaining an example is more efficient than only reading it to understand a concept. The 

revision navigation mode uses the learner model to propose links to relevant examples. Indeed, the environment 

suggests to students that they revise specific examples that integrate concepts not sufficiently understood. It 

automatically gives messages to students advising them on what they should do. The creation of these messages 

is also based on the learner model. 

Other functionalities, such as browse bars allowing forward and backward movements within the environment 

improve the user interface. The content of an example can be printed. The home page of the Sphinx environment 

displays information about its utility and its use. A help menu is available to introduce the environment and give 

details on its use. A contextual help offers the student advice and tips. 

176

Architecture of the Sphinx environment 

Figure 1 presents the architecture of the Sphinx environment in a simplified way. The student enters the 

environment by using the Student Module. This is a Web interface that allows the student to have access to the 

examples of solved exercises, and to all explanations and discussions. The student can modify her/his 

preferences for this interface. The Feedback Module is activated when the student chooses words for describing 

the action made in a step of the problem-solving process. It also allows the student to know if her/his response 

was right or wrong. 

Figure 1. A simplified architecture of the proposed environment 

The Server Module is the central point of the environment. It manages access to and the saving of 

data/knowledge (examples of solved exercises, users’ explanations and actions, and user preferences). This is the 

link between the different modules: Teacher Module, Student Module, and Feedback Module. The teacher has 

access to the environment by using the Teacher Module, which is an acquisition module for adding examples of 

solved exercises. This aspect is often forgotten in educational environments, but it is very important for 

facilitating the use by teachers, because it eliminates the need for the teacher to adapt to the programming 

language used to build the system. The teacher can also have access to the environment by using the Student 

Module and can interact with all the students by observing explanations and discussions, or by adding new 

explanations or comments. 

Figure 2. A snapshot of the user interface 

The online environment, Sphinx, performs on any platform; both Windows and Linux are available for our 

students. It does not limit the number of connections. The interface is developed in the Java language, mySQL 

was chosen as a database system, and data and knowledge are represented in an XML format. All users have 

access to the environment by any Web browser that supports the Java Servlet technology. To enter the 

environment, they have to validate their login and password. Next, they can use the general user interface. Other 

particular interfaces are proposed to observe examples of solved exercises, to write messages, change 

preferences or consult the help menu. The database system allows access to users’ messages and user data 

177

(preferences, history, etc.). The users’ actions are saved in text files in a special folder, the track folder. This 

function is important because these files allow us to evaluate how the students used the environment. To 

conclude this section, Figure 2 presents a snapshot of the user interface. 

The left-hand column proposes the list of examples for each week. The browser arrows and a help message are 

shown below this. Tools like printing, modifying personal features, help, etc., are at the top of this snapshot. The 

main window shows an example with details on each step of the solving process. For each step, an ‘explain’ 

button (at right of each solving step) allows the user to post a message by using the grey shaded window at the 

bottom right of the window. The numbers below the ‘explain’ button indicate the number of messages: total, not 

read, new, replied. All the discussions about this step can be seen at the left of the blue shaded window, 

structured as a discussion forum. 

Experiment, results and discussion 

During the entire 2003 fall session, this environment was made available as additional pedagogical material to 

137 students in our department's course on artificial intelligence. Most of the students were in their second year 

of university studies, and this course is compulsory for their bachelor program. The term is fifteen weeks long, 

with two exams, one at the middle of the term and the other at the end of the term, usually in the fifteenth week. 

The students had to use a login and password, thus, only the students enrolled in this course could log onto the 

environment. Each week, the teacher added between five and ten examples corresponding to the content seen in 

classroom; the examples were listed according to their difficulty level. Based on the text files saved in the track 

folder, we evaluated the participation and collaboration of the students. We also compared the students’ marks in 

order to find a correlation between the use of the environment and the results obtained. Finally, we questioned 

the students on the use of the new functionalities such as different navigation modes, information about 

messages and modifications of individual preferences. The purpose of our evaluation was to determine if the 

Sphinx environment was effective in increasing students’ marks and if its functionalities were useful or not. 

Participation 

We recorded 506 student logons, an average of 3.7 per student. In fact, only 113 students out of the 137 used 

Sphinx. If we consider only the students who actively participated in the experiment, the average number of 

logons per student is 4.47. The average logon time is about one hour. We noted that most of the students who 

entered the environment did not add any explanation or message. However, they consulted solved exercises, 

explanations and messages of the other students. Why did the students participate so little in the discussions? We 

have no sure answers at the present time. One might hypothesize that they did not know how to explain the 

solutions to problems or they did not want to participate in discussions because they were afraid of making 

mistakes. Further investigation is needed to clarify this point. However, it seems that the students found it useful 

to consult the information available in the Sphinx environment. 

We also noted that the Sphinx environment was not used regularly during the session. Figure 3 shows the 

number of logons per week. Most of students logged onto the environment during exam week (Weeks #7 and 

#14), so we can suppose that the Sphinx environment was useful for students as a revision tool before each 

exam. 

Figure 3. Number of logons per week 

178

Collaboration 

Table 2 corresponds to three extracts of discussions (translated into English) posted by students during this 

experiment. Discussion #1 is about the best path found in a search tree by using the alpha-beta technique. The 

student Jflal7 explained why the node R was chosen. This was a good explanation and another student agreed 

with it. The students’ messages were not only explanations. Sometimes, students asked questions on the solved 

exercises because they did not understand them. For instance, in discussion #2, the student MaTur2 asked a 

question about a natural language processing program that builds sentences with a set of words and grammar 

rules. This program had no semantic verification process. Then, MaTur2 asked why a certain sentence was not 

given as a solution. During this discussion, another student Gucot59 explained to MaTur2 that the solution 

proposed in the exercise was not complete and it missed twelve other sentences. From the teacher viewpoint, 

collaboration is very important because students can exchange ideas and learn from each other. This discussion 

illustrates how collaboration can be exploited in a learning context. 

Table 2. Three extracts of discussions between students 

Discussion Title of message Nickname Content of message 

#1 Reason Jflal7 The value R is 2 and the one of Q is 1. R is chosen because one 

must choose the maximum. 

Correct! Stleg19 It’s correct! 

#2 And the cupboard? MaTur2 Should this sentence be here: A cupboard builds a joiner? 

Partial Solution Gucot59 I think this program produces 12 other different sentences. 

#3 A lot of negatives! Jofou33 Is it possible to delete the negatives in the sub clauses? 

In discussion #3, the student Jofou33 wondered why many negatives were used in an exercise on knowledge 

representation using conceptual graphs. The other students did not give any response. In this case, there was no 

collaboration or one can suppose that the other students did not know how to answer the question. Because we 

think it is important to give correct feedback to our students, a teaching assistant posted a response (not given in 

the table) to explain the representation. 

Students’ marks 

We compared the marks of the whole class to those of the students who used the Sphinx environment most 

frequently. We think that to be useful in the learning process, the student should actively use Sphinx. If a student 

only enters Sphinx and prints some examples, s/he is not getting any more than s/he would get from a book, for 

instance. Thus, we considered only the students who added messages, logged onto the environment for more 

than four hours and participated in several discussions. When we analysed the logon time, this group of students 

clearly stood out from the others. Their average score was 70.84% whereas the average score of the class was 

67.44%. One might think that the best students used the Sphinx environment most frequently. However, upon 

looking at the students’ general marks, we noted that this hypothesis was wrong, thus indicating that Sphinx may 

be having a positive effect on the students’ marks. This increase is an encouraging result and we will continue 

improving the Sphinx environment to help students learn and succeed. 

Users’ comments 

At the end of the session, the students filled out a questionnaire. The analysis of the questionnaires showed that 

63% of the students used the new functionalities (different navigation modes, information about messages and 

modifications of individual preferences), 61% wrote that Sphinx had helped them and 70% appreciated the 

possibility of revising concepts with which they had difficulties. We can conclude that the Sphinx environment 

is useful to students especially because of the different modes of navigation. The degree of satisfaction 

concerning the messages is lower. Only 48% appreciated this functionality. The students have different learning 

styles. Some prefer to be directed, others do not. For the moment, these messages are quite basic and need to be 

more precise and personalized, and perhaps should even use a pedagogical agent. These two facts could explain 

the low degree of satisfaction. 

The teaching team thinks that the new functionalities of the Sphinx environment improve the learning context 

and should help students use this environment more easily and more often. According to the team, the messages 

could be improved by explaining why they are given. It would help the students to better understand what is 

179

expected from them. The teaching team considers that encouraging messages should be given even when 

participation is satisfactory. Finally, the team would like to have another functionality with which to propose 

specific advice, to underline the importance of an example, for instance. 

Benefits of the Sphinx environment 

Through using the Sphinx environment we learned that participation and collaboration can have a positive effect 

on the students’ marks. If we consider only the students who really participated in the experiment by adding 

several messages, logging onto the environment for more than four hours and taking part in several discussions, 

we can conclude that the use of the Sphinx environment improved the learning process. Moreover, the results on 

learning-by-example showed the importance of the explaining process. Thus, in the next phase in the 

development of Sphinx we will add elements that hopefully will encourage more students to take part in the 

processes of explaining and collaborating. 

Future work 

We are planning a new experiment to measure improvement in each individual's learning. In the present version 

of Sphinx, at the beginning of the session, each concept in the learner model has the value ‘not seen’. According 

to her/his actions, her/his student’s model is updated by modifying the values of the concepts implied by the 

actions. At the end of the session, each learner model can be consulted in order to evaluate the knowledge level 

of concepts. However, such an evaluation is not really exact because our students can use pedagogical material 

other than the Sphinx environment to learn. If a student does not explain specific examples, this does not always 

mean that s/he does not know the concept illustrated by these examples. S/he can have learned it in classroom, 

by doing exercises, or by some other means. The improvement in learning can be verifiable from a global 

viewpoint but it is more difficult to evaluate on an individual basis, because the Sphinx environment is only an 

online educational support for classroom instruction. Students can decide to use or not use it. Above all, we want 

to encourage students to work more outside the classroom. We are now working on a better representation of the 

learner model, which would allow us to measure individual learning improvement more efficiently. We would 

also like to adapt the Sphinx environment to distance learning (Tourigny et al., 2005), an instruction mode that 

our university is becoming more and more interested in. 

Conclusion 

This paper presented a case study on an online environment, Sphinx, available on a continuous basis to 137 

students during the 2003 fall session. This environment is a tool for helping students outside the classroom and 

for encouraging them to do exercises. A set of solved exercises was proposed each week during the course. A 

student could consult an exercise and its solution. She/he could try to do the exercise, to self-explain a step of the 

solving process for her/himself or add an explanation in the environment. She/he could share her/his explanation 

with the other students or comment on an explanation of another student. Or she/he could simply read the 

discussion of an example. Sphinx is based on self-explanation (Chi, 2000; Chi et al., 1989) and collaboration 

(Oliver and Omari, 2001), two activities that allow students to build and/or revise their domain knowledge and 

become more active when they learn. We noted that the students who used the Sphinx environment most 

frequently, obtained a better average score. Even if it is difficult to interpret this result, it encourages us to 

continue using the Sphinx environment and we can show this result to our future students in order to encourage 

them to use Sphinx more frequently. We are improving our methodology for studying individual learning 

effectiveness and are planning a bigger experiment to evaluate how and why students learned with the Sphinx 

environment. 

The analysis of this experiment also allowed us to identify some aspects of the environment that should be 

improved. Particular attention will be paid to ways of motivating the students to take part in the explaining and 

collaborating process. We continue to improve the present version of the Sphinx environment and our final goal 

is to build a shell adaptable to any course and any kind of training. Teachers should be able to easily store 

examples of their course and students should be able to consult examples at anytime, explain them, comment on 

explanations of peers, and receive feedback within a virtual classroom between lecture sessions. 

180


The authors would like to thank the Institute for Robotics and Intelligent Systems (IRIS-Precarn Inc.) for its 

financial support. The authors would also like to thank the students, Jalil Emmanuel, Maxime Garceau-Brodeur, 

Huu Loc Nguyen and Benoît Potvin, for their participation in the development of the Sphinx environment as 

well as the students registered in the course IFT-17586 Artificial Intelligence I for their collaboration. Finally, 

the authors would like to thank the two anonymous reviewers for their constructive comments. 

References 

Baker, M. (2000). The roles of models in Artificial Intelligence and Education research: a prospective view. 

International Journal of Artificial Intelligence in Education, 11, 122-143. 

Capus, L., Potvin, B., & Tourigny, N. (2003). An e-learning framework to extend lecture courses. Paper 

presented at the International Conference on Advances in Infrastructure for e-Business, e-Education, e-Science, 

e-Medicine, and Mobile Technologies on the Internet, July 28 - August 3, L’Aquila, Italy. 

Chi, M. T. H. (2000). Self-Explaining Expository Texts: The Dual Processes of Generating Inferences and 

Repairing Mental Models. In Glaser, R. (Ed.), Advances in Instructional Psychology, Mahwah, NJ: Lawrence 

Erlbaum, 161-238. 

Chi, M. T. H., Bassok, M., Lewis, M. W., Reimann, P., & Glaser, R. (1989). Self-Explanations: How Students 

Study and Use Examples in Learning to Solve Problems. Cognitive Science, 13, 145-182. 

Cronjé, J. C. (2001). Metaphors and models in Internet-based learning. Computers & Education, 37 (3-4), 241- 

256. 

Delestre, N. (2000). La construction automatique de cours hypermédia adaptés à l'apprenant par agencement de 

briques élémentaires, In Bourigault, D., Charlet, J., Kassel, G. & Zacklad M. (Eds.), Ingénierie des 

connaissances 2000, Paris: Eyrolles, 35-46. 

Khan, T., & Yip, Y. J. (1995). CBTII - Case-Based computer-aided instruction: survey of principles, 

applications and issues. The Knowledge Engineering Review, 10 (3), 235-268. 

Legendre, R. (1993). Dictionnaire actuel de l’éducation, Second Edition, Montréal: Guérin. 

Luger, G. F. (2005). Artificial Intelligence, Structures and Strategies for Complex Problem Solving (5 th Ed.), 

New York: Addison Wesley. 

Oliver, R., & Omari, A. (2001). Exploring Student Responses to Collaborating and Learning in a Web-Based 

Environment. Journal of Computer Assisted Learning, 17 (1), 34-47. 

Tchounikine, P. (2002). Pour une ingénierie des Environnements Informatiques pour l'Apprentissage Humain. 

Revue I3, 2 (1), 59-95. 

Tourigny, N., Capus, L., & Gamache, A. (2005). Using Self-Explanations with Information and Communication 

Technologies to Improve Distance Learning in Computer Science. In Tiginyanu, I. M. & Sontea, V. (Eds.), 

Proceedings of the 4 th International Conference on Microelectronics and Computer Science, Chisinau, Moldova, 

September 15-17, Vol. II, 461-464. 

Triantafillou, E., Pomportsis, A., & Demetriadis, S. (2003). The design and the formative evaluation of an 

adaptive educational system based on cognitive styles. Computers & Education, 41 (1), 87-103. 

Van Lehn, K. (1998). Analogy Events: How Examples are Used During Problem Solving. Cognitive Science, 22 

(3), 347-388. 

Van Lehn, K., & Jones, R.M. (1993). Learning by explaining examples to oneself: A computational model. In 

Chipman & Meyrowitz (Eds.), Foundations of Knowledge Acquisition: Cognitive models of complex learning, 

Boston, MA: Kluwer, 25-82. 

181

Hussain, S., Lindh, J., & Shukur, G. (2006). The effect of LEGO Training on Pupils’ School Performance in Mathematics, 

Problem Solving Ability and Attitude: Swedish Data. Educational Technology & Society, 9 (3), 182-194. 

The effect of LEGO Training on Pupils’ School Performance in 

Mathematics, Problem Solving Ability and Attitude: Swedish Data 

Shakir Hussain 

Department of Primary Care & General Practice, University of Birmingham, UK 

s.hussain@bham.ac.uk 

Jörgen Lindh 

Jönköping International Business School, Sweden 

lijo@jibs.hj.se 

Ghazi Shukur 

Jönköping International Business School and Centre for Labour Market Policy (CAFO), Department of 

Economics and Statistics, Växjö University, Sweden 

shgh@jibs.hj.se 

ghazi.shukur@vxu.se 

ABSTRACT 

The purpose of this study is to investigate the effect of one year of regular “LEGO” training on pupils’ 

performances in schools. The underlying pedagogical perspective is the constructivist theory, where the 

main idea is that knowledge is constructed in the mind of the pupil by active learning. 

The investigation has been made in two steps. The first step was before the training and the second after 

training. For both cases we have constructed and included control groups. A logistic model is applied to the 

data under consideration to investigate whether the LEGO training leads to improving pupil’s performance 

in the schools. To analyse the opinion studies, GLM for matched pair models and the Quasi symmetry 

methods have been used. Preliminary results show better performances in mathematics for the trained group 

in grade five, and pupils who are good at mathematics tend to be more engaged and seem to be more 

successful when working with LEGO. 

The study has also shown that pupils have different learning styles in their approach to LEGO training. The 

role of the teacher, as a mediator of knowledge and skills, was crucial for coping with problems related to 

this kind of technology. The teacher must be able to support the pupils and to make them understand the 

LEGO Dacta material on a deeper level. 

Keywords 

Robotic Toys, Problem Solving, Constructivism, Multilevel Modelling 

Introduction 

This research project called “Programmable construction material in the teaching situation”, aims at studying the 

pedagogical effects caused by the application of LEGO Dacta materials in some schools in the area of middle 

Sweden. For that purpose, we took two groups of study in order to make comparisons. One of them is called the 

control group (CG) formed by schools that have not been subjected to the experimental work with LEGO Dacta 

and the other called experimental group (EG) that was represented by schools that have participated 

pedagogically with the LEGO Dacta proposals and technological materials. 

The starting point was a former project called INFOESCUELA, which was a pilot project that began in Peru 

1996 (Iturrizaga, 2000). It was initiated by the Peruvian Ministry of Education, with the objective of introducing 

technology to primary schools through the use of LEGO Dacta materials. Over the course of three years, from 

1996 to 1998, the project was expanded to cover 130 schools throughout the country of Peru. In general terms, 

the result of the Peruvian project showed empirical evidences that the pupils from the experimental group 

successfully fulfilled their school activities due to the use of LEGO Dacta material. In the referred case, the EG 

gained better percentages of achievement than the CG in the tests of mathematics, technology, Spanish and eye 

to hand coordination. The results of this research were so encouraging that LEGO Dacta was interested in 

studying the effects of using their products in Sweden, which represented a different school context than in Peru. 

We were conscious of many of the dissimilarities between the school contexts of Peru and Sweden, and 

especially the more complex evaluation process in the Swedish school environment. In Peru the chances were 

good to find control groups of pupils that never have been in touch with computers. Due to an over all higher 






182

technological level in Sweden, we could not expect to find pupils that were totally inexperienced with 

computers, because most children come into contact with computers at early ages in their families and elsewhere 

in the community. The Peruvian research project was in that respect of a more simple nature, and the results 

could not easily be applied to other contexts, i. e. the circumstances of the Swedish case were more subtle. 

Theoretical background 

The theoretical foundation for our research project is both the theory of constructivism, deriving from Piaget’s 

view of learning (Piaget & Garcia, 1991), and the theory of situated cognition (Lave, 1988). Constructivism in 

terms of Piaget is normally called individual constructivism, ie. that individual persons construct knowledge 

through interaction with their environment. Papert in collaboration with Piaget led him to consider using 

mathematics in order to understand how children learn and think. Papert believed that the computer was a tool 

that could allow children to explore mathematics and other curricular subjects that support constructionist 

learning, ie. an educational method which is based on the constructivist learning theory. This method stresses 

the role that concrete objects play in the complex process of knowledge construction (Papert, 1980, 1993). 

Relating this to pupils’ interaction with the programmable construction material, it means that the pupils’ 

individual knowledge develops continuously in building constructions of LEGO material. In situated cognition 

the importance of the situation or the context of learning is emphasized. Individuals participate and become 

absorbed in social and cultural activities. Learning is a part of these activities, which contributes meaningful 

knowledge to its carrier. Communication is a key concept since it mediates between the individual and her 

context (social situation). 

Our work with the programmable material in the school classes was performed as group work. This put a lot of 

“attention to what kind of things that surrounds the individual; especially relations between individuals, groups, 

communities, situations, customs, language, culture and society” (Marton & Booth, 2000, p 28). The way of 

working with the LEGO Dacta material in the schools pointed to knowledge development which to a great extent 

seemed to happen between individuals and between groups. LEGO and LOGO rely on the same pedagogical 

principles, even though LEGO can be considered as an extension of LOGO. Papert wanted to design a suitable 

computer language for children, which had the power of professional programming languages, but was easy 

enough for non-mathematical beginners. The name LOGO was chosen for the new language to suggest the fact 

that it is primarily symbolic and only secondarily quantitative (Papert, 1980). 

During the same period, Marvin Minsky was encouraging children in the new MIT Children’s Laboratory to 

learn to control a small robot which moved on the floor and could draw a record of its movement with a pen. It 

seemed to Papert and Minsky a logical step to integrate control of this robotic turtle into LOGO, which later was 

replaced by a screen turtle, but the language remained true to Papert’s ideals; that it should be a tool for learning 

concepts like planning, problem solving, and experimentation. 

Mitchel Resnick, Professor at the lab and Director of the Lifelong Kindergarten research group, compared and 

contrasted several projects chosen by workshop participants which featured the construction of physical objects, 

often built from LEGO bricks, that were controlled by the computer (Resnick, 1991). For Resnick each project 

was rich in personal meaning, as it symbolized something for each of the project leaders, and secondly there was 

great diversity among these projects. Diversity of themes showed itself through the variety of projects, none of 

them “typical”. He concludes that while both LEGO and LOGO encourage open-ended construction, their use 

alone does not lead to diversity. Participants were encouraged to forget constraints when choosing a project 

theme. 

Most of the reported LEGO projects are small, in terms of sample size of individuals and period length of study. 

Criticism has been raised towards these kinds of studies, as being only stories rather than research. Becker 

(1987) emphasizes the importance of testable consequences in research about LOGO in his discussions of 

advantages and disadvantages of two types of research methodologies used to study the effect of LOGO in 

classroom settings: the treatment methodology and computer criticism. Several researchers, including Becker 

(1987), have stressed that the main evidence showing that LOGO can produce measurable learning when used in 

"discovery" classes has been obtained in situations close to individual tutoring. In normal-sized classes, the 

evidence clearly shows the need for direct instruction in the concepts and skills to be learned from LOGO, as 

well as further direct instruction to enable students to generalize what they have learned to transfer to other 

situations. This is in complete opposition to Papert's conception of the discovery approach to LOGO. Pea (1985) 

asks: “What contribution do we get from the computer to our mental functioning?” whether the computer is only 

an amplifier of cognition or a reorganizer of our minds. 

183

Moore & Ray (1999) argue for more advanced formal statistical methods for sensitivity and performance 

analysis in computer experiments. They suggest, for example, multivariate methods in analysis of thorough 

investigations. Our research project aims to combine a wide range of different kinds of methods in a one year 

study. 

Research questions and hypotheses 

Questions about the pupils learning: 

How do the pupils learn the LEGO Dacta material? 

What kind of skills and abilities are supposed to be developed when pupils are working with the material? 

Are there any differences in the pupils’ way and ability to handle the material related to age or gender? 

Questions about the learning context/classroom environment: 

In what way shall the context/environment be organised to best stimulate the pupils’ learning of the LEGO 

material? 

How to integrate the material so it will be a natural part of the pupils’ everyday work? 

Question about the role of teacher: 

What kind of role is feasible for the teacher/the pedagogy to support the pupils’ learning with LEGO 

material? 

Hypotheses 

For the trained pupil with our specific program the study aimed to quantify the contribution of robotic toys 

(LEGO) training on mathematics, problem solving ability and student attitude (MPA) toward different issues. 

Formulating a proper model, interpreting the parameters and quantitative assessment will be attempted. 

In the project, two hypotheses were set up: 

By using LEGO construction kits, sensors and programming tools, pupils will develop better knowledge in 

mathematics than pupils that do not work with the material. (1) 

By using LEGO construction kits, sensors and programming tools, pupils will develop better problem 

solving ability than pupils that do not work with the material. (2) 

Furthermore, we were also interested to measure attitudes towards different issues related to usage of LEGO, or 

general techniques in school (see Appendix A). 

Potentially there are two main approaches to estimating the training impact of MPA. The first approach would be 

to design a study of trained pupils and compare them with controlled pupils that under similar circumstances did 

not get any training. The second approach would be to obtain good information on the pupils’ change in 

performance based on the MPA. The performance parameters will be obtained for individual pupils before and 

after training. Both of these methods were used in this study, but with emphasis on the second one. 

The outline of the project 

In this section we explain the design and the outline of the LEGO project. 

Participating schools and classes 

To get a fair comparison we wanted both groups, EG and CG, to be equivalent regarding educational, social and 

demographic characteristics (Lindh et al., 2003). Different educational tests were applied to these groups. When 

184

ecruiting “experimental” classes (EG) our ambition was to reach a balanced distribution with schools in varying 

geographical regions in the middle of Sweden and representation of schools in small, medium sized and big 

municipalities. Our intention was also to get an equal proportion of classes in fourth grade and eighth grade. The 

class sizes varied between 17 and 43 pupils. 

The project was performed in 12 “experimental classes” of different schools in the middle of Sweden. All 

together there were 322 pupils, 193 pupils in the fifth grade (12-13 years old) and 129 pupils in the ninth grade 

(15-16 years old). There were 12 “control classes”, 374 pupils together, 169 pupils in the fifth grade and 205 in 

the ninth grade. The training time was around 2 hours a week over 12 months, from April 2002 to April 2003. 

The LEGO Dacta material in the classroom work 

The LEGO material, ie. the construction kit, consisted of a mechanical assembly system, a set of sensors and 

actuators, a central control unit (the programmable brick), a programming environment, ie. a computer and 

software and working instructions and manuals. The programmable brick is the most noticeable component of 

the kit, as it provides both control and power to all LEGO constructions. 

When the experimental classes worked with the LEGO material, pupils cooperated in smaller groups, generally 

3-4 pupils, which we called working groups. The groups did not follow a certain syllabus working with the 

LEGO material. The work was rather adjusted to the ordinary school activities and to the local preconditions at 

each school. To be sure that the classes started in a similar way, the teachers were instructed to start with the 

same kind of task. 

It was stated in the research plan that the pupils should work with the LEGO material about eight hours a month. 

This level was decided to ensure that pupils should have similar preconditions. It was planned that the 

researchers should come out and visit the EG during this period, to follow up how well the work continued and 

what changed during this time. 

We noticed that the school classes solved the practical things in different ways. The physical preconditions 

concerning rooms varied at the different schools. Some classes could have LEGO material out permanently, 

while other classes took the material out each time before using it. Some had to work in special computer rooms, 

other classes needed to go to other school rooms to get computers, and other classes had only a couple of 

computers in their ordinary class room which they used to program the robotics the pupils built. 

In some classes, especially in grade 8, the teachers have been able to integrate the work in their ordinary 

teaching, in mathematics or in technology. This was not usually the case in grade 4, where LEGO work was 

more like a separate part of the school day. The participating teachers in the project have been taught how to 

handle the LEGO Dacta material, as pupils often ask highly intricate questions. 

Methods used in the project 

In this research project we used both quantitative and qualitative methods, i.e. triangulation (see for example 

Reason & Rowan (1985)). 

Qualitative methods 

The qualitative methods used in the project were observation, interview and inquiry. 

Besides the above, every teacher involved in the project had to document their LEGO lessons; what kinds of 

activities were done during the lessons, how the pupils worked, what kind of problems they faced, and so on. 

The researchers made regular visits to the project schools. While visiting they made interviews with the pupils 

and observations of the work. The interviews were taped for later analysis. The purpose was to be able to follow 

the working group’s achievements during the project as well as discern their attitudes/feelings about the LEGO 

material. Also we could analyse their way of speaking about the material and their understanding of the LEGO 

concepts. 

185

Quantitative methods 

The quantitative methods consisted of different tests in mathematics and problem solving, and also covered 

inquiries concerning pupils’ attitudes to working with robotics. 

Before the school classes started their work with the LEGO Dacta material they had to perform a test in 

mathematics that was similar to the Swedish national test in mathematics, followed up by a test at the end of the 

project. The test was done by both EG and CG. The purpose was to compare the achievements for the two 

groups to see if there had been any changes/improvements. The problem solving test for both EG and CG at the 

same period was given for the same reason. 

Method for studying interrelationship 

The response for these pupils against the control pupils (which may depend on the MPA), can be described by a 

regression model with binary response variable. The usual logistic regression is used to study how the MPA are 

related to a dichotomous outcome Y = 0 (control pupils) or Y = 1 (training pupils). Y is the dependent variable in 

the logistic regression and is a function of a number of explanatory variables associated with their respective 

parameters. The maximum likelihood estimate of the parameters is obtained after maximizing the log-likelihood 

function. These estimated parameters are the change in the following logarithm of the odds 

ratio: 

⎡ Pr ob ( Y = 1| 

MPA) 

⎤ 

log 

. 

⎢ 

⎥ 

⎣1− 

Pr ob( 

Y = 1| 

MPA) 

⎦ 

In this paper, we study issues related to the model above in a multilevel perspective, i.e. by applying the 

multilevel logistic regression to our data. We have a clear case of a system in which individuals are subject to the 

influences of grouping pupils to learn in classes. The units in such a system lie at two different levels of a 

hierarchy. Note that this hierarchy should be taken into account and standard models and estimation methods are 

not appropriate in this case and will give biased and misleading results. A typical suitable multilevel model, 

Goldstein (1995), of this system would assign pupils to level one and classes to level two. Units at level one are 

recognized as being grouped, or nested, within units at the next higher level. In this case we can see clear 

hierarchical structure in the data. From this point of view of our model what matters is how this structure affects 

the measurements of interest. Thus when measuring educational achievement, it is known that the average 

achievement varies from one class to another. This means that pupils within a group will be more alike, on 

average, than pupils from different groups. 

The point of multilevel modeling is that a statistical model should explicitly recognize a hierarchical structure 

where one is present, and if this is not done then we need to be aware of the consequences of failing to do this, 

which leads to inefficient modeling. It is inefficient because it involves estimating many more coefficients than 

the multilevel procedure, does not treat groups as a random sample and provides no useful quantification of the 

variation among classes or groups in the population more generally. However, data with more than one level 

structure (as in our study) usually is analysed using a multilevel model. The multilevel model we use in this 

paper helps us investigate many important issues related to the study. Patterns of variability, in particular the 

nested form of variability, are studied with respect to different variables of interest. The fixed effect and the 

random effects are part of the outcome of the multilevel model that helps us decide the significance and direction 

of each effect. 

Very often it makes sense to use probability models to represent the variability within and between groups, i.e. to 

perceive the unexplained variation within groups and the unexplained variation between groups as random 

variability. For a study of pupils within groups, for example, this means that not only unexplained variation 

between pupils, but also unexplained variation between groups or classes are regarded as random variability. 

This can be expressed by statistical models with so-called random coefficients. The hierarchical model is such a 

random coefficient model for multilevel data and is by now the main tool for multilevel analysis. 

If we consider any outcome point in our study we may notice that it comes from a sampling design called a 

multi-stage sample. A simple random sample is selected from the first stage followed by a second stage random 

sample selection, nested in the first stage. Take for example the student score in mathematics. A cost-efficient 

strategy of evaluating schools and students within schools is by taking a random sample of schools and then a 

second stage random selection of student sample from each school. This method of sampling represents a twostage 

design. However the mathematics score variance consists of two components, between pupils within any 

186

school variance and between schools variance. The calculations in this study are performed using the statistical 

program packages, MLWIN-version 1.1 and SPLUS-version 6.1. 

Methods for opinion studies 

Pupils in the LEGO study are classified into categories of options that represent their opinion in a given 

question. We call the pupils in this classification “raters”. The questioner includes 17 questions with each 

question consisting of 4 categories (see the appendix) to investigate the training effects after the participation of 

the pupil in this training program. At this stage, we are mainly interested in investigating the following: 

1. Testing that the distribution between the 4 categories is the same. 

2. Whether the after training choice is independent from the before training choice for the pupils who changed 

their mind? 

3. Whether there is any symmetric association pattern around the original answers? 

Many of the social studies involve responses taken on the ordinal scale and may involve subjective opinion and, 

in particular, an intermediate category will often be subjective to more misclassification than an extreme 

category because there are two directions in which to err from the extremes. A square table can be used to 

display the joint rating of two occasions (before and after). 

Tables of equal rows and columns with dependence structure have two matters. The first, important issue is the 

difference in the marginal distribution. The question that we like to have an answer for is “whether classification 

for after training tends to be higher than before training”? Secondly, the extent of main-diagonal occurrence 

within the joint distribution of the rating and the question one may ask is “whether agreement between the two 

occasions, e.g. non-significant or diagonal, leaks”? Perfect agreement occurs when the total probability of 

agreement = 1. Pre and post results obtained from these raters’ data are analysed using Generalized Linear 

Model (GLM) and the data is in square table format. Matched pairs models are used to analyse agreement 

between two observers. Moreover, the structure of agreement and fitting different models is also investigated. 

Now we consider the pupil’s opinion before and after LEGO training. For any pupil, indexed by s, we suppose 

that the log probabilities for the k response categories are 

λ ,..., λ 

λ 

s 1 sk 

s1 + 1 sk 

τ ,..., λ + τ 

k 

Before LEGO Training, and 

After LEGO Training 

So that (τ1,. . .,τk) is the training effect, assumed to be the same for all individual pupils. To simplify the idea of 

modelling of square tables we will introduce the main type of models that we use in the analysis. The first is the 

symmetry model where we like to test the null hypothesis 

H : π = π , 

0 

ij 

ji 

that is, we like to test if the cell probability on one side of the main diagonal are a mirror image of those on the 

other side. 

For expected frequency the logit value is 

log μ = λ + λ + λ + λ 

(1) 

ij 

i 

j 

ij 

The quasi symmetry model implies some association and allows the main effects terms to differ so that 

X 

( log μ ij = λ + λi 

Y 

+ λ j + λij 

) ≠ 

X Y 

( log μ ji = λ + λ j + λi 

+ λij 

) 

(2) 

The main effect in (2) is different for rows and columns and the resulting estimates are the differences in 

Y X 

{ λ j − λj 

} for j = 1,2,…, . 

The marginal homogeneity model compares marginal distributions in the square table, and one may test the 

hypothesis of marginal homogeneity by comparing the fit of the specific square table. It is known, that a table 

that satisfies both quasi symmetry and marginal homogeneity can also satisfy symmetry and one can test 

187

marginal homogeneity by comparing goodness-of-fit statistics for the symmetry and quasi-symmetry models 

(Bishop et al., 1975). 

Conditional symmetry models may be used when categories are ordered, and this model estimate extra parameter 

τ for the off diagonal on structure of agreements. The main objective of this extra parameter is to quantify the 

effect on the structure of agreement. The symmetry model is a special case when τ = 0. The following model 

represents a generalization that includes the condition when symmetry does not hold with ordered category. 

log μ = λ + λ + λ + λ + τI(i< 

j), I( • ) is the indicator function (3) 

Results 

ij i j ij 

In this section we present the most important qualitative and quantitative results of this study. 

Qualitative results 

Different strategies of learning the material 

We have found that pupils learn the material in various ways, which is in line with other research (Kafai, 1996). 

One way to learn by children is by a “trial-and-error method”. Another way is more “cooperative”: by asking 

their fellow workers. Alternatively they ask another pupil in the class that is considered to know the material 

much better than oneself. 

More seldom are situations where students read the written instructions from the teacher or from a book. One 

reason may be the normal situation when pupils constructed the robotics, which often seems to be a situation of 

“joyful stress”, but sometimes they took help from instructive pictures. We could also observe that boys were 

more often less willing to follow the instructions, whereas girls seemed to be more concentrated and intent on 

following the written task. 

Pupils learning 

Even though our study did not report significant contributions as far as logic skill, we could still discover other 

interesting things. Of certain interest was the pupils’ learning to cooperate in working groups. On interviewing 

pupils we often heard how the work with LEGO had enhanced the feeling of a community. 

Another positive thing was the pupils’ better understanding of how to program the computers and how to load 

different programs to the robotics via an infrared beam. Gradually the pupils became more conscious of how 

they could steer the robotics by making and loading complex programs which made the robotics perform and 

interact in an advanced way. 

We did not see any significant difference between the younger and older pupils concerning the ability to build, 

program and more generally handle the LEGO material. The same holds for gender, there was not any general 

difference between girls and boys. 

Learning context 

From our observations we could draw several conclusions about how the environment should be best organized 

to best stimulate pupils’ learning of the LEGO material. First, there needs to be a large space for the pupils to 

work: they must be able to spread the LEGO material on the ground, to “play around”, and test different kind of 

solutions for each kind of project they face. The working groups should not be too big (maximum 2-3 

pupils/LEGO box). The task given to the pupils must be concrete, relevant and realistic to solve. It is very 

important that the pupils can relate the material to their ordinary school work and their different subjects. 

188

The role of the teacher 

The role of the teacher, as a mediator of knowledge and skills, was crucial for coping with problems related to 

this kind of technology (Chioccariella et al., 2001). The teacher must be able to support the pupils and help them 

understand the LEGO Dacta material on a deeper level. We have seen from the study the advantage of a well 

prepared teacher who can meet even advanced problems from eager pupils. Sometimes tricky problems had to be 

resolved before the pupils could continue their work with new constructions. A teacher with a background in 

physics or natural science could discuss LEGO technology in comparison with other technology and give a 

broader perspective to the available technology. 

One opinion often heard among the teachers was about the advantage of two teachers working together in the 

classroom. This made it possible for one teacher to stop and discuss situations thoroughly with a working group 

of pupils, while the other one had control of the classroom situation. Teachers or other adults sometimes also 

played another important role, as the resolver of conflicts. Even though we generally found collaboration in 

working groups good, sometimes conflicts occurred. A trained supervising teacher could easily be helpful in 

resolving any arguments by explaining to the pupils the opportunities which open to them through cooperating 

together (Knoop, 2002). 

Quantitative results 

Comparing control and trained groups in the fifth grade 

In this part of the study we start by investigating the properties of any possible differences between the control 

and the trained groups for the pupils in the fifth grade (ie. after the LEGO training experiment). We first start by 

applying the binomial second order marginal quasi-likelihood (MQL) approach (for more information about this 

approach, see Hox, 2002 and Snijders et al., 2002). 

Our analysis reveals that the vast majority of unexplained variation in the probability of LEGO training is, as 

expected, between classes which support our design procedure of dividing these classes into two groups (trained 

and control). Moreover, we find that pupils that highly support the idea that “pupils can co-operate better in 

working groups” belong with high probability to the trained group. When using the multilevel regression with 

results from mathematics as a dependent variable, we find a common agreement among the pupils that are good 

at mathematics in general regarding the attitude to question 6, namely that “Working with LEGO material helps 

us learn things like order and method”. On the other hand, we find a negative output for question 5, that “LEGOmaterial 

should not be used in the school”. Note that there is a negation in this question which means in this case 

that the pupils do not support this statement (most of them totally disagreed with the statement). Moreover, we 

also find a negative output for question 12, that “Girls and boys learn equally well when working with LEGO 

robots” indicating that the boys and girls in fact do not believe that they learn equally well. 

Table 1. Results from the multilevel regression in two stages 

Fixed Effect Model Standard Model Standard 

Stage 1 error Stage 2 error 

Intercept 29.08 0.476 26.5 1.683 

Question 5: LEGO-material should not be used 

in the school. 

-0.601 0.314 

Question 6: Working with LEGO material helps 

us learn things like order and method. 

0.811 0.363 

Question 12: Girls and boys learn equally well 

when working with LEGO robots. 

-0.809 0.400 

Random Effect 

Level 2 variance 2.235 1.272 1.9 1.1 


Log Likelihood 1961.4 1947.6 

The sample size consists of 165 pupils from different schools. 

Table 1 summarises the multilevel results of the estimated parameters together with their standard errors from 

the fixed effect once using the empty model with only a constant term included and then using the full model 

189

with the attitude questions included (here we only find questions 5, 6 and 12 to be statistically significant 

effects). The second part of the table, i.e. the random effect part, shows the level one and level two variance 

estimates together with the log likelihood which indicate that the second model is more proper than the first. By 

adding the fixed effect, i.e. questions 5, 6 and 12, to the model, we expand the variation in the outcome 

mathematics (the difference in deviance is 13.8 (=1961.4 - 1947.6). The level one variance (24.4) with its 

respective standard error (2.0) shows significance in the differences between pupils when having the 

mathematics as the dependent variable. In other words, the vast majority of unexplained variation in mathematics 

comes from the difference between students. On the other hand, when looking at the level one variance (1.9) 

together with its standard error (1.1), we find the class effect not to be significant in this case. 

When looking at achievements in mathematics for this group of pupils before and after the training by using the 

standard two-sample t-test, we find a positive shift in the mean from 0.711 to 0.817 with p-value = 0.000 (which 

means significant at all significant levels) indicating better performances in mathematics for the trained group. 

For the problem solving, on the other hand, we find a slight shift in the opposite direction from 0.696 to 0.649 

with p-value = 0.023 which is rather significant. 

When looking at the pupils in the trained group in the fifth grade evaluated by their teachers using level of 

engagement as a dependent variable, see Table 2, we find a clear difference between pupils and the overall level 

of engagement (3.37) which is above the average (2.5). On the other hand, we do not find any significant 

differences between the teachers from the different classes (when evaluating their pupils). 

By adding the fixed effect, i.e. mathematics, to the model we expand the variation explained in the outcome 

engagement (the difference in deviance is 12.9 (= 472.5 - 459.6). Moreover, we find that the pupils with a 

greater ability in mathematics to be more engaged in the learning process, as shown in the following table: 


Fixed Effect Model Stage 1 Standard error Model Stage 2 Standard error 

Intercept 3.36 0.11 1.73 0.45 

Mathematics 0.06 0.02 

Random Effect 





When adding the attitude variables in the modelling, our results (not shown here) reveal that the vast majority of 

the pupils agree with respect to question 5 that says “LEGO material ought to be used already in pre-school”, 

which means the pupils that highly support this statement tend to be more engaged in the learning process. We 

have the same result with respect to question 7 which indicates that working with LEGO material can give the 

pupils more self confidence. Moreover, looking at the level two variance (0.94) with its standard error (0.10), we 

find significant differences between students when evaluated by their teachers using a level of engagement as the 

dependent variable. The level one variance (0.02) with its respective standard error (0.03) does not indicate any 

significance difference between teachers’ evaluations of their students in different classes using the engagement 

level as a dependent variable. 

Comparing control and trained groups in the ninth grade 

Here, we investigate the properties of any possible differences between the control and the trained groups for the 

pupils in the ninth grade after (i.e. after the LEGO training experiment). When applying the (MQL) approach we 

could not find any significant differences between the controlled and trained groups regarding the mathematics 

and problem solving. However, we find a common agreement on refusing the statement which states “Teaching 

about LEGO material does not require any computer knowledge”. In particular, pupils who highly disagree are 

good at mathematics. 

When looking at achievements in mathematics for this group of pupils before and after the training by using the 

same standard two-sample t-test as for grade 5, we did not find any significant shifts in the mean with regard to 

mathematics or problem solving. 

190

When looking at the pupils in trained grade 9 evaluated by their teachers using level of engagement as a 

dependent variable, we find a clear difference between pupils, and the overall level of engagement (3.13) which 

is above the average (2.5). The classes show no significant effect on the level of engagement, but as we 

mentioned the pupils differ in their level of engagement, see Table 3: 

When including the mathematics as an explanatory variable, the result shows that this variable is significant with 

a positive sign indicating that pupils with high ability in mathematics tend to be more engaged. More variation is 

explained when including this variable in the model; the difference in the deviance is 18.2 (= 375.2 - 356.97), 

see Table 3 below: 


Fixed Effect Model Stage 1 Standard error Model Stage2 Standard error 

Intercept 3.13 0.22 1.80 0.39 

Math 0.04 0.01 

Random Effect 





As in the case of grade 5 pupils, we find the attitude questions number 5 and 7 to show the major agreement 

among the pupils. 

Analysing the opinion studies 

We here analyse the differences in responses between the two occasions, before and after. We used the method 

of factor analysis to select the most important questions from the list of 17, (see the Appendix). 

Then, when considering the trained group in the fifth grade and based on the method of factor analysis, we find 

that following questions 4, 13, 16 and 17 to have the greatest factor loadings (i.e., the most important variables) 

both before and after training. For the trained group in the ninth grade pupils, we find questions 4, 6, 9 and 12 to 

be most important. Accordingly we analyse these questions, regarding the possible opinion changes among the 

pupils, by means of the method of GLM for matched pair models that we mentioned in the previous section. The 

Quasi symmetry model, however, is shown to have a better fit than the symmetry model and hence we analyse 

results that are obtained by using the Quasi symmetry method only (results supporting this are excluded to save 

space but are available upon request from the authors). 

Generally, the results, regarding equations 4, 13, 16 and 17 for the fifth grade and equations 4, 6, 9 and 12 for the 

ninth grade, indicate that pupils with a high level opinion (i.e., those who totally agree) before training have 

changed their responses, regarding these questions, to a lower level (i.e. partly agree) after training. This is 

happening more often than the reverse. The results also indicate that pupils’ option choices after training are 

independent from their option choices before training. Moreover, the results reveal that there is a symmetric 

association pattern and that any lack of symmetry seems to reflect slight marginal heterogeneity. 

Concluding Discussion 

There has been an intense debate in pedagogical spheres about the outcomes of LOGO/LEGO, which is mainly a 

part of an even broader discussion about justifying the use of computers in education. A good overview of this 

discussion is the paper by Harvey (2001) “Harmful to children?” where he analyses why the idea of educational 

computer is under attack. 

Within the LEGO research group reasonable statements have been given why use of what they call the new 

generation of computationally-enhanced manipulative materials, or shortly “digital manipulatives”. These 

researchers mean that the traditional field of instructional design is of little help: it focuses on strategies and 

191

materials to help teachers instruct. Instead, the interest must be in developing strategies and materials to help 

children construct. They call this effort "constructional design" (Resnick, 1996; Resnick, Bruckman, & Martin, 

1996). Constructional design is a type of meta-design: it involves the design of new tools and activities to 

support children in their own design activities. In short, constructional design involves designing for designers. 

A lot of research has been performed to find out whether or not there are any substantial benefits when using 

LEGO in educational contexts. Most of the research – which has been shown in the research background of this 

article – have weaknesses due to smallness and time. The size of this study is much bigger than most of the other 

LEGO studies, except from the initially referred Peruvian study, that in our opinion has serious shortcomings. 

This project is run in a complex context; schools. Certainly there is a risk for some minor bias because of the 

pupils’ use of LEGO in their spare time, though we consider this bias as statistically negligible. 

The study was built up from both a qualitative and quantitative research perspective. It was based on two stated 

hypotheses, which are dealt with in order below. The results show better performances in mathematics for the 

trained group in grade 5. When looking at achievements in mathematics for pupils in grade 9 before and after the 

training, we did not find any significant shifts in the mean with regards to mathematics. For the problem solving, 

there is no significant improvement either. This seems to be true for both grade 5 and 9. An interesting result of 

the study is, that pupils with higher ability in mathematics tend to be more engaged. 

There is not a general positive attitude towards LEGO among the pupils. We have seen in our study that for a 

certain category of pupils, i.e. high-performing pupils in mathematics in grade 5, they have a positive attitude 

towards the LEGO material. The same is also true for grade 9. GLM for matched pair models and the Quasi 

symmetry model has been used in the analysis of the opinion studies among the pupils. The results reveal that, in 

some of the questions, the pupils have changed their opinion responses from high-level opinion before training 

to a lower level after training. This happened more often than the reverse. 

Summarising, this field test shows a complex variation of data concerning the different measured variables. We 

would like to point out that lots of data also have been collected by using other methods than the quantitative 

ones. Thus several qualitative methods have been used, among others, including interviews and observations. 

Briefly, these other methods confirm our results above. The qualitative and quantitative results are in alignment, 

where the qualitative data has been necessarily comprehensive, and the qualitative data adds learning approaches 

which is not evident from the quantitative data. 

After reviewing research in this area, we conclude that it is difficult to confirm the hypothesis that LEGO - 

generally - has positive effects on cognitive development. However our study indicates certain positive effects 

can be shown for groups/categories of pupils. Pedersen (1998) argues that the teacher’s role in the positive 

results is important. It is likely that the teacher also has considerable influence over the way in which these tools 

are received by the pupils, boys as well as girls. In this regard, parallel arguments can surely be made for an 

interest in technology. 

More follow-up studies, especially longitudinal, need to be conducted to answer the question of whether LEGO 

training which stimulates a pupil’s ability to solve logical problems can really be raised by using LEGO 

Mindstorms for Schools? The question of the extent to which LEGO Mindstorms for Schools activities can 

generally raise the level of problem solving is indeed complex, and will surely need to be examined in further 

studies. 

References 

Becker, H. J. (1987). The importance of a methodology that maximizes falsifiability: Its applicability to research 

about Logo. Educational Researcher, 16 (5), 11-16. 

Bishop, Y. V., Fienberg, S. E., & Holland. P. W. (1975). Discrete Multivariate Analysis, Cambridge: MIT Press. 

Chioccariello, A., Manca, M., & Sarti, L. (2001). Children’s playful learning with a robotic construction kit, 

Genova: Instituto per le Tecnologie Didattiche – C.N.R. 

Goldstein, H. (1995). Multilevel Statistical models. (2 nd Ed.), London: Edward Arnold. 

Harvey, B. (2001). Harmful to children? The Alliance for Children Report, Berkeley,: University of California. 

192

Hox, J. J. (2002). Multilevel Analysis - Techniques and Applications, Mahwah: Lawrence Erlbaum. 

Iturrizaga, I. M. (2000). Study of Educational Impact of the LEGO Dacta Materials - INFOESCUELA - MED. 

Final Report, retrieved June 30, 2006, from, http://www.lego.com/education/download/infoescuela.pdf. 

Kafai, Y. (1996). Learning Design by Making Games: Children’s Development of Design Strategies in the 

Creation of a Complex Computational Artifact. In Kafai, Y. & Resnick, M. (Eds), Constructionism in practice: 

Designing, thinking, and learning in a digital world, Mahwah, New Jersey, USA: Lawrence Erlbaum. 

Knoop, H. K. (2002). Play, Learning & Creativity - Why Happy Children are better Learners, Köpenhamn: 

Aschehoug Forlag. 

Lave, J. (1988). Cognition in practice, Cambridge, UK: Cambridge University Press. 

Lindh J., Carlsson, S., Kilbrink, N., & Segolsson, M. (2003). Programmable Construction Kits in an 

Educational Environment, final report, Jönköping: School of Education and Communication. 

Marton, F., & Booth, S. (2000). Om lärande (About learning), Lund: Studentlitteratur. 

Moore, L. M., & Ray, B. K. (1999). Statistical Methods for Sensitivity and Performance Analysis in Computer 

Experiments. In Farrington, P. A., Nembhard, H. B., Sturrock, D. T. & Evans, G. W. (Eds.), Proceedings of the 

1999 Winter Simulation Conference, 486-491, retrieved June 30, 2006, from, http://www.informscs.org/wsc99papers/068.PDF. 

Papert, S. (1980). Mindstorms - Children, Computers, and Powerful Ideas, New York, USA: Basic Books. 

Papert, S. (1993). The Children’s Machine: Rethinking School in the Age of the Computer, New York: Basic 

Books. 

Pea, R. D. (1985). Beyond Amplification: Using the Computer to Reorganize Mental Functioning. Educational 

Researcher, 20 (4), 167-182. 

Pedersen, J. (1998). Informationstekniken i skolan: En forskningsöversikt (98:343), Stockholm: Skolverket, 

Liber distribution. 

Piaget, J., & Garcia, R. (1991). Toward a logic of meanings, Hillsdale, NJ, USA: Lawrence Erlbaum. 

Reason, P., & Rowan, J. (1985). Human inquiry - A Sourcebook of new paradigm research, New York, USA: 

John Wiley & Sons. 

Resnick, M. (1991). Xylophones, Hamsters, and Fireworks: The Role of Diversity in Constructionist Activities. 

In Harel, I. & Papert, S. (Eds.), Constructionism: Research Reports and Essays, 1985-1990 by the Epistemology 

and Learning Research Group, Norwood, NJ, USA: Ablex Publishing. 

Resnick, M. (1996). Towards a Practice of Constructional Design. In Schauble, L & Glaser, R. (Eds.), 

Innovations in Learning: New Environments for Education, Mahwah, NJ: Lawrence Erlbaum. 

Resnick, M., Bruckman, A., & Martin, F. (1996). Pianos Not Stereos: Creating Computational Construction Kits. 

Interactions, 3 (6), 41-50. 

Snijders. T. B., & Bosker, R. J. (2002). Multilevel analysis. An introduction to basic and advanced multilevel 

modeling, SAGE Publication. 

193

Appendix A 

Statement 

1. LEGO material ought to be used already in pre-school. 

2. LEGO-robots are well suited for techniques lessons. 

3. LEGO material enable pupils to cooperate better in groups. 

4. Many of the pieces in the LEGO package get lost in the school. 

5. LEGO-material should not be used in the school. 

6. Working with LEGO material helps us learn things like order 

and method. 

7. Working with LEGO robots makes us more self confident. 

8. We learn a lot when we build and program LEGO robots. 

9. Learning about techniques helps us develop in many ways. 

10. You cannot build LEGO robots in language lessons. 

11. Boys like to compete with LEGO robots more than girls do. 

12. Girls and boys learn equally well when working with LEGO 

robots. 

13. LEGO robots are only toys or a game. 

14. Teaching about LEGO material does not require any computer 

knowledge. 

15. Boys know more about techniques than girls. 

16. Group work tends to be better when working with LEGO. 

17. We learn best by using LEGO robots in games or 

competitions. 

How far do you agree with these statements? 

Totally 

agree 

Partly 

agree 

Partly 

disagree 

Totally 

disagree 

194

Randolph, J. J., Virnes, M., Jormanainen, I. & Eronen, P. J. (2006). The Effects of a Computer-Assisted Interview Tool on 

Data Quality. Educational Technology & Society, 9 (3), 195-205. 

The Effects of a Computer-Assisted Interview Tool on Data Quality 

Justus J. Randolph, Marjo Virnes, Ilkka Jormanainen and Pasi J. Eronen 

Department of Computer Science, University of Joensuu, PO BOX 111, 80101, Joensuu, Finland 

Tel: +358 13 251 7929 

Fax: +358 13 251 7955 

justus.randolph@cs.joensuu.fi 

ABSTRACT 

Although computer-assisted interview tools have much potential, little empirical evidence on the quality 

and quantity of data generated by these tools has been collected. In this study we compared the effects of 

using Virre, a computer-assisted self-interview tool, with the effects of using other data collection methods, 

such as written responding and face-to-face interviewing, on data quantity and quality. An intra-sample, 

counter-balanced, repeated-measures experiment was used with 20 students enrolled in a computer science 

education program. It was found that there were significantly more words and unique clauses per response 

in the Virre condition than in the written response condition; however, the odds of avoiding in-depth 

responding were much greater in the Virre condition than in the written response condition. The effect sizes 

observed in this study indicated that there was a greater quantity of data and a higher rate of response 

avoidance during Virre data collection than what would have been expected, based on previous literature, 

during face-to-face data collection. Although these results statistically generalize to our sample only, these 

results indicate that audio-visual, computer-assisted self-interviewing can yield as high, or higher, quality of 

data as other commonly-used data collection methods. 

Keywords 

Computer-assisted interviewing, Data collection, Data quality 

Introduction 

Face-to-face interviewing is considered by many to be superior to other types of interview data-collection 

techniques, such as telephone interviewing and paper-and-pencil interviewing (Dillman, 1978; Rossi, Wright, & 

Anderson, 1983; Smith, 1987). According to De Leeuw, compared to paper-and-pencil and telephone interviews, 

face-to-face interviews have more flexibility, higher response rates, and greater potential for studying the general 

population (1993). However, face-to-face interviews also require more resources to carry out (De Leeuw, 1993). 

Much research has been collected over the past two decades on the feasibility of using computer-assisted data 

collection methods as low-cost supplements or alternatives to face-to-face data collection (see Saris, 1989; 

1991). Most of the computer-assisted data collection research concentrates on interviewers’ and respondents’ 

attitudes towards computer-assisted interviewing; however, it does not address the quality of data that is 

collected in computer-assisted interviews. De Leeuw and Nichols (1996), in a review of research on computerassisted 

data collection methods, concluded that: 

Computer-assisted data collection has a high potential regarding increased timeliness of results, 

improved data quality, and cost reduction in large surveys. However, for most of these potential 

advantages the empirical evidence is still limited. . . . At present there is still little empirical research 

into the effect of computerized interviewing on data quality. Most studies have investigated the 

acceptance of the new techniques by interviewers and respondents. (p. 7.1-7.2) 

Ten years later, there is still a high potential for computer-assisted personal interviewing, but there is still also a 

paucity of empirical evidence on the quality of data generated by those methods. Because of a lack of needed 

empirical evidence in this area, we examined the quantity and quality of data generated from computer-assisted 

interviewing. Specifically, we investigated the use of an innovative tool called Virre, which is a combination of a 

computerized self-administered questionnaire and a computerized self-interviewing tool designed to collect the 

responses of primary-school-aged interviewees. (We classify Virre as a self-interviewing tool because only the 

respondent and Virre are necessary for an interview to be conducted.) 

Several stakeholder groups stand to gain from this research. There has been much recent discussion in the field 

of research and evaluation about harnessing technologies to improve practice (Gay & Bennington, 2000; Love, 

2004; Means, Roschelle, Penuel, Sabelli, & Haertel, 2003). Evaluators and researchers may be interested in this 

article’s findings because we investigate the hypothesis that Virre, and tools like Virre, can be used as a low-cost 






195

alternatives to high-quality, but resource-intensive, data collection strategies, such as face-to-face interviewing. 

Educational practitioners in general may be interested in the findings of this research; Virre was originally 

designed as an educational tool to motivate students to ‘think about their thinking’ aloud, a strategy which has 

known benefits for learning (see Pressley & McCormick, 1995). Computer science educators may be particularly 

interested in this research because Virre was created by, and for, computer science educators in K-12 computer 

science education programs. Although our study design only allows us to statistically generalize to the 

population of responses made by the students involved in this study, it is reasonable to hypothesize that our 

findings would generalize most accurately to populations of similar students in comparable K-12 computer 

science education programs. 

In the section that follows, we describe a conceptual model of data collection effects and use that model to point 

out the theoretical similarities between Virre and face-to-face data collection. In the section titled ‘Expected 

Results,’ we describe and justify the point estimate and range of values that we expected in our study. In the 

procedures section, we operationalize our variables, describe Virre in more detail, and document the data 

collection and data analysis procedures that we used. In the results section, we present the findings of our 

research in terms of the comparative effects of Virre and written responding on number of words, number of 

unique clauses, and presence of response- avoidance phrases. In the discussion section, we revisit our research 

hypotheses and comment on the implications of our findings. 

Literature Review 

De Leeuw’s (1993) conceptual model of data collection effects on data quality, as illustrated in Figure 1, is 

composed of three types of factors: media-related factors, information transmission factors, and interviewer 

impact factors. According to De Leeuw (1993), media-related factors concern “the social conventions associated 

with the medium of communication” (p.14), information transmission factors concern “the more technical 

aspects of information transmission, not social customs” (p. 16), and interviewer impact factors concern the 

extent to which the effects of the interviewer’s presence and behaviors are restricted. 

Media-Related 

Factors 

- familiarity 

- locus of control 

- conventions about 

silence & sincerity 

Information Transmission 

- available channels 

- presentation of stimuli 

- regulation of 

commmuncation flow 

Data 

Quality 

Interviewer Impact 

- presence of the 

interviewer 

- effect of specific 

interviewer 

behaviors 

Figure 1. De Leeuw’s Conceptual Model of Data Collection Effects on Data Quality 

[From De Leeuw, 1993, p. 20, Reprinted with permission] 

In terms of media related factors, face-to-face interviewing and Virre methods differ on several counts. Face-toface 

interviewing is obviously more familiar to respondents than computer-assisted interviewing. We assume 

that respondents will have different conventions about silence and evaluate the sincerity of the data collection 

endeavor differently in face-to-face and Virre conditions. Nevertheless, several studies (e.g., Zandan & Frost, 

1989; Witt & Bernstein, 1992) have found that “respondents generally like [computer-assisted interviewing]; 

they find it interesting, easy to use, and amusing” (De Leeuw & Nichols, 1996, p. 6.3). We suppose that the 

locus of control in face-to-face interviewing and Virre is comparable. 

196

In terms of interviewer factors, on one hand, Virre is not able to do many of the tasks that a human interviewer 

can do well, such as motivating respondents, answering respondents’ questions, or asking for clarification to 

respondents’ answers. On the other hand, research has shown that participants are much more likely to disclose 

sensitive information in computer-assisted interviews than in face-to-face interviews -- an effect which has been 

found, however, to be diminishing over time (Wiesband & Kiesler, 1996). Also, because computer-assisted 

interviews can be standardized, instrumentation threats to internal validity can be controlled for. 

Because of the theoretical similarities between Virre and face-to-face data collection, especially in terms of 

factors related to information transmission, we hypothesized that the effect sizes of Virre would be similar to the 

effect sizes found in the research on face-to-face data collection. (The effect sizes found in the research on faceto-face 

data collection can be found in the following section.) 

Expected Results 

Because of the lack of previous research on the effects of computer-assisted data collection on data quality, we 

had to estimate the expected results by proxy; we looked to the research on data comparisons between mail, 

telephone and face-to-face data-collection methods to put the effect sizes of the current study into context. In a 

meta-analysis of the effects of mail, telephone, and face-to-face data collection methods on data quality, the two 

effect sizes, and their ranges, presented below are particularly useful for putting the results of our current study 

into perspective (De Leeuw, 1993). 

Although there is no estimate of the comparative effects of face-to-face surveys and mail surveys on the number 

of responses to open statements, an estimate does exist between face-to-face surveys and telephone surveys. De 

Leeuw (1993) found that face-to-face surveys are slightly superior to telephone surveys in terms of the average 

number of statements to open questions. In a meta-analysis of four studies that examined number of statements 

to open questions between face-to-face surveys and telephone surveys, De Leeuw reported that the average, 

weighted r (i.e., the Pearson product-moment correlation) across studies was .04, with 95% lower and upper 

bounds of .02 and .07. (An r of .04 in the context of De Leeuw’s study is statistically equivalent to 52% of cases 

having had a greater number of open statements in face-to-face interviews than in telephone interviews. See 

Rosenthal, Rosnow, & Rubin, 2000, or Randolph & Edmonson, 2005.) Based on De Leeuw’s theoretical model 

(1993), we assume that the effect size between face-to-face surveys and mail surveys would be higher than the 

effect size between face-to-face surveys and telephone surveys for the number of statements to open questions. 

Another important finding from De Leeuw’s meta-analysis is that there is a slightly lower rate of nonresponse in 

face-to-face surveys than in mail surveys. In De Leeuw’s meta-analysis, the average, weighted value of r over 8 

studies was .03, with 95% lower and upper confidence intervals of .01 and .05 (1993). (An r of .03 in the context 

of De Leeuw’s study is statistically equivalent to 51.5% of cases having had a lower rate of nonresponse in faceto-face 

surveys than in mail surveys. See Rosenthal, Rosnow, & Rubin, 2000, or Randolph & Edmonson, 2005.) 

Because of the similarities between Virre data collection and face-to-face data collection and because face-toface 

data collection has been shown to lead to more statements and less nonresponse than written data collection 

techniques, we drew the following general and specific hypotheses for this study: 

1. The Virre responses of students will have more statements and fewer incidences of nonresponse than the 

written responses of those same students. 

2. The magnitude of effects attributed to Virre, in terms of number of statements and incidence of nonresponse, 

will be near the plausible range of magnitude of effects attributed to face-to-face interviewing for number of 

statements (i.e., .02 < r < .07) and nonresponse [hereafter response avoidance] (i.e., .01 < r < .05). (Stated 

in another way, the difference between the number of statements in the Virre condition and the number of 

words in the written response condition is expected to be approximately the same as the differences between 

face-to-face interviews and telephone interviews found in the previous literature. Also, the difference 

between the number of response avoidances in the Virre condition and the number of response avoidances in 

the written response condition is expected to be approximately the same as the differences in nonresponse 

between face-to-face interviews and mail surveys found in the previous literature.) 

To be able to compare the magnitude of effects of Virre responding and written responding to the magnitude of 

effects of face-to-face responding and mail responding requires several, admittedly tenuous, assumptions. The 

first is that mail responding is to face-to-face responding as written responding is to Virre responding. We justify 

this assumption based on the theoretical similarities between face-to-face and Virre responding. Second, in 

previous research, two often-used data quality variables were number of statements made and nonresponse. 

197

Because number of statements made is not operationalized in the previous research, we assume that number of 

words per response and number of distinct ideas per response are valid measures of the construct – number of 

statements made. Since students were instructed to respond to every item in our study, we assume response 

avoidance terms such as I don’t know or no comment to be analogous to nonresponse in the previous research. 

(Admittedly, number of statements made and nonresponse are quantitative in nature; however, following the 

terminology used in previous data collection research [e.g., in De Leeuw, 1993], we used the word quality to 

refer to these characteristics of collected data.) 

Methods 

Instruments 

The Virre environment contains four parts: the application itself, a webcam, a microphone, and a computer that 

runs the application. The application, written with Delphi, handles question sets, which can predefined by a Virre 

operator. For each question in a set, the application plays an audio file in which the question is spoken; it also 

presents the question as text to the user. While the user responds to the question, the application records audio 

and video to a file so that the user can later watch and comment on the answer file. All data, such as question sets 

and comments, are saved to files in XML format. 

Virre needs external software for recording and playing question files and for playing back answer files. Since 

Microsoft’s DirectX interface is used to capture and compress Virre’s video stream, it is possible to use any 

DirectX-compatible webcam. 

Currently, Virre’s components are housed inside a bear with a web cam in its nose and a video screen in its 

abdomen; however, the Virre components could be installed within different types of exteriors to match different 

contexts. See Eronen, Jormanainen, and Virnes (2003) for more details. Figure 2 shows a photograph of Virre in 

its stuffed-bear version, which is designed for elementary-age students. 

Participants and Setting 

This investigation took place in the context of a computer science education program, called Kids’ Club, located 

at a university in eastern Finland (See Eronen, Sutinen, Vesisenaho, & Virnes, 2002a, 2002b). According to the 

Kids’ Club website (Kids’ Club, n.d.): 

Kids’ Club is a collaborative research laboratory, where children of age 10-15 work together with 

university students and researchers of Computer Science and Education. To children, Kids’ Club 

appears as a technology club with an opportunity to study skills of their interests in a playful, nonschool-like 

environment, where there is room for innovative ideation and alternative approaches. (n. p.) 

Typical activities in Kids’ Club include designing and programming robots and carrying out technology design 

projects. 

The 20 participants in this study self-selected into the Kids´ Club program, which is held at a learning laboratory 

of a department of computer science. Eight of the participants came from a university practice school; twelve of 

the students came from a local, public elementary school. The ages of the students ranged from 10 and 12 and 

their grade levels ranged from 4 th grade to 6 th grade. As is typical in computer science education programs, 

the number of female students who self-selected into the program was much lower than the number of males 

who self-selected into the program (Galpin, 2002; Teague, 1997); only 2 of 20 participants in this study were 

female participants. 

Procedure 

In this study the independent variable is response condition (i.e., Virre responding or written responding), the 

dependent variables are number of words per response, number of unique clauses per response, and the presence 

of one or more response avoidance phrases as the sole elements of a response. Oral language markers like mmm. 

. . , hmmm. . . , and oh. . . were removed from the responses before analyzing number of words. 

198

Figure 2. A Student Interacting with an Ursidae Version of Virre 

A clause was defined as a subject-predicate combination or implied subject-predicate combination. For example, 

if a student, in response to the question --“What did you do today?”-- had responded with “Built robots, I built 

robots today,” both of those statements would have been counted as clauses. A unique clause was defined as 

clause that did not repeat the same information already reported in an earlier clause, in response to the same 

question. For example, if a student, in response to the question -- “What did you do today?” -- had responded 

with “Built robots, I built robots today,” that response would have been considered to have only one unique 

clause since the information in the two clauses was essentially the same. 

We dichotomously categorized a response as a response avoidance if the students used one or more of the 

following terms or phrases, translated from Finnish, without commenting further: “nothing,” “nothing special,” 

“everything,” “can’t remember,” “nothing new,” “all kinds of things,” “anything,” “something,” or “it doesn’t 

matter.” For example, if a student, in response to the question -- “What did you do today?” -- answered “All 

kinds of things,” that response would have been counted as a response avoidance; however, if the student had 

responded with “All kinds of things. We learned programming. We built robots,” that response would not have 

been counted as a response avoidance since the student elaborated on what had been done. 

Students’ responses were collected on four occasions, twice in the Virre condition and twice in the written 

condition. Prior to this investigation, students had received instruction on how to design, build, and test robots. 

During all measurement sessions students did the same activities; they built, programmed, and tested their 

robots. During the last part of the class, students were asked to respond to the following questions, which are 

translated from Finnish: 

1. What did you do in Lego-robot club today? 

2. In your opinion, what was hard about today’s lesson? 

3. In your opinion, what was easy about today’s lesson? 

4. What did you learn today? 

The researchers randomly selected which groups of students would respond using Virre and which would 

respond in writing. There were no time limits on how long students could take to respond in either condition. 

If a student was instructed to use Virre, the student would individually go to Virre, hit the start button and would 

then see the question on the screen and hear the question being spoken by Virre. Then the student would record 

his or her answer, in private, by speaking to Virre. When the student was done speaking, he or she would tap the 

start button again to end the recording and move on to the next question until the student had answered all four 

199

questions. If the student was instructed to respond in writing, he or she would be given a form which had the 

four questions on it. The forms were filled out individually. 

The Virre and written response conditions were purposively counterbalanced (i.e., half of the students responded 

using Virre one measurement session and responded in writing in another measurement session, while the other 

half did the opposite). After the data had been collected, the responses in the Virre session were transcribed. One 

rater coded the number of words and number of unique clauses per response. To establish estimates of interrater 

reliability, a second rater coded a random sample of 30 (about 10%) of the responses. 

Data Analysis 

We took an intra-sample statistics approach (see Shaffer & Serlin, 2004) to data analysis because this was a 

preliminary study in which we were primarily interested in making inferences concerning the universe of 

responses that the participants in our sample might have made. Since we took an intra-sample approach, we 

treated participants as fixed-effects, used the response as the unit of analysis, and used a factorial, orthogonal, 

between-subjects univariate design for the dependent variables: number of words and unique clauses. (Had we 

taken the traditional, extra-sample statistics approach we would have treated the participant as the unit of 

analysis, used a repeated-measures, two-within factor design where the factors would have been type of 

treatment and measurement, and made an inference to the universe of participants.) A two-way mixed-effect 

model, single-measure intraclass correlation coefficient (consistency definition) was used as the estimate of 

inter-rater reliability. An analysis of how well the data for this study met parametric assumptions and a rationale 

for using multiple univariate analyses can be found in an online appendix; see Randolph (2006). 

We used logistic regression to estimate an odds ratio for the binary, dependent variable: response avoidance. For 

model checking we used Hosmer-Lemeshow’s test (Hosmer & Lemeshow, 1989) and examined classification 

tables. In these analyses, we used the variables response condition (i.e., Virre responding or written responding) 

and participants as covariates. To convert the odds ratio to the r metric, via Rosenthal et al.’s equations (2000, p. 

24), we used the odds ratio to create a binary effect size display, then calculated rbesd. The confidence intervals 

for rbesd were estimated using the variance equations in Shadish and Haddock (1994). 

Equation 3.2 of Rosenthal et al. (2000) was used to calculate rcontrast for the effect size of words per response and 

unique clauses per response; confidence intervals for rcontrast were calculated based on information given in 

Shadish & Haddock (1994, Table 16.3). SPSS 11.0.1 was used to conduct the statistical analyses. 

Results 

Complications 

In the Virre condition, 23 student responses were missing because of a faulty design aspect in Virre. (There were 

320 responses in the written response condition and 297 responses in the Virre condition.) We found that the 

students would sometimes repeatedly push the start button and, as a result, would skip one or more questions. 

We assume that these errors are nonsystematic because we are led to believe that the students pushed the start 

button repeatedly on accident instead of pushing the button repeatedly as a way to avoid responding. Because 

this fault has since been corrected, we do not conceptualize this as part of the treatment’s construct. 

We ran analyses with and without replacement of missing data. (When we replaced missing data, we replaced a 

missing data point with the data point that corresponded to the data point for the same condition, question, and 

participant, but in a different measurement.) Since there were only minor differences in the results when using 

replacement and not using replacement (see the sensitivity analyses section below) and since orthogonal designs 

are more robust than nonorthogonal designs, we did our parametric analyses on the data with missing values 

replaced. However, for nonparametric analyses (i.e., for the analyses of response avoidance and number of types 

of words) the data were not replaced. We also examined the mean differences when response avoidances were 

treated as missing data. 

Interrater Reliability 

The intraclass correlation coefficient between the raters was .79. The coefficient had a 95% upper confidence 

interval of .89 and lower confidence interval of .60. 

200

Number of Words 

Table 1 shows that students in the Virre condition responded with 3.39 more words per response than in the 

written response condition. The ANOVA table of Table 2 indicates that this difference (response condition) was 

statistically significant, F(1, 280) = 39.01, p < .000. For the difference between conditions, rcontrast was .35; the 

95% confidence intervals indicate that the plausible values of the parameter should be between .22 and .48. 

Number of Unique Clauses 

In this sample, there were, on average, 39 more unique clauses per 100 responses in the Virre condition than in 

the written response condition (see Table 1). The ANOVA table presented in Table 3 indicates that this 

difference (response condition) was statistically significant, F(1, 280) = 27.91, p < .000. For the contrast between 

response conditions, rcontrast was .30 with 95% confidence intervals of .17 and .43. 

Response Avoidance 

Regardless of whether participants were included as fixed factors, the odds of a student avoiding a response were 

significantly greater in the Virre condition than in the written response condition. Table 4 shows the 

crosstabulation of response avoidance by response condition. Not using participants as a fixed factor, the odds of 

a student avoiding a response was 2.56 times greater in the Virre condition than in the written response 

condition. 

Taking a logistic regression approach and including participants as a fixed-factor, the odds ratio was 4.26 with 

99% lower and upper confidence intervals of 1.47 and 12.66. This odds ratio corresponds to an rbesd of .35, in the 

written response direction, and 99% confidence intervals of .22 and .48. 

Table 1. Descriptive Statistics for Number of Words and Unique Clauses per Response Condition 

Dependent variable Response Condition Mean SD Std. Error 99% Confidence intervals 

Lower Upper 

Words 

Virre 7.53 7.62 0.05 6.53 8.52 

a 

Written 4.14 4.24 0.05 3.14 5.13 

Unique clauses 

Virre 1.58 0.95 0.39 1.45 1.70 

b 

Written 1.22 0.47 0.39 1.10 1.34 

a. Difference of the mean between response conditions (99% confidence intervals of difference): 3.39 (1.98, 

4.79) 

b. Difference of the mean between response conditions (99% confidence intervals of difference): 0.39 (0.18, 

0.53) 

Table 2. ANOVA Table for Words per Response Condition 

Source Type III Sum of Squares df Mean square F p 

Corrected Model 6,413 a 

39 164.45 6.99 .000 

Intercept 10,881 1 10,881.11 462.36 .000 

Participant 3,520 19 185.24 7.87 .000 

Response Condition 918 b 

1 918.01 39.01 .000 

Interaction c 1,976 19 103.99 4.42 .000 

Error 6,590 280 23.53 

Total 23,844 320 

Corrected total 13,003 319 

a. r 2 = .49 (adjusted r 2 = .42), b. rcontrast =.35 (99% confidence intervals of contrast = .22, .48). c. The interaction is 

between response condition and participant 

Table 3. ANOVA Table for Unique Clauses per Response Condition 

Source Type III Sum of Squares df Mean square F p 

Corrected model a 86.72 39 2.22 6.11 .000 

Intercept 624.40 1 624.40 1,716.15 .000 

201

Participant 45.16 19 2.38 6.50 .000 

Response condition b 

10.15 1 10.15 27.91 .000 

Interaction c 31.41 19 1.65 4.54 .000 

Error 101.88 280 0.36 

Total 813.00 320 

Corrected total 188.60 319 

a. r 2 = .46 (adjusted r 2 = .38), b. rcontrast = .30 (95% confidence intervals of contrast = .17, .43). c. The interaction 

is between response condition and participant 

Table 4. Cross Tabulation of Response Avoidance and Response Condition 

Response condition 

Virre Written 

Response avoidance Yes 27 14 41 

No 110 146 256 

Total 137 160 297 

Note. The odds ratio for this table is 2.56 with Mantel-Haenszel 95% lower and upper bounds of 1.28 and 5.11. 

Participants are not included as a fixed factor in this estimate. Missing data were not replaced in this analysis. 

The logistic regression model’s overall prediction accuracy was 88.9%. The model’s prediction accuracy given 

that there was no response avoidance was 98.0%; however, prediction accuracy given response avoidance was 

only 31.7%. The Hosmer-Lemeshow test (with a statistically nonsignificant X 2 of 5.3 with 7 df) indicated that 

this model was appropriate. 

Sensitivity Analysis 

Table 5 shows the mean differences when no data manipulation was done (i.e., when no data were replaced), 

when missing data were replaced, and when response avoidances were also treated as missing data. Table 5 

indicates that the differences between the data manipulation methods used are negligible. Regardless of the type 

of data manipulation, rcontrast was .35 for number of words and .30 for number of clauses. 

Table 5. Mean Differences under a Variety of Data Manipulations 

Data Manipulation Variable Mean 99% Confidence interval of difference 

difference Lower bound Upper Bound 

No replacement Words 3.39 1.98 4.79 

Replace missing Words 3.39 2.32 4.46 

Replace missing and response avoidance Words 3.97 2.49 5.44 

No replacement Clauses 0.39 0.18 0.53 

Replace missing Clauses 0.36 0.22 0.49 

Replace missing and response avoidance Clauses 0.42 0.24 0.60 

Note. Mean differences are positive in the Virre direction and negative in the written response direction. 

Discussion 

In summary, these 20 students used markedly more words and more unique clauses in the Virre condition than 

they used in the written response condition. There were significantly more instances of students avoiding 

responding in the Virre condition than in the written response condition. However, Table 5 shows that when 

avoided responses are treated as missing data, in both the Virre and written response conditions the number of 

words and clauses in each response in the Virre condition was still significantly higher than in the written 

condition. 

In terms of our first research hypothesis, we were correct that there would be more words and unique clauses in 

the Virre condition than in the written response condition. However, contrary to our hypothesis, there were more 

incidences of response avoidance in the Virre condition than in the written response condition. 

Total 

202

In terms of our second research hypothesis, we were wrong on both counts. Our observed results were far 

outside of the range of what we had expected. Figure 3 shows the predicted and observed effect sizes (in terms of 

r). 

Values of r 

.6 

.4 

.2 

-.0 

-.2 

-.4 

-.6 

Statements (pred.) 

Words (obs.) 

Clauses (obs.) 

Dependent variable 

Nonresponse (pred.) 

Resp. avoid. (obs.) 

Upper interval 

Figure 3. Differences Between Predicted and Observed Effect Sizes 

Lower interval 

Effect size estimate 

Note. pred. = predicted. obs. = observed. Resp. avoid. = response avoidance. For Words, Clauses and Resp. 

avoid., values of r are positive in the Virre direction. For Statements and Nonresponse, values of r are positive in 

the face-to-face interview direction. 95% confidence intervals are shown for predicted variables. 99% confidence 

intervals are shown for observed variables. 

We have several alternative hypotheses for the discrepancy between the expected and observed values: 

The variables that we chose do not correspond directly with the variables in the previous literature. For 

example, number of words and number of unique ideas may not measure the same construct, in the same 

way, as number of statements to open-ended questions. 

The analogy that mail responding (or telephone responding in the case of number of statements) is to faceto-face 

responding as written responding is to Virre responding is strained. 

The discrepancy is an artifact of our self-selecting sample compared to the supposedly random samples 

selected in the previous literature. 

The discrepancy is due to systematic error in either this study or the meta-analysis of the previous literature 

or a nonsystematic meta-analytic error as discussed in Briggs (2005). 

In terms of the reason for the higher number of response avoidances in the Virre condition, we suspect that 

there was a tacit academic convention that each question must not be avoided in the written response 

condition and that no such convention had been established in the Virre condition. Equally, the greater 

number of words and unique clauses in the Virre condition may have been due to an experimenter effect. 

In actuality, we believe that the discrepancy is probably due to a combination of these hypotheses. 

Since we took an intra-sample approach in this study, these results do not statistically generalize to the 

population of students. The results generalize to the population of responses that these students could have 

responded with. However, the results do at least provide some support for a revised hypothesis that students 

when using Virre will use more words and unique clauses, but also have a tendency to ‘hedge’ more on their 

answers, compared to when written responding is used. The prevalence of response avoidances in the Virre 

203

condition, however, do not drastically change the fact that there were significantly more words and clauses in the 

Virre condition than in the written response condition. 

Assuming that future generations of Virre studies affirm and generalize our finding that students respond with 

more words and unique clauses, controlling for response avoidances, there are several practical implications for 

researchers. If Virre can indeed be used as an alternative to face-to-face interviewing, researchers, evaluators, 

and educators in similar programs can benefit because Virre, and other computer-assisted self-interviewing tools 

are more cost-effective in the long run than face-to-face interviewing (De Leeuw & Nichols, 1996) and provide 

more data than written responses, as was shown in this study. 

Conclusion 

In this study we compared the effects of using a computer-assisted self-interviewing tool, called Virre, with 

written responding on the variables: number of words, number of unique clauses, and response avoidance. 

Taking an intra-sample approach, it was found that students who self-selected into a computer science education 

program responded to a series of reflection questions with significantly more words and more unique clauses 

during Virre responding than during written responding. The students were much more likely to use a response 

avoidance term during Virre responding than in written responding; however, when treating answers that were 

categorized as a response avoidance as missing data, the number of words and clauses per response was still 

much higher in the Virre condition. Although these results do not generalize to the population of K-12 students, 

they do give preliminary evidence that Virre, and computer-assisted self-interviewing tools like Virre, could lead 

to more data than written responding or face-to-face interviewing. If this theory is supported by further studies, 

the implications are that researchers, evaluators, and educators could use Virre or Virre-like tools as a 

supplement or substitute to face-to-face interviewing, which, particularly when large numbers of subjects are to 

be interviewed, is more expensive and resource-intensive when many interviews need to be done. 


This research was supported in part by a grant from the Association for Computing Machinery’s Special Interest 

Group on Computer Science Education (SIGCSE). 

References 

Briggs, D. C. (2005). Meta-analysis: A case study. Evaluation Review, 29 (2), 87-127. 

De Leeuw, E. D. (1993). Data quality in mail, telephone and face-to-face surveys, Amsterdam: TT-Publikates 

(ERIC Document Reproduction Service No. ED 374136). 

De Leeuw, E. D., & Nicholls, W. (1996). Technological innovations in data collection. Acceptance, data quality 

and costs. Sociological Research Online, 1 (4), Retrieved May 4, 2006, from, 

http://www.socresonline.org.uk/cgi-bin/abstract.pl?1/4/leeuw.html. 

Dillman, D. A. (1978). Mail and telephone surveys: The total design method, New York: Wiley. 

Eronen, P. J., Jormanainen, I., & Virnes, M. (2003). Virtual reflecting tool - Virre. In Proceedings of the Third 

Finnish / Baltic Sea Conference on Computer Science Education, Finland: Helsinki University of Technology 

Press, 42-47. 

Eronen, P. J., Sutinen, E., Vesisenaho, M., & Virnes, M. (2002a). Kids’ Club as an ICT-based learning 

laboratory. Informatics in Education, 1 (1), 61-72. 

Eronen, P. J., Sutinen E., Vesisenaho, M., & Virnes, M. (2002b). Kids’ Club - information technology research 

with kids. In Kommers, P., Petrushkin, V., Kinshuk, & Galeev, I. (Eds.), Proceedings of the IEEE International 

Conference on Advanced Learning Technologies, Los Almitos, CA: IEEE Press, 331-333. 

Galpin, V. (2002). Women in computing around the world. ACM SIGCSE Bulletin, 34 (2), 94-100. 

204

Gay, G., & Bennington, T. L. (2000). Information technologies in evaluation: Social, moral, epistemological, 

and practical implications: New Directions for Evaluation, New York: Jossey-Bass. 

Hosmer, D. W., & Lemeshow, S. (1989). Applied logistic regression, New York: Wiley. 

Kids’ Club (n. d.). Welcome to Kids’ Club web pages, Retrieved May 4, 2006, from, 

http://cs.joensuu.fi/~kidsclub/. 

Love, A. (Ed.). (2004). Harnessing technology for evaluation. The Evaluation Exchange, 10 (3), Retrieved May 

4, 2006, from, http://www.gse.harvard.edu/hfrp/content/eval/issue27/fall2004.pdf. 

Means, B., Roschelle, J., Penuel, W., Sabelli, N., & Haertel, G. (2003). Technology’s contribution to teaching 

and policy: Efficiency, standardization, or transformation? Review of Research in Education, 27, 159-181. 

Pressley, M., & McCormick, C. B. (1995). Advanced educational psychology for educators, research, and 

policymakers, New York: Harper Collins. 

Randolph, J. J. (2006). Online appendix to “The Effects of a Computer-Assisted Interview Tool on Data 

Quality”, retrieved May 4, 2006, from, http://geocities.com/justusrandolph/virre_appendix.pdf. 

Randolph, J. J., & Edmondson, R. S. (2005). Using the binomial effect size display (BESD) to present the 

magnitude of effect sizes to the evaluation audience. Practical Assessment Research & Evaluation, 10 (14), 

retrieved May 4, 2006, from, http://pareonline.net/getvn.asp?v=10&n=14. 

Rosenthal, R., Rosnow, R. L., & Rubin, D. B. (2000). Contrasts and effects sizes in behavioral research: A 

correlational approach, Cambridge, United Kingdom: Cambridge University Press. 

Rossi, P. H., Wright, J. D., & Anderson, A. B. (1983). Sample surveys: History, current practice, and future 

prospects. In Rossi, P. H, Wright, J. D. & Anderson, A. B. (Eds.), Handbook of survey research, San Diego: 

Academic Press, 1-20. 

Saris, W. E. (1989). A technological revolution in data collection. Quality and Quantity, 23, 333-350. 

Saris, W. R. (1991). Computer assisted interviewing (Quantitative applications in the social sciences, no. 80), 

Newbury Park: Sage. 

Shadish, W. R., & Haddock, C. K. (1994). Combining estimates of effect size. In Cooper, H, & Hedges, L. V. 

(Eds.), The handbook of research synthesis, New York: Russell Sage Foundation, 261-281. 

Shaffer, D. W., & Serlin, R. C. (2004). What good are statistics that don’t generalize? Educational Researcher, 

33 (9), 14-25. 

Smith, T. W. (1987). The art of asking questions - 1936-1985. Public Opinion Quarterly, 51, 95-108. 

Teague, J. (1997). A structured review of reasons for the underrepresentation of women in computing. In 

Proceedings of the 2 nd Australian Conference on Computer Science Education, New York: ACM Press, 91-98. 

Weisband, S., & Kiesler, S. (1996). Self disclosure on computer forms: Meta-analysis and implications. In 

Tauber, J. M. (Ed.), Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: Common 

Ground, New York: ACM Press, 3-10. 

Witt, K. J., & Bernstein, S. (1992). Best Practices in Disk-By-Mail Surveys. Sawtooth Software Conference 

Proceedings, Evanston: Sawtooth Software. 

Zandan, P., & Frost, L. (1989) Customer Satisfaction Research Using Disk-By-Mail. Sawtooth Software 

Conference Proceedings, Evanston: Sawtooth Software. 

205

Breiter, A., & Light, D. (2006). Data for School Improvement: Factors for designing effective information systems to support 

decision-making in schools. Educational Technology & Society, 9 (3), 206-217. 

Data for School Improvement: Factors for designing effective information 

systems to support decision-making in schools 

Andreas Breiter 

Institute for Information Management Bremen, Am Fallturm 1, 28359 Bremen, Germany 

Tel: +49 421 218 7525 

Fax: +49 421 218 4894 

abreiter@ifib.de 

Daniel Light 

EDC/ Center for Children and Technology, 96 Morton St, New York NY 10014, USA 

Tel: +1 (718) 237-5116 

dlight@edc.org 

ABSTRACT 

National legislation that increased the role of accountability testing has created pressure to use testing data, 

along with other data, for instructional decision-making. Connected to this push for data-driven decisionmaking, 

is the increased interest in data delivery systems or Management Information Systems (MIS) in 

education. But, before administrators rush to build data and information systems, we argue for a careful 

review of existing knowledge about information systems in the education sector in light of what business 

and organizational research already knows about information systems. We draw on the considerable body 

of business and organizational research on MIS and a recent educational case study in New York City to 

introduce a theoretical framework to describe the process from data to decision-making in schools. Our 

exploration of how schools use information focuses on the potential of new technologies and new ways of 

analysis to meet the information needs of educators across different levels of the system. We conclude with 

a discussion about critical factors for the development and implementation of effective information systems 

for schools: 1) Build from the real needs of classroom and building educators; 2) Recognize teachers’ 

wealth of tacit knowledge as a starting point; 3) Select appropriate data to include in the information 

system; 4) Effective testing requires close alignment between standards, teaching and testing; 5) Educators 

need professional development on instructional decision-making that considers the role of data; 6) 

Educators need expanded repertoires of instructional strategies; and 7) Further research on effective 

instructional decision-making and IS support is needed. 

Keywords 

Standard-Based Testing, Decision Support System, Data-Driven Decision-Making, Management Information 

Systems, Assessment Information Systems 

Introduction 

The shift in the funding and regulatory environment caused by the No Child Left Behind Act (NCLB) has 

prompted many district and school administrators to think differently about the potential that assessment data 

and information systems may have to inform instruction and decision-making aimed at raising student 

achievement. Increasingly the exploration of how data can inform instructional decisions is a main topic of 

educational policy (Salpeter, 2004; Secada, 2001) and building Management Information Systems (MIS) is a 

central concern for many administrators. But, before administrators rush to build data and information systems, 

we argue that educational decision-makers could benefit from a review of relevant work being done in business 

research and its application to an analysis of early experiences using technology to provide test data to classroom 

teachers. We draw on the considerable body of business organizational research on MIS and a recent educational 

case study in New York City to explore the question of how schools use information systems and the 

information needs of end users across different levels of the system. The case study examines the 

implementation of a web-based information system for student standardized test results to assist the educator’s 

daily practice. Finally, the paper concludes with a discussion about critical factors for the development and 

implementation of information systems for schools and the meaning the data can have for school administration. 

Support for Decision-Making 

Research about using test data to support classroom-level decisions or building-level planning to improve 

learning is just beginning to emerge. In the U.S., the research from different design sites that are piloting 

educational data-systems are seen in: the Quality School Portfolio (QSP) developed at CRESST (Mitchell & 






206

Lee, 1998), and IBM Reinventing Education data projects in Broward County Florida (Spielvogel et al., 2001), 

the Texas Education Agency, and the South Carolina Department of Education (Spielvogel & Pasnik, 1999). 

Research on the role of data systems and applications in practice is also being done in Minneapolis (Heistad & 

Spicuzza, 2003), Boston (Sharkey & Murnane, 2003), San Francisco (Symonds 2003) and on the implementation 

of QSP in Milwaukee (Thorn, 2002; Webb, 2002). 

Additionally, there is a body of empirical research on the use of school information systems in other countries, 

like the case study in New Zealand (Nolan, Brown, & Graves, 2001), Visscher and Bloemen’s (1999) 

examination of data-system in Dutch schools (Visscher & Bloemen, 1999), an examination of experiences with a 

widely used school information systems in Great Britain (Wild, Smith, & Walker, 2001) and in Hong Kong 

(Fung & Ledesma, 2001). In Germany, most studies primarily focus on test administration and feedback 

mechanisms (for an overview see e.g. Kohler & Schrader, 2004; Weinert, 2001). Nevertheless, most studies 

mainly focus on administrative data for school management. Less emphasis is given to the role of data for 

teachers in their instructional decision-making. 

Although research on data-support systems is just beginning in the education field, such systems, known either 

as Management Information Systems (MIS) or Decision Support Systems (DSS), have been a focus since the 

early 70’s in the field of organization and management research. Except for the work done by Thorn in 

Milwaukee (Thorn, 2001, 2002, 2003), the study on High School’s continuous improvement process (Ingram et 

al., 2004) and the approach of Petrides and Guiney on knowledge management in schools (Petrides & Guiney, 

2002), most of the above educational studies use case study methods and do not offer a theoretical model of 

data-driven decision making. Thorn used MIS theory to help understand data-support systems in schools. In this 

paper we follow his lead and present a review of MIS research and use our own research on a New York City 

educational data-system to highlight the application of an MIS framework to education. 

Taking into account the increased significance of “information” as a prime resource in management contexts and 

its importance for the support of decision-making processes, various approaches to information management 

were developed in the beginning of the 70’s. MIS are based on the assumption that availability of relevant 

information is a necessary condition for decisions. Simon (1977) suggested three phases of decision-making: 

intelligence (review the environment, analyze goals, collect data, identify problem, categorize problem, assess 

ownership and responsibility), design (develop alternative courses of action, analyze potential solutions, create 

model, test for feasibility, validate results) and choice (acceptability of solution, building normative models). In 

all three phases information has to be provided and/or searched for in different forms and levels of aggregation. 

Essentially, management decisions can be understood as information processing where information takes on a 

strategically important significance for the organization’s development. 

In the beginning of the 1970’s, Gorry and Scott Morton (1971) recounted, in an empirical study, that the first 

MIS failed mainly because it was based on a flawed understanding of managerial work and a fundamental 

misunderstanding regarding the necessary information (Gorry & Scott Morton, 1971). The predominant 

perspective at that time, which is still common today among many system developers, is based on a naïve 

understanding of decision-making processes. It was thought that decisions can be exclusively rationalized and it 

took some years to understand the “bounded rationality” of decision-makers (Simon 1977). Built on this 

conception, company-wide projects collected data from every department and sub-department to be stored in a 

central data bank, which managers could use to make the “right” decisions. Through extensive case studies 

Gorry and Scott Morton (1971) showed that only a small portion of the collected data was relevant for decisionmaking 

and most of the data was totally worthless for decision-making. 

Ackoff (1989) elaborates on the fundamental flawed assumptions that accompanied the development of MIS. 

The preconceptions underlying the design of early MIS presupposed that the crucial need of management was 

the availability of all relevant information. The early designers failed to realize that good management requires 

the reduction of irrelevant information to focus on relevant information when making a decision. The early MIS 

systems overloaded decision-makers with extraneous and irrelevant information, which merely complicated their 

task of selecting out the relevant data. This mistaken assumption was also exacerbated by actual managers who, 

when asked, in the abstract, would usually request all possible information but then get lost in data overload. It is 

important to bear in mind, however, that the definition of relevant information is often hard for managers as they 

usually act with unclear information. Feldman and March (1988), too, doubt that the provision of requested 

information should be sufficient to make good decisions. Their assumption is that, often, the necessary 

information only can be identified when the decisions have already been made. Even if essential information is 

already available to the decision-makers it is frequently ignored during the decision-making process. 

207

Today, MIS literature has moved forward to what is called “business intelligence”. The underlying assumption 

remains that enough efficient computers would be able to remove the problems of user-friendly data analysis. 

The latest approaches, like data warehousing, integrate multiple databases and promise the use special 

algorithms (“data mining”) that will uncover hitherto unknown connections in big databases (see e.g. Laudon & 

Laudon, 2002; Marakas, 2003). But even these technologically complex systems do not meet the demands of 

decision-makers who expect more than simple predefined reports. In the end of the 90’s the hardware still was 

not efficient enough to carry out the analyses. Longitudinal analyses were difficult because only limited 

historical data was available. OLAP (“On-Line Analytical Processing“) brought the expectation that now 

complex, fast, and user-friendly data bank inquiries could be performed (see e.g. Connolly & Begg, 2002). 

Unlike earlier technologies, OLAP facilitates a more effective and efficient access to the core data, and the 

graphic visualization and the user interface have been improved. 

In this brief review of research, we have identified a few key points as we move on to think about school 

information systems: 

Decision-making is a highly complex individual cognitive process influenced by various environmental 

factors. The classroom may be the example par excellence of an inter-subjective decision-making 

environment. Teachers constantly make decisions which may affect 20 or more children. 

Decision-makers often are not fully cognizant of the specific data they rely on for each decision. Identifying 

the information needs of the decision maker is a crucial step in designing an effective information system, 

although identifying the appropriate information to put into the system is a complex and time-consuming 

process. When making instructional decisions, teachers may need to consider their students’ intellectual and 

physical abilities, developmental stages, personalities and their interpersonal skills. 

An effective information system needs to incorporate the logistical elements of time, quantity, quality and 

access. Schools are challenging organizations to work with: they lack professional staff for data processing 

and distribution; and teachers are often isolated in their classrooms, unable to absent the classroom to 

retrieve information. 

Typology of School Information Systems 

School information systems constitute a clear sub-group of management information systems that are used in 

educational organizations. In schools, distinct information systems support different types of decisions: 

administrative information systems, learning management systems and assessment information systems. In 

principle we must distinguish between systems that are focused directly on the support of the teaching and 

learning process and systems that serve for the administration and instructional decisions (see figure 1). An often 

cited definition for a school information system is given by Visscher: “[A]n information system based on one or 

more computers, consisting of a data bank and one or more computer applications which altogether enable the 

computer-supported storage, manipulation, retrieval, and distribution of data to support school management.“ 

(Visscher, 2001, p. 4). But this emphasizes only the aspect of administrative support. 

Figure 1: Typology of school information systems 

208

Administrative information systems span the whole area of basic data – from addresses to scheduling and 

timetables, to accounting and financial planning. Usually, the school management must work with several 

systems for different purposes that are compatible in limited ways. 

Learning management systems or Learning Content Management Systems aim at the direct support of the 

learning and teaching process (e-learning). Here, either learning processes are controlled individually or material 

is made available to students and teachers. 

Independent of the content management systems, are assessment information systems (AIS) that make data about 

student test performance available. The assessment information can come from standardized tests, classroombased 

assessments (i.e. teacher designed) or from portfolios of student work. The systematic analysis and use of 

AIS for instruction and school development will be a determining topic for schools and their management in the 

next years. Stronger accountability policies as well as school improvement processes will require the collection, 

analysis and reporting of data. Many systems are being combined into “data warehouses” since the database 

integration of multiple sources allows for multidimensional analyses and reduces the expense of inquiry and 

maintenance. 

We can learn from the empirical research in MIS that decision-makers at different levels of the school system 

have different information needs, which first have to be identified and analyzed. The aspect of information 

logistics in the design of MIS raises the question of how accurate information can be provided to the right 

persons at the right time. In the school system four action levels with specific actor groups can be distinguished 

that each have different needs of information (see table 1). 

Table 1: Model of levels of information needs in schools 

Level Stakeholders Information needs 

Classroom Teachers Disaggregated student data 

Students Grades and test scores / portfolios 

Tracking of attendance / suspensions 

School Principal Aggregated longitudinal student data (i.e. by class, by subject) 

Administrators Grades and test scores 

Tracking of attendance / suspensions 

Aggregated longitudinal administrative data 

Coordination of class scheduling 

Special education and special programs scheduling 

Allocation of human resources 

Professional development 

Finance and budgeting 

District Superintendent Aggregated longitudinal student data (i.e. by building, by grade) 

Administrators Aggregated longitudinal administrative data (i.e. by building, by grade) 

External data reporting requirements 

School Environment Parents Disaggregated student data 

Local community Aggregated administrative data 

Data-to-Knowledge Process Model 

Most theories of information management draw distinctions among data, information, and knowledge. For 

example, knowledge is regarded in management literature as being embedded in people, and knowledge creation 

occurs in the process of social interaction about information (e.g. Brown & Duguid, 2000; Sveiby, 1997). 

Decision support systems have to be embedded in a larger context which is often described as knowledge 

management (e.g. Nonaka & Takeuchi, 1995) or organizational learning (Argyris & Schoen, 1978; Bhatt & 

Zaveri, 2001). The terms ‘information’ and ‘knowledge’ are often used as though they were interchangeable 

when in practice their management requires very different processes. At the core of knowledge management is 

the idea of systematizing and categorizing the variety of data and providing the resulting product in an 

appropriate using information technology. This, then, supports organizational learning (e.g. Alavi & Leidner, 

2001; Earl, 2001). This perspective is supported by Nonaka and Takeuchi (1995): “information is a flow of 

messages, while knowledge is created by that very flow of information anchored in the beliefs and commitment 

of its holder. This […] emphasizes that knowledge is essentially related to human action”. Likewise, Drucker 

(1989) claims that “[…] knowledge is information that changes something or somebody - either by becoming 

grounds for actions, or by making an individual (or an institution) capable of different or more effective action” 

209

(Drucker, 1989). Therefore, data, prior to becoming information, is in a raw state and is not connected in a 

meaningful way to a context or situation. 

Borrowing from Ackoff’s (1989) work in the field of organization and management theory, we adapted a 

simplified version of Ackoff’s conceptual framework that links data, information and knowledge (Breiter & 

Light, 2004a, 2004b). Within the framework, there are three “phases” of the continuum that begins with raw data 

and ends with meaningful knowledge that is used to make decisions. They are the following: 

Data exist in a raw state. They do not have meaning in itself, and therefore, can exist in any form, usable or 

not. Whether or not data become information depends on the understanding of the person looking at the 

data. 

Information is data that is given meaning when connected to a context. It is data used to comprehend and 

organize our environment, unveiling an understanding of relations between data and context. Alone, 

however, it does not carry any implications for future action. 

Knowledge is the collection of information deemed useful, and eventually used to guide action. Knowledge 

is created through a sequential process. In relation to test information, the teacher’s ability to see 

connections between students’ scores on different item-skills analysis and her classroom instruction, and 

then act on them, represents knowledge. 

Figure 2: The process of transforming data into knowledge (see also Light, Wexlar and Heinze, 2004) 

The literature identifies six broad steps (see Figure 2) through which a person goes in order to transform data 

into knowledge (Ackoff 1989; Drucker, 1989). The process entails collecting and organizing data, along with 

summarizing, analyzing, and synthesizing information prior to acting (making a decision). This is the process 

through which raw data are made meaningful, by being related to the context or situation that produced it; 

consequently, human action underlies all decision-making. This sequential process underlies our understanding 

of how teachers interact with data. 

We can use an example of test scores to exemplify the underlying theoretical assumptions that distinguish among 

data, information and knowledge presented above. For example, the number 655 by itself means little. Test-data 

becomes information with the realization that the numbers indicate a measure of student test performance and 

the level of performance. A scale score of 655 on the third grade Language Arts test places that child below 

proficiency at Level 2 of the New York State Performance Levels. In relation to test scores, knowledge comes 

when the information can guide practice. The school must help a child achieve proficiency, therefore a Level 2 

student might be enrolled in a supplemental program, or placed in cooperative groups with higher-level students 

(Level 3 or 4). Furthermore, 655 is right at the cut off to Level 3 – this provides even more knowledge about the 

significance of this child’s score to the school’s overall accountability process. Since schools must show a 

certain percentage of the students moving from Level 2 up to Level 3, a child whose score is right below the cut 

off is easier to move up than students scoring way below the cut offs. Achieving a certain level of knowledge 

about the test results is an important step in the decision-making process, for example. 

210

Case study: New York City Department of Education and the Grow Network Data 

Reports 

This paper draws on a research project of the implementation of an assessment information system in a large 

U.S. city. The research project Linking Data and Learning was funded by the Carnegie Corporation (Light et al., 

2005). The New York City Department of Education (NYCDOE) is the largest education system in the USA 

with over a million students and 80,000 teachers. In 2001, NYCDOE contracted with the Grow Network to 

provide a data-driven decision-making tool at the third-grade through eighth-grade levels, where there are 30,000 

teachers, 5,000 district and school instructional leaders, and 1,200 schools serving approximately 400,000 

students. This represented an unprecedented effort to use standardized assessment data linked with supporting 

teaching resources to improve the quality of educational decision-making across multiple levels of the school 

system. 

A number of factors of the larger context of New York City are often perceived as obstacles to improving 

student achievement. Among the total student population there are 423,694 children who are supported by public 

assistance, 153,134 receiving special education services and 127,099 English Language Learners (for all 

statistics see http://nycenet.edu/Offices/Stats). Other key complicating factors include: 

High levels of concentrated poverty, homelessness, isolation and addiction. 

High rates of teacher turn over create a constant cycle of novice teachers in need of training, mentoring, and 

support regarding content knowledge. 

High turn over of school leaders, 50% of school leaders hold their position for less than three years. 

Cultural disconnect: given the diversity of the school population and the challenge to effectively 

communicating philosophies, addressing learning needs of many different students and families is a 

challenge. 

Basic issues of ongoing maintenance and technical support are still concerns. Internet connectivity was still 

an obstacle in some districts. 

NYCDOE introduced a system-wide data-support tool for its schools with the help of the Grow Network 

Company in 2001. The goal of Grow Network’s NYCDOE Data Reports was to use paper and on-line reports to 

present relevant standardized test results to teachers, principals, and parents with specific recommendations for 

responsive action. Targeting students in grades 3-8, the objective was to use standardized assessment data, 

coupled with supporting resources and professional development, to improve the quality of instructional practice 

and student outcomes. By the end of the study, the Grow Network delivered four different data Reports 

reflecting different views for different target groups (table 2). 

Table 2 Views of Grow Network Data Reports 

View Information Reported Target group 

School Aggregated data, divided by subjects and grades Principal, district, local community 

Class Students’ test results, divided by subjects and by item-skills Teachers, staff developers 

Subject Aggregated data, divided by grades Teachers, subject coordinators 

Students Test data, divided by subjects Students, teachers, parents 

The Grow Network does a substantial amount of cleaning and manipulating to prepare the data for the Reports. 

Students are grouped by their current class, grade and school. They are grouped not just by score, but also 

according to the New York State standards across four levels, ranging from Far Below Standards (Level 1) to 

Far Above Standards (Level 4). Furthermore, the Grow Reports also present educators with student results on 

sub-sections of the test. For example, the teacher’s Grow Report groups students in accordance with the state 

performance standards and provides an overview of class-wide priorities. For administrators, the reports provide 

an overview of the school, and present class and teacher-level data. For the parents, the reports explain the goals 

of the test, how their child performed, and what parents can do to help their child improve their score. 

Soon after the beginning of the school year, educators receive a paper report with disaggregated data about their 

current students and they get a password-protected account on-line with access to complete test data for every 

student broken down by test areas. The on-line information system allows teachers to find and compare 

individual student data. Their class is ranked and grouped according to each sub-test and skill item. 

In addition to the test reports, the Grow Network website supports teachers’ analysis and instructional decisionmaking 

with two additional features. First, the website provides information on the state standards, definitions of 

the tested skills and concepts. Administrators and teachers often cited these materials as an important component 

211

of the information provided. Second, the reported data is also linked to instructional materials and resources for 

teachers and administrators that suggest activities and teaching strategies to promote standards-based learning in 

the classroom. The reports also link to external resources approved by NYCDOE. The goal is to help teachers to 

collect more course material and rethink their classroom organization. 

While most parts of the system rely on existing data from other sources, the analytical component offers 

aggregates of data (according to subject, item-skills). Additionally, the system suggests different models for 

classroom organization (e.g. heterogeneous or homogeneous grouping, additional material according to 

standards). 

Research Methodology 

The research project used a mixture of qualitative and quantitative methodologies. At the building and district 

level, the qualitative components focused on understanding the how educators (from classroom teachers to 

district administrators) understood the data presented in the Grow Reports and how they used that information to 

generate knowledge that could inform their decision-making. At the highest levels of the system, the findings are 

based on structured interviews with 47 educational leaders, including: central office stakeholders, 

superintendents, deputy superintendents, math coordinators, English Language Arts (ELA) coordinators, staff 

developers, district liaisons, technology coordinators, directors of research and curriculum. Additional interviews 

were conducted with representatives from non-governmental organizations working closely with the New York 

City schools on issues such as educational reform and professional development. 

The research team also conducted ethnographic research in 15 schools across four school districts in New York 

City that represented various neighborhoods, student populations, and overall performance levels. This 

methodology is increasingly used in information systems research as hidden expectations, beliefs and usability 

concerns can be identified (e.g. Myers 1999; Trauth 2001). Research in the 15 schools produced 45 semistructured 

and open-ended interviews with principals, assistant principals, staff developers, and teachers; and 

observations of ten grade-wide meetings and/or professional development workshops. To further explore the 

ways in which teachers think about using assessment information, the team conducted structured interviews 

using sample Grow Reports with 31 teachers in grades four, six and eight. The qualitative data was used to 

design two separate surveys for teachers and for administrators to explore how educators interpret data and 

conceptualize the use of the Grow Reports for instructional planning and the types of supports needed to fully 

leverage the use of data to improve instruction. 

Use by Administrators 

Linking Data and Learning found that administrator uses of the Grow Reports could be grouped into four main 

categories. However, depending upon the administrators’ position (e.g. superintendent for curriculum, district 

math coordinator, school principal, or staff developer), they generated knowledge from the data organized by the 

Grow Reports and implemented their decisions into the school or district in slightly different ways. 

Identifying areas of need and targeting resources: Administrators explained that the Grow Reports helped 

them to identify class-, grade-, and school-wide strengths and weaknesses that could then be used to make 

decisions about planning, shaping professional development activities, and determining student performance 

and demographics. As one superintendent explained, “Grow allows you to combine test results and 

longitudinal analysis to diagnose a school’s strengths. This helps make decisions about professional 

development and resource allocation” (Light et al. 2004, p.41) 

Planning: Once administrators identified which students, teachers, and resources they wanted to target, the 

data helped them to focus school or district planning activities. Administrators explained that they used the 

data on the Grow Reports to plan for setting school and district priorities and for instructional programs. 

However, administrators do not look to test data exclusively to make decisions because it is based on a 

single assessment. A superintendent reflected that, “You have to take into consideration how valid or current 

the data is when you are using the previous year’s data. This is why you have to compare Grow with all of 

the other data sources” (Light et al. 2004, p. 42) 

Supporting conversations: Administrators considered that the reports and test data helped frame important 

conversations across the school system about student learning, professional development needs and school 

wide challenges. The reports provoked discussion about the role of testing in teaching and learning, as well 

as the (mis)alignments among standards, teaching and assessment. Specifically, the reports raise issues 

regarding standardized tests as diagnostics. Some respondents spoke of using standardized test data to move 

212

principals’, teachers’ and parents’ conversations about the determining factors of student achievement away 

from issues of socio-economic status or student behavior and toward a specific focus on students’ strengths 

and weaknesses. 

Shaping professional development: The Grow Reports were used both as the focus of a professional 

development activity (typically in a workshop format) and to make decisions about other professional 

development activities, such as helping teachers to create differentiated instructional activities or learning 

about school or district-wide standards and goals through their close alignment to the Grow Reports. 

Use by Teachers 

Generally, the teachers considered the Grow Reports to be easy to read and informative. Teachers felt the report 

was self-explanatory. They compared them very favorably to all prior reporting formats they had seen. In 

particular, those teachers who had been using data as a planning tool felt the reports were a substantial 

improvement and time saving tool. The interviews with teachers also uncovered a core of “practitioner 

knowledge” about using the data which is what enabled them to use the Grow Reports to inform their 

instructional decisions. On the issue of teachers’ background in psychometrics, teachers do not often have strong 

knowledge, but they did touch on many measurement and assessment issues and raised a number of concerns, 

although not phrased in psychometric terms. Some teachers raised concerns about the ways the data might have 

been transformed as it was turned into the Grow Report. Nearly all teachers touched on issues of validity and 

reliability in some way. Around validity, teachers had concerns about students who test poorly, or where having 

a “bad day” when they took the test. Teachers also raised concerns about students who test well – whose higher 

scores do not truly reflect their ability – and would be denied needed academic support because of it. Teachers 

also had reliability concerns linked to the difference between what they see as life skills and deeper learning, and 

the skills measured by the discrete, de-contextualized items on the test. 

Teachers’ concerns about reliability and validity were mediated by two factors – the level at which decisions 

were being made, and other “information” available. These two factors are closely interconnected. First, when 

referring to classroom-level decisions made by teachers, the interviewed teachers expressed less concern. At the 

classroom level, teachers understood that the test result, as a single measure, was insufficient for decisionmaking. 

Therefore, they always balanced the test data with other perspectives. Teachers have multiple sources of 

information about their students – teacher assessments, authentic assessments, observations, conversations and a 

shared experience. Teachers have a wealth of what sociologists of knowledge have called “tacit knowledge” 

(Polanyi, 1966) that they use in conjunction with test data. Teachers concerns about the validity of the test data 

increased when they spoke about decisions made at higher levels of the education system. They felt it was 

unwise to base student graduation or promotion decisions on one score. Their concerns were connected to a lack 

of other relevant information (i.e. insufficient data) available to decision-makers at higher levels. 

The teachers’ synthesis of information from the Grow Reports into their understanding of the classroom offered 

a springboard into a conversation about instructional decision-making. In the interviews teachers reported using 

the Grow Reports in myriad ways to meet their own varied instructional pedagogical needs, as well as their 

diverse students’ academic needs. We grouped those ‘areas of instructional practice’ into the following five 

categories: 

Targeting instruction: When asked in interviews, several teachers and school administrators report, that they 

use the Grow Reports and instructional resources when doing broad level planning such as setting class 

priorities, creating a pacing calendar, or doing weekly or yearly lesson plans. Teachers report that the Grow 

Reports might help them decide on what standards to address and which skills to teach in daily lesson plans, 

mini-lessons, and even year-long pacing calendars. Many say that analyzing the information presented in the 

Grow Reports helps to show them where their overall class’ strengths and weaknesses lie. 

Meeting the needs of diverse learners: Most teachers agree that because the data represented on the Grow 

Reports reveals that individual students perform at different levels, the tool helped them to differentiate 

instruction. Noting that the Grow Reports provided them with more information about student test 

performance than what they had access to previously, teachers said that they looked at the data mostly to 

know “where their students are.” Classroom teachers commonly interpreted the test results in relative terms 

of their students’ strengths and weaknesses. Teachers report different uses of the Data Report depending on 

the differentiation strategies they sought to implement. Teachers said that sometimes ideas for 

individualized instruction to meet student needs are derived from the reports, e.g. by modifying lesson plans, 

by providing different materials so that students have multiple entry points into the content; by varying 

homework and assignments and/or, by teaching in small groups or one-on-one. 

213

Supporting Conversations: Most of the teachers interviewed say the Grow Reports help bridge discussions 

about student learning. The teachers talked about using the Grow Reports in conversations with teachers, 

parents, administrators, and students as a starting point for conversations as well as something “concrete” to 

show parents, administrators, other teachers, or the students themselves when discussing where the class or 

the student was in terms of his or her learning and where he or she needs to go. 

Shaping teachers’ professional development: In interviews and surveys teachers indicated that they use the 

opportunity to analyze the Grow Reports and their classes’ strengths and weaknesses to also reflect upon 

their own teaching practice. Teachers explained that seeing, for example, that the majority students scored 

low on a skill, would cause them to reflect upon how they taught that specific skill. Some teachers reported 

that by looking at the Reports they realized that they weren’t even teaching some of the standards and skills 

on which the students were tested (Light et al. 2004, p. 37). The reports helped teachers align their teaching 

to what the state standards expect children to be able to know and do. 

Encouraging self-directed learning: Another interesting use that emerged during the fieldwork was the 

dissemination of the data to students as a way to encourage them to take ownership of their own learning. A 

small but sizeable group of teachers talked about sharing the Reports (or the data from the reports) with their 

students so that students were not only aware of their performances on the test but were also encouraged to 

take responsibility in terms of their own academic progress. (Light et al. 2004, p. 38). 

Rethinking the design of MIS for education 

The findings from the research on Grow Reports offer an illuminating example for other districts that are starting 

to use test data for accountability and policy making. From the classroom-level perspective, the Grow Reports 

can be regarded as a useful data-tool since most teachers interviewed felt the data presented was clear and useful. 

Despite a noted lack of psychometric sophistication, teachers clearly demonstrated they could make sense of and 

use the data in the reports. Educators at the two levels of the school system closest to the students (i.e. classroom 

and building levels) felt that the student performance result on a single standardized test was not sufficient data 

to inform their instructional decision-making. The teachers and administrators who were using this data 

incorporated additional data from other sources to inform their decisions. In particular, teachers relied heavily on 

information gleaned from their daily interactions with their students. It was this tacit knowledge that enabled 

educators to use the MIS data effectively. Information technology has become widely available and its power 

and capacity is continuously increasing. This school case highlights the following seven factors of how data 

systems can support educators in making informed decisions: 

Build from the real needs of classroom and building educators. Most systems are built top-down, gathering 

as much data as possible and thinking about relevant data as information later – user-centered approaches 

would help. The Grow Network built the Grow Reports working closely together with the target audience – 

teachers. So much so that the tool may be more closely tailored to teacher needs than administrators; 

Recognize teachers’ wealth of tacit knowledge as a starting point. Teachers using the Reports talked a lot 

about their holistic understanding of the students’ needs and abilities, emphasizing their tacit knowledge as 

educators and how the data fit into their prior knowledge; 

Select appropriate data to include in the information system. Information systems are often flooded with 

data, offering more data than decision-makers can effectively synthesize and use. The Grow Reports present 

one specific and crucial data point – the standardized test scores. Because of the demanding accountability 

context in the United States, this is significant data to most teachers and administrators regardless of their 

concerns about the data’s reliability and validity. 

Effective testing requires close alignment between state standards, teaching and testing. Alignment between 

what state standards expect students to know, what teachers are teaching and what the tests measure is often 

missing. The data tool can support a discussion about alignment by making the connections (or 

disconnections) explicit. 

Educators need professional development on instructional decision-making that considers the role of data. 

As U.S. teachers have long experience with standardized testing and state standards, they did not need 

specific professional development on standardized tests and standards. However, teachers working in 

systems that only recently have introduced standardized tests may need professional development around 

understanding the tests themselves. 

Educators need expanded repertoires of instructional strategies. Parallel to the needs for professional 

development, it is also important to provide access to resources, teaching materials and information about 

good practices. Although the data helps teachers identify the need to differentiate instruction, the teachers 

are often in need of an expanded repertoire of strategies and practices to be able to respond to the needs of 

individual children or groups of students. 

214

Further research on effective instructional decision-making is needed. Even if the support by information 

systems for decision-making is highly advanced, we will not be able to tell whether the decisions made are 

better than before – it depends on the context and is not reproducible. Neither the impact of the Grow 

Reports, nor the extent of its use in the district, has yet been evaluated. 

Conclusion 

We started this article by connecting research on MIS in corporate organizations to data-driven decision-making 

in schools and have found that research on the Grow Reports offers a school case that can illuminate MIS theory. 

Data-driven decision-making will be an important task for school administrators and teachers in the future; and 

more countries will follow the U.S. example. Our research on the Grow Reports suggests that the process for 

designing decision support systems has to be turned upside down from decision-making to data selection. Instead 

of starting with the available data, designers need to start from the source of the data – the students and their 

learning needs. The design needs to start from the information needs for decision support of different 

stakeholders, which are defined by their different relationships to the students. For example, the information 

needs of a third grade teacher planning a reading project are quite different from those of a school principal 

deciding on professional development programs to offer her teachers. From that premise, MIS designers should 

then consider which presentation formats and representations of data are relevant to meet those different needs. 

From this knowledge, an information system can then be built that houses the decision-relevant data that will 

help educators more effectively guide the students’ learning. 

References 

Ackoff, R. L. (1989). From Data to Wisdom. Journal of Applied Systems Analysis, 16, 3-9. 

Alavi, M., & Leidner, D. E. (2001). Knowledge Management and Knowledge Management Systems: Conceptual 

Foundations and Research issues. Management Information Systems Quarterly, 25 (1), 107-136. 

Argyris, C., & Schoen, D. A. (1978). Organizational Learning: a Theory of Action Perspective, Reading, MA: 

Addison Wesley. 

Bhatt, G. D., & Zaveri, J. (2001). The enabling role of decision support systems in organizational learning. 

Decision Support Systems, 32, 297-309. 

Breiter, A., & Light, D. (2004a). Decision Support Systems in Schools – from Data Collection to Decision 

Making. Paper presented at the America's Conference on Information Systems (AMCIS), New York, NY. 

Breiter, A., & Light, D. (2004b). Teachers in the "intelligent" school – From data collection to decision-making. 

Paper presented at the ED-MEDIA, Lugano, Switzerland. 

Brown, J. S., & Duguid, P. (2000). The Social Life of Information, Boston, MA: Harvard Business School Press. 

Connolly, T., & Begg, C. (2002). Database Systems: A practical approach to design, implementation and 

management (3 rd Ed.), Reading, MA: Addison-Wesley. 

Drucker, P. F. (1989). The New Realities: In Government and Politics. In Economics and Business. In Society 

and World View, New York: Harper & Row. 

Earl, M. J. (2001). Knowledge Management Strategies: Towards a Taxonomy. Journal of Management 

Information Systems, 18 (1), 215-233. 

Feldman, M. S., & March, J. G. (1988). Information in Organizations as Signal and Symbol. In March, J. G. 

(Ed.), Decision and Organizations, Oxford: Basil Blackwell, 409-428. 

Fung, A. C. W., & Ledesma, J. (2001). SAMS in Hong Kong Schools: A Centrally Developed SIS for Primary 

and Secondary Schools. In Visscher, A. J., Wild, P. & Fung, A. C. W. (Eds.), Information Technology in 

Educational Management. Synthesis of Experience, Research and Future Perspectives on Computer-Assisted 

School Information Systems, Heidelberg: Springer, 39-53. 

215

Gorry, G. A., & Scott Morton, M. S. (1971). A Framework for Management Information Systems. Sloan 

Management Review, 13 (1), 55-70. 

Heistad, D., & Spicuzza, R. (2003). Beyond Zip code analyses: What good measurement has to offer and how it 

can enhance the instructional delivery to all students. Paper presented at the AERA Conference, April, Chicago. 

Ingram, D., Seashore Louis, K., & Schroeder, R. R. (2004). Accountability Policies and Teacher Decision 

Making: Barriers to the Use of Data to Improve Practice. Teachers College Record, 106 (6), 1258-1287. 

Kohler, B., & Schrader, F.-W. (2004). Ergebnisrückmeldung und Rezeption. Von der externen Evaluation zur 

Entwicklung von Schule und Unterricht. Themenheft Empirische Pädagogik, 18 (1), Landau: Verlag Empirische 

Pädagogik. 

Laudon, K., & Laudon, J. P. (2002). Management information systems: managing the digital firm (7 th Ed.), 

Englewood Cliffs, NJ: Prentice-Hall. 

Light, D., Honey, M., Heinze, J., Brunner, C., Wexlar, D., Mandinach, E., & Fasca, C. (2005). Linking Data and 

Learning - The Grow Network Study - Summary Report, New York: EDC's Center for Children and Technology. 

Light, D., Wexlar, D., & Heinze, J. (2004). How Practitioners Interpret and Link Data to Instruction: Research 

Findings on New York City Schools’ Implementation of the Grow Network. Paper presented at the American 

Educational Research Association, San Diego, CA. 

Marakas, G. M. (2003). Modern Data Warehousing, Mining, and Visualization - Core Concepts, Upper Saddle 

River, NJ: Pearson. 

Mitchell, D., & Lee, J. (1998). Quality school portfolio: Reporting on school goals and student achievement. 

Paper presented at the CRESST Conference, Los Angeles, CA. 

Myers, M. D. (1999). Investigating Information Systems with Ethnographic Research. Communications of the 

Association for Information Systems, 2 (23), 2-17. 

Nolan, C. J. P., Brown, M. A., & Graves, B. (2001). MUSAC in New Zealand: From Grass Roots to System- 

Wide in a Decade. In Visscher, A. J., Wild, P. & Fung, A. C. W. (Eds.), Information Technology in Educational 

Management. Synthesis of Experience, Research and Future Perspectives on Computer-Assisted School 

Information Systems, Heidelberg: Springer, 55-75. 

Nonaka, I., & Takeuchi, H. (1995). The Knowledge Creating Company, Oxford: Oxford University Press. 

Petrides, L. A., & Guiney, S. Z. (2002). Knowledge Management for School Leaders: An Ecological Framework 

for Thinking Schools. Teachers College Record, 104 (8), 1702-1717. 

Polanyi, M. (1966). The Tacit Dimension, London: Routledge and Kegan Paul. 

Salpeter, J. (2004). Data: Mining with a Mission. Technology and Learning, 24 (8): 30-37. 

Secada, W. (2001). From the Director. Newsletter of the Comprehensive Center-Region VI, 6, 1-2. 

Sharkey, N., & Murnane, R. (2003). Helping K-12 Educators Learn from Student Assessment Data. Paper 

presented at the AERA, Chicago. 

Simon, H. A. (1977). The New Science of Management Decisions (revised edition), Englewood Cliffs, NJ: 


Spielvogel, B., Brunner, C., Pasnik, S., Keane, J. T., Friedman, W., & Jeffers, L. (2001). IBM Reinventing 

Education Grant Partnership Initiative - Individual Site Reports, New York: EDC/Center for Children and 

Technology. 

Spielvogel, B., & Pasnik, S. (1999). From the School Room to the State House: Data Warehouse Solutions for 

Informed Decision-Making in Education, New York: EDC/Center for Children and Technology. 

216

Sveiby, K. E. (1997). The New Organizational Wealth: Managing and Measuring Knowledge-Based Assets, 

Berrett Koehler. 

Symonds, K. W. (2003). After the Test: How Schools are Using Data to Close the Achievement Gap, San 

Francisco, CA, Bay Area School Reform Collaborative. 

Thorn, C. A. (2001). Knowledge Management for Educational Information Systems: What is the State in the 

Field? Education Policy Analysis Archives, 9 (47), retrieved June 30, 2006, from, 

http://epaa.asu.edu/epaa/v9n47/. 

Thorn, C. A. (2002). Data Use in the Classroom: The Challenges of Implementing Data-based Decision-making 

at the School Level. Paper presented at the American Education Research Association Convention, New 

Orleans, Louisiana. 

Thorn, C. A. (2003). Making Decision Support Systems Useful in the Classroom: Designing a Needs 

Assessment Process. In Selwood, I., Fung, A. & Paturi, T. (Eds.), Information Technology in Educational 

Management, Amsterdam: Kluwer Academic Publishers. 

Trauth, E. M. (2001). Qualitative Research in IS: Issues and Trends, Hershey, PA: IDEA. 

Visscher, A. J. (2001). Computer-Assisted School Information Systems: The Concepts, Intended Benefits, and 

Stages of Development. In Visscher, A. J., Wild, P. & Fung, A. C. W. (Eds.), Information Technology in 

Educational Management: Synthesis of Experience, Research and Future Perspectives on Computer-Assisted 

School Information Systems, Norwell, MA: Kluwer, 3-18. 

Visscher, A. J., & Bloemen, P. P. M. (1999). Evaluation of the Use of Computer-Assisted Management 

Information Systems in Dutch Schools. Journal of Research on Computing in Education, 32 (1), 172-188. 

Webb, N. (2002). Assessment Literacy in a Standards-Based Urban Education Setting. Paper presented at the 

AERA, New Orleans. 

Weinert, F. E. (2001). Leistungsmessung in Schulen, Weinheim: Beltz. 

Wild, P., Smith, D., & Walker, J. (2001). Has a Decade of Computerization Made a Difference in School 

Management. In Nolan, C. J. P., Fung, A. C. W. & Brown, M. A. (Eds.), Pathways to Institutional Improvement 

with Information Technology in Educational Management, Norwell, MA: Kluwer, 99-120. 

217

Deng, Y.-C., Lin, T., Kinshuk, Chan, T.-W. (2006) Component Exchange Community: A model of utilizing research 

components to foster international collaboration. Educational Technology & Society, 9 (3), 218-231. 

Component Exchange Community: A model of utilizing research 

components to foster international collaboration 

Yi-Chan Deng 

Department of Computer Science and Information Engineering, National Central University, Taiwan 

ycdeng@cl.ncu.edu.tw 

Taiyu Lin and Kinshuk 

Advanced Learning Technologies Research Centre Massey University, Palmerton North, New Zealand 

t.lin@massey.ac.nz 

kinshuk@ieee.org 

Tak-Wai Chan 

Research Center of Science and Technology for Learning National Central University, Taiwan 

chan@cl.ncu.edu.tw 

ABSTRACT 

One-to-one technology enhanced learning research refers to the design and investigation of learning 

environments and learning activities where every learner is equipped with at least one portable computing 

device enabled by wireless capability. G1:1 is an international research community coordinated by a 

network of laboratories conducting one-to-one technology enhanced learning. The concept of component 

exchange community emerged as a means of realizing one of the missions of G1:1 - speeding up researches 

through exchanges of research components. Possible types of research components include software, 

subject matter content (learning objects), and methodologies. It aims at building a platform for fostering 

international collaboration, providing a novel way for the research work by an individual or laboratories to 

be accessed by the wider research community and users, and, in return, increasing research impacts of these 

researches. Component exchange community motivates maintenance of good quality documentation. This 

paper describes the concept and its model of component exchange community. Related models are 

compared and, as an illustration, a scenario of using component exchange community is given. 

KEYWORDS 

Component exchange, One-to-one learning technology, G1:1, eBay-like model 

Background 

One-to-one (1:1) technology enhanced learning (TEL) is defined as the design and investigation of environments 

and learning activities in which every learner is equipped with at least one computing device that is capable of 

offering the learning experience similar to that of one-teacher-to-one-student interaction. G1:1, pronounced as 

"G one one", is a research community (http://www.g1to1.org) that coordinates research teams conducting 

significant 1:1 TEL projects. In order to achieve this goal, the scope of the G1:1 initiative is broad; it includes 

research and development in fields that have potential to contribute to the realisation of the goal. Currently, six 

broad fields are identified: hardware, software, digital materials, theories for learning and teaching, activity 

models, and methodologies. 

In order to harness the benefits of introducing mobility into learning environment (see Hsi, 2003), particular 

attention is paid to wireless-enabled hardware in the G1:1 initiative. G1:1 was in fact originally initiated by the 

organizers and some key participants of the first few IEEE International Workshops on Wireless and Mobile 

Technologies in Education (WMTE). 

The hardware devices can be cellular phones, personal digital assistants, laptops, (which already have wireless 

communication capability), or graphing calculators, electronic dictionaries, Gameboys, and the like, which might 

be enabled by wireless communication capability one day. Over the next 10 years, it is anticipated that the pace 

of establishing the 1:1 experimental sites such as 1:1 classrooms at all educational levels or 1:1 informal learning 

locations such as 1:1 museums will be accelerating. Some of them, such as the One Laptop Per Child project 

(http://laptop.media.mit.edu), by the Media Lab of Massachusetts Institute of Technology, are deployment 

projects. 

With support of wireless-enabled hardware, one-to-one TEL research brings about the concept of seamless 

learning space. It indicates the potential that one may learn in various learning scenarios (environments and 

learning activities). These learning scenarios vary over three dimensions: (1) places: classrooms, campus, 






218

outdoors, museums, forest, home, train, and so forth; (2) scale of number of people to learn with: none (learning 

alone), a small group, a whole class and the teacher, or a large online community; (3) learning activity models 

(or pedagogical models): intelligent tutoring, learning with cognitive tools (e.g. Lego), computer supported 

collaborative learning, digital game-based learning, and so forth. “Seamless” refers to the ease and quickness of 

switching from one learning scenario to another scenario while maintaining the continuity of one’s learning. 

“Space” refers to the potential huge number of possible learning scenarios generated by varying over the three 

dimensions mentioned above (Chan, et al., 2006). 

G1:1 is informal, yet, it connects to formal organizations such as Kaleidoscope (http://www.noekaleidoscope.org), 

The International Society of the Learning Sciences (ISLS, http://www.isls.org), Asia-Pacific 

Society for Computers in Education (APSCE, http://www.apsce.net), Artificial Intelligence in Education (AIED, 

http://www.ijaied.org), and potentially some others organizations in the future through overlapping membership 

and shared events. It has evolved over a series of events. There were three consequent workshops each year from 

2002 to 2004. Then, in 2005, there were three G1:1 workshops and six panels in international meetings 

discussing G1:1. Except for the first one, all G1:1 workshops were held just prior to some international meetings. 

At the core of G1:1 are a group of researchers who have been working in various research labs and centers in 

sub-areas of science and technology for learning. They perceive that the implications of the advent of wireless, 

mobile and ubiquitous technologies go beyond investigation of just another genus of new learning technologies, 

but an upcoming wave of technological innovations in both formal and informal learning, in which no nation 

avoids nor claims to be in the lead in research, technology, development, practice, deployment, and so forth of 

these changes (Roschelle & Pea, 2002; Hsi, 2003; Hoppe et al., 2003; Sharples & Beale 2003; Norris & 

Soloway, 2004; Roschelle et al., 2005; Liang et. al., 2005; Chan et al., 2006). There is an urgent need of putting 

together complementary strength and combining insights of different researchers as rapidly as possible to make a 

greater impact in the global context. 

What does G1:1 do? Based on the belief that the high quality research is the basis of every intention and action 

of impacting 1:1 TEL, G1:1 intends to stimulate and promote high quality 1:1 research, at least to the extent that 

actively debate and critique what high quality research looks like; to set or attempt to set high standards for 

research outcomes, and support advancement of understanding. 

G1:1 tries to inform advancements of 1:1 TEL, to raise the next direction of globally important learning 

challenges, establish an international research agenda, anticipate emerging drivers for educational changes by 

advanced learning technologies and their wider socio-cultural impact, and makes the impact of researches more 

visible and effective by aligning the momentum to the same direction. 

G1:1 also attempts to increase the international sharing by serving as the key incubator for international 

collaborative research on designs, methods, strategies, software, and findings as well as exploring learning with 

similar technologies across multiple cultures. Besides, G1:1 is setting up a worldwide network of test beds so 

that a particular test bed can "see the universe" from the local and specific perspectives. This paper is an 

endeavor towards this mission of G1:1. 

The preliminary concept of Component Exchange Community (CEC) was a part of the vision statement of G1:1 

documented in late 2003 and a simplified version of that concept, Component Inventory, was proposed by Mike 

Sharple and Sherry Hsi in the second G1:1 workshop held in National Central University two days before IEEE 

International Workshop on Wireless and Mobile Technologies in Education (WMTE2004). The concept was 

implemented in a G1:1 website by Sharple and graduate students in National Central University. 

The Component Inventory, serves as a portal and collects description of projects conducting 1:1 TEL projects. 

This information serving role, irrefutably important, is however limited by its similarity of content to individual 

projects’ websites. The authors believe that the research community can be even better served by a 

sharing/collaborative model, which is similar to the success-proven e-Bay model in eCommerce. Componentexchange 

community (CEC) is an embodiment of such a collaborative model. The establishment of CEC also 

signals that the G1:1 initiative is difficult, if not impossible, to be achieved by an individual or a single research 

centre, but also the entire research community as a whole. 

Objectives and Motivations 

The objectives of CEC are the following: 

1. Exchanging 1:1 research results, such as an 1:1 software, to be accessible to the wider G1:1 community; 

219

2. Being the platform to foster 1:1 international collaborations; and 

3. Encouraging high quality and collaborative documentation of some useful 1:1 research components. 

There are two underlying assumptions of this effort. The first assumption is that researchers, in the academic 

world in general and in G1:1 in particular, are eager to share their research results, ideas and discoveries besides 

academic publications. Spreading research results is, in essence, a mission of academia. Traditionally, this is 

done by publishing books or papers in conferences and journals. Results of learning technology research do not 

only include ideas and discoveries, but could also be software systems or components. If the intellectual property 

has been recognised, the software, or part of the software, is ready to be publicized. The software could become 

an opportunity for collaborations and hence further its research impact if it can be used for research purposes by 

other 1:1 laboratories. 

The second assumption is that there is a demand for reusable and/or modifiable research results in the advanced 

learning technology research community in general and G1:1 in particular, especially for graduate students. 

There are several advantages of this endeavour. First, it may reduce unnecessary efforts of re-inventing the 

wheel which cannot earn any credit any way. A laboratory, instead, can make their efforts more valuable by 

building upon what other laboratories have already done or develop alternative approaches. These research 

results are possibly the most up-to-date learning materials for other researchers, especially those who have just 

begun 1:1 research. 

Although many researchers are eager to share their research outcomes, such as software systems, implementation 

documentations are often poor or even non-existent due to the higher priority of proof of concept than continual 

development or commercialization. The lack of good quality documentation hinders others from using research 

components that are already in existence. Furthermore, many countries/regions, owing to shortage of talents or 

resources, are in need of better/innovative products to improve their standard of education or living. In addition 

to this humanitarian benefit, it has been very long- and well-known belief which was explicitly pointed out by 

Florida (2005) that “when talented people come together, their collective creativity is not just additive; rather, 

their interactions multiply and enhance individual productivity”. 

The rapid growth of “weblogs” suggests that it may be possible that individual laboratories can publicize their 

research results, especially those such as software that cannot be published in the form of research articles or 

books. We can imagine an analogous situation in which every 1:1 laboratory is a “factory” for producing some 

research products. That factory can “sell” their products by having a “window” in the cyber world. The “sell” 

here may mean “call for using our software”, “call for testing our software”, “call for collaborating with us”, and 

the like. Thus, unlike most of software sharing websites which follow Amazon-like model, CEC adopts eBaylike 

model. 

The Amazon-like and eBay-like models are discussed subsequently in detail. The eBay-like model is further 

investigated from three different views that are: Dimension of Provider, Dimension of User and Dimension of 

Community, as shown in the section Three dimensions of view to eBay-like model. Graphical representations 

depicting the relations of different components, component versions and component contributors are employed 

to ease the search of components and attribute due credit to contributors. These graphs, called contribution-graph 

and pedigree-graph, are then discussed. A section is also presented about the prototype developed for CEC and 

an exemplary scenario showing how two different researchers can use CEC to their own benefit is depicted 

before the Conclusion and Future Work. 

Related Works 

There have been some centralizing efforts on global-scale research collaborations based on what we called the 

Amazon-like model. These include: the National High-performance computing and communications (HPCC) 

Software Exchange (NHSE), which is an Internet-accessible resource that promotes the exchange of software 

and information among those involved with high-performance computing and communications (Browne et al., 

1995, 1998); SourceForge.net (http://sourceforge.net) which is the largest Open Source software development 

web site (Weiss, 2005); EduForge (https://eduforge.org), an open access environment designed for the sharing of 

ideas, research outcomes, open content and open source software for education; and Download.com 

(http://www.download.com) – the popular site to get freeware and shareware are all examples of the Amazonlike 

model. 

220

Table 1 shows the comparisons of related works. The target of NHSE is especially for the research community 

whereas the SourceForge.net and Download.com are targeted for general purpose. EduForge is aimed at the field 

of education. The methodology of NHSE and Download.com are to collect software in their websites and that of 

the Source Forge and EduForge are to maintain projects in a centralized manner. Most of them use GPL (General 

Public License), LGPL (Lesser General Public License), or are classified as freeware or shareware to limit usage. 

They are a centralized model for product management and project management. 

NHSE 

Table 1: Comparisons of related works 

Target Methodology Usage limitation Model 

HPCC research 

purpose 

SourceForge General purpose 

Download.com General 

purpose 

EduForge 

Educational 

purpose 

Definition of Component 

Collect product in 

central site 

Register projects in 

central site 

Collect product in 

central site 

Register projects in 

central site 

GPL, freeware, 

shareware, … 

GPL,LGPL … 

freeware, shareware, 

… 

GPL,LGPL … 

Centralized 

product management 

Centralized 

project management 

Centralized 

product management 

Centralized project 

management 

It should be noted that the term “component” defined here is much less technically specific from that in software 

engineering literature (Booch, 1994; Szyperski, 1998; Sweller, 1998). In the area of learning technology, a 

‘component’ usually refers to a piece of software, a unit of digital content material (in short, material), or 

perhaps a piece of hardware (probably together with some communication protocols). The type of component 

focused by CEC is somehow different from that of the industry. In CEC, the emphasis is placed on research 

components – components that bear value for the research community. Therefore, in addition to the usual types 

of component i.e. software, digital material and hardware, three more types of components are defined in CEC, 

namely: theory, activity model and methodology. These six types of components are defined in the following 

list: 

Components of theory: A theory is a logically self-consistent model or framework devised to explain a 

group of facts or phenomena, especially one that has been repeatedly tested or is widely accepted and can be 

used to make predictions about natural or social phenomena. In education and psychology, (learning) 

theories help us understand the process of learning. An example of components in this category is the Social 

Development Theory (Vygotsky, 1978). 

Components of activity models: An activity model is a schematic description of actors, tools, and their 

actions via some tools. A schematic description can be less rigorous than that described by a formal 

language, for example, Educational Modelling Language (Koper & Manderveld, 2004). Examples of activity 

models include the scaffolding and fading model in cognitive apprenticeship (Collins et al., 1989), the 

Jigsaw model of cooperative learning (Slavin, 1980), and reciprocal teaching model of peer tutoring 

(Palincsar & Brown, 1984; Palincsar et al., 1987). 

Components of methodologies: A methodology is a set of working methods consisting of practices, 

procedures, and rules used by those who work in a discipline or engage in an inquiry. In other words, a 

methodology is a specific way of performing an operation that implies precise deliverables at the end of 

each stage of an activity model. An example of methodology would include specifying a set of principles 

and procedures in the design of a scaffolding tool to assist learning physics in senior high school level (Pea, 

2004). 

Components of digital materials: Digital materials are digital files which can be launched by software 

components. These digital files are electronic representation of media, such as text, images, sound, 

assessment objects or any other piece of data that can be rendered by a software component and presented to 

the learner. More than one digital files can be collected together to build other digital files. 

Software components: Software components include but are not limited to: 

1. Software applications which are complete and stand-alone programs that can be installed and 

executed on one or a set of computational hardware devices. 

2. Software objects which are complete source codes or software libraries that represent one or a set of 

functions and can be quoted to reuse in developing another program. Software objects can be written 

in compiler languages such as object-oriented languages or interpreter languages. 

Hardware devices: Hardware devices include but are not limited to: 

221

1. Learning devices which are hardware equipment used to facilitate the participation in learning 

activities. Learning devices can be purpose-specific devices such as response pads, graphic calculators, 

electronic English dictionaries, and pocket game machines. Other learning devices can be generalpurpose 

devices such as cellular phones, personal digital assistants (PDAs), WebPads, Notebooks and 

Tablet PCs. 

2. Supporting devices which are instructional equipments used to sustain learning activities taking place, 

such as EduCart (Deng et al., 2004), digital whiteboard, projector, learning management server, and so 

on. 

A one-to-one learning scenario can be composed of a theory, an activity model, a methodology, a set of software 

components, a set of digital materials, and a genre of hardware devices as shown in figure 1. The collaborations 

between research teams can be contributed in different types of components to form a new one-to-one learning 

scenario. 

eBay-like Model 

Figure 1: A example to composing six types of components 

In this section, we propose an eBay like model for component exchange. In other words, the model is a central 

bulletin and peer contract model. In figure 2 shows, the CEC takes the role of a broker to trigger and promotes 

peer-to-peer collaboration. Research teams can submit their research components to CEC and they also can 

search to find out their collaboration candidates in CEC and request to create a Component Usage Agreement. 

The two parties of a Component Usage Agreement are called the Requestor and the Requested. 

Figure 2: eBay-like model 

222

Once a candidate component is found, the Requestor submits a request to the Requested expressing intention to 

create a Component Usage Agreement so that the Requestor can use one of the components in CEC provided by 

the Requested. 

The Component Usage Agreement consists of zero to many clauses. Each individual clause represents a 

condition that either the Requestor or the Requested wish for. Both the Requestor and the Requested are entitled 

to initiate (add) or modify clauses before the agreement is finalized. Some default clauses are initially available 

but are optional. Some of these default clauses are as follows: 

The Component Owner agrees to provide any missing information or objects that was unintentionally not 

incorporated in the original Component. 

The Component Owner and the Component User agree to jointly own the copyright of the research data 

resulted from the use of the Component. 

The Component Owner and the Component User agree to grant each other the right to initiate publication of 

the research result from the use of the Component. 

The Component Owner and the Component User agree to include the representative(s) from the other party 

in the author list if the research results from the use of the Component are published. 

At the close of the study, the Component User will provide the Component Owner with a final and complete 

dataset gathered in the research in which the Component was involved. 

The change of terms, from Requestor to Component User and from Requested to Component Owner, takes place 

when the status of the Component Usage Agreement is finalized. 

Depending on whether the initiator or the last modifier made the last change to the clause, each clause is 

categorized as either Requestor-initiated/modified-clause (Requestor- clause) or Requested-initiated/modifiedclause 

(Requested-clause). Each clause has to be agreed by both parties before the finalization of the agreement, 

for example, when the Requestor agrees with all Requested-clauses, he/she can finalize the agreement; and the 

same applies to the Requested with all Requestor-clauses. 

The following is an example to illustrate some of the clauses. 

Three Dimensions of Viewing CEC 

Members of CEC are of two types: providers and users. A provider of a component is a research team who 

offers the component to the community. A user of a component is either a collaborator, who is a research team 

collaborating with the research team providing the component, or a consumer, who is a teacher, learner, or 

administrator and uses the component in teaching or learning activities. 

This section discusses three dimensions of viewing the eBay-like model: the Dimension of Providers, the 

Dimension of User and the Dimension of Community. There are three different levels of sharing in each 

dimension, as described below, to allow providers and users of educational computing research community to 

have more choices to share or collaborate with others. 

Three Levels of Each Dimension 

There are three levels in each dimension. The first level is simple and requires the least effort for each dimension 

whereas the third level is the most complex and requires the most effort for sharing and using. There are 

inevitable conflicts between the needs of providers and users. The users always require more resources, but the 

providers may not have the time or resources to cater for what the user asks for. Different levels provide more 

options for both providers and users. 

However, quality control is always an issue for sharing. A contribution labelling and management mechanism is 

developed and is tailored to the particular needs and nature of the research community. Contributions are 

labelled by different levels- level 1 is the simplest and level 3 the most complete. 

Component contributor’s self-assessment is the easiest and earliest quality label that could be obtained. Anyone 

who wishes to share his/her research components and contribute them to the CEC and the rest of research 

community can simply refer to table 3 and table 4 to label the provided components. In this manner, the 

223

component provider is not required to provide documentation, user support, and so on to the fullest level which 

in fact may not be needed by the users. 

The rest of the community can also have their say about the quality label of components. A voting mechanism 

allows the user and others to determine the authenticity of the initial data input by the providers. 

Dimension of Provider 

From the viewpoint of the provider, a simple three-tier classification and basic attributes are used to explain the 

different levels of components and provider (Table 2). 

Level 1 

Table 2: Check-lists of provider 

Level 2 Level 3 

User manual primitive complete complete with case studies 

Open source code none a little many 

Consultant 

asynchronous 

(e-mail) 

synchronous (telephone, video 

conferencing) 

face-to-face (students /developers 

exchange) 

Development 

document 

none thesis / technical report detail 

Central info 

updating 

none sometimes synchronous 

Release constrains just one time by contract no limitation 

Table 3 constitutes an example of basic requirements for providers. For example, the correction-type component 

is at its testing stage (beta version) and requires testing, debugging and improvement. The component provider 

will provide at least user manual and central info updating attributes to level 3 (i.e. comprehensive user manual, 

total accessibility for updates/patches) to allow users to test the components and access the latest component 

information. The other attributes might be still at level 1. 

Table 3: Examples of component type 

Correction-Type Component Discovery-Type Component 

User manual complete with case studies Complete 

Open source code none Many 

Consultant asynchronous (e-mail) face-to-face (students /developers exchange) 

Development document none Detail 

Central info updating synchronous Sometimes 

Release constrains just one time no limitation 

If the research component is of discovery-type, then it is stable but the provider needs to see how it works and 

discover its potential applications. The provider will offer user manual and central info updating attributes to 

level 2. The user manual of level 2 is not expected to have the examples of case study so the user would not be 

confined to only the examples shown on the user manual and would discover more creative ideas. On the other 

hand, as the component is quite stable, central info updating does not need to be done frequently. The other 

attributes such as open source code, consultant, development document and release constrain can also be set to 

level 3 to attract more suitable users for collaboration. 

From the above examples, it can be seen that the provider can select different attributes at different levels based 

on various development stages, on the extent of producing different attributes and on expected results of his/her 

research component. 

Dimension of User 

Similar to dimension of provider, a simple three-tier classification (Table 4) and its basic attributes are used to 

explain different kinds of users (Table 5). For example, a use-only user might need all attributes at level 1 (none 

224

of the attributes is needed). The component will of course be quite limited, but will suffice the need of the useonly 

user. 

Level 1 

Table 4: Check-list of user 

Level 2 Level 3 

Document none error correcting localization and new case study 

Code development none modification integration 

Development human 

resource 

none student development team 

Feedback none suggestion paper 

A feedback-type user, who agrees to provide feedback after using the component, is required to have feedback 

and document attributes to level 3 and the other attributes to level 1. But a developer-type user, who wishes to 

enter a co-develop relationship with the provider, will need to have code development and development human 

resource attributes at level 3 and feedback and document attributes at level 2. 

Table 5: Examples of user type 

Use-Only User Feedback-Type User Developer-Type User 

Document Level 1 Level 3 Level 2 

Code development Level 1 Level 1 Level 3 

Development human resource Level 1 Level 1 Level 3 

Feedback Level 1 Level 3 Level 2 

From the above examples, we can see how users can classify themselves and how providers can find their 

potential collaborators. 

Dimension of Research Community 

The Dimension of Research Community is the identifying feature of the eBay-like model. Through the 

community, different kinds of events can be organised to promote research collaboration and further 

participation among existing members and to attract new members. One of the important roles of the community 

is a broker of sharing and exchange. If one uses Google to find a collaborator, one would need to laboriously 

search, filter, identify, and request for permission. Whereas the research community offers a formal channel that 

users can search for components; advertise their need of certain components; and publicize their own 

components – all in a contextualised manner. 

As opposed to the once-off collaboration between two research teams for a single task, CEC employs a 

contribution-graph and pedigree-graph to allow multi-team- and multi-version- collaboration to take place. 

Detailed discussion of these two graphs is presented in the next section. 

The community, as a monitoring body, can also prevent aberrant or illegal behaviours from its members such as 

virus spreading or commercial advertisement; as a managing body, it can provide and manage membership 

registration and identification; and as a marketing body, can work with other international agencies or events to 

promote the community itself. 

The Research Community is also an arena for achievement recognition. Excellent component could be selected 

by experts periodically and a Seal of Recognition given to the contributing research team. This can foster an 

atmosphere of constructive competition whereby developing higher quality components are encouraged. 

The community itself is an attraction for enhancement. The more its members and the greater its number of 

components are likely to appeal, the more new members registrations and component submissions will increase. 

The higher quality the community’s components are, the more likely that they are going to draw in further high 

quality research components. The positive snow-ball effect is the unique feature expected to differentiate a 

community-based approach and a randomly occurring approach of research collaboration in the long run. 

225

e-Bay Model vs. Amazon-model 

A comparison with the Amazon-like model can help understand the e-Bay model further. Table 6 below shows 

the differences between these two models. As a centralized model, the Amazon-like model has to face all the 

users and is expected to have its components attain a certain level of quality in order to be able to exchange and 

share. However, as a peer-to-peer model, the eBay-like model only needs to stress a clear explanation of the 

current status of research products to its user, whom is usually a single user at a time, and not on the quality of 

the product. If the other researcher or research team accepts the quality, then collaboration or exchange can be 

conducted. With regard to the collaboration between the Provider and the User, in the Amazon-like model, there 

is little collaboration between the Provider and the User. But in the eBay-like model, there is a lot more 

collaboration between the Provider and the Provider; and the Provider and the User. The Amazon-like model 

usually follows different kinds of standards to appeal to a bigger audience. But in the eBay-like model, if the two 

parties come to an agreement and they follow some simple rules, they can exchange and collaborate. In the 

Amazon-like model, there are a lot more issues and policies to consider, for example, which standard to follow 

or which development platform to use, etc., whereas the eBay-like model places more concern for issues such as 

getting the maximum benefit for two collaborating parties. 

Table 6: Comparisons of Amazon-like and eBay-like model 

Amazon-like model eBay-like model 

• Well quality control 

• Easy to contribute 

• Less collaborations between users and providers • More collaborations between users and providers 

• Standard rules 

• Easy to get peer consensuses & some basic rules 

• More policy issues 

• More mutual benefit issues 

Contribution-graph and pedigree-graph 

Another issue that researchers are generally concerned about is credit. The credit is the user can recognize and 

acknowledge the provider. So, we use a contribution-graph and a pedigree-graph to record the data of every 

component. Each node on the graph depicts a component. A link records the change between two components. A 

contribution-graph is mainly used to record the entire history of a component right from its initial node. A 

pedigree-graph is used to output all the links and nodes of a particular component. Below we make a detailed 

description of the two types of graphs. 

Contribution-graph 

The contribution-graph is designed for tracing the history of component and to give credits to authors. It depicts 

all the links between components that have a contribution relationship. 

An exemplary contribution graph is shown in figure 3. In the figure, Research Team 1 (RT1) revises component 

A to make another component B and registers it under CEC. RT2 has needs for component B to use it in PDA. 

Hence, they will modify component B to make another component C for use in PDA. Therefore, Figure 3(a) 

shows the contribution-graph for Component C. Figure 3(b) provides description of contribution of the two 

research teams, RT1 and RT3, in integrating components B and D to bring about a new component that has new 

benefits for both the research teams. In this scenario, teams RT1 and RT3 will jointly integrate components B 

and D to make a new component E. 

Pedigree-graph 

The pedigree-graph is designed to show the whole picture of a component. It not only shows the contribution 

links of the older generation, but also all the links and nodes of the previous, current and following generations. 

This is expected to help the users in understanding the initial and subsequent developments of a particular 

component and in helping them find a suitable component or collaborator. 

Figure 4 shows the pedigree graph of component B that was discussed in the previous section. 

226

Implementation of CEC 

Figure 3: Examples of contribution-graph 

Figure 4: Example of pedigree-graph 

A prototype CEC has been implemented and deployed at the following URL: http://cec.g1to1.org/cec. 

Technically, the CEC website is developed using PHP and MySQL. The website (Figure 5) embodies the ideas 

and provides the services presented in this paper. 

It needs to be pointed out that unlike the software download site introduced earlier, CEC is membership-based. 

Its membership is only granted to those who have contributed one or more components to CEC or who is willing 

to contribute to the improvement of existing components, for example, offer to evaluate the component in a new 

context. Only members can request to create a Component Usage Agreements with other member(s). This 

mechanism is particularly designed for research community. CEC membership, like the creditability of any 

researcher, is to be valued and built on honest contributions. 

Strict protocols are also in place, as the conditions to use the services provided ensure the quality of operation in 

the CEC. 

227

A Scenario of Collaboration 

Figure 5: Website of Component Exchange Community (CEC) 

The Center of Learning at National Central University (CL@NCU) submits Digital Classroom Environment Test 

bed as a component in the CEC system. (Huang et al., 2001; Liu et al., 2003; Liang et al., 2005; Deng et al., 

2005) in the CEC system. 

A research team “S” in the CEC from Singapore (S@Singapore) finds the Digital Classroom Environment Test 

bed component which partially meets their particular research requirements. S@Singapore contacts CL@NCU 

and makes a request for collaboration. The Requirement lists of S@Singapore are as follows: 

To provide the Datalogger WebPAD interface 

Device control (push/pull/control…) 

Group activity (co-construct group project report) 

Experiment data collections 

Formative assessment 

To integrate Science Project-based Learning System and Digital Classroom Environment (DCE) to create a 

new product namely SCL-DCE. 

Then both parties modify the template of Component Usage Agreement in CEC to finalize the contract. The 

details of the contract are as follows: 

1. The CL@NCU agrees to provide any missing information or object that was unintentionally not 

incorporated in the DCE. 

2. The CL@NCU and the S@Singapore agree to jointly own the copyright of the research data resulted from 

the use of the SCL-DCE. 

3. The CL@NCU and the S@Singapore agree to grant each other the right to initiate publication of the 

research result from the use of the SCL-DCE. 

4. The CL@NCU and the S@Singapore agree to include the representative(s) from the other party in the 

author list if the research results from the use of the SCL-DCE are published. 

5. At the close of the study, the S@Singapore agrees to provide the CL@NCU with a final and complete 

dataset gathered in research in which the SCL-DCE was involved. 

6. The S@Singapore agrees to send two students to CL@NCU for 2 months to get the training required for 

integration. 

7. The SCL-DCE is limited to be used in S@Singapore and CL@NCU. 

228

8. New contract is needed for any business use. 

After signing the contract, the S@Singapore sends two students to CL@NCU and begins the training program. 

Two month later, these students take the unfinished integrated product SCL-DCE back to S@Singapore. The 

S@Singapore keeps contact with CL@NCU and continues to finish the integration. The S@Singapore also 

updates the user manuals of SCL-DCE. The CL@NCU updates the DCE development documents and updates 

the SCL-DCE contribution-graph in the CEC. Then the S@Singapore starts experiment with SCL-DCE. Finally, 

CL@NCU and S@Singapore co-author a paper to report the results. 

Discussion 

The contribution of a member in CEC can be recognized by providing (sort of informal ‘publishing’), enhancing 

(e.g. documenting, extending, etc.), or testing of a component. Thus CEC can help trigger more collaboration 

among international research teams. Through the process, each participating team may strengthen their work and 

gain more widespread influences through joint publications. Besides, a practical incentive for these collaborating 

research teams is that they may apply for international bilateral collaborative research funding. 

One may ask: why do not use Google to find a collaborator as most researchers will list their on their homepages 

these days? Google can be regarded as an ad-hoc mode in this perspective because one may search, filter, 

identify, and then send emails to potential collaborators. A potential difficulty of this approach is the lack of 

vocabulary (keywords) of intentional research collaboration intention such as “document my component,” “call 

for test my component,” “call for collaborators,” and the like, in the searching process of components and 

collaborators. Even if one finds an interested component, the intention for collaboration of the provider of that 

component is not known. This is contrary to CEC. CEC is a channel for component providers and adopters to 

advertise their needs, and hence their intentions for collaboration and the specific requirements or details of their 

preferred collaboration modes are made explicit. 

There was debate on the priority of the six types of components in CEC though no consensus was reached. We 

believe experience and time of the working CEC would tell the answer. Nevertheless, G1:1 website provides a 

web-based Component Inventory in which there are: Centre for Educational Technology and Distance Learning 

(CETADL), University of Birmingham, Center for Learning Technology, National Central University, Taiwan, 

and The University of Tokushima, Japan. CEC can be regarded as an extension of this Component Inventory by 

specifying collaboration intentions and building an active community. 

Although it is still unknown whether the ambition of the One Laptop Per Child project initiated by the Media 

Lab of Massachusetts Institute of Technology for providing a $100 laptop to millions of schoolchildren in 

developing countries (http://laptop.media.mit.edu) can be fulfilled, the $100 laptop hardware itself can be 

regarded as a hardware device component in CEC. Once a critical mass for a virtuous cycle, in which momentum 

gathered by increasing number components further develops its impact and attracts more people to contribute to 

CEC, is reached, it then makes possible the vision of abundance of educational components ready to go together 

with the $100 laptop to benefit those who are in need. 

Summary 

Component exchange community (CEC) has sprung from the vision of G1:1 which aims to support increased 

international sharing and coordination of one-to-one learning technology. CEC has its focus on learning 

technology components at their earlier (research) stage and provides a semi-formal mechanism similar to a 

marketplace where sellers (component providers) can provide specifications (by using Table 3, i.e. provider 

checklist) of their products. The buyers (component requestor/experimenters) can make offers (by using Table 4, 

i.e. user checklist) to obtain the products. Idiosyncratic requirements can be added using clauses in the 

Component Usage Agreement in addition to what are already mentioned in the provider and user checklist. In 

this manner, greater flexibility is achieved. 

It is the hope of the authors that the establishment of CEC can accelerate global collaborative research on 

learning technology by achieving the following: 

Accessibility: research works of individual researchers or research teams can be more accessible to the 

global research community and users; 

229

Quality: researchers with higher possibility of sharing and collaboration in mind can develop better quality 

components and documentations; 

Synergy: where research components can be creatively assembled; and 

Utilisation: those users in need can obtain what they require using non-monetary rewards (e.g. carrying out 

an experiment of a theory, or testing a beta-version component). 

CEC is unique in its three-levels of exchange model, which provides more options for different contributors’ and 

users’ needs, and its contribution graph, which is taken from a family tree analogy, showing merger (marriage) 

of components and where contribution acknowledgements are visible even after several new versions 

(generations) of any research components. Its pedigree-graph shows the blood-line of each component. This 

information is particularly beneficial when two researchers wish to enter an agreement to further develop a 

specific component. 

CEC is an ongoing project based on the previous attempt of the Component Inventory proposed by Mike 

Sharples and Sherry Hsi. CEC website has already been set up to validate this unique concept by testing and 

refining process (http://cec.g1to1.org/cec). The authors would like to invite those who are involved and 

interested in learning technology to contribute and use CEC. 


The authors would like to thank the participants of G1:1 workshop in WMTE 2005, Japan for their comments 

and suggestions. The authors would also like to thank Amy Chen, Emily Ching, Rosa Paredes Juarez, Jitti 

Niramitranon and Nobuji Saito Vazquez for their valuable inputs to the various organisational specification of 

the CEC. 

References 

Booch, G., (1994). Object-Oriented Analysis and Design with Applications, Addison-Wesley, Reading, MA. 

Browne, S., Dongarra, J., Green, S., Moore, K., Rowan, T., Wade, R., Fox, G., Hawick, K., Kennedy, K., Pool, 

J., Stevens, R., Olson, B., & Disz, T. (1995). The National HPCC Software Exchange. IEEE Computational 

Science and Engineering, 2 (2), 62-69. 

Browne, S., Dongarra, J., Horner, J., McMahan, P., & Wells, S., (1998). The National HPCC Software Exchange 

(NHSE). D-Lib Magazine, 4 (5), retrieved June 30, 2006, from, 

http://www.dlib.org/dlib/may98/browne/05browne.html. 

Chan, T. W., Roschelle, J., His, S., Kinshuk, Sharples, M., Brown, T., Patton, C., Cherniavsky, J., Pea, R., 

Norris, C., Soloway, E., Balacheff, N., Scardamalia, Marlene., Dillenbourg, P., Looi, C. K., Milrad, M., Hoppe, 

U., Nussbaum, M., Mizoguchi, R., Ogata, H., McGreal, R., & G1:1 members (2006). One-to-one Technology 

Enhanced Learning: An Opportunity for Global Research Collaboration. Research and Practice in Technology 

Enhanced Learning, 1 (1), 3-29. 

Collins A., Brown J. S., & Newman S. E. (1989). Cognitive Apprenticeship: Teaching the crafts of reading, 

writing and mathematics. In Resnick, L. B. (Ed.), Knowing, Learning and Instruction, Hillsdale, NJ: Lawrence 

Erlbaum, 453-494. 

Deng, Y. C., Chang, S. B., Chang, L. J., & Chan, T. W. (2004). EduCart: A Hardware Management System for 

Supporting Devices in a Classroom Learning Environment. In Proceedings of the 2 nd IEEE International 

Workshop on Wireless and Mobile Technologies in Education (WMTE 2004), Taiwan, 177-181. 

Deng, Y. C., Chang, S. B., Chang, B., & Chan, T. W. (2005). DCE: A One-on-One Digital Classroom 

Environment. In Proceedings of Artificial Intelligence in Education, Amsterdam, 786-789. 

Florida, R. (2005). Where it’s at. New Scientist, 29 October 2005, 43. 

Huang, C. W., Liang, J. K., & Wang, H. Y. (2001). EduClick: A Computer-supported Formative Evaluation 

System with Wireless Devices in Ordinary Classroom. In Proceedings of ICCE 2001, 1462-1469. 

230

Hoppe, H. U., Joiner, R., Milrad, M., & Sharples, M. (2003). Guest editorial: Wireless and Mobile Technology 

In Education. Journal of Computer Assisted Learning, 19 (3), 255-259. 

Hsi, S. (2003). The Electronic Guidebook: A Study of User Experiences Mediated by Nomadic Web Content in 

a Museum Setting. International Journal of Computer-Assisted Learning, 19 (3), 308-319. 

Koper, R., & Manderveld, J. (2004). Educational modelling language: modelling reusable, interoperable, rich 

and personalised units of learning. British Journal of Educational Technology, 35 (5), 537-551. 

Liang, J. K., Liu, T. C., Wang, H. Y., Chang, B., Deng, Y. C., Yang, J. C., Chou, C. Y., Ko, H. W., Yang, S., & 

Chan, T. W. (2005). A few design perspectives on one-on-one digital classroom. Journal of Computer-Assisted 

Learning, 21 (3), 181-189. 

Liu, T. C., Wang, H. Y., Liang, J. K., Chan, T. W., Ko H. W., & Yang, J. C. (2003). Wireless and mobile 

technologies to enhance teaching and learning. Journal of Computer Assisted Learning, 19 (3), 371-382. 

Norris, C., & Soloway, E. (2004). Envisioning the handheld-centric classroom. Journal of Educational Computing 

Research, 30 (4), 281-294. 

Palincsar A. S., Brown A. L. (1984). Reciprocal Teaching of Comprehension- Fostering and Comprehension- 

Monitoring Activities. Cognition and Instruction, 1 (2), 117-175. 

Palincsar A. S., Brown A. L., & Martin S. M. (1987). Peer Interaction in Reading Comprehension Instruction. 

Educational Psychologist, 22 (3&4), 231-253. 

Pea, R. (2004). The Social and Technological Dimensions of Scaffolding and Related Theoretical Concepts for 

Learning, Education, and Human Activity. Journal of the Learning Sciences, 13 (3), 423-451. 

Roschelle, J., & Pea, R. (2002). A walk on the WILD side: How wireless handhelds may change computersupported 

collaborative learning. International Journal of Cognition and Technology, 1 (1), 145-168. 

Roschelle, J. Patton, C. Brecht, J., & Chan, T. W. (2005). The Context for WMTE in 2015. In Proceedings of the 

3 rd IEEE International Workshop on Wireless and Mobile Technologies in Education (WMTE 2005), Japan, 112- 

119. 

Sharples, M., & Beale, R. (2003) A technical review of mobile computational devices. Journal of Computer 

Assisted Learning, 19 (3), 392-395. 

Slavin R. E., (1980). Cooperative Learning, Review of Educational Research, 50 (2), 315-342 

Sommerville, I., (1998). Software Engineering, Reading, MA: Addison-Wesley. 

Szyperski, C., (1998). Component Software, Reading, MA: Addison-Wesley. 

Sweller, J. (1988). Cognitive load during problem solving: Effects on learning. Cognitive Science, 12, 257-285. 

Vygotsky, L. S. (1978). Mind in Society, Cambridge, MA: Harvard University Press. 

Weiss, D. (2005) Quantitative Analysis of Open Source Projects on SourceForge. In Proceedings of the 1 st 

International Conference on Open Source Systems (OSS 2005), Genova, Italy, 2005. 

231

Gibbs, W. J., Olexa, V., & Bernas, R. S. (2006). A Visualization Tool for Managing and Studying Online Communications. 


A Visualization Tool for Managing and Studying Online Communications 

William J. Gibbs 

Department of Interactive Media, 316 College Hall, Duquesne University, USA 

Tel: +1 412-396-1310 

gibbsw@duq.edu 

Vladimir Olexa 

Duquesne University, Pittsburgh, PA 15203 USA 

Olexa677@duq.edu 

Ronan S. Bernas 

Department of Psychology, Eastern Illinois University, 600 Lincoln Avenue, Charleston, IL 61920 USA 

Tel: +1 217-581-6416 

Fax: +1 217-581-6764 

cfrsb@ux1.cts.eiu.edu 

ABSTRACT 

Most colleges and universities have adopted course management systems (e.g., Blackboard, WebCT). 

Worldwide faculty and students use them for class communications and discussions. The discussion tools 

provided by course management systems, while powerful, often do not offer adequate capabilities to 

appraise communication patterns, online behaviors, group processes, or critical thinking. This paper 

discusses a Web-based program that represents temporal data as maps to illustrate behaviors of online 

discussants. Depicting discussions in this manner may help instructors and researchers study, moderate, 

and/or facilitate group discussions. The paper presents the rationale for program’s development and 

provides a review of its technical specifications. 

Keywords 

Computer-mediated-communications, Online discussions, Visualizing online discussions, Interaction patterns 

Introduction 

The paper begins by providing readers background information about Computer-Mediated Communications 

(CMC) and a discussion of projects aimed at visualizing CMC. From these works, it introduces a software tool, 

Mapping Temporal Relations of Discussions Software, that the authors developed to assist instructors and 

researchers manage and study asynchronous online communications. The discussion serves to overview the 

software’s purpose and functionally. The authors highlight software features and present observations made by 

users who tested it. Finally, the paper discusses the software and its implications from a pedagogical perspective. 

Background 

CMC is communication that takes place between human beings by way of computers (Herring, 1996). The 

growth of CMC has been exponential. At any particular time, thousands of online conversations take place 

worldwide (Donath, 2002). Its pervasiveness has captured the interests of educators and research scholars from 

diverse fields of study. As Junge (1999) points out, an expanding literature base investigates CMC’s formal 

linguistic properties; sociolinguists and ethnographers of communication study issues related to various CMC 

types; and discourse analyses of CMC modalities have explored, among other things, philosophical dimensions. 

Werry (1996) suggests that “interactive written discourse provides a fertile ground for analysis since it makes 

possible interesting forms of social and linguistic interaction and brings into play a unique set of temporal, 

spatial, and social dimensions” (p. 47). 

In educational settings, there has been a growing reliance on CMC to support learning and communications in 

traditional face-to-face and online classes. Most colleges and universities provide course management systems 

(e.g., Blackboard, WebCT). These systems supply tools that enable students (and instructors) who are not in 

immediate and face-to-face contact with each other to use the computer to engage in synchronous and 

asynchronous discussion of course topics and analyze related ideas. This form of interaction is used to create a 

sense of online community among discussants, to engage them, and to have them think deeply and critically 






232

about the concepts and ideas presented by the instructor and other students. It draws on aspects of traditional 

face-to-face classroom education but also has its own unique characteristics. For example, asynchronous 

computer conferencing is available 24 hours per day, 7 days per week, which allows contributions to online 

discussions to take place at different times. The temporal norms that evolve online are often quite different from 

those of a face-to-face class. CMC gives students time to reflect and to consider the content and wording of a 

contribution or response, which can be highly beneficial for learning (Palloff & Pratt, 2001). Students may also 

interact with other students, instructors, scholars, and experts from anywhere, given Internet access. To a great 

extent, these characteristics engender widespread utilization of and reliance on CMC within K-12, university, 

and corporate educational and training settings. 

Although popular, there is still much to be known about CMC and the interactions that occur among discussants 

in online groups (Jeong, 2004) and about how they influence critical thinking, engagement, and the online 

community. There has been insufficient research to explicate the effects of CMC on communications and 

learning in online discussions (Jeong, 2003), which “… can be attributed to the lack of methods and tools 

capable of measuring group interactions and processes” (p. 26). In addition, while popular course management 

systems are technically advanced, many do not address key pedagogical goals (Duffy, Dueber, & Hawley, 1998) 

and thus could be improved. Hara, Bonk and Angeli (2000), for example, recommend that computer 

conferencing tools provide discussants with “…graphical displays to indicate the potential links between 

messages, the depth of those links, and the directionality of communication patterns” (p. 65). 

Purpose 

Given the proliferation of online educational communications and the development of course management 

systems designed to support them, there is a need for tools that assist educators, researchers, and students engage 

in, make sense of, and study the relatively new and evolving area of CMC. This paper discusses a tool, Mapping 

Temporal Relations of Discussions Software (MTRDS), designed to visually and temporally represent 

asynchronous online discussions. It provides the rationale for MTRDS’ development and reviews literature 

related to the temporality and visualization of online discussions, which are foundational to the project. In 

addition, the paper presents MTRDS and its technical specifications. 

Online Discussions and Time 

CMC, specifically asynchronous communications, allows discussants to engage in discussion at a time 

convenient to them but they are often unaware of or overlook temporal norms that can affect online relations. 

Time cues within messages are important but are a frequently disregarded element in CMC. Walther and Tidwell 

(1995) purport that CMC lacks “…relationally rich nonverbal cues” (p.356). However, time cues, such as the 

time stamping of messages, delays in responding, a discussant’s frequent or infrequence communications over 

time, still remain. 

Chronemics refers to human perception of and reaction to time and how time is used and structured. It relates to 

time as a nonverbal communicative channel influencing human communications (Canfield). The presence of 

chronemic cues in CMC, such as the time a message arrives and the duration until a discussant responds, may 

impact the judgments one ascribes to the message and the message sender. How discussants use time and the 

variations of time during interaction episodes convey meaning at several levels. In an interaction episode, the 

individual who controls wait time is typically in a superior role. The person waiting for a reply to a message is 

subject to the preferences of the message sender as to when the message gets sent. The time when a message is 

sent and the delay between messages may evoke favorable or unfavorable reactions from message recipients. 

According to Hewitt and Teplovs (1999), “… the growth potential of a message thread may depend, to some 

degree, on the age of its notes” (p. 235). If a message does not draw a response in the first few days of it being 

posted, it is improbable that the message will receive any response (Hewitt & Teplovs, 1999). The delay of 

responses characteristic of asynchronous communications can adversely impact discussion threads (Jeong, 

2004). However, “longer response times produced critiques that elicited higher proportions of replies with 

elaboration and supporting evidence” (Jeong, 2004, p. 1). Thus, the temporal norms that evolve during CMC 

may have significant influence on communications and ignorance of them will likely result in user aggravation 

and misattribution (Walther & Tidwell, 1995). Tools that enable the observation of temporal patterns may prove 

useful in helping discussants moderate and effect worthwhile communications. 

233

Visualization of Online Discussions 

Many communication technologies focus on textual representations of information ignoring modalities that 

emphasize such aspects as pattern recognition (Selvin et al., 2001). A number of researchers have developed 

systems and interfaces for visualizing online communications. For instance, Donath (2002) discussed a project, 

PeopleGarden, which employs flowers and a garden metaphor to visually represent message board participation. 

Babble (Erickson, Halverso, Kellogg, Laff, & Wolf) is an online conversation area that enables visualization of 

online discussants and their activities. People are shown as dots within a circle. The position of the dots changes 

according to the activity level of the person engaged in the conversation. Loom is another project that uses 

visualization techniques to reveal social patterns for Usernet groups (Donath, Karahaliosm, & Viégas, 1999; 

Donath, 2002; Boyd, Lee, Ramage & Donath, 2002). Among other things, it depicts conversational patterns such 

as individual postings, vocal members of the group, regular and irregular participants, and mood. Boyd, et al. 

state that provided, “…the innumerable styles of interaction in Usenet, immense possibilities exist for exploring 

alternative techniques and approaches to visually convey social interaction” (p. 2). Based on the work of 

Whittaker (1998) and Smith and Fiore (2001), they developed questions representative of social characteristics 

of individuals, conversations, and news groups and categorized them accordingly. Several questions appear 

relevant to learners' involvement in online conversations and the dynamics of those conversations: 

Individual 

How frequently does an individual post? How verbose is s/he? 

What types of tone, language and conversation techniques are often used? 

Conversation/thread 

How many participants are active in a conversation? 

What structure or pattern of communication occurs (i.e. back and forth vs. large participation vs. one person 

dominating)? 

Does the thread fracture into mini-threads? 

Group 

How many threads are usually active? 

How many people participate in the group, and how frequently do they post? 

How many people start conversations? 

Are people who initiate threads also tend to reply to others’ messages? 

(Boyd, et al., 2002, p. 3) 

Conceivably, by representing these data visually and dynamically during online conversations, one can provide 

discussants greater awareness of the communication dynamics and perhaps shape it in productive ways. 

Moreover, such visualization can improve the conferencing environment, which is often disconcerting to users. 

Each of the aforementioned projects emphasizes the social dimensions of online conversations and the design of 

visualizations to represent online conversational spaces intuitively and meaningfully to users. However, they are 

more interfaces than educational tools for analyzing online conversations. Educational researchers employ a 

variety of techniques and tools to represent and study CMC interactions. For example, Howell-Richardson and 

Mellar (1996) created message maps to temporally identify periods of variable discussion activity as well as the 

patterns of interactivity. Among other things, maps helped identify message clustering, concentration of 

referential links, and the degree of participant activity. Blake and Rapanotti (2001) made conference interaction 

maps with nodes representing messages posted to a conference area and arrows defining relationships among 

nodes. Each node includes indices of time, sender, message contribution, and message agreement with prior 

messages. Hara et al. (2000) used conference activity graphs to visually illustrate transformations of student 

interactions over time. In the authors’ view, these works, while serving different aims, lend credence to the 

significance of the temporality in CMC and the relevance of visualizing online conversations. 

Mapping Temporal Relations of Discussions Software 

In the summer of 2004, the authors developed Mapping Temporal Relations of Discussions Software (MTRDS) 

to assist them analyze the temporal aspects of online educational course discussions. Prior to MTRDS, they 

created print illustrations similar to figure 1 to visualize conversations, which proved time consuming and caused 

extended delays between the discussions and the creation of maps. Their effort to depict conversations visually 

was due, in large part, to the design of computer conferencing systems, which organize conference messages 

234

hierarchically by thread/topic and time. As new messages get added, they become indented under the post that 

provoked the reply. Such organization makes it difficult to examine discussions as well as evolving temporal 

norms, especially when the number of messages increases. Mapping temporal relations provides a more 

compelling view. For example, the authors created hand-made maps similar to what is depicted in figure 1, 

which illustrates a discussion that occurred over 15 days. The Y-axis represents time (hour) and the X-axis 

represents day or date. Each square denotes a discussion post and contains the participant’s initials and a unique 

numeric identifier (not shown in figure 1). The lines connecting posts indicate replies. From such an illustration, 

one can observe spatial and temporal dimensions of discussions, including message clustering; the general time 

and date of discussion activity; individual patterns of interactivity, such as individuals who contributed 

consistently over time and those who contributed sporadically; and those who appear on the periphery of 

discussions. In the authors’ view, maps provide useful information about communication patterns. It is 

conceivable that these kinds of data may help educators manage discussions as well as engender collaboration 

and critical thinking during conversations. For instance, an instructor or moderator who has this information at 

his/her disposal could prompt a discussant on the periphery of a discussion to engage him/her in more timely 

interactions. 

Figure 1. Hand-made student discussion activity map 

MTRDS and Other Approaches to Represent and Study CMC 

As mentioned, educational researchers (e.g., Howell-Richardson and Mellar, 1996; Blake and Rapanotti, 2001; 

Hara et al., 2000) have used a number of approaches to represent and study CMC interactions. The authors 

believe that many components of MTRDS build upon these works as well as offer unique variations on them. 

Characteristics that the authors view as unique to MTRDS include the following: 

MTRDS is not a discussion board but a tool that manipulates data from existing discussion boards, and thus 

it can potentially be used with a wide array of existing discussion tools. It represents data visually and 

provides an interface for viewing discussions that highlights their temporal and spatial dimensions. 

MTRDS visualizations are dynamic and reflect changes when they are made to the discussion transcript. 

MTRDS is entirely Web based and accessible to users with an Internet connection and a Web browser. 

MTRDS enables users to read discussion messages while, at the same time, examine discussion 

characteristics such as message clustering, concentration of links, temporal norms, and the degree of 

participation. 

Because the authors spent a great deal of time creating maps by hand, they designed MTRDS to be easy-touse 

and efficient. Once the user creates a discussion file, he only has to open the MTRDS’ Web site and 

click a button to upload the file and the map is created. 

235

MTRDS Design 

MTRDS is a Web-based prototype that dynamically maps the temporal aspects of discussions. At present anyone 

who uses WebCT (a popular Web-based course environment) can compile discussion board messages in a text 

file and then upload the file to the software using a Web browser. MTRDS generates a visual representation of 

discussions based on hour and date. Figure 2 represents a drawing of a map rendered by MTRDS. The X and Y 

axes denote day/date and time, respectively. Each discussion message/post is represented by a color-coded circle 

(message node) and a line (link) that connects a response node to the originating message. An originating 

message (i.e., the Parent), when replied to will have one or more child messages. A line with a single arrow 

pointing backward represents the linkage between a response node (child) and its originating message (the 

Parent). From the maps, one can readily determine the ancestry of a particular response node (child). MTRDS 

draws arrows indicating which message is being responded to or followed up by a specific message. Thus, the 

arrows point backward. It is important to point out that this kind of a map illustrates the pattern of responding. 

On the other hand, if one is interested in the flow of discussion of the topic, it may make more sense for the 

arrows to point forward. However, this depends on the instructor’s/researcher’s purpose for using the map. 

MTRDS color-codes messages by topic because discussions often contain multiple topics/threads. Color coding 

also reveals isolated message threads and those that dominate. When the user passes the computer mouse over a 

message node, the node highlights and the post time appears in the far left (time) column. If the user clicks the 

mouse button on a message node, the view expands and presents a scroll box containing the message, author’s 

name, and post time. 

Technologies Used in MTRDS 

Figure 2. Rendering of MTRDS’ map of discussion 

MTRDS employs PHP, Extensible Markup Language (XML), and Flash (MX 2004 Professional) in an Apache 

or IIS Web server environment as its main underlying technologies. PHP hypertext preprocessor is a server-side 

scripting language used primarily for dynamic Web development. MTRDS uses this technology to parse 

discussion forum data and to generate XML, the common data source between PHP and Flash. MTRDS employs 

Flash ActionScript to visually display the data obtained from the XML file generated by the PHP script. 

File Processing Using PHP 

Discussion forum messages are obtained from the WebCT forum interface. WebCT compiles selected messages, 

which can be downloaded as a text (.txt) file. Text files contain the following key parameters: message ID, 

branch from (identification of message and the message it is tied to), author, post date, subject, and body. When 

the user uploads a discussion file obtained from WebCT, it is submitted to the central location on the Apache 

236

Web server. The PHP parser processes the content of the file and extracts all relevant information. The parser 

generates an XML file into which it places all the extracted information. See Appendix A for additional 

information about the PHP parser and XML file. 

After the XML file is generated, an HTML page containing the Flash movie is called and the map is generated. 

When the Flash movie is called, it acquires the xml file and creates an array of Message objects for its internal 

processing. A custom Drawing class written in ActionScript contains all the basic tools required for drawing the 

map (e.g., drawSquare(), drawLine(), drawArrow(), drawText(), etc.). Another class, weekView, is then used to 

perform the actual processing and drawing of the message nodes and links. 

The entire Flash movie is constructed using three frames. The first frame processes the XML file and generates 

the array of the Message objects. The second frame instantiates the weekView object and calls the 

renderMessages() function that draws the actual map. This frame also includes controls to move back and forth 

in the time line. The third frame contains no objects and serves only as a tracking point for moving in the 

timeline. 

MTRDS’ Functionality: Benefits and Improvements 

The authors asked five instructors to evaluate the functionality of MTRDS. The instructors implemented it for 

their classes or research. They were asked to state benefits of using the program for their teaching or research 

needs and to identify problems they encountered and/or features that need to be incorporated. For clarity, the 

instructors who evaluated MTRDS will be hereafter referred to as users. In addition, the authors demonstrated 

the program at two academic presentations attended by teaching and research professionals. During the 

demonstrations, attendees reviewed MTRDS and offered suggestions about its design and functionally. The 

evaluative exercises provided useful information about the program and ways to enhance it. The following 

summary highlights several beneficial features of the software that users identified. It also lists problem as well 

as features not currently available that users perceived as important improvements. 

User Observations: Existing Beneficial Features of MTRDS 

Observing communication characteristics: Smith and Fiore (2001) point out that CMC software 

environments often “…obscure information about threads’ temporal patterns, population, and structure…” 

(p.139). Users found that because MTRDS displays discussion messages temporally-spatially it readily 

conveyed information about interactivity; thread dominance and development; message continuity, 

sequencing, and ancestry; links between messages; and isolated messages, which is not easily observable in 

traditional discussion boards (Smith & Fiore, 2001). 

Message context: Many popular discussion boards present users a hierarchical arrangement of message titles 

grouped by thread. The titles are links to the message content and in some ways serve as an index. 

Generally, when the user clicks the link to read a message, the hierarchy of message titles disappears and, to 

an extent, removes a context (time, date, placement in index, etc.) with which to understand the message. 

The context surrounding messages, such as the time and date of a post and its place in the thread can provide 

valuable information to help students understand a message. MTRDS allowed users to see the structure of 

the overall conversation while reading messages, which enhanced the message context. As users read a 

message, they observed how spatially separate by time and date the message was from other posts within the 

entire discussion. Users could determine the thread in which the message existed and its placement in the 

message chain. In addition, they could identify where that thread and message existed in the entire 

discussion. 

Temporal behaviors: Maps helped users identify the temporal behaviors of online discussants and, when 

necessary, moderate them to positively effect communications. Users noted that they could easily identify 

student participation patterns in the context of a discussion. For instance, when viewing a map after a 

discussion occurred, one instructor observed how a discussion leader responded sequentially to each student 

posts over several days. The instructor speculated that the response pattern, while useful for individual 

students, may not have engendered group cohesion and dialogue, an aim of the discussion activity. 

Moreover, the map illustrated low learner-to-learner interactions and discontinuous conversation. The 

instructor reflected that had he shown the map to the discussion leader during the discussion, it would have 

conveyed his pattern of responding and perhaps allowed him to see that his communication patterns 

potentially inhibited group interaction. Moreover, the map articulated the leader’s pattern of responding 

237

ather than of initiating dialogue, which the map would have made clear to him and could have helped him 

alter his communication behavior. 

Temporal-spatial representation: When comparing MTRDS’ display of discussions with other discussion 

board displays, one user suggested that it is analogous to providing a person who is lost directions in the 

form of a geographical map rather than in text or written form. Whereas many traditional discussion boards 

present information textually with messages sequenced linearly with limited temporal or spatial information, 

MTRDS, similar to a geographical map, afforded users a spatial (and temporal) dimension of the discussion 

that offered insights about how discussions evolved. It also allows one to efficiently surmise the overall flow 

of a discussion. 

Indicators of communication dynamics: The communication patterns represented by MTRDS provided 

indices of when communications were of a monological or dialogical nature. Figure 3 illustrates a rendering 

of a discussion conducted by an online guest lecturer (for clarity, figures 3 and 4 were reduced to black and 

white drawings to indicate the overall flow in the discussions). Message nodes expanded horizontally from 

the initial node (column 1) as activity increased. The thread initiated by the guest dominated for several 

days. Only on the fifth day several new threads appeared. One can see message nodes extend over seven 

days, the number of posts each day, the time of each post as well as direct and indirect references to 

messages. It also reveals when discussants discontinued threads. One could examine the concluding thread 

message in an attempt to ascertain why the topic ended and another began. Overall, the segment of the 

discussion presented in figure 3 illustrates consistent thread development and substantial learner-to-learner 

interaction. Figure 4 presents a segment of an activity in which students introduced themselves at the 

beginning of an online course. Student partners interviewed one another and each student posted the 

partner’s introduction. Relative to the discussions depicted in figure 3, the activity illustrated in figure 4 is 

minimal and somewhat discontinuous. 

Observations: Suggested Improvements 

Figure 3. Rendering of discussion map: Guest Lecturer 

Users wanted a condensed view of the representations so they could see as much of discussions as possible. 

This could be accomplished by incorporating a zoom feature that allows users control over how much of a 

discussion displays at one time. Currently, MTRDS presents message nodes by time (Y-axis) and date (Xaxis). 

Nodes appearing at the top of the Y-axis for a particular date are older than those at the bottom. On 

the X-axis, nodes display from oldest to newest, moving from left to right. To display as many messages as 

possible at one time, while maintaining readability, several factors had to be considered. Only one week’s 

(seven days) worth of messages display at a time. This number can be increased but the clarity of the map 

may be affected. The user scrolls to view the message nodes on subsequent days. 

238

Figure 4. Rendering of discussion map: Student Introduction 

MTRDS displays only days that contain messages and skips “empty” days. For instance, if February 3 

contained six messages, February 4 contained no messages, and February 5 contained ten messages, then 

February 3 and February 5 are represented on the map and not February 4. However, users pointed out that 

it would be useful to display days when there are no messages because it is informative and makes it easy 

for the instructor or researcher to ascertain periods of inactivity. One day of no activity versus several 

successive days of no activity is important data to interpret. Showing the “empty” days allows for looking at 

patterns of activity and non-activity visually. 

Users wanted the capability of isolating threads/topics or individuals. Each map presents the entire 

discussion including all threads, messages, and links, which can clutter the map when discussions become 

highly interactive. Users needed the capability to display selected threads or persons so they could more 

easily observe thread development or individual participation patterns. 

Researchers wanted to code or place labels on message content for analysis. For instance, as the 

representation is being viewed, a user could click on a message node and a drop-down list of codes would be 

presented. The code would allow the researcher to label or classify (e.g., evaluation, hypothesis, argument, 

etc.) the message. Coded information would be submitted to a database and available for additional analysis. 

This feature could allow one to study the semantic relations of messages to gain insights about group 

processes and communications. 

MTRDS does not distinguish a real response to a message from an unintended one. There may be a message 

that follows another but it is actually not a response to that previous message's content. The student merely 

used the previous message as a thread to follow. A student may either be genuinely following up the 

previous message's thought or may, in reality, be creating a new thread (a new discussion or topic). Those 

kinds of distinctions are meaningful and informative for the instructor or researcher, but, at present, the 

software is unable to detect them. It will require human interpretation and judgment to do so. 

Users wanted the ability to save the representations to be used in other applications 

MTRDS: Implications from a Pedagogical Perspective 

During face-to-face educational discussions, instructors often moderate and astutely observe student patterns of 

communication, such as responsiveness, verbal and non-verbal expressions, topic development, conversational 

239

dominance or apprehension, and participation level. Based on these observations, they use a variety of 

techniques (e.g., questioning) to facilitate communication and engage students. The verbal and kinesic signals 

available in face-to-face communications are, to a great extent, non existent in online discussions (Walther & 

Tidwell, 1995). In the absence of such cues, instructors must rely on other means to effectively moderate 

communications. MTRDS can aid instructors by highlighting characteristics of online communications that are 

difficult to ascertain with traditional discussion boards. For example, Gibbs (2006) analyzed multiple online 

discussions with MTRDS, examining the overall topography of maps, the extensiveness of links among 

messages, and discussion duration. He found that complex maps with extensive node-links relations suggested 

learner-to-learner interactivity and when discussions exhibited consistent thread development over several days 

and message clustering around a thread it suggested focused activity about a topic. Conversely, thread 

segmentation and isolated messages were indicative of disjointed communications or short question-response 

interactions. Finally, color coding highlighted information about thread dominance and segmentation and 

allowed him to identify, over the entire discussion, when threads began and terminated. He was also able to 

identify specific points where threads diverged and the message content that provoked the divergence. In some 

cases, the maps enabled him to follow message chains assessing the content of messages and to isolate points of 

departure within conversations. Such a view of multi-threaded discussions will allow instructors to monitor 

divergent conversations. It indicates the divergence in the context of other messages and threads and provides 

instructors information to help them assess and possibly circumvent splintering or off-tasks conversations. 

Projects completed thus far using MTRDS suggest that the visualizations convey relations among messages as 

well as the overall form or structure of interactions that unfold during discussions. They help identify whether 

discussions are interactive or one-way monologues as well as who initiates conversations and who responds. The 

temporal-spatial display of discussion data allows for identification of individuals who participate consistently 

and those participating sporadically. If grading is assigned for participation, the maps present a clear picture of 

involvement. Not only can an instructor determine if a student contributed and how frequently but he can assess 

the manner in which the student took part, which may serve as an indication of the student’s involvement in a 

discussion. For example, in some instances, users of MTRDS generated maps that showed students who 

consistently posted at the end of message chains or who posting numerous messages on few days toward the end 

of discussions. Such patterns become evident when rendered with MTRDS and suggest minimal involvement in 

a discussion. They frequently do not engender dialogue with other participants and many of the messages 

become isolated. 

MTRDS presents an audit of an individual participation. Discussions often begin with a message prompt, usually 

posted by the instructor. The content of messages nodes appearing in close proximity to the prompt will likely be 

related to the originating message. Conversely, it is plausible that message appearing lower in the message chain 

will be more dissimilar in content to the originating message than those appearing higher in the chain. Thus, as 

instructors observe a discussion in MTRDS, they will likely identify student posts that consistently appear lower 

in the chain. Such participation should be monitored closely because the student may be discussing issues 

noticeably different from those intended by the instructor. 

The aforementioned issues are important from a pedagogical perspective because they highlight ways tools such 

as MTRDS can help instructors, discussion moderators, and students to moderate online conversation and to 

positively effect communications. 

Summary and Conclusion 

Face-to-face class discussions usually occur during a limited period of class time. Instructors moderate 

conversations and observe participant interaction behaviors. They are able to observe the verbal and non verbal 

behaviors of students, which can help them efficiently and effectively manage a conversation. For instance, an 

instructor may quickly prompt a participant to engage him as soon as the individual does not actively participate 

or the instructor may curtail one who tends to dominate a conversation and exclude others. Because online 

conversations lack the visual and verbal cues of face-to-face communication, managing conversations can be 

more complex and inefficient. In addition, asynchronous discussions typically occur over several days and are 

mediated by computer software, which frequently conceals important information related to interaction patterns. 

The MTRDS project originated as an effort to develop a tool that would reveal the temporal aspects of online 

discussions and visually represent discussants' posting behaviors to help instructors, researchers, and students 

efficiently and effectively study and/or facilitate group discussions. Online learning environments are often 

disorienting to students and instructors. An instrument such as MTRDS may help class participants effectively 

monitor discussions and aid them in constructing meaningful interactions. 

240

There are many potential advantages of having this software for instruction or research purposes, which were 

emphasized by users. While MTRDS needs refinement, it appears to have potential as a viable tool to help 

educators and researchers monitor, manage, and study communication processes in online environments. As the 

authors refine the software they plan to address the issues raised by the user evaluations. 


This project was supported through funds from a NEH Endowment Competition, Duquesne University, 

McAnulty College and Graduate School Of Liberal Arts. 

References 

Blake, C. T., & Rapanotti, L. (2001). Mapping interactions in a computer conferencing environment, retrieved 

April 3, 2006, from, http://iet.open.ac.uk/pp/c.tosunoglu/euro-cscl2001.pdf. 

Boyd, D., Lee, H-Y., Ramage, D., & Donath, J. (2002). Developing legible visualizations for online social 

spaces. In Proceedings of the 35th Hawaii International Conference on System Sciences, retrieved April 3, 2006, 

from, http://csdl.computer.org/comp/proceedings/hicss/2002/1435/04/14350115.pdf. 

Canfield, A. (n. d.). Body, Identity and interaction: Interpreting nonverbal communication, retrieved April 3, 

2006, from, http://canfield.etext.net/. 

Donath, J., Karahalios, K., & Viegas, F. (1999). Visualizing conversation. Journal of Computer-Mediated 

Communication, 4 (4), retrieved April 3, 2006, from, http://jcmc.indiana.edu/vol4/issue4/donath.html. 

Donath, J. (2002). A semantic approach to visualizing online conversations. Association for Computing 

Machinery. Communications of the ACM, 45 (4), 45-49. 

Duffy, T. M., Dueber, B., & Hawley, C. (1998). Critical thinking in a distributed environment: A pedagogical 

base for the design of conferencing systems. In Bonk, C. J. & King, K. S. (Eds.), Electronic collaborators: 

Learner-centered technologies for literacy, apprenticeship, and discourse, Mahwah, NJ: Lawrence Erlbaum, 51- 

78. 

Erickson, T. J, Halverso, C., Kellogg, W. A., Laff, M., & Wolf, T. (n. d.). Social translucence: Designing social 

infrastructures that make collective activity visible, IBM T. J. Watson Research Center, retrieved April 3, 2006, 

from, http://www.visi.com/%7Esnowfall/Soc_Infrastructures.html. 

Gibbs, W. J. (2006). Visualizing interaction patterns in online discussions and indices of cognitive presence. 

Journal of Computing in Higher Education, 18 (1), 30-54. 

Hara, N., Bonk, C., & Angeli, C. (2000). Content analysis of online discussion in an applied educational 

psychology course. Instructional Science, 28, 115-152. 

Herring, S. C. (1996). Two variants of an electronic message schema. In Herring, S. C. (Ed.), Computermediated 

communication: Linguistic, social, and cross-cultural perspectives, Philadelphia, Pennsylvania: John 

Benjamins, 81-106. 

Hewitt, J., & Teplovs, C. (1999). An analysis of growth patterns in computer conferencing threads. In Hoadley, 

C. & Roschelle, J. (Eds.), Proceedings of the computer support for collaborative learning (CSCL) 1999 

Conference, Mahwah, NJ: Lawrence Erlbaum, 232-241. 

Howell-Richardson, C., & Mellar, H. (1996). A methodology for the analysis of patterns of participation within 

computer-mediated communication courses. Instructional Science, 24, 47-69. 

Jeong, A. C. (2003). The sequential analysis of group interaction and critical thinking in online threaded 

discussions. The American Journal of Distance Education, 17 (1), 25–43. 

241

Jeong, A. (2004). Methods and tools for the computational analysis of group interaction and argumentation in 

asynchronous online group discussions, retrieved April 3, 2006, from, http://dev22448- 

01.sp01.fsu.edu/Research/Proposals/AERA2005_CMCResearchMethods(Jeong)_v2.pdf. 

Junge, T. (1999). Key concepts and debates in linguistic anthropology computer-mediated communication, 

retrieved April 3, 2006, from, http://www.emory.edu/COLLEGE/ANTHROPOLOGY/Linganth/computer.html. 

Palloff, R. M., & Pratt, K. (2001). Lessons from the cyberspace classroom: The realities of online teaching, San 

Francisco: Jossey-Bass. 

Selvin, A., Buckingham Shum, S., Sierhuis, M., Conklin, J., Zimmermann, B., Palus, C., Drath, W., Horth, D., 

Domingue, J., Motta, E., & Li, G. (2001). Compendium: Making meetings into knowledge events. Knowledge 

Technologies 2001, March 4-7, Austin, TX, retrieved April 3, 2006, from, 

http://www2.gca.org/knowledgetechnologies/2001/proceedings/Conklin&Selvin%20Slides.pdf. 

Smith, M., & Fiore, A. (2001). Visualization components for persistent conversations, In Proceedings of CHI 

2001, Seattle, WA, March 31 - April 5, retrieved April 3, 2006, from, 

http://uk.builder.com/whitepapers/0,39026692,60010219p-39000909q,00.htm. 

Walther, J. B., & Tidwell, L. C. (1995). Nonverbal cues in computer-mediated communications, and the effect of 

chronemics on relational communication. Journal of Organizational Computing, 5 (4), 355-378. 

Werry, C. C. (1996). Linguistic and interactional features of internet relay chat. In Herring, S. C. (Ed.), 

Computer-mediated communication: Linguistic, social, and cross-cultural perspectives, Philadelphia, 

Pennsylvania: John Benjamins, 47-63. 

Whittaker, S., Terveen, L., Hill, W., & Cherny, L. (1998). The dynamics of mass interaction, In Proceedings of 

Conference on Computer Supported Cooperative Work, New York: ACM Press, 257-264. 

242

Appendix A 

The parser is designed to be extensible so that new modules can be written to support other formats (e.g., 

Blackboard). The basic mechanism of the parser is presented below. It uses an object called Message and creates 

an array of objects. When the array is complete, its content is then transformed and transferred into a XML file 

(messages.xml) on the server. 

class Message { 

var $msg_id; 

var $branch_from; 

var $author; 

var $post_date; 

var $subject; 

var $body; 

} 

class Parser { 

... 

function processFile($file_name, $file_type) 

{ 

... 

switch($file_type) 

{ 

case 1 : module to process file type 1; break; 

case 2 : module to process file type 2; break; 

case N : module to process file type N; break; 

... 

} 

... 

} 

} 

PHP parser 

XML Structure 

The XML file generated by the parser has a simple structure for easy processing by Flash and to reflect the basic 

data requirements of a message forum. 

 

 

BRACH_FROM_ID 

AUTHOR 

YYYY-MM-DD-HH-MM 

SUBJECT 

BODY 

 

 

XML file structure 

243

Ng’ambi, D., & Johnston, K. (2006). An ICT-mediated Constructivist Approach for increasing academic support and 

teaching critical thinking skills. Educational Technology & Society, 9 (3), 244-253. 

An ICT-mediated Constructivist Approach for increasing academic support 

and teaching critical thinking skills 

Dick Ng’ambi 

Centre for Educational Technology, University of Cape Town, South Africa 

dngambi@ched.uct.ac.za 

Kevin Johnston 

Information Systems Department, University of Cape Town, South Africa 

kjohnsto@commerce.uct.ac.za 

ABSTRACT 

South African Universities are tasked with increasing student throughput by offering additional 

academic support. A second task is to teach students to challenge and question. One way of attempting to 

achieve these tasks is by using Information and Communication Technology (ICT). The focus of this paper 

is to examine the effect of using an ICT tool to both increase academic support to students, and to teach 

critical thinking skills. 

A field study comparing a Project Management course at the University of Cape Town over two successive 

years was conducted. In the second year an ICT mediated constructivist approach (DFAQ web site) in 

which students acquired project management skills was used to increase support and teach critical thinking 

skills. Structuration theory, in particular the notion of practical and discursive consciousness, was used to 

inform our understanding of the role of questioning on teaching project management. 

The conclusion is that a constructive approach, mediated by an anonymous web-based consultative 

environment, the Dynamic Frequently Asked Questions (DFAQ) improved support to students and had an 

effect on student learning of project management and students acquired some questioning skills as 

evidenced in the examination performance. The efficacy of the approach was evaluated both through an 

interpretive study of DFAQ artefacts and the performance in the examination. 

The paper examines relevant literature, details the research objectives, describes a field survey, and the 

results. 

Keywords 

Constructivist approach, Critical thinking, Thinking Skills, Academic support, Consultative environment, ICTmediation 

Introduction 

The South African government funding of higher education institutions is based on student throughput, as 

opposed to intake numbers. The paradox is that increased intake numbers do not translate into increased 

throughput but rather puts huge constraints on support required to help the students. In order to increase student 

throughput, institutions need to increase academic support to students. Thus, the immediate challenge institutions 

are facing is how to provide increased student support both effectively and efficiently. 

In response to this challenge, many institutions have set up bridging programmes such as Academic 

Development Programmes (ADP) as a means to increase academic support to students, with an inadvertent aim 

of increasing student throughput. Given the volumes of students entering universities in South Africa, huge staff 

resources are required for these bridging programmes to have any effective impact on student throughput. While 

Academic Development Programmes (ADP) are useful interventions, these programmes are only as successful 

as they are able to create independent learners, and instil critical mindedness. 

Universities in South Africa are also faced with the challenge of how to teach critical thinking skills to students 

whose preparation for higher education was based on teachers as transmitters of knowledge. The Department of 

Education launched ‘curriculum 2005’ as an attempt to redress this and other problems (Department of 

Education, 2005). 

Mindful of the difficulty of teaching critical thinking, our approach was to facilitate the acquisition of thinking 

skills through using questioning as a learning activity. An IS project management course was appropriate for this 






244

investigation because projects in general are not routine, are goal oriented, require coordination of effort, have a 

start and finish time, and are constrained by budgets (Schwalbe, 2004). For this reason, it was important that 

students acquired skills that would enable them to think laterally. 

In view of this background, this project was conceived to explore ways in which Information and 

Communication Technology (ICT) could be used to increase support to students and to teach them to think 

laterally. 

The main research objectives were thus to examine the effect of using purpose built ICT to increase academic 

support to students, and to teach students critical thinking skills. 

This paper discusses a field survey in which these two objectives were tackled using ICT at the University of 

Cape Town (UCT). A special purpose built ICT application designed and develope at UCT was used in the field 

study. 

The paper is organised as follows: It begins with a review of relevant literature, outlines the research objectives, 

and describes the field survey. The field survey is divided into three parts: the methodology; findings and 

discussion; and the impact on student learning. 

Background 

Use of ICT 

For many years researchers have been suggesting that lecturers should use ICT in new ways, other than simply 

‘information giving’ (Nicaise and Crane, 1999). 

Khine (2003) stated that lecturers could use ICT to facilitate learning, critical thinking and peer discussions. Pratt 

(2002) suggested that it is possible for ICT to facilitate both teaching and learning. It would be naïve to suggest 

that ICT are a panacea for under-prepared students, but ICT can be a useful complement to lecturers when 

carefully integrated into the curriculum. Winn (1989) suggested that educators should use ICT to develop 

alternatives to the teacher-based education model. Dynamic Frequently Asked Questions (DFAQ) is one such 

ICT alternative. 

DFAQ 

The Dynamic Frequently Asked Questions (DFAQ) tool was designed and developed at UCT as a special 

purpose question and consultation environment for students (Ng’ambi, 2003; Ng’ambi and Hardman, 2004; 

Ng’ambi, 2004). DFAQ provided an anonymous medium through which students consulted one another and with 

a lecturer (Ng’ambi, 2003). DFAQ was an educative, social and communicative space, which dynamically 

created a knowledge resource from student consultations. When a student posted a question, the lecturer was 

notified via email and if necessary via Short Message Service (SMS) on their mobile phone. When a question 

had been responded to, the student was notified via email with an optional SMS notification. The value of the 

DFAQ lay in the extent to which it supported learning both on and off-campus, as Toohey (1999) suggested. 

DFAQ consultations were anonymous. DFAQ was available 24 hours a day, and allowed students the 

opportunity to learn from answers to questions other students had posed. DFAQ thus allowed students to decide 

when and where to use it. 

Constructivism 

Two learning environments were identified in the literature; the traditional learning environment (teachercentred 

model) and the constructivist learning environment (learner-centred model). As DFAQ is learnercentred, 

this paper focuses on the constructivist environment. This paper views a constructivist learning 

environment as an effective complement to the traditional learning environment. 

According to Crotty (1998, p58) the term constructivism refers to the epistemological considerations focusing 

exclusively on ‘the meaning-making activity of the individual mind’ and constructionism focuses on ‘the 

collective generation [and transmission] of meaning’. Constructivism is used in this paper as a theory to guide 

245

understanding of how students acquire critical questioning skills. Karagiorgi and Symeou (2005, p24) asserted, 

“...in a world of instant information, constructivism can become a guiding theoretical foundation and provide a 

theory of cognitive growth and learning that can be applied to several learning goals.” 

In a constructivist learning environment the role of the lecturer shifts from being a source of knowledge to 

facilitating learning. Khine (2003) argued that students should not be left to explore alone, rather lecturers should 

provide support, coaching and modelling to the students to make certain learning takes place. 

Unlike the teacher-centred model, in which lecturers impart knowledge to students, “knowledge for 

constructivism cannot be imposed or transferred intact from the mind of one knower to the mind of another” 

(Karagiorgi and Symeou, 2005, p18). 

DFAQ aimed to encourage learners to continually question, and in so doing become life long learners. 

You can teach a student a lesson for a day; but if you can teach him to learn by creating 

curiosity about life, he will continue the learning process as long as he lives [CP Bedford]. 

Integrating ICT & Constructivism 

“The Web is where constructivist learning can take place The web provides access to rich sources of 

information; encourages meaningful interactions with content; and brings people together to challenge, support, 

or respond to each other” (Khine, 2003, pp 22-23). However, merely providing students with access to the web 

does not guarantee constructivist learning. The lecturer is required to provide some guidance, or coaching to 

allow students to create their own meanings. Russell and Schneiderheinze (2005) argued that a critical factor 

which shaped how lecturers used ICT to develop learning, was the lecturers academic belief and acceptance of 

constructivism. 

Rickards (2003) observed that a problem for many is developing the skills to effectively utilise ICT in a 

meaningful manner. The challenge that faces lecturers when integrating ICT in their teaching are summed up in 

the following statement: 

When technologies are inserted into the educational environment, they are meant to develop 

learning abilities in students. However, these technologies do not function in a vacuum. Instead 

they are coupled to existing tools and concepts in the setting. When teachers attempt to 

implement a technology innovation in the classroom, they naturally face the complex challenge 

of fitting together new ideas with deep-rooted pedagogical beliefs and practices (Russell and 

Schneiderheinze, 2005, p39). 

DFAQ was developed as a web based instrument in order to integrate ICT and constructivism. The particular 

course selected to use DFAQ was a third year Project Management course at UCT. 

Project Management 

Schwalbe (2004) reported that the Chaos studies stated that only 28% of ICT projects were successful, 163% 

experienced time overruns, and 145% experienced cost overruns. In 2001, the British Computer Society 

reviewed 1027 projects, and found 13% were successful (in terms of scope, time, cost & quality) (Coombs, 

2003). 

Project Management (PM) requires strategic competency and critical thinking skills. Many Project Management 

text books (Schwalbe, 2004; Marchewka, 2003; Cadle & Yeates, 2004; Hughes & Cotterell, 2002) emphasise the 

need for developing questioning skills. A traditional project management aphorism, “you know you’re a 

successful project manager when you survive the project” suggests a need for critical thinking skills in addition 

to having knowledge about the subject matter. One of the UCT project management students put it this way: 

If project management is taught as thoroughly as it is... are the models misleading us? Why do 

the stats tell us that the chances of us completing a project properly are next to nil? Is it the 

project team, or the models? or a mix of both? [Anonymous DFAQ posting] 

246

Critical Questioning 

Walton and Archer (2004, p185) explained, “Critical literacy is not about oppositional thinking or an alignment 

with politically correct positions. Rather, educators should highlight the need for critical evaluation and 

acknowledge the role of cultural capital as they help students construct and apply meaningful critical 

frameworks.” 

Project Management students are expected to actively participate in class by asking questions and sharing 

personal experiences according to Schwalbe (2004). Critical thinking is a required competency for successful 

project managers. 

There is increasing pressure on lecturers to teach critical thinking skills as Hills (1987) pointed out several years 

ago. While the problem of teaching critical thinking has been identified, few suggestions on how such skills can 

or should be taught were found. Students need to be taught and trained in how to question. Johnston and 

Ng’ambi (2004) made several recommendations on using questions to teach critical thinking skills. . 

An opportunity for critical questioning skills using problem based learning, practical application of project 

management (e.g. using MS Project) and the scholarly knowledge of project management was spotted. The 

assumption was that, used in a constructivist learning environment DFAQ would provide opportunities for 

students to learn from one another while engaging with the scholarly knowledge of project management and 

acquiring critical thinking skills. 

Theoretical underpinning 

Giddens (2000) Structuration Theory (GST) was used in an attempt to understand how students acquire critical 

thinking skills in a constructivist environment. In terms of GST, students can be said to be knowledgeable 

agents, as they are knowledgeable about their actions, even if they cannot express that knowledge. Giddens 

(2000, p35) postulated, “the agent is knowledgeable, with knowledge of most of the actions he or she undertakes. 

This wealth of knowledge is expressed primarily as practical consciousness.” In this regard, critical questioning 

skills are a form of practical consciousness. It is questioning that shapes and informs decisions though these 

questions are not always discursively expressed. This study aimed to get students to be discursive about the 

questioning. According to Giddens (2000, p5), “discursive consciousness refers to the understanding or 

knowledge which the agent achieves by reflecting upon his or her actions”. In the context of this study, the 

reflection process allows students to think about the implications of their actions on the project at hand. 

Questions are therefore useful tools to foster that reflection. The use of an ICT mediated intervention, provided 

users access to a deluge of questions posted by other users. The ramifications of this were that reflections were 

aided by both individual questioning and what others were asking. Giddens put this succinctly, “reflection may 

occur with the aid of others or as an autonomous act” (Giddens, 2000, p36). In the context of this project, “aid of 

others” was through messages posted in an ICT mediated environment. Thus, students’ discursive consciousness 

is achieved through reflection on both their individual messages and those posted by others. Reflection is a 

meaning making process hence a constructivist activity. Meaning making is an outcome of interpretation of 

messages. 

Stohl (1995) categorised messages into three groups: “messages having intentions of senders (what is desired 

and intended); ostensive message (what is actually verbalized); and receivers’ interpretations (how the reader 

interprets the message)” (p53). Our argument is that questions are special forms of messages in that a question is 

an outcome of the sender having reflected on his or her actions. Questions are therefore outcomes of reflection 

on action and produce other reflections leading to both intended and unintended actions. Both the process of 

responding to questions and access to questions posted by others tend to demand some action and are invitations 

for reflection on practical consciousness. Accepting this argument, it seems reasonable to expect that critical 

thinking would result from a questioning engagement, whereby suggesting that questioning is a potential 

learning approach. Thus, our premise is that critical thinking skills are an outcome of a constructivist activity on 

practical consciousness. 

To the extent that questions are expressions of intentions, they are ostensive messages serving as conveyor belts 

of intentions. As “conveyor belts” questions are indicators of thought processes which harbour intentions. 

Therefore access to a collective pool of intentions, gives hope that academic support for learners can be designed 

that are based on constructivist principles since new meanings are made in the cause of reading others’ questions 

and responses. 

247

A message is not only an expressive vehicle of intention but also is an exercise of power through which 

resources are mobilised. Questions give power of resource mobilisation to its author. In asking a question, the 

author mobilises resources in the form of responses and through the phrasing of a question the author shapes the 

content of the response. The target audience of questions shapes the phrasing of questions and defines the 

possible sources of responses. 

Research Objectives 

The main research objectives were to examine the effects of using ICT to increase academic support to students, 

and to teach students critical thinking skills. The questions asked were analysed and classified. The academic 

results of two cohorts of students were examined, the first without the particular ICT support (DFAQ), and the 

second cohort with the ICT support. 

Field Survey 

The choice of the field survey was informed by the need to develop students with critical thinking skills or 

critical literacy skills. 

A third year Information Systems (IS) course “Information Technology and Project Management” at UCT was 

used as the basis of the field survey. Three types of skills are expected of a project management student: 

knowledge of the body of knowledge (content); skills of using project management tools; and critical 

questioning skills. While recognising the need for students to have a solid foundation of the body of knowledge, 

the focus in this field survey was on equipping students with critical and inquiring skills. 

The course was lectured by the same lecturer (Johnston), and used the same text book (i.e. as a source of the 

body of knowledge) in 2003 and 2004. The major difference between the two years was the emphasis on 

developing questioning skills, and the introduction of awarding marks for asking questions in 2004. This was 

supported by the introduction of the DFAQ environment. 

The year mark component of the course assessment was modified as in Table 1, to include a mark for 

participating and questioning, and the essay marks were increased, the final examination mark was unchanged. 

Table 1: Course Assessment 

Course Assessment 2003 2004 

Essays & Assignments 10% 16% 

Workshop exercises 

April Test 

Class participation & questioning 

Final Examination 

10% 

20% 

0% 

60% 

10% 

10% 

4% 

60% 

A minimum of 45% was required for both the year mark, and for the final examination. 

Once it was decided to incorporate ICT into the Project Management Course to support students and to teach 

them to question, various software options had to be examined. DFAQ the web-based question environment was 

selected as it offered an anonymous medium for students to ask and review questions. The students were 

instructed in class and in handouts on how to ask questions, and how to use DFAQ. DFAQ was thus integrated 

into the course. Students were made aware of the fact that 12% of their year mark would be for asking questions, 

and of that 4% was linked to their use of DFAQ. 

Questions and answers were recorded. The identities of students who asked questions were recorded, but not 

linked to the question. Students were thus rewarded for quantity rather than quality of questions asked. 

Academic results of the 2004 students were compared with those of the 2003 students who had completed the 

course without a questioning mark in the course, and without ICT such as DFAQ. 

The 2004 students were encouraged and required to pose questions in their essays. These questions could be 

worth up to 0.2% of a students year mark. Four percent of the year mark was allocated to class participation and 

248

questioning. This mark was achieved by asking questions in lectures or on the DFAQ environment. A register 

was kept of all students who posed questions in class. DFAQ allowed quieter and reserved students to ask and 

respond to questions anonymously – although a list of email addresses of students who asked questions was kept 

for notification purposes it was not clear who asked which question. Students were encouraged to ask one-onone 

questions to the lecturer, to email questions to the lecturer, and to place questions in the lecturers pigeon hole 

– all incidences of questions asked were kept. 

Thirteen percent of the final written examination in 2004 consisted of questions in which students had to pose 

questions (worth 7.8% of the year mark). Thus a total of 12% of the course assessment in 2004 was for asking 

questions, rather than answering questions. 

The DFAQ web-site was used daily by students, not only to ask and respond to questions, but to review answers 

to previously asked questions. Over 95% of students participated on the DFAQ web site. The DFAQ 

environment (Ng’ambi and Hardman, 2004) provided the opportunity for students to practice their critical 

thinking skills. Students found the site easy to use, see Figure 1 for a sample of the user interface. 

Findings and Discussion 

The number of questions in all media from the 140 students in 2004 was as follows: 

Anonymous web-based via DFAQ 471 

Questions in class 88 

E-mail questions 15 

Face-to-face questions 9 

Written questions in pigeon hole zero 

This strongly indicates that the students in this class preferred asking anonymous questions via ICT (DFAQ 

environment). Furthermore, DFAQ offered support to students 24 hours a day. The DFAQ questions ranged 

from simple concept and course administrative questions to complex scenario based questions. 

Figure 1: Part of DFAQ User Interface 

There were 561 responses to these 471 questions. Some questions had as many as 6 responses. Of interest, are 

the frequently referenced questions (hits per question). One question was referenced 44 times – “Should our 

hypothesis be stated in the introduction and proven in the conclusion?” a second question was referenced 25 

times. In all 179 questions were referenced, and 116 were referenced more than nine times (see Figure 2). 

249

Figure 2: DFAQ screen showing the number of times questions have been referenced 

An analysis of the questions that were posted anonymously suggests that students were becoming critical in their 

thinking as this question indicates: 

At the start of a project - I mean the VERY BEGINNING - the project manager or whoever 

supplies management with an estimated cost and delivery date for the entire project as part of a 

business case. How can one know how much something will cost, what staff resources will be 

required, how long it will take or what exactly needs to be done (outside the usual recipe 

"analysis" "design" "test" "implement" etc.) BEFORE any analysis has been carried out?? And 

project success is measured against these "guesses"! In my opinion, a successful PM and their 

team guess well. Or they have delivered a very similar project previously and are able to learn 

from their mistakes. Any comments? [Anonymous DFAQ posting] 

The above question is indicative of integration between subject knowledge on project management, and 

acquisition of critical questioning skills. To the extent that the student stated a scenario, posed questions and 

answered his / her own questions shows that the student constructed his own knowledge as is the case for 

constructivist environments. 

Karagiorgi and Symeou (2005, p18) argued that “knowledge for constructivism cannot be imposed or transferred 

intact from the mind of one knower to the mind of another. Therefore, learning and teaching cannot be 

synonymous: we can teach, even well, without having students learn.” One can infer from the above student 

question that students were critically engaging with the course material and raising questions which allowed 

them to make meanings for themselves. It can be argued that the student worked with the concepts of project 

management (e.g. estimated time, delivery time, business case, staff resources etc.) to construct new meaning. 

Crotty (1998) explained that in order to generate meaning, people need to base their assumptions on things and 

concepts which they already know, or think they know. 

Some questions were of an inquisitive inquiry type. For example, the question below suggest critical thinking on 

the concept of Mc Farlan’s Risk Assessment:: 

Regarding McFarlan's risk assessment, why do we get to prioritise Planning? Should planning 

always take priority? [Anonymous DFAQ posting] 

Several similar critical questions among the 471 questions posted were observed. This was encouraging as it 

suggests that the DFAQ environment mediated the teaching of critical questioning skills. Some questions 

though were of procedural nature such as this one: 

Hello. I just wanted to ask, How do we add a project buffer into our gantt charts in MS project? 

[Anonymous DFAQ posting] 

250

The significance of the above question is that the student was attempting to relate the concept of project buffer to 

a project management tool, MS Project. Anecdotal evidence based on teaching experience, suggests that many 

students struggled to find the relationship between the project management theory and practice. To the extent 

that procedural questions were asked, could indicate that students were critical of such a relationship. 

Where is the line drawn in the acceptable skill level for IS professionals? Talking about the real 

world now... do we have to multi-skill? i.e. graphics as well as programming... as well as how 

many languages? [Anonymous DFAQ posting] 

One can infer from this question that there is a gap between teaching students general management, change 

management and project management techniques required to manage the development of a small information 

systems project, and developing skills to effectively apply the knowledge. 

There were questions in which students needed clarity or confirmation such as: 

About the mini workshop. The scope of the project is small but with further phases, it would 

include invoicing and sales etc. Should we create the application to be open to expansion for 

further phases? [Anonymous DFAQ posting] 

The above question allowed the lecturer to play the role of a coach, which is consistent with the role of a lecturer 

in a constructivist environment (Khine, 2003) and in an online discussion forum (Chiang and Fung, 2004). As a 

coach, the lecturer may post questions or responses in a way that fosters engagement. Chiang and Fung (2004, 

p312) asserted “In typical online learning discussions, learners often play the role of problem solvers and the 

educator often plays the role of tutor or coach. The educator observes how learning progresses. In the role of 

coach, the educator may be involved in posting a guiding problem, providing hints or suggestions in guiding a 

problem, or interactively joining in the discussion.” In DFAQ an educator may choose to either post messages 

anonymously or reveal their identities. DFAQ gives an educator access to student knowledge levels as Ng’ambi 

and Hardman (2004, p195) observe: 

Questions serve as important indicators of learners’ current knowledge. By recording learners’ 

questions in a database, we are able to build up a valuable teaching resource about what it is 

that the learners think they know and what they struggle with. We can look at the learners’ 

questions and tailor our materials to meet the needs we see expressed in those questions. 

It has already been mentioned that the final written examination in 2004 consisted of questions in which students 

had to ask questions. In evaluating the impact on students, the type of questions students asked in the exams 

were analysed. For the sake of brevity only four questions are discussed here. Students were asked to list 5 

questions that could be used to determine the Critical Success Factors (CSFs) of ICT (Information and 

Communications Technology) within Cisco? Below are some of the questions posed: 

What are the objectives of ICT for the future within Cisco in terms of business objectives and strategic 

focus? 

What are the critical factors that will need to be met to ensure successful completion of objectives? 

What variables exist within each factor that will affect the outcome of that factor and the objectives as a 

whole? 

How will these variables be measured, and how often? 

The significance of the above questions is that they are indicative of critical thinking skills. The questions are 

structured such that they build on the previous question and are all open ended. While the DFAQ artefacts 

(questions and responses) shows that questions acquired critical thinking skills, the questions students asked in 

the examination give hope that students could apply the acquired skills. It can be inferred from the above that 

DFAQ increased academic support to students, and was used as a medium for acquiring critical thinking skills, 

hence meeting the research objectives. 

This leads to the question, what was the impact on student results as compared to the previous year? 

Impact on student results 

The impact on the general academic performance of students was not evident. For example, the percentage of 

students who obtained first class passes rose dramatically from 1% in 2003 to 6% in 2004. The percentage of 

second class passes dropped dramatically from 49% to 29%. The number of third class passes remained constant 

at 46%. The failure rate increased from 4% to 19%. The class average dropped slightly from 60% to 57%. 

251

Comparing the results of the same students in another IS course run over the same time period, shows the 

number of first class passes rising from 11% in 2003 to 14% in 2004 (1% to 6% in PM course) . The percentage 

of second class passes dropped slightly from 69% to 60% (49% to 29% in PM course). The number of third class 

passes remained relatively constant at 17% and 18% respectively. The failure rate more than doubled from 3% to 

8%. The class average dropped from 66% to 64% (60% to 57% in PM course). 

While the gap between teaching and learning seems logical, the discrepancy between student learning and 

academic performance is rather odd. While further investigation is required into the impact of critical thinking 

skills on academic performance, Haggis (2003) argues that, research findings in the area of academic literacies 

suggest the need for a shift from a view of success/failure based on ‘ability’ and ‘preparation’, to one that sees 

study at this level as an apprenticeship into new ways of thinking and expression for students. According to 

Haggis (2003), new forms of expression take a number of years to develop, and need to be explicitly modelled 

and explored. 

Although the focus of this paper is on using ICTs to teach critical thinking skills, student performance is a result 

of subsidiary awareness as Polanyi and Prosch (1975, p33) argued: 

A striking feature of knowing a skill is the presence of two different kinds of awareness of the 

things that we are skilfully handling. When I use a hammer to drive a nail, I attend to both, but 

quite differently. I watch the effects of my strokes on the nail as I wield the hammer. I do not feel 

that its handle has struck my palm but that its head has struck the nail. In another sense, of 

course, I am highly alert to the feelings in my palm and fingers holding the hammer. They guide 

my handling of it effectively, and the degree of attention that I give to the nail is given to these 

feelings to the same extent, but in a different way. The difference may be stated by saying that 

these feelings are not watched in themselves but that I watch something else by keeping aware 

of them. I know the feelings in the palm of my hand by relying on them for attending to the 

hammer hitting the nail. I may say that I have a subsidiary awareness of the feelings in my hand 

which is merged into my focal awareness of my driving the nail (own highlights) 

Using Polanyi and Prosch’s (1975) metaphor, in this project “the hammer” – DFAQ in a constructivist 

environment, was used to “drive a nail” – critical thinking skills. A field survey was used to watch the “effects of 

the strokes on the nail as we wield the hammer” and observe the process of acquiring the thinking skills. The 

impact of the hammer on the nail was assessed in terms of whether the nail “went in”. The subsidiary awareness 

was the student performance which although highlighted, it was not the focus of the study. Given that 

performance emerged in “our focal awareness of our driving the nail” our future work will investigate the impact 

of ICT mediated critical thinking skills on student performance. 

Conclusion 

The paper described a field survey in which DFAQ (an ICT based constructivist learning environment) was 

introduced into an IS project management course in 2004 as a scaffolding tool to support students as well as to 

facilitate the acquisition of critical thinking skills. The anonymous interaction within DFAQ coupled with the 

integration of the tool in the course seems to have motivated student participation, and offered increased support. 

Despite the anonymity and the constructivist learning approach adopted, all the 471 questions posted were either 

on the theory of project management, use of project management tools, essay writing or course administration 

such as: I was just wondering, why do the workshops have to be so intense? why don’t we have less work and 

concentrate more thoroughly on this? 

Critical thinking skills are not taught by giving students more information or even better books. This survey 

demonstrated that providing students with an anonymous consultative environment, which is integrated within a 

course, made the interaction worthwhile. Students acquired critical thinking skills and were able to apply the 

skills in an examination environment. 

References 

Cadle, J., & Yeates, D. (2004). Project Management for Information Systems, England: Prentice Hall. 

Chiang, C. A., & Fung, P. I. (2004). Redesigning chat forum for critical thinking in a problem-based learning 

environment: Internet and Higher Education, 7 (4), 311-328. 

252

Coombs, P. (2003). IT Project Estimation, Cambridge: Cambridge University Press. 

Crotty, M. (1998). The Foundations of Social Research: Meaning and perspective in research process, 

Australia: Allen & Unwin. 

Department of Education (2005). Building a bright future: Curriculum 2005, Pretoria: Government Printers. 

Giddens, A. (2000). An Introduction to a Social Theorist, Kaspersen, N. L. (Trans.), Oxford: Blackwell. 

Haggis, T. (2003). Constructing Images of Ourselves? A Critical Investigation into ‘Approaches to Learning’ 

Research in Higher Education. British Educational Research Journal, 29 (1), 89-104. 

Hills, P. (1987). Education for a Computer Age, London: Croom Helm. 

Hughes, B., & Cotterell, M. (2002). Software Project Management, London: McGraw-Hill. 

Johnston, K., & Ng’ambi, D. (2004). Every question deserves an answer. In McKay, E. (Eds.), Proceedings of 

International Conference on Computers in Education, Melbourne: RMIT Business School. 

Karagiorgi, Y., & Symeou, L. (2005). Translating Constructivism into Instructional Design: Potential and 

Limitations. Educational Technology & Society, 8 (1), 17-27. 

Khine, S. M. (2003). Creating a Technology-Rich Constructivist Learning Environment in a classroom 

management module. In Khine, S. M. & Fisher, D. (Eds.), Technology-Rich Learning Environments, New 

Jersey: World Scientific, 21-39. 

Marchewka, J. T. (2003). Information Technology Project Management, USA: Wiley & Sons. 

Ng’ambi, D. (2003). Students helping Students Learn - The Case for a Web-Based Dynamic FAQ Environment. 

In Proceedings of the International Association of Science and Technology for Development (IASTED) 

Conference on Information and Knowledge Sharing, Scottsdale, 293-298. 

Ng’ambi, D. (2004). Towards an Organizational Intelligent Framework based on Giddens’ Structuration Theory: 

A case of using an Organizational Memory System - DFAQ. In: IFIP World Congress Proceedings of the 

Symposium on Professional Practice in Artificial Intelligence, Toulouse. 

Ng’ambi, D., & Hardman, J. (2004). Towards a knowledge-sharing scaffolding environment based on learners’ 

questions. British Journal of Educational Technology, 35 (2), 187-196. 

Nicaise, M., & Crane, M. (1999). Knowledge Constructing Through HyperMedia Authoring. In Educational 

Technology Research and Development, 47 (1), 29-50. 

Polanyi, M., & Prosch, H. (1975). Meaning, Chicago: University of Chicago Press. 

Pratt, N. (2002). ‘Using’ or ‘Making Use Of’ ICT. Ways of Knowing Journal, 2 (2), 40-47. 

Rickards, T. (2003). Technology-Rich Learning Environments and the role of effective teaching. In Khine, S. M. 

& Fisher, D. (Eds.), Technology-Rich Learning Environments, New Jersey: World Scientific, 115-132. 

Russell, L. D., & Schneiderheinze, A. (2005). Understanding Innovation in Education Using Activity Theory. 


Schwalbe, K. (2004). Information Technology Project Management (3 rd Ed.), Canada: Thomson. 

Stohl, C. (1995). Organizational Communication, Thousand Oaks, CA: Sage Publications. 

Toohey, S. (1999). Designing Courses for Higher Education, Buckingham: Open University Press. 

Walton, M., & Archer, A. (2004). The Web and information literacy: scaffolding the use of web sources in a 

project-based curriculum. British Journal of Educational Technology, 35 (2), 173-186. 

Winn, W. (1989). Toward a rationale and theoretical basis for educational technology. Educational Technology 

Research and Development, 37 (1), 35-46. 

253

July 2006 Volume 9 Number 3 - CiteSeerX

You also want an ePaper? Increase the reach of your titles

Delete template?

Save as template?