CALICO Journal, Volume 9 Number 1 9

CALICO Journal, Volume 9 Number 1 9



Abstract: This paper describes the strategy used in Miniprof, a program designed to provide

"intelligent' instruction on elementary topics in French. In case of an erroneous response from a

student, the program engages him/her in a Socratic dialogue. It asks specific questions about the

sentence involved, thereby leading the student to identify each mistake and its correction. The

systematic error detection and subsequent human-like instruction specific to the student's

response is achieved by using three major functions: parsing, error diagnostics, and tutoring. The

design of the system and the issues involved are discussed.

Keywords: intelligent tutors, parsing, syntax, computer-assisted language instruction.


The goals and potential benefits of intelligent computer assisted instruction in general, and of

intelligent computer assisted language instruction in particular, are by now well recognized (see,

for example, Kearsley, 1987; Polson and Richardson, 1988; Psotka et al., 1988; Wenger, 1987).

Some language programs incorporating a variety of techniques and teaching strategies are

already in use (see, for example, Bailin and Levin, 1989; Bailin and Thomson, 1988).

Miniprof, an intelligent tutor for elementary topics in French, is being developed at Central

Michigan University. The program, comprising four modules at present, is written in C for IBM

personal computers.

In the course of developing these modules, we experimented with different teaching strategies

and system designs to deal with student errors. In two of the modules the error analysis,

although quite good, was performed in an ad hoc manner and was not sufficiently systematic to

be easily generalized. Adding new modules and enhancing the error analysis required

considerable effort.

CALICO Journal, Volume 9 Number 1 9

In the other two modules a system emerged which provides detailed tutoring-the kind a student

gets from an instructor. When an error occurs in the student's answer to a question, the student is

engaged in a dialogue specific to the error, using material from the sentence on which the student

is working. This article describes the error diagnostics and subsequent instruction that leads to

this teacher-like behavior.

I. The Language Modules

The modules described here are intended to teach students how to replace subjects with

pronouns and how to give negative answers to questions. In the first module, the student is

asked a question and instructed to use a subject-pronoun in the response, and in the second

module, which extends the first, to respond in the negative and make the appropriate subjectpronoun


In case of errors, the student is engaged in a dialogue leading to the identification of the mistakes

and their correction. Figures 1 and 2 give examples of this interaction.

#1 Est-ce que tu gagnes beaucoup?

*Non, tu gagnes beaucoup.

The question uses the pronoun "tu".

Which pronoun should you use in your answer?


Use "je". You need to answer in the negative.

Use the negative in the next sentence.

Answer: Non, je ne gagne pas beaucoup.

#2 Est-ce que tu parles espagnol?

*Non, je ne parles espagnol.

The subject is "je". "parl" is correct.

What should the ending be?


Right. Your answer is negative.

What do you need after the verb?

*I don't know.

Use "pas". Answer: Non, je ne parle pas espagnol.

Figure 1. Examples #1 and #2

CALICO Journal, Volume 9 Number 1 10

Although the exercises are simple, the objective is to imitate the behavior of an experienced

instructor. To this end, the instruction is focused on the specific error, and the instructional

messages use material from the sentence in question.

II. Overview of the design

The two modules are based on a common internal strategy which incorporates three major

components: parsing, diagnosis and tutoring. As outlined in Figure 3, the parsing and diagnosis

functions analyze the student's response and record errors by setting a flag for each error


The syntax errors—those at the sentence level—are detected by the parser as described in section

#3 Est-ce que Mike étudie beaucoup?

*Non, Mike ne étudie pas beaucoup.

You need to substitute a pronoun for "Mike".

Which pronoun should you use?


That's right

What do you use instead of "ne" if a vowel or mute 'h' follows?



Answer: Non, il n'étudie pas beaucoup.

#4 Est-ce que vous mangez beaucoup?

*Non, nous ne mangons pas beaucoup.

What do you need between a 'g' and a verb ending starting with 'a' or 'o'?


Use "e". Answer: Non, nous ne mangeons pas beaucoup.

Figure 2. Examples #3 and #4

CALICO Journal, Volume 9 Number 1 11

III. For example, in sentence #3 of Figure 2, the parser determines that "n'" must be used instead

of "ne" because the next word begins with a vowel. In sentence #4, because the root of the verb

ends in "g" and is followed by "o", the parser detects that an "e" must be inserted.

The other type of error involves the context and the nature of the exercise. Errors of this type are

detected by the diagnostic component. This component compares the student's answer to the

parse of both the question and the correct answer. In sentence #3 of Figure 2, it is the diagnostic

component that determines that a pronoun has not been substituted for the noun as required by

the exercise and that the correct pronoun to replace "Mike" is the 3rd person masculine singular

pronoun "il". The diagnostic component also detects the errors in sentence #1 of Figure 1, where

the response to a question with "tu" must be "je" and the exercise requires that the response be


Instruction is handled by the tutoring component which engages the student in a dialogue by

asking specific question about he incorrectly answered exercise items.

CALICO Journal, Volume 9 Number 1 12

III. The Components of the System

A. Parsing

Parsing offers the most effective way of specifying and checking the syntax in exercises. The

parser relies on a grammar and a lexicon to perform its task. Let us look at each in turn.

Sentence Grammar

Our parser uses a context-free grammar augmented with the kind of constraints (for example,

subject-verb agreement) found in definite clause grammars (Pereira & Shieber, 1987). A contextfree

grammar (Allen 1987, Winograd 1983, Sanders & Sanders, 1989) is a set of rules defining the

grammatical sentences in a language. These rules are written using non-terminals (phrase and

clause categories) and terminals (words of the language). A word matches a terminal if it has the

same part of speech. The words used in the exercises—about 100 in all—stored in the lexicon

along with their attributes (part of speech, number, gender, etc.). Each rule defines the structure

of a non-terminal as a combination of terminals and non-terminals.

Figure 4 lists the grammar for the exercises handled by our modules. The terminals are in italics.

The first three rules define the grammatical sentences in a small subset of elementary French.

The first rule expresses the fact that a grammatical sentence in the grammar can consist of a

terminal followed by a set of non-terminals. In this case, the terminal is called 'question', which

we use to represent "est-ce que" in the lexicon. The non-terminals are 'np' (noun-phrase) and 'vp'

(verb-phrase). These are defined by other rules in the grammar. There are, for example, a number

of rules which can define 'np', any one of which will produce a grammatical sentence.

For example, the rule

np(G,N,P) --> pronoun (G,N,P)

means that an 'np' can be a pronoun. In a similar manner, other rules indicate that an 'np' can also

be a proper noun or an article followed by a noun.

The braces immediately following a rule indicate the additional constraints that also apply to the

rule. The first rule in the grammar has the constraint {N 1=N 2,P 1=P 2} which stipulates that the 'np'

and the 'vp' must agree in number (N 1=N 2) and person (P 1=P 2). A similar constraint exists in the


npl(G,N,P)--> det(G 1,N 1,P), n(G 2,N 2,P) {G 1=G 2, N 1=N 2}

where the determiner and the noun must be of the same number and gender.

CALICO Journal, Volume 9 Number 1 13

s--> question, np(G,N,P), vp(N 2,P 2) {N 1=N 2,P 1=P 2}

s--> adverb, np(G,N 1,P 1), vp(N 2,P 2) {N 1=N 2,P 1=P 2}

s--> np(G,N 1,P 1), vp(N 2,p 2) {N 1=N 2,P 1=P 2}

np(G,N,P)--> pronoun(G,N,P)

np(G,plural,P)--> npl(G,N,P), conj, npl(G,N,P)

np(G,N,P)--> npl(G,N,P)

npl(G,N,P)--> proper-noun(G,N,P)

npl(G,N,P)--> det (G 1,N 1,P), n(G 2,N 2,P) {G 1=G 2,N 1=N 2}

vp(N,P)--> v(N,P), complements

vp(N,P)--> v(N,P)(Vowel Rule), v(N,P), neg2, complements

complements--> prep, proper_noun

complements--> int, adv

complements--> adv

complements--> n

In the rule

Figure 4. The Grammar

vp(N,P)--> neg1(Vowel Rule), vp(N,P), neg2, complements,

'Vowel Rule' refers to the use of 'n' or 'ne' before the verb. 'N' is used if the verb begins with a

vowel or a mute 'h' and 'ne' in the other cases.

The Lexicon

The lexicon contains the list of words used in the exercises along with grammatical information

about them. Each word is stored using five slots. The first two slots contain the word and its part

of speech. The remaining three slots contain the gender, number and person, when applicable.

The pronoun 'il' and the proper nom 'Jean-Pierre' for example, are listed as follows:

“il", pronoun, MASC, SING, 3,

“Jean-Pierre", proper-noun, MASC, SING, 3.

CALICO Journal, Volume 9 Number 1 14

Verbs are listed differently. At present we only deal with verbs that do not involve stem changes.

The lexicon only contains the root of the verb; for example, there is an entry

"parl", verb,-,-,-.

There is also a separate dictionary of verb endings which lists each possible ending along with

the corresponding number and person, as shown below:

“e", SING, 1,-,-,

"es", SING, 2,-,-,

“e", SING, 3,-,-,

“eons", PLUR, 1, G,-,

“ons”, PLUR, 1,-,-,

“ez", PLUR, 2,-,-,

"ent", PLUR, 3,-,-,

As we expand the lessons, the lexicon will have to contain additional information. The

information on a verb, for example, will specify the past participle, the auxiliary it uses in

compound tenses, and the kinds of complements it allows (a direct -object, an indirect object, an

infinitive preceded by a preposition).

The Parser

The parser uses the grammar to assign a grammatical structure to the question and the student's

actual response. If the student's answer is not a valid sentence, it tries to parse as much of the

sentence as possible and identify the errors.

Let us look at an example of how the parser works. Suppose the sentence is "Elle ne parle pas

beaucoup. " The parser is a top-down parser—it starts from the rules defining a sentence, namely

1. s-- > question, np(G,N1,P1), vp(N 2,P 2) {N 1=N 2,P 1=P 2},

2. s-- > adverb, np(G,N 1,P 1), vp(N 2,P 2) {N 1=N 2,P 1, =P 2},

3. s--> np(G,N 1,P 1), vp(N 2,P 2) {N 1=N 2,P 1=P 2}

Rule 1 says that the sentence may start with the terminal 'question'. After consulting the lexicon

which contains information on each word, the parser determines that the rule is not applicable

because the first word is not a question. Nor is rule 2 applicable. The parser then applies rule 3

and tries to recognize a noun phrase and a verb phrase. A noun phrase must conform to one of

the rules 4, 5 or 6. These are:

CALICO Journal, Volume 9 Number 1 15

4. np(G,N,P)- > pronoun (G,N,P),

5. np(G,plural,P)- > np1 (G,N,P), conj, np1(G,N,P),

6. np(G,N,P)- > np1(G,N,P).

Rule 4 requires the word to be a pronoun, as is the case here. Thus the parser records the fact that

the noun phrase is the third person feminine singular pronoun “elle”.

Next it tries to parse a verb phrase. Rules 9 and 10 are relevant here:

9. vp(N,P)- > v(N,P), complements,

10. vp(N,P)-- > neg1(Vowel Rule), v(N,P), neg2, complements.

Rule 9 requires a verb followed by a complement. The parser checks the next word ("ne") in the

lexicon. It is not a verb. It therefore goes to the next possible form of a verb phrase: the leading

negative, a verb, the trailing negative and complements. The "ne” is recognized, as is the verb

"parle " (whose person and number indeed match that of the noun phrase), as well as the trailing

negative "pas". Finally, using rules 11-14, which are:

11. complements- > prep, proper_noun,

12. complements--> int, adv,

13. complements-- > adv,

14. complements-- > n,

"beaucoup" is recognized as an adverb. To the extent that rule 3 for the structure of a sentence is

satisfied and all the words in the sentence have been used, the parse has been successful. Each

part of the sentence has been classified, and the overall structure is represented by the following






| |

| |

pronoun neg1—verb—neg2—adverb

| | | | |

| | | | |

Elle ne parle pas beaucoup

What happens if the parser encounters an incorrect student response? It attempts to assign a

structure to as much of the sentence as Usable and to flag as an error the incorrect parts. The way

the negative is handled illustrates the point. Negatives in French have two parts; an error cannot

therefore be confirmed until the second part has been reached. If the sentence includes the first

part of the negative 'ne', but ‘pas' is missing, the parser puts out an error flag when it

CALICO Journal, Volume 9 Number 1 16

eaches the position where 'pas' should be. If, on the other hand, 'ne' is missing, the parser

continues to parse to see if there is a 'pas'; only when it finds 'pas' doe it set the error code for the


Let us now look at a complete example of an incorrect answer. Assume that the student has

responded "Il n'parlent beaucoup. " The parser starts with rules 1-3 defining a sentence and

proceeds much as in the example. It chooses rule 3 for the definition, and uses rule 4 to recognize

the pronoun “Il" as the noun phrase. In trying to parse the verb phrase, the presence of the

negative "n' " dictates rule 10. The next word, ”parlent”, does not start with a vowel, so an error is

recorded. "Parlent" is recognized as a verb, but the number (plural) of the ending does not match

that of the subject ("il”-singular) and another error is recorded. The next word, according to rule

10, must be the trailing negative. It is not, so yet another error is recorded. Finally, just as in the

previous example, rule 13 is used to classify the adverb "beaucoup" as a valid complement. The

parser bas thus built the following parse tree and signaled the appropriate errors.





| |

| |

pronoun neg1—verb—-adverb

| | | |

| | | |

il n’ parlent beaucoup

If the student were responding to the question "Est-ce que Monique parle beaucoup", the

student's response should contain the feminine singular pronoun "elle". But this is not a syntax

error and cannot be detected by the parser. Such context errors are detected by the diagnostic

component described in the next section. The example above illustrates a few of the errors

detected by the parser. A complete list of the errors detected in the exercise on the negative

appears in Figure 5. It should be noted that the parser only assigns a structure to the correct

answer in an indirect manner. The parse tree of the correct answer is generated by making

substitutions to the parse tree of the question. These are done using simple rules:

1. Drop the question marker and insert "non” in the sentence.

2. a. If the subject is a noun, replace it with a pronoun of the same number and gender.

b. If the subject is a first person pronoun, change it to second person or vice versa.

3. If the person of the pronoun has been changed, adjust the ending of the verb.

In thesecondmodule, for example, the student is asked to answer questions in the negative.

Assume that the question is "Est-ce que tu travailles beaucoup?". The program first removes the

question marker "est-ce que" and substitutes the adverb "Non" to make the sentence negative.

Then the first person pronoun “je" is substituted for the second person pronoun “tu", and the

verb is changed to "travaille" to agree with the new subject. The exercise requires a negative

answer so the two parts of the negative are added to the response. In the example, the sentence is

CALICO Journal, Volume 9 Number 1 17

made negative by adding “ne" before the verb "travaille” and "pas” after.

Finally, the remainder of the question is copied to the response, i.e. “beaucoup” is copied to the

end of the sentence to produce the correct response “Non, je ne travaille pas beaucoup."

Parsing New Material

In order to make sure that new exercise set items can be handled by the parser, our program

checks to make sure that the words used in the sentence are in the dictionary and that the

sentence is compatible with the limited capabilities of the grammar. The entry module used to

create or augment an exercise file first ensures that each word is in the dictionary. If it is not, it

asks the user for the necessary information about the word (part-of-speech, gender, etc.). It then

attempts to parse the sentence. If the parse fails, the sentence is clearly unsuitable; otherwise the

dictionary is updated and the sentence is included in the exercise file.

B. The Diagnostic Component

The diagnostic component detects errors the parser cannot identify. More specifically, it

recognizes context errors and exercise-related errors.

Context errors are those where the student's response, although syntactically correct, is not an

appropriate answer to the question (see part A of Figure 6). For instance, the pronoun substitute

for the proper noun "Janine " must be " elle", the 3rd person feminine singular pronoun; a

question asked with "tu", a 2nd person singular pronoun, requires "je", a 1st person singular

pronoun, in the response. The diagnostic component checks the student’s answer for both types

of context errors.

Other context errors are more subtle. Assume the student's answer is "Non, tu ne parles pas

espagnol " to the question " Est-ce que tu parles espagnol? ". The correct answer is "Non, je ne

parle pas espagnol." The parser will not detect an error in the student's response because it is a

syntactically correct sentence. Yet, because the student bas responded with the wrong subject

pronoun, an error is detected by the diagnostic component: the response is wrong in this context.

The tutor later asks the student to use the right subject pronoun "je" . Verb endings in French,

however, are highly inflected, so a change in the subject may require a modification in the verb.

Once the student bas corrected the subject pronoun, there is an error in the verb ending, which

ought to be "e", not "es". The diagnostic component detects this error by matching the verb form

in the student's answer with the verb in the correct answer.

Unlike the errors noted above, other errors detected by the diagnostic component are exercise

specific. For example, if the student does not substitute, a subject-pronoun for the noun or answer

CALICO Journal, Volume 9 Number 1 18

Figure 5. Errors detected by the parser

in the negative as required by the exercise, the diagnostic component detects the error. It does so

by checking the parse tree of the student's answer to see if the response is negative and does

contain a pronoun as subject.

These exercise-related errors cannot be detected by the parser. The number of exercise specific

errors is quite small relative to the total number of errors and we would guess that the proportion

will not increase considerably. In the tutor, the exercise related errors listed in part B of Figure 6

are handled separately from the others and can be easily deactivated when not required by the

exercise. The diagnostic component does its work by using the parse trees of the original

question, the correct answer and the student's response. Information about the sentence and

about various errors mentioned above are passed on to the tutor using a set of flags. Figure 6

contains a complete list of errors the system can diagnose.

C. The Tutor

The tutor is a rule-based system that asks the student questions about errors detected by the

parsing and diagnostic components. It attempts to model the way in which a teacher would

respond when actually tutoring a student in French. The order in which the tutor handles the

errors is determined by a built in priority list. Each item on this list corresponds to a group of

errors. The priority list for the exercises on the negative is shown below.

CALICO Journal, Volume 9 Number 1 19

(1) Pronoun/Exercise: exercise type errors in the subject-pronoun, (error B.11 from Figure 6)

(2) Pronoun: syntax or context errors in the subject-pronoun, (errors 1-6 from Figure 6 and 1)

(3) Verb: syntax or context errors in the verb, (errors 2-8 from Figure 5 and 8 from Figure 6)

(4) Negative: syntax errors in the negative, (errors 9-12 from Figure 5)

(5) Negative/Exercise: exercise type errors in the negative, (en-or B. 12 from Figure 6)

(6) Answer: prints the correct answer.

It will be noted that errors are categorized by part of speech, and within each part of speech, the

exercise-type errors are separated from the syntax and context errors. This facilitates the changes

CALICO Journal, Volume 9 Number 1 20

needed to handle exercises of different types. Thus, the only difference between the module

which requires the student to respond in the negative and the one which does not is the presence

or absence of “Negative/Exercise” on the list.

The tutor-calling routine is rather straightforward. It first checks to see if the student’s answer is

correct and, if it is, prints an appropriate message. If the answer is not correct, it sequentially

takes each item on the priority list and calls the corresponding function. The last item on the

priority list, “Answer”, calls a function which prints the correct answer.

Each tutorial functions contain a sequence of clauses (‘if’ statements), each related to an error.

Each statement contains the actions taken by the tutor when that error occurs. If there are no

errors of the type handled by the function, control returns to the main routine.

The negative in French, for example, requires ‘ne’ before the verb and ‘pas’ after the verb. In

addition, if ‘ne’ is followed by a word beginning with a vowel, ‘n’’ is used. So, four errors are

possible, three with ‘ne’ and one with ‘pas’. ‘Ne’ can be missing or used when the next word

begins with a vowel or ‘n’’ used when the next word begins with a consonant, or ‘pas’ can be

missing. The function therefore has four “if” statements each related to one of these errors. The

statements contain the actions taken by the tutor when the error occurs.

When a function is called and an error triggers one of its clauses, the words related to the error

CALICO Journal, Volume 9 Number 1 21

are looked up in the parse tree. The student is then asked specific questions about the error.

All the functions contain similar conditions and actions for errors. Figure 7 contains a sample of

the procedure for the verb-errors function.

If the question to the student is “Est-ce que tu voyages beaucoup?” and the student has

responded “Non, je ne voyages pas beaucoup.”, the parser will signal that the verb ending is

wrong and the code in Figure 7 from the tutor will be triggered. The program first traverses the

parse tree of the correct answer to find the subject “je” and the verb of the sentence “voyage”. A

message is then printed on the screen telling the student that the right verb has been used, but

that there is a problem with the ending of the verb. The program then points out to the student

that the subject is “je” and that the root of the verb that the student has correctly used is “voyag”,

and asks what the right ending should be. After reading the student’s answer, the program tries

CALICO Journal, Volume 9 Number 1 22

to match it to the correct answer “e”. If they match, a message indicating that the student is right

is printed; otherwise the student is given the correct form.

Sometimes more than one question is asked of the student. A clause in the pronoun errors

function shown in Figure 8 serves as an illustration. Assume the question is “Est-ce que Bob et

Monique parlent espagnol?” and that the student has answered with “Non, il ne parlent pas

espagnol.” This pronoun-error clause is triggered when the third person subject pronoun the

student has used is of the wrong number, as in the example above. The program first finds the

subject of the sentence “Bob et Monique” in the parse tree of the question. It then asks the student

if the subject is singular or plural.

IV. Conclusion

The three component design (parsing, diagnostics and tutoring) allows instruction which is

specific to the student’s errors and models the behavior of a human instructor. The parsing and

diagnostic components provide the detailed information about the syntax, context and exerciserelated

errors, as well as the syntactic information contained in the parse trees which are crucial

for creating the instructor-like dialogue in the tutor. Furthermore, the three part design allows the

program to be modular: separate functions are identified and isolated and hence can be modified

with relative ease.

The design can be generalized (within the limited scope of elementary French) to other types of

exercises. For each type of exercise, one has to classify the possible errors into three categories

(syntactic, contextual and exercise related). These have to be grouped and the groups prioritized

for tutoring. Finally, the corresponding tutoring rules have to be tailored to the errors.

We need, however, to make several improvements. The grammar can be enlarged to handle a

larger variety of sentences. In addition the lexicon and the verb morphology are very limited and

need to be expanded. Finally, the diagnostic strategy does not handle certain errors commonly

made by students, such as misspellings and misplaced words. We are developing a scheme to

deal with some of these errors.

Still, we have taken a modest step towards a system that would mimic human tutoring.

CALICO Journal, Volume 9 Number 1 23


Allen, James. 1987. Natural Language Understanding. Menlo Park, CA: Benjamin/Cummings.

Bailin, Alan and Lori Levin, eds. 1989. Intelligent Computer-Assisted Language Instruction. A special

issue of Computers and the Humanities 23, 1, 1-90.

Bailin, Alan and Philip Thomson. 1988. “The Use of Natural Language Processing in Computer-

Assisted Language Instruction.” Computers and the Humanities 22, 99-110.

Kearsley, Greg, ed. 1987. Artificial Intelligence and Instruction: Applications and Methods. Reading,

MA: AddisonWesley.

Pereira, Fernando C. N. and Stuart M. Shieber. 1987. Prolog and Natural-Language Analysis.

Stanford, CA: Center for the Study of Language and Information.

Polson, Martha C. and J. Jeffrey Richardson, eds. 1988. Intelligent Tutoring Systems: Lessons

Learned. Hillsdale, NJ: Lawrence Erlbaum.

Sanders, Alton F. and Ruth H. Sanders. 1989. “Syntactic Parsing: A Survey.” Computers and the

Humanities 23, 13-30.

Wenger, Etienne. 1987. Artificial Intelligence and Tutoring Systems: Computational and Cognitive

Approaches to the Communication of Knowledge. Los Altos, CA: Morgan Kaufmann.

Winograd, Terry. 1983. Language as a Cognitive Process. Vol. 1 (Syntax). Reading, MA: Addison-



Gilles Labrie is an Associate Professor of French at Central Michigan University. His present

research interests are in computer applications to elementary and intermediate language


L.P.S. Singh is an Associate Professor of Computer Science at Central Michigan University. His

research interests are in the areas of intelligent tutoring systems, parallel computing and


CALICO Journal, Volume 9 Number 1 24


Gilles Labrie

Department of Foreign Languages, Literatures and Cultures

Central Michigan University

Mt. Pleasant, MI 48859


Bitnet: 33U2EPJ@cmuvm.bitnet

L.P.S. Singh

Department of Computer Science

Central Michigan University

Mt. Pleasant, MI 48859


CALICO Journal, Volume 9 Number 1 25

More magazines by this user
Similar magazines