CHAPTER 5

LEXICAL TUTORING RESEARCH & DEVELOPMENT

Concordance software is not the only idea ever put forward for using computers to facilitate lexical acquisition. There are dozens if not hundreds of vocabulary tutorials in existence, both on the commercial market and in language-teaching institutions where CALL enthusiasts ply their trade.
Vocabulary has always been seen as one of the most computerizable of learning tasks, mainly because of the apparently manageable size of the learning unit, and also because of the huge variance in learners' prior knowledge. The classic measure of this variance is a study by Saragi, Nation, and Meister (1978), which found that after Indonesian learners had all used a typical language coursebook, only 12% of the words in the book were known to every learner, while every word was known to at least 30% of them. So much variability indicates some sort of individualized instruction.

However, most vocabulary programs, whether commercial or homespun, have little connection to either theoretical or empirical research. There are only a handful of efforts that are theoretically motivated, empirically validated, and which attempt to facilitate contextual knowledge. Five of the best motivated lexical tutors of recent years will be described in light of the themes and issues discussed above, with a view to showing what a concordance approach could add to the existing options.

A caveat: these tutors are at various stages of completion; some were designed mainly for use and others mainly for research; and most are for second-language learners but not all. No systematic taxonomy is intended (for this see Goodfellow, 1995a), but rather a collection of designs illustrating themes discussed in the two preceding chapters.

A mnemonics-based tutor

Coady, Magoto, Hubbard, Graney and Mokhtari (1993) have developed and tested a tutor based on a mnemonic approach to word learning. The problem the tutor addresses is the one discussed in Chapter 3, that naturalistic vocabulary learning in second language is impractical given the time available. The goal is to speed up the acquisition of the 1200 most frequent words of English for a group of academic language learners in the US.

The approach is as follows: Learners meet the 1200 words in 60 groups of 20. Twenty words appear on the screen and the learner selects one for attention. A short definition of the word appears in the learners' first language, along with an example sentence in English, and there is a place for the learner to type in a mnemonic of up to 30 characters in length. The learners have been trained in the mnemonic keyword method of learning words (Levin and Pressley, 1985). In this method, a new second-language word is associated with a word already known in the first language via an "interactive image." For example, a Spanish speaker learning "payment" might think of "pimento," and form an image of handing over money for some peppers, and this memorable image would make the word "payment" more recoverable the next time it was needed. In this case, "pimento" is stored in the computer with "payment," and can be used at test-time to help the learner remember the word.

The learners are tested by the program after each group of 20 words. They see a Spanish word and are asked for the English equivalent, and they can access their mnemonic to help them recover it. The tutor itself was also tested pre and post in comparison to a control group, showing the students making small but significant gains in both vocabulary knowledge (control 5%, treatment 13%) and comprehension of a text using the learned words (control 10%, treatment 20%).

The authors regard the tutor as a success, which it appears to be up to a point, but it has some weaknesses. First, there is the strong case against teaching learners translation equivalents and encouraging their naive lexical hypotheses (discussed in Chapter 3), especially in the area of the most frequent words which are likely to be the most polysemous and the least equatable to words in the first language. Coady and colleagues know this, and argue that their goal is simply to establish an initial representation for a word, leaving the remainder of the learning process for a later date, presumably in natural reading. This raises the question of whether more of the learning process could not be built into the tutorial.

The second weakness is providing just a single example sentence for a word: transferable knowledge is unlikely to be created, just as Gick and Holyoak's subjects could not transfer a single problem solution to an analogous context.

Third, while mnemonic learning strategies are known to have the power to strengthen memory traces, it is not clear that the strategy is ever actually used when the specific training period is concluded, i.e. that it will ever account for more than a tiny minority of the thousands of words that must be learned. Dozens of studies prove that "mnemonics works" or "mnemonics works better than contextual inference" (Pressley and McDaniel, 1987; McDaniel and Pressley, 1989), but none prove or even examine whether students trained in the strategy ever use it when the study is over.

Fourth, the report does not mention whether the students actually bothered to think up and enter very many "pimentos." Any who were not availing themselves of the theory-based part of the program were then merely using the program to get first-language synonyms for second-language words, leaving the relevant research question to be this: Did they get anything from the computer program that they could not get from their small bilingual dictionaries?

Fifth, the extensibility of the tutor beyond 1200 words could be problematic. Each of the 1200 vocabulary items required devising a first-language synonym and a second-language example sentence, a total of 2400 entries. If, as seems apparent, about 3500 words is the desirable number for direct instruction (Hirsh and Nation, 1992; Sutarsyah, Nation, and Kennedy, 1994), then 7000 entries would need to be written, a task of some considerable labour.

A corpus approach would face none of these objections. There would be no encouragement for thinking in terms of translation equivalents; several examples would be provided for every word, allowing schema induction mechanisms to operate; the learning strategy involved would be the normal one used outside the tutorial; the computer's processing ability would be used to do more than store a vocabulary list; and the system would be infinitely extensible, since its main resource would be natural text which is now abundantly available.

A pregnant contexts approach

Following Schouten-van Parreren's (1985) theory that words are ideally learned in very supportive, pregnant contexts, Beheydt (1985) has developed a lexical tutor called CONTEXT that introduces and gives practice in recovering the 1000 highest-frequency words of English. Each word is stored in the program in three very simple, very pregnant contexts. Learners are presented with one of the context sentences, with the target word replaced by a gap, and try to guess the word "sensibly." If they cannot guess it, a second pregnant context is presented, then a third, and after that the word is given. Words unguessed are recycled for further practice.

This work was done before Mondria and Wit-de Boer (1991) raised doubts about pregnant contexts. From their results, one might predict that learning words from Beheydt's pregnant contexts would be easy, but then later the words would not be remembered. Unfortunately we do not know if this happened, because Beheydt does not offer any test of his program, either immediate or delayed, on either retention or comprehension of novel text.

What we do know, however, is that devising (1000 x 3=) 3000 dedicated context sentences is a considerable labour. But the labour would hardly be finished there, because with 1000 words in their heads learners' lexical needs have hardly begun to be met. The challenge of a systematic approach to lexis lies not in the first 1000 words, a frequency range already well covered in commercial language materials (Meara, 1993) and classrooms (Lightbown, Halter and Meara, 1995), but in pushing learners toward the 3500 words needed for unassisted reading. To teach 2500 more words with Beheydt's tutor, a further (2500 x 3 =) 7500 pregnant contexts would have to be devised.

Nor would that be the end of it. When words enter remedial recycling in Beheydt's tutor, they are merely presented again in the original set of pregnant contexts. The second time around, of course, the learner can simply rely on a surface association to provide the answer. So ideally, to give the learner an opportunity to process the word as deeply in the remedial cycle as in the original cycle (in other words to fit it to a novel context), three more dedicated pregnant sentences would have to be devised for each word. Hand-coding quickly goes out of phase with the size of the learning task.

Impregnating contexts with AI

A lexical tutor with a similar theory to Beheydt's but a more sophisticated technology is Kanselaar's IT'S ENGLISH (1993; Kanselaar, Jaspers and Kok, 1993). "IT'S" is a pun on intelligent tutoring systems, of which a growing sub-species is devoted to language instruction, normally with a focus on syntax rather than lexis (Swartz and Yazdani, 1991; Cobb, 1993b).

The starting point of Kanselaar's tutor is once again Schouten-van Parreren's (1985) finding that words are best learned from inferring meanings from context, which as noted raises the problem that natural contexts are not always as helpful as one would wish. Kanselaar's solution to this problem is once again to make contexts pregnant, not by devising special sentences but by providing lexical resources borrowed from artificial intelligence that can be used to make any context pregnant. For example, if a student reading a text comes across an unknown word, he or she stands a good chance of working out its meaning, if for every word in the context he or she can access on-line a definition, a synonym, an antonym, a superset, a subset, a synthesized pronunciation, a grammar rule, and the part of speech as computed by a syntactic parser. These lexical resources should be enough to transform any context into a pregnant one, so that inferential learning can take place from natural text.

But would learners not run into the problem at test time that pregnant learning is unretained? In fact, it is not clear that the pregnant learning actually takes place. The subjects set out to read the texts, in which certain new words are marked for attention. But the subjects, as one might expect, use the dictionary not to clarify surrounding contexts, but to look up the target words themselves (like Coady and colleagues' subjects, failing to follow the learning strategy proposed for them). So it is no surprise that when tested for learning results, using IT'S ENGLISH produces the usual outcome described in Chapter 3: small gains over a control group in definitional knowledge, but no gains in comprehension of a novel text.

Loaded up with so many resources, IT'S ENGLISH apparently runs slow enough to irritate its users, at least on the machines used in the experiment. This is ironic, since only three of the many facilities lugging the system actually get used to any degree (definitions, pronunciation, and occasionally example sentences), a usage pattern similar to one found in a similar study by Bland, Noblitt, Armstrong, and Gray (1990). Still, the commitment to AI seems to predominate over the commitment to learning, because Kanselaar and colleagues' plan is to proceed with more intelligent lexical resources, not fewer. As of 1993, however, IT'S ENGLISH is effectively a collection of texts linked to a CD-ROM dictionary inviting all the problems of dictionary learning already discussed.

On-line dictionary support for reading

Addressing the paradox that children appear not to learn words very well from either context or dictionaries, Reinking and Rickman (1990) hypothesize that the problem with dictionaries might really be with problems of paper dictionaries that could be remedied by using a computer.

In the studies that show poor learning from definitions, the problem may have been that stopping to use a dictionary distracts attention from reading, as well as raising confusion about which senses and examples are applicable to a given context. An on-line dictionary, particularly one linked to a text that learners will be reading, can have instantaneous look-up as well as pre-coded linkage to the relevant senses and examples of words that the learners will encounter. With these advantages, dictionary learning might have a more positive effect on comprehension.

Reinking and Rickman selected appropriate texts and located 32 they thought would be difficult for their subjects. They connected these words to an on-line dictionary, pre-linking the senses and examples relevant to particular contexts. On a definitional measure, an experimental group using the on-line services slightly outperformed control groups who used either a paper dictionary or a paper glossary (87% of words learned on-line, 78% off-line). However, in terms of comprehending a novel passage using the target words, scores were both lower and equal for all groups, with one exception. If the program forced the on-line subjects to look up every target word before allowing the text to advance, then a comprehension score difference was produced (76% for the on-line group, 62% off-line).

However, by forcing a visit to every definition for one group, the study loses both internal validity (introducing a time-on-task confound) and external validity (being told which words to look up corresponds to nothing in the wider world of either school or life). In a normal learning environment, learners are always meeting a mix of known and unknown words, and they must somehow be left the responsibility of deciding for themselves which ones to pay attention to.

Further, just as there are pregnant contexts, what Reinking and Rickman propose here is pregnant definitions. Pregnant contexts, as discussed above, are easy to get meaning from, but often with no retention in a delayed test. The default assumption is that pregnant definitions would be the same, with uneffortful learning unretained. Unfortunately the authors have not provided a retention measure; both comprehension and vocabulary tests were administered immediately after the subjects had finished reading.

Reinking and Rickman's approach also has extensibility problems. It is not easy to see how the principles of their tutor could ever be the basis for a training program of any practical size. Here they have developed a super-intensive system capable of presenting 32 new words: how many texts would have to be found or created and hand-linked to dedicated dictionary information to support the learning of 1000 words let alone 3500? It seems unlikely that this type of tutor will move beyond the in-principle phase.

On-line dictionary and concordance support for reading

Goodfellow (1994, 1995b) has developed and begun testing a lexical tutor called LEXICA. The initiating activity is for learners to begin reading a text on the computer screen, and any unknown word can be selected for further information from either a linked monolingual dictionary or a concordance program accessing a 50,000-word corpus. Once a learner has selected a word for attention, the tutor suggests several things to do with it, in order to process it further and learn it. The word can be stored in one of two lists, either under the heading "form" or "meaning," depending on whichever is most interesting or problematic; words from these lists can be sorted into further lists with the learner's own headings. The main role proposed for the dictionary and concordance is to aid with these sorting tasks, and any information a learner thinks is particularly interesting can be added to a notes window in the program.

At any point learners can volunteer to be tested on the words they have been working on. The test is to replace each word into the (gapped) line from the text where it was first seen. Various clues can be requested if the word cannot be recovered, namely the sortings and notes the learners themselves have previously entered. For example, they can see that they sorted the word for meaning rather than form, and look at their companion notes with some concordance or dictionary information (with headword deleted), or their mnemonics-or their first-language translations.

LEXICA clearly offers a large number of strategy options, and Goodfellow has developed a strategy-tracking system and is currently experimenting with ways of relating strategies to outcomes. In fact, it is the tracking system that seems to hold the most interest for Goodfellow at present rather than the tutor's practical or institutional uses. The information the tracking system provides may eventually be fed into further development of the tutor, so the project must be evaluated long term. But for the moment, it has a number of weaknesses.

First, while Reinking's tutor specifies exactly which words learners should pay attention to, at the other extreme LEXICA offers no guidance at all about which words in a text might be worth paying attention to. Since the words are presented in running text, chances are good that Mondria effects will operate in some unknown proportion of cases, i.e. when the overall meaning of a text is clear, then learners will not be especially aware of which words they know well and less well.

Second, the outcome measure is to match a word to the exact context it was first presented in, not to a novel context, so the crucial dimension of transfer is not emphasized or tested. In other words, only initial learning is attempted, surely an underestimation of the tutorial potential of a corpus, which contains a great deal of information about how words adapt themselves to different contexts. Why not use the concordance to get out novel contexts, and ask the learner to fit words to these, promoting and testing transfer? A possible reason is that the theory-base of LEXICA is not the literature of reading research, where transfer is a key issue, but instead an adaptation to lexical acquisition of Marton's (1986) very general theory of "deep" vs "surface" learning styles (discussed in Goodfellow and Powell, 1994).

Third, the level of motivation and metacognitive awareness LEXICA presupposes will strike readers with teaching experience as optimistic. Not every learner could make much sense of dividing words by form and meaning, and indeed no evidence is offered that the pilot subjects ever took to the idea. What the test subjects seemed to do most, in fact, was use LEXICA to look up words in the dictionary, which leads straight back to problems with the quality of definitional knowledge.

Fourth, Goodfellow's user data suggest that his subjects dealt with remarkably few words over the course of a session, roughly 18 in 4 hours in one case (Goodfellow and Powell, 1994) or a word every 13 minutes. While a novel word may be worth 13 minutes and more of a learner's attention, this is not a quick way to build up vocabulary size.

A multicontextual approach

None of the tutors reviewed so far ask their users to infer the meanings of new words by reading text on a computer screen. This is odd, because as shown above few any longer doubt that "most words are learned from context," however ill. Definitions, pregnant contexts, and mnemonics are all dubious contenders, and yet these are the strategies of choice for lexical tutors.

In an attempt to find out what type of on-screen information best helped second-language students learn new words, Markham (1989) devised a program to teach the same 15 words in two versions, one providing a definition for each word, the other presenting each word in three paragraphs of running text. The subjects were then tested using a measure with two parts: a definitional task (multiple choice) and a contextual task (choose the most appropriate use of the word from a series of novel sentences). In an immedate post-test, learning was about equal between treatments on both measures (around 72%).

However, on a surprise repeat post-test four weeks later, there were interesting differences. There was still no difference in ability to choose correct definitions, but there was a difference in ability to choose correct contextualizations. The group that had read the three paragraphs now chose 71% of the contextualizations correctly, as originally, but the definition group had dropped to 60%, a loss of 15% [(71-60)/71=.15]. Markham concludes that "long-term, depth oriented gains [are] associated with exposure to words embedded in a variety of natural paragraph level contexts" (p. 121). In other words, with definitions you get a weak grip on novel contexts, but with context you get definitions for free.

So here at least is an existence proof for words being learned from text on a computer screen. Further, it confirms that definitions provide easy learning while texts provide deeper and more transferable learning. This is support in principle for the concordancing concept, because three paragraphs of context for each target word is just the sort of thing a concordance excels at providing.

The only problem, however, is that Markham does not specify whether his texts are hand-coded or authentic. If they are hand-coded, then teaching 1000 words would mean finding or devising 3000 paragraphs of pregnant text. But if they are authentic, or pulled from a corpus set for a certain lexical range, then this is a good basis for the development of a corpus-based tutor for the present study. Also, Markham does not appear to have tested his tutor in an ongoing institutional curriculum, which will also be a feature of the present study.

Conclusion

Each of these tutorials has points of value and interest, but each also fails in one or more of four important ways.

These are all points to consider in the development of a corpus-based tutor and in assessing its effectiveness. A good lexical tutor, of course, is likely to be designed with a particular learner in mind. The next chapter looks at the proposed users of the corpus tutor and their particular lexical needs.



contents

 top

 next