CHAPTER 7

CALL AT SQU

The design of a corpus-based lexical tutor evolved naturally out of trends in educational computing at the Language Centre of Sultan Qaboos University. Several kinds of programs had already been developed and used by students-mainly text reconstruction and a hand-coded lexical tutor-so designing a corpus-based tutor was a matter of adapting and integrating found concepts and technologies.

Text reconstruction

SQU students have been using CALL activities as part of their language training since the University opened in 1986. The initial design concept for CALL development was text reconstruction, in which a supply of texts is stored in a computer along with templates for transforming them into activities such as cloze passages, systematic deletion passages, scrambled sentences, crosswords, and many others. The learner chooses a text and an activity, and when the computer has deconstructed the text in the chosen way the learner reconstructs it. He or she presumably gleans insights, gets practice, and gains confidence with the language in the process, especially if the computer uses its "knowledge" of the original text, or other texts it can access, to help out. Stevens and Millmore's TEXT TANGLERS (1987) was under development even before SQU opened, later joined by SUPERCLOZE (Millmore and Stevens, 1990), and then TEXPERT (Cobb, 1993a) when a Macintosh lab was added. Cobb and Stevens (1996) discuss the rationale for reconstruction; Stevens (1995) discusses some patterns of students' use.

Initially, the texts were adapted scientific texts which the students had already met in the reading classroom. Science texts were used because the Language Centre at SQU was an experiment in "content-based English," teaching English through the medium of content courses. Professors in the College of Science supplied the Language Centre with the biology, chemistry, and physics texts they wanted the students to read, often simplified in collaboration with language course designers, and it was the task of language instructors to help the students understand these and learn English through them. One means of doing this was to put the texts into the computer lab and invite the students to work them over as many times and as many ways as possible.

There was also a practical reason to choose the text reconstruction technology for the CALL lab. Once developed, this technology was not tied to any particular set or level of texts. Flexibility was an advantage in the early days of SQU, where professors who had never taught foreigners met foreigners who had never seen professors, to the mutual mystification of each. In the search for a workable relationship, there was a great deal of curricular turmoil; a succession of approaches and materials came and went. For CALL designers, the instability of this environment indicated a type of software with as little hand-coding as possible, since any approach tied to a particular text or teaching point could quickly become obsolete. Software was needed that could build useful activities from whatever texts were thrown at it. With text reconstruction, the science professors could change their courses as often as they liked, as long as they provided their new materials on computer disk.

Text reconstruction and concordancing

This text-reconstruction pedagogy and technology has evolved over the years at SQU, in line with new ideas, new hardware, and user feedback. An obvious way of extending text reconstruction was to build a concordance routine into the students' interface, since the textbase (corpus) was already in place. The concordance could appear as an exploratory tool, or as a HELP option within a reconstruction activity. For example, a learner requesting HELP to fill a gap in a paragraph could be given, instead of the usual first letter then second letter etc, a concordance of other sentences using the missing word. The learner's search of memory is constrained rather than short-circuited, and steered onto the semantic not phonological plane. TEXPERT contains many schemes for making concordancing available to learners as an option within text reconstruction activities.

Vocabulary by computer

Text manipulation can be seen as practice in vocabulary, but it is not a useful way of introducing new vocabulary. Finding a word for a gap assumes the missing word is already known. So in a separate line of development, a more specific attempt was made to deliver lexis by computer.

It has always been clear that SQU students were extremely weak in English vocabulary, as lamented by their science lecturers. Arden-Close (1993b) observed chemistry lectures extensively and interviewed the professors, concluding: "Language problems in these lectures are seen as almost exclusively vocabulary problems" (p. 251). Yet it has not been easy for the Language Centre to address these problems.

Course instability meant among other things that it was difficult to identify any lexical base to build a vocabulary course on. However, as the collection of machine-readable science materials grew into a sizable corpus (Griffiths, 1990), it became apparent that it would be possible to use a concordance program to scan for whatever lexical bedrock might be forming below the shifting sand. (This use of concordancing in the early days of SQU is discussed in Stevens, 1991, and Flowerdew, 1993b). Gradually a lexical base was discerned, comprising several hundred scientific terms that seemed to recur whatever the approach or subject matter. These terms were fashioned into a 500-word vocabulary course consisting of a workbook and computer program.

The computer program, LEXIQUIZ (Cobb and Poulton, 1991), gives students additional exposure to the 500 scientific terms they have already met in their workbook. The rationale is simply that a great deal of practice and recycling is required if the students are to learn and retain any significant portion of these words. The tutorial interface resides atop a database of science terms, each tagged to a short definition and example sentence. The program asks multiple-choice questions about words in 25 groups of 20, and the learner cycles through the items until each has been answered correctly. The questions are presented in one of four modes, from which the learner chooses. Here is the interface in one of it modes, with the user about to select a definition for "density":

Figure 7.1 LEXIQUIZ - word + definition mode

The modes consist of all possible combinations of the three items in the database: a word with four definitions to choose from; a word and four gapped sentences; a definition and four words; or a gapped sentence and four words. The third, leftover item in each case becomes the HELP, should the learner get stuck.

Figure 7.2 shows another interface mode, with the user requesting help-in this case the leftover definition.

Figure 7.2 LEXIQUIZ - gapped sentence mode + HELP

LEXIQUIZ has been used by hundreds of science students over at least five years, and could be described as a modest success. Students have shown a strong interest in using the computer to learn words, far more than they ever did for text manipulation. A common pattern of use has been to sit and review hundreds of words at a time, suggesting a thirst for vocabulary on the part of the students, as well as a role for self-paced word-learning opportunities.

 

The limits of LEXIQUIZ

Nevertheless, LEXIQUIZ is far from the last word in vocabulary tutors. In fact it resembles some of those criticized in Chapter 5 in that it purveys pregnant contexts, does not have much for students to read, and is not extensible without giant labour. The biggest limitation, however, is that short definitions and single contexts are unlikely to affect learners' ability to comprehend newly learned words in novel contexts, as discussed in Chapter 3. This might not be a problem, since the words are being met in other contexts in both language and science classrooms. Unfortunately, no research has attempted to test the tutor's effectiveness.

An even more serious problem with LEXIQUIZ is in the kind of words it draws to the students' attention. It gives students practice in medium and low-frequency scientific words, while they actually do not know very many high-frequency words-a case of going about things backwards. The highest frequency 2000 words of English are crucial for any sort of reading, including scientific, since with heavy repetition they comprise about 80% of the words in any text. It seems unlikely that either reading or listening to science lectures could proceed very smoothly for learners with few words in the 0 to 2000 range, however many they knew at other levels. The theory of English-through-science had somehow obscured this problem.

Technical vs sub-technical lexis

Arden-Close (1993b) provides some touching anecdotes that indicate where the main vocabulary problems at SQU lie. His research consisted of observing numerous science lectures, where professors unversed in language issues attempted to communicate with students. He describes one chemistry lecturer backing up further and further in a search for common lexical ground. Trying to get across the idea of "carbon fluoride bonds" and meeting incomprehension, the lecturer tries a succession of progressively more common analogies: teflon pans, a tug of war, an assembly line-to no avail. In the light of the size-testing undertaken in the present study, it is no wonder; "pan," "war," "line" and other words from the 2000 wordlist were no doubt themselves unknown, let alone any compounds derived from them.

In another anecdote, a biology lecturer describes searching for a common analogy to convey "hybridization," and in the process indicates the real level of the problem:

The first time I gave a hybridization analogy, I talked about dogs, and then I switched to goats; and then it even dawned on me that some of them aren't going to be in touch with the fact that if you mix two different kinds of goats they come out looking in between, and I didn't know all the specific terms there, what their two different breeds of goats are called-you can talk about [mixing] colours, but a lot of them don't know their colours yet (p. 258, emphasis added).

Numerous similar interchanges have taken place over the years. There is no common lexical ground for lecturers to retreat to.

The lexical profile suggested by these and many similar anecdotes is supported by size testing. When Nation's (1990) test was given to SQU students for the first time in 1993, this was the typical profile of words the students knew at various levels after one year of study:

Table 7.1 Words known at five levels

 Level 2000 3000 5000 University 10,000
 Student 1  27% 22% 17% 0% 0%
 Student 2 39 22 11 27 22
 Student 3 33 27 11 11 0
 Student 4 33 44 17 27 17
 Student 5 27 17 5 22 5
 Student 6 27 17 0 5 5
 Student 7 50 33 22 0 0
 Student 8 27 11 22 5 0
 Student 9 33 33 17 11 11
 Student 10 39 17 0 0 0
 Student 11 33 17 11 17 0
 MEAN % 33.5 23.6 12.1 11.4 5.5
 S. Dev. 7.1 9.7 7.8 10.5 7.9

As predicted, the students had a smattering of words at all levels, but only about (2000 x 33.5% =) 600 words at the 2000 level. In fact, they had more words beyond the 2000 level than within it--words met only in the 20% of text left over when the high-frequency words have claimed their 80%.

Further, it appears that high frequency words, not scientific words, are precisely the ones students have the most trouble learning. English scientific terms are often already known in the first language, as concepts merely awaiting new labels or even as loan-words. They are often inferable from context and diagrams, get emphasized in lectures, and so on. Numerous empirical studies from English-medium universities in developing countries trace student reading difficulties to high-frequency (sub-technical) lexis, rather than technical (Sutarsyah, Nation and Kennedy, 1994; Marshall and Gilmour, 1993; Parry, 1991; Robinson, 1989). A revealing study by Cohen, Glasman, Rosenbaum-Cohen, Ferrara, and Fine (1988) tracked the words Arabic and Hebrew-speaking learners looked up in dictionaries while reading an academic text: 85% were sub-technical.

Arrival of the PET

So introducing the PET into SQU in 1991, with its emphasis on general English and its lexical base of high-frequency words, was probably a good move. However, the nature of the challenge it posed became clear only gradually. From one point of view, the PET was just one more upheaval in a landscape already strewn with curriculum wreckage. From another, the arrival of a test with a 2400-word base was a challenge unlike anything that had gone before. It is doubtful that SQU students had ever had to learn anything like that many words, of whatever type, however counted. Further, the PET included a stiff reading comprehension section, so these words would have to be learned well enough for use in comprehension of novel texts. No wonder the first PET result was called "the slaughter of the innocents."

Re-tooling LEXIQUIZ

There appeared to be a role for CALL in the new PET era, but it could not be merely an expansion of LEXIQUIZ for two reasons. One, hand-coding 2400 definitions and example sentences would be a labour of huge proportions; as noted in Chapter 2, hand-coding normally trails off at about 1000 words. Second, many words in the 2400 range are extremely polysemous (such as "run") compared to lower-frequency words (such as "density"), so that writing short definitions for them is not simple, nor is finding a typical example sentence. In other words, some approach that did not require hand-coding was indicated. And given the reading comprehension aspect of the PET, an approach with more text for the students to read and operate on was desirable.

What was needed was a marriage of text reconstruction and LEXIQUIZ. For example, this could take the form of some sort of list-driven text reconstruction, where particular words would be learned by meeting them several times in text-based activities. The two technologies were already well developed, so only an integration was required, as well as the development of a corpus of non-scientific texts.

The prospects for a text-based tutor

But could a vocabulary tutor appeal to Arab learners if it did not give them practice with definitions? Arab learners are well known to tend toward a deductive learning style and to be fond of memorizing short definitions. But (as discussed in Chapter 6) there is nothing inevitable about this learning habit. Language learners, even in the Arabian Gulf, cling to definitions less and pay attention to use and context more as they become more proficient in a second language. Evidence of this can be found in the following mini-study of user responses to LEXIQUIZ. As mentioned before, the program offers four ways of setting up its multiple-choice questions. The reason for this design, other than to provide variety and motivation, was to see if there were any patterns to student learning preferences. Unfortunately, no tracking system was built into the program, so no large scale information is available, but the following survey information is still suggestive.

LEXIQUIZ's four modes really boil down to two: fitting a word to a definition, and fitting a word to a context. Instructors who had supervised lab sessions for several terms consistently reported that students preferred to choose a word for a definition (as could be predicted, if only because it involves less reading). However, putting questions to the students themselves revealed some subtlety. The following questionnaire was completed by a class of 11 male engineering students at the end of a six-week summer course with several sessions on LEXIQUIZ. The number of students choosing each option is indicated in parentheses after each choice:

Figure 7.3 How LEXIQUIZ was used

 END-OF-COURSE QUESTIONNAIRE, AUGUST 1993

Here are the four ways to use LEXIQUIZ:

 1. Find the definition that goes with a word  (1 word, 4 definitions)
 2. Find the example sentence that goes with a word  (1 word, 4 sentences)
 3. Find the word that goes with a definition  (1 definition, 4 words)
 4. Find the word that goes with an example  (1 sentence, 4 words)

 QUESTIONS  No. opting for each way (in parenth.)
Which way of using LEXIQUIZ do you think is most useful for learning new words? Circle one.  1 (2)  2 (2)  3 (5)  4 (2)
Which way of using LEXIQUIZ have you used most of the time?  1 (3)  2 (2)  3 (3)  4 (3)
Which way did you mainly use LEXIQUIZ at the BEGINNING of the term?   1 (2) 2 (2) 3 (5) 4 (2)
Which way did you mainly use LEXIQUIZ at the END of the term?  1 (2) 2 (2) 3 (2) 4 (5)

 

In line with expectation, the students believe that definitions are the most useful way to learn words (mode 3), and the options with the most to read are the least to get used (modes 1 and 2). Counter to expectation, between the beginning and the end of the term three students had switched from a definitional to a contextual learning strategy (mode 3 to mode 4). This trend appears not to be a fluke, inasmuch as it was replicated in the following term.

Design implications

This finding suggests two principles for the development of a text-based training program: most students will gradually adapt to a contextual approach, but in the beginning they will probably find a definitional component useful and motivating. So a decision was made to develop a text-based lexical tutor in two stages, in line with PET Bands 2 and 3. The first would incorporate definitions as well as concordances, and the second would be a text-based, concordance-driven system with no definitions or other hand coding. The next chapter discusses the design and implementation of the first tutor, PET·200.



contents

 top

 next