Robert Stockwell and Donka Minkova. 2001. English words: History and structure. New York: Cambridge University Press. 208 pp + xi.

Reviewed by Thomas Cobb, Université du Québec à Montréal

For CAAL Journal, Spring 2004

English Words gives its reader a wealth of information about the Greco-Latin strand of the English lexicon, i.e. about what many would call the non-English words of English (like lexicon and information) despite the book’s title. These Greco-Latin words (henceforth GL words) make up the vast majority of entries in any dictionary (over 80 percent, p. 2). However, they do not have quite the same importance in texts other than dictionaries owing to the repetition of less numerous but more used Germanic and Anglo-Saxon items (henceforth AS items, like reader, give, wealth, and knowledge, plus the pronouns and prepositions) that continue to form the core of the language.

The subtext of the book is English and how it grew from its diminutive AS base. The reader is introduced in Ch. 1 to native growth processes (mainly compounding) and thereafter to the unusually extensive lexical borrowing and adaptation that has characterized English and provides its users with lexical resources that can only be called vast and whose role and result in matters both intellectual and practical can only be guessed at. Stockwell and Minkova’s book provides an interesting description of this borrowing as well as an array of tools from linguistics for generalizing about growth processes, change processes, assimilation processes and several other processes that arise as a result of having twin lexical strands in a language.

Along the way, fascinating details invite pause:

· It is well known that the arrival of GL words into AS England was a feature of the Norman invasion in 1066, but there are some less obvious twists to the story. One was that the Normans who imposed GL via French upon the English were themselves only two generations away from speaking a Nordic language quite similar to AS. The effects of a common background on the success of the Norman French implantation must be interesting but are probably unrecoverable and are not alluded to here, and, in any event, by 1204 the Norman French influence on English had been replaced by a Parisian French influence. Another twist is that the majority of GL terms did not arrive via French at all, but directly from Greek and Latin during the Renaissance and earlier eras when Latin was the lingua franca of European scholarship, first religious and then scientific.

· Not every language borrows as extensively as English does. German for example tends to meet its lexical growth needs through recycling, i.e. compounding native resources (Fernsehen => far seeing => television, p. 53). English does this too (doorbell, stronghold) but to a lesser extent (or at least prefers to compound GL resources, e.g. television). Why different languages adopt different ways of building the lexicons they need is not speculated on, although it would be consistent with other parts of the book to say that accidents of history are the cause, in this case the repeated waves of conquest that ebb and flow through English history.

· While English has borrowed vastly from languages that may seem quite distant to it, in fact its borrowings are so only in an immediate perspective. In a broader perspective, Latin, and Greek, like Old English, are themselves Indo-European languages, so most of the loans have actually been “in-house”.

· There are two assimilation routes for words into a language. One is via the street or market, where items are shared between two or more language groups for purposes of accomplishing tasks in a shared existence; in this case, new words are likely simply to replace old words. The other is the borrowing of words to meet conceptual needs that have not yet found linguistic expression in the borrowing language. These might include words to talk about relations, relationships, subtleties of emotion, and of course words to talk about words (discourse, logic, argument, conclusion). English speakers apparently at some point found themselves in need of such words and readily adopted them from GL sources and naturalized them. But here an interesting cause and effect question arises. It is to be assumed that English, like German, could have used native linguistic resources to meet its opening conceptual gaps, had GL items not been available. So did the availability of such words effectively stimulate the need for them in an interactive process? Did English “get ahead” in the sixteenth and seventeenth centuries by not having to wait for the slow emergence of native linguistic resources? These matters are not speculated on in the book, but abundant data will spur curious readers to speculate for themselves.

These are just a few of the fascinating moments in the tale of lexical borrowing.

But who is the reader the tale is intended for? The authors do not state their proposed audience, but it seems safe to say it is not other linguists. For one thing, there is no reference list. In fact there are few references – just a few standards from the 1970s and 1980s in occasional footnotes. Also, there are no novel interpretations of any of the data (although some of the book’s topics seem ripe for this, as indicated already). And none of the usual linguistic issues or controversies are mentioned or suggested, apart from a mention that syntax may be innate while morphology, at least in English, is clearly a product of historical accident joined to a predilection for easy articulation.

Gradually it seems, at least for this reader, that the book is a sort of textbook for American undergraduates, either native or advanced ESL (English as a Second Language) headed toward studies in law, medicine, or other areas where GL words and word-roots are likely to be encountered. It seems a lexical equivalent for the good writing manual that greets every undergraduate (once Sheridan Baker’s The Practical Stylist but nowadays probably replaced by something else). There are several reasons for postulating these readers and this purpose. The first is that the book has a complete set of accompanying workbook exercises on the Internet[1]. The second is a series of oblique hints including that such words as cognate are “generally unknown to today’s undergraduate” (p. 1); that “if you already use a good dictionary then you probably don’t need this book” (p. 1); that “people cannot call themselves ‘educated’ who do not have a minimal acquaintance with the history and structure of the words in their own language” (p. 1); and so on. The authors believe, not doubt rightly, that the youth of today are missing important lexical preparation for higher study, not to mention cultivated life in general so they have written a cross-curricular Lexical Primer. If this analysis is correct, then the book’s theme is lexical expansion in two senses.

My identification of a plausible intended reader for English Words is rather speculative, but assuming it is accepted to some extent, how well does the book stand up as a textbook for “lexing up” the pre-law or pre-med undergraduate? Linguistically, as mentioned, this work will not turn either morphology or historical linguistics on its head; nor is it quite on the cutting edge educationally.

Nothing that follows in any way suggests that a curious and capable undergraduate who had not yet discovered the pleasures of a proper dictionary would not benefit from working through this book’s descriptions, explanations, concepts, and most of all its lively examples. The 12 units of roughly 13 pages each of Internet worksheets are useful practice in, for example, determining the GL root of a word, or working out a pronunciation of a multi-syllabic word met in reading but never heard pronounced, and so on. Particularly valuable is an appendix with advice for choosing a good dictionary, an appendix that will make sense for the reader who has been through the book and at the same time will allow the development of further language awareness of the type this book proposes. But will the intended reader make his or her way through the book?

The first problem I find with this book as a learning tool is the paradox, which one often finds in educational materials written by content specialists, that the medium of instruction presupposes the target knowledge. This book is not only about the GL strand of English, it also uses the GL strand of English as the medium of instruction. What does this mean? Below is a passage of about 200 words from a relatively un-technical section of the book (p. 37). The passage has been typed into a computer program called Vocabprofile which breaks a text into four zones of frequency including one called the Academic Word List (Coxhead, 2000) which mainly comprises the more common GL words of the type treated in this book. Here is the text, followed by the profile in Figure 1 and a break-out of its GL items in Figure 2.

Interestingly, at these early stages of massive diversification of the vocabulary of English, there seem to be no negative attitudes to borrowed words. Literacy in medieval times was very much an accomplishment related to social standing. It is likely therefore that the large majority of people who could read and write were either members of the Norman aristocracy, or people trained to serve the Normans in some capacity: clerks, scribes, chroniclers, religious and court writers, scholars, poets. This situation might conceal both potential negative attitudes and the rate at which new words were actually adopted by speakers of English. Thus, an early record of a French word is no guarantee that that word was familiar and current throughout the linguistic community. Conversely, we can imagine that many words, especially words which would not make their way easily into religious, legal, or didactic writing, might have been used in the spoken language for decades before they actually went on record. More manifestly, the class-based distinction between the literate and the illiterate is reflected in the type of words that Middle English borrowed from French. The two chronological layers of borrowing discussed below show how the new political and social realities shaped the English lexicon.

The vocabulary profile of this passage is shown in Table 1, with a breakout of the GL items shown in Figure 1.

Table 1: Sample of GL density in English Words

	Families	Types	Tokens	Percent
K1 Words (1 to 1000):	84	95	156	76.85%
K2 Words (1001 to 2000):	6	7	8	3.94%
AWL Words (academic):	13	13	15	7.39%
Off-List Words:	?	23	24	11.82%
	103+?	138	203	100%

Figure 1: Break-out of GL items

1. Relatively high-frequency GL items from Academic Word list:
attitudes capacity community conversely decades distinction diversification guarantee layers legal majority negative potential

2. Less frequent or off-list GL items not on AWL

accomplishment aristocracy chroniclers chronological conceal didactic French illiterate interestingly lexicon linguistic literacy literate manifestly massive medieval Norman poets realities scholars scribes vocabulary

The analysis shows that roughly 20 percent of the individual words or tokens in the text (one word in five) derive from the lexical zone the authors are assuming the reader needs some sort of help with. While clearly any college student will know some of these words, or maybe all of them to some extent, this is nevertheless a high rate of target lexis within the medium of instruction. Research in applied linguistics has found that comprehension begins to suffer when more than 5 percent of words, or one word in 20, are unknown or inadequately known (Laufer, 1992). In other words, the reader who needs to read this book will probably struggle to do so.

A defense against this criticism might be that GL words are basically known to undergraduates in terms of meaning, so that reading the book would present no problem, and that these words need only to be raised to the zone of active use through meta-information and word-analysis skills. However, my own research on corpora of undergraduate writing suggests that GL words are not generally well known by current undergraduates, and that this is especially true of the increasing number of ESL students heading for graduate work in medicine, commerce, and information technology in North American universities. Do these learners know the meanings of GL words? Server records kept by the Cambridge Advanced Learner’s Online Dictionary, which is used by many of these learners, show that 90 percent of 23-million look-ups in 2001 and 2002 were GL items. The 2002 look-ups are shown in Table 2, ranked by number of look-ups (with the 2001 rank shown in parentheses).

Table 2: Top 50 look-ups by academic ESL learners (2002)

1	serendipity (1)	14	regard (23)	26	endeavour (49)	39	intend (-)
2	idiom (2)	15	foible (10)	27	love (18)	40	implement (-)
3	paradigm (3)	16	inform (21)	28	metaphor (29)	41	aesthetic (30)
4	ubiquitous (4)	17	appreciate (40)	29	foray (12)	42	emphasize (-)
5	effect (7)	18	assess (31)	30	procure (20)	43	ethic (43)
6	advice (14)	19	assert (25)	31	fob (16)	44	sarcasm (-)
7	liaise (13)	20	irony (22)	32	pertain to (32)	45	relate (-)
8	enquire (19)	21	commit (-)	33	diverse (38)	46	empathy (39)
9	pragmatic (6)	22	allege (46)	34	ambiguous (26)	47	propose (-)
10	affect (11)	23	acquire (24)	35	benefit (-)	48	use (-)
11	analyse (15)	24	cynic (36)	36	mitigate (41)	49	leverage (-)
12	jingoism (9)	25	provide (-)	37	criterion (48)	50	rhetoric (42)
13	comply (37)			38	elude (-)

Source: http://dictionary.cambridge.org/top20/top50_02.asp . See also top50_01.asp for 2001 figures.

These heavily looked-up items seem to be mainly GL words. Running the list through Vocabprofile, we find that 30% of the items are GL words from the AWL (diverse, ambiguous, criterion), 50% are off-list or lower frequency GL words (ubiquitous, mitigate, elude), and the remainder are GL items that have entered the high frequency zone (effect, provide, regard). I would venture that this amount of looking-up suggests that the ESL contingent is learning rather than fine-tuning these items, and further that they would have difficulty reading a text that contained a high proportion of them.

Another problem with this book as an instructional text concerns the notion of learning about words. The authors seem to equate “learning” and “learning about”, and the writing often glides seamlessly from one to the other. Take the following passage:

Often the extent of one’s vocabulary becomes a measure of intellect. Knowledge about the history and structure of our words – both the core and the learned vocabulary – is a valuable asset. (p. 3)

The thought moves seamlessly between vocabulary knowledge (“the extent of one’s vocabulary”) and meta-knowledge (“knowledge about”) as if these were one and the same. But they are not. In language acquisition research, the role of meta-linguistic knowledge in language use is hotly debated. It is well known that acquisition of a first language can proceed to a basic level and possibly quite a high level with little or no help from meta-linguistic knowledge or awareness. Acquisition of a second language is not necessarily the same, particularly if attempted post-childhood, and specialists in this area take positions ranging from the necessity of conscious awareness and meta-linguistic knowledge (Schmidt, 1997) to the useless and even damaging effects of it (Krashen, 1985). There is ample evidence at present to support both positions.

The authors’ unawareness of learning issues related to the GL is quite general. For example, their attempt at a GL pedagogy is one of at least three other discernible attempts of which they apparently know nothing. Two major educational research programs that I happen to know of are under way. One concerning first-language acquisition has been under way for years at the Ontario Institute for Education (OISE) in Canada. Stockwell and Minkova briefly mention that the GL strand of English is not picked up equally by all while it would be nice if it could be, but Olsen (1994), Olson and Astington (1990), and Corson (1997) at OISE have spent years itemizing both the cognitive functions of these words, as well as the devastating cognitive, academic, and social effects of being unable to perform these functions. Children who fail to control the GL strand come up against a “lexical bar” (in Corson’s 1985 phrase) which can last over lifetimes and generations. These researchers detail the role that GL items play in analyzing events and structuring discourse for English speakers, as well as the effects of its absence. A live issue here (harkening back to a point made earlier) is whether teaching the words is equivalent to teaching the concepts, whether conceptual need must be felt before such words will make sense, or whether some sort of word-concept interaction can be planned and managed.

An unrelated but similarly longstanding second-language GL pedagogy has been under way for several years in New Zealand. Nation (2001) and his colleagues have determined the importance of the vast GL holdings of English for a second-language learner not on the basis of the cognitive properties of these words but rather the frequency properties. Using the Vocabprofile analysis mentioned above and its key concept of text coverage, these researchers have shown that without a knowledge of the GL side of the lexicon, English texts of any sophistication are impenetrable. They have therefore sought to make these lexical holdings pedagogically accessible through corpus analysis , for example determining which particular GL items are employed across academic domains and which are mainly found only in specific domains. The cross-domain items turn out to number only 570 word families, and these have been grouped and made pedagogically accessible as the aforementioned Academic Word List (AWL, Coxhead, 2000). Several schemes are under development world-wide to help second-language learners, many of them en route for graduate schools in North America, to learn relevant aspects of these words.

One such aspect is pronunciation of GL items, which is dealt with both by our linguists and applied linguists in the frequency-based tradition. In Ch. 10 of English Words we turn to the matter of how to pronounce GL items, which of course is mainly a matter of determining where to locate the main stress. AS words are simple, with stress falling on the first syllable of the root word (warden, doorknob), whereas GL words are long and complex with some but not all suffixes pulling the stress up next to them (syncopate, syncoPAtion). Our authors propose a five-step rule-based system or algorithm to determine the stress of an unknown word, which passes through many rather hard concepts and exceptions that, once again, most learner who needed this might not be up for. In contrast, applied linguists Murphy and Kandil (2004), again following the principle that frequency is probably the most useful kind of information for learners, have categorized and counted the stress patterns of all the high frequency GL words, that is to say the AWL words, and determined that in fact there are only a handful of patterns that actually get used very much, with just 14 patterns covering 90% of the 2 979 individual words (in 570 families) of the AWL. This looks like a usable body of information that a learner, with some input from a teacher, might actually be able to use on a simple memorization basis. The linguists, on the other hand, sends a learner off with an abstruse set of rules mainly applicable to infrequent and one-off items.

In conclusion, linguists, language educators, and applied linguists are attacking the same problem from three directions. The problem is academic learners’ lack of control of the GL strand of English. Stockwell and Minkova’s solution is to impart enough linguistic analysis to allow learners to use a decent dictionary. Language educators (Olson, Corson, Astington) have worked more on detailing why this is a problem in the first place, something the linguists have taken for granted. Applied linguists (Nation, Coxhead) have used learning principles like frequency, on a somewhat opportunistic basis, to increase the GL’s learnability. Oddly, the GL problem is being investigated on learners’ behalves from three points of view with almost no overlap. There is probably a case for integrating these approaches to what the GL strand means in English and what we should do about it.

Table 2: Top 50 look-ups by academic ESL learners (2002)

[1] The URL provided for these exercises, and occasionally whole new workbook units of several pages length, is incorrect as given in the front matter of the volume; it should be http://uk.cambridge.org/resources/0521793629/toc/default.htm ).

References