Title List Changes

New Titles

Outside U.S. and Canada

Customer Center

Product Center

Free Resources


Oxford English Dictionary Online

Publisher: Oxford University Press
URL: http://www.oed.com
Tested: 5 March-17 April, 2000
Price: $550/year for individuals, for institutions contact http://www.oup.com

I was tempted to do with the Oxford English Dictionary Online what the otherwise garrulous Ed Sullivan did when presenting the Beatles, and just say:

Ladies and Gentlemen, the OED
and then hot-link you to the demo site (http//www.oed.com/public/tour/), which is not only informative and well-designed, but also worth a ton of promotional flyers.

I don't think I could have gotten away with it as a review. And I really did not want to because I spent many joyous days in a row, first with the beta version of this gem and then with the real version. So much so that I managed to miss my original deadline, and several newspapers (Boston Herald) and magazines (PC World.com) already did at least a short review of it. It perhaps indicates how important a reference source OED is that the New York Times did an unusually long (and disappointingly hackneyed) review in its April 10th issue that was reprinted by syndicates.

I still trust that I can highlight some of the virtues of OED not covered by the demo or the reviews, as well as some deficiencies that may be of interest to the readers and the software developers of OED. This review is exceptionally long — as befits the exceptional subject. Because of its details and illustrations this review may also be appropriate for use in bibliographic instruction units that deal with OED.

OED Online is expensive, and I don't think that its overly complex institutional charging scheme, based on warm bodies and not actual users, will be attractive, but the yearly licensing fee — depending on the institute and the number of users — is affordable for research universities. It is less expensive than some other digital ready reference sources that I would replace immediately for an equally good or better free Web alternative in order to release money for the Online OED. OED simply has no peer for a serious library that has scholars and students engaged in studies of the English language. It is an excellent implementation of magnificent content with some easy-to-correct deficiencies of the impressive software. .

THE CONTENT

OED started off as the New English Dictionary and has more than 115 years' history. It went through name changes and format changes, has been supplemented, re-issued, integrated and enhanced with volumes of addenda. As I will refer to some of the various historical incarnations and components of OED, it is worthwhile to look at the summary genealogy chart for background references.

OED has been available online at many universities such as the University of Illinois, the University of Michigan for years for their students, faculty and staff. The state of Georgia made it available online for the citizens of Georgia. This is the first time that Oxford University Press offers it online directly.

The Size of the OED

There are many ways to approach the comparison of dictionaries. Statistics is one of them and OED has long traditions in providing good statistics for starters. The not exactly marketing-minded first editor of OED, Sir James Murray, presented the first issue with some impressive statistics (made mostly by one of his daughters, as I learned from the excellent biography Caught in the Web of Words: James A. H. Murray by one of his granddaughters). The OED site includes many interesting statistics about the various editions, including the Supplements and Additions volumes.

However, there are many statistics, or rather factoids, floating around about dictionaries that can be misleading. I never thought that the number of volumes or the number of pages of a dictionary would be good guidelines as so much depends on the font size, the line spacing, the size of the paper, the layout, and the thickness of the paper. Just the fact that the small pronunciation guide from the bottom of the pages was removed in Webster's Third International Dictionary made it shed a few hundred pages and thus saved the publisher hundreds of dollars in printing costs.

In the digital world such measures as the number of pages, number of volumes are obviously meaningless. Although journalists crave such numbers, they tend to mix them up, as did the PC World review mentioned above, claiming 750,000 words defined; the New York Times review mixed it up in a different way, which I will discuss at the end of the review. If you want to see some of the gaffes that I found in some of the reviews published in respected newspapers such as the Financial Times, the Observer, the Chicago-Sun Time, the Independent then click here for a sampler. It does not tell enough that the Online OED includes the 20 volumes of the Second Edition plus the 3 volumes of the Additions, along with 1,045 revised and new main entries created for the Third Edition (more about it later).

The real measure that puts OED so clearly above all the other dictionaries in terms of size is that the entire text of it consists of more than 59 million words, dwarfing all other dictionaries and even the Encyclopaedia Britannica with its nearly 45 million words.

And quantity in this case means quality, too. The breadth of the definitions, the etymologies, and especially of the quotations that document the historical evolution of the words and their meanings requires this many words. Once you submerge into this magnificent ocean of words it is hard to come ashore. It is like the nitrogen narcosis that divers experience when they descend deep in the ocean.

No one can blame Sir James Murray for grossly underestimating the volume and timeframe of the projected work. He thought that the dictionary would amount to 6,400 pages in 4 volumes and be completed in 10 years. The first edition in book form — that Murray never saw — was 15,490 pages in 10 volumes and took 70 years to complete. By the original guidelines, there was to be about a 6:1 ratio in terms of the average length of entries between the OED and the chief competitor and model of the era, the original Webster dictionary, but it turned out to be an 8:1 ratio.

Main Entries et al.

The number of main entries are not always meaningful, let alone comparable. One dictionary may use main entries in a situation when the other uses sub-entries or subordinate entries for a combined form of the word within a main entry. Take as an example the term "big time," which is not a main entry in OED but a subordinate entry under the adjective "big" — with a definition and quotations. There is also a cross reference under the noun "time." "Big time" also appears in the quotations of more than two dozen other main entries. (It is one of the idiosyncrasies of OED that while "big time" is listed only under "big," "small time" has its own main entry — not merely a sub-entry under "small" — or rather "time," which would have been the logical choice for me in the case of a printed dictionary).

OED lists only "Big Brother", big-head, the highly redundant big-headed, big house, big-side, and bigwig as main entries beginning with the word "big." In the online version of the Random House Webster's Unabridged Dictionary (RHWUD) I found 57 main entries starting with "big" from "Big Apple" to "big top" (the largest tent of the circus). Most of them appear within the main entry for "big" in OED.

Similarly, "megaloblast," "megalocardia," "megalocephalic," and "megalomaniac" are main entries in RHWUD, while in OED they are listed under the long entry for the prefix "megalo-" . When it comes to adjectival and other derivative formats it is odd how many appear as a main entry in OED, which is so parsimonious in giving that treatment to compound terms.

The OED does not seem to have a consistent system as to when compound terms are given the main entry treatment and when they are used as subordinate entries or for deciding under which element of the compound term are they defined. This is not an idle academic question, considering the trouble of zigzagging between volume 19 and volume 3 and trying to guess which volume will have the definition. When using the printed edition it is very inconvenient if you assume, for instance, that "white paper" is defined under "paper" just to learn that it is defined under "white." The knowledge you gain from the experience may not even be useful the next time as "white elephant" is listed as a subordinate entry under "elephant," not "white." It could just as well have been a main entry, as "white bread" is. It is strange that "megaspore" is not a main entry but "megastar" is. Luckily, in the digital version you don't need to guess the right volume.

It does not help when comparing the size of dictionaries that publishers use slightly different terminology in their publicity blurb — or attach different meaning to measures that have been routinely used to classify dictionaries. The publicity material for Encarta World English Dictionary, for example, uses the measure that the dictionary has 400,000 references. On inquiry Microsoft clarified that there are 92,000 header words, i.e. the equivalent of headwords or main entry words in OED's parlance. Only with this clarification can justice be done to OED, which has nearly 300,000 main entries and a total of 640,000 main entry and sub-entry terms combined.

It is also to be noted that OED does not include personal names and geographic names unless these are used attributively like the word "Zürich" denoting the porcelain manufactured in Zürich, the largest city in Switzerland. In contrast, RHWUD has thousands of entries for geographic and personal names.

Similarly, entries that are just cross-references from variant forms of a word should be counted in any dictionary comparisons only when this distinction is made clear. Incidentally, in most digital dictionaries, cross-references are less important, since the variant formats listed under the preferred entry can be easily searched. (In OED Quick Search mode, however, this does not apply because in this mode only the main entry headwords and the subordinate entry headwords are searched, which I consider to be an important oversight in design).

The OED entry for "aesthetic" identifies "esthetic" as a correct variant term. But when using it as a term in Quick Search mode there is no result. If the dictionary editors decided to list such variants, they should have realized that these are likely to be used as search terms and they should lead the user to the preferred format. This does not happen, unless the variant format is a cross-reference main entry, as is the case with "abbat" — a variant for "abbot."

Integration

Just as the second Edition of OED integrated the first edition and its four supplements, the online edition merges the Second Edition and the three volumes of Additions in a nicely integrated format. There are also 1,045 main entries in the Online OED that were created in the framework of the OED Revision Programme after the publication of the 3rd volume of Additions. These are listed separately when the search results are displayed, and can be invoked by pressing the New Edition button. They are considered draft or interim entries and include revised, enhanced and completely new words like "machohood" from the first range of revisions. These are to be added to the third edition, which is scheduled to be published by 2010, and will represent the complete revision of the entire OED.

Interestingly, the software refers to New Edition but in the introductory material the term Third Edition is used. The good news is that in every quarter about 1,000 entries will be added to the online version. Chief editor John Simpson provides details about the third edition at this URL: http://www.oed.com/public/guide/preface.htm.

At the online launch date in mid-March, the Revision Programme debuted with the beginning of the letter 'M'. It may seem odd to start at the middle of the alphabet, but because the revisions will not be published in print in themselves, there is no reason to stick to the beginning of the alphabet. Actually, these quarterly revision batches remind me of the original form of issuing the dictionary as fascicles, i.e. as serials. These revisions can be considered digital fascicles perfectly befitting the online version.

It is too early to evaluate the extent of the revisions based on the first 1,045 entries, but the sample I took suggests significant changes. Take, as an example, the original article about the word "magyar" and its revised version from the new edition. Beyond making the definition much more politically correct, the new edition makes a distinction between the British and the U.S. pronunciations, adds an important pronunciation note to the original and includes new quotations.

It is exceptional when the New Edition turns out to be politically less correct. This is the case with the entry about Mae West. The Second Edition entry provides this etymological note:

[f. the professional name of an American film actress and entertainer (1892-1980), with reference to her curvaceous figure (see quot. 1941).]

The New Edition gets somewhat more specific perhaps because some avid readers may have complained that the allusion was unclear in spite of the full-fledged quotations. Oddly, the New Edition replaced the original first quotation from Reader's Digest by another one of the same year (1940) from Listener that wondered about the etymological obscurity of the word.

Obsolete and Archaic Words

OED is the dictionary with the largest vocabulary. It is also unique in the sense that it has by far the most archaic and obsolete words (and some criticize it for this). OED was meant to be a historical dictionary to unearth the written words of the English language, so it's no wonder that it includes nearly 100,000 senses/meanings of words that are labeled obsolete and about 5,000 that are labeled archaic. (This number is not exact, as in searching it is not possible to distinguish words that are labeled "arch." for "archaic" and "Arch." for "architecture" because the search engine is not case sensitive. Of course, Murray could not foresee this when devising the intelligent but extremely complex system of notation, abbreviation and labeling). To me it is not a negative fact that there are so many words that are considered obsolete. To me this documents the richness of the birth, life and death of words.

No doubt the OED is not a dictionary for casual use. It definitely requires some learning to understand the structure of the entries, the meaning of the symbols and abbreviations and the relationship between the entries. You would certainly have a hard time persuading a student to consult the OED for a paper just hours before it is due. OED is not meant for that purpose. But this dictionary has no peers when it comes to tracing back the origin of words or finding their most subtle meanings, including the archaic and obsolete meanings — as long as you are not limiting your readings to airport paperbacks.

Just recently I was arguing for using in an article the word "restaurator," meaning the person who restores the painting or other artwork to a state as close as possible to its original form after the object is damaged or has deteriorated seriously. None of the dictionaries had this meaning of the word, except for OED. Although it was clearly labeled as an obsolete word and I lost my argument, I was happy to find that indeed it was once used in this manner in English (and I know it is being used in other languages as well).

It cannot be overemphasized that OED gives full treatment to subordinate entries, their meanings and sub-meanings and shades of differences, including their definition, temporal, stylistic, regional labeling, and dating through the more than 2.5 million illustrative quotations. That explains the whopping size of OED, including the whopping size of obsolete words. A simple example may illustrate the breadth and sophistication of temporal labeling of words.

There are two main entries in OED for the verb "abate." One presents a strictly legal use with a single meaning, but the other lists 19 sub-senses that even for a logophile seems to be excessive. Seven of the sub-meanings are labeled obsolete, including its special meanings in falconry and horsemanship. In one case only the non-legal meaning is obsolete. Three of the sub-meanings are labeled archaic, and one is archaic except for the legal meaning. It is clear even from the screenshots that the obsolete and archaic meanings are intertwined with the "contemporary" meanings, and quite often you need to scroll down several screens to find those.

Given the fact that this enormous collection is available in a fully marked-up SGML format invisible to the user, it would be possible to offer the option for the users to sort the results with the obsolete and archaic meanings listed last if they so desire. This would be similar to the option of instant sorting of the main entry index list by date as discussed in the software section later.

Neologisms

OED is also very impressive in the other end of the vocabulary spectrum when it comes to new or relatively new words and word formations. These words — understandably — are not labeled as neologisms, as what constituted a new word in the First Edition or the Supplements would not be new now. Searching by date of quotation could make it possible to find words that came into use, say, in the 1990s if the first quotation dates were indexed in a separate index rather than combined in the index with all the other quotation dates. In lieu of such a valuable index, when you search for words with a quotation date of 1996 the list will include words that are used in quotation samples of, say, 1486, 1712, 1823 and 1996.

My battery of nearly 150 test words helped in spot-checking for neologisms. OED fared as well across the board as the most comparable current Random House Webster's Unabridged Dictionary. Both OED and RHWUD included "carjacking," "chill out," "cyberspace," "DAT" [for digital audio tape], "libber," "outsource," "perestroika," "roadkill" and "trophy wife." For some words, neither dictionary had a definition, such as "chick flick," "DVD," "ebonics." It is true that for some neologisms, RHWUD had a definition and OED did not, such as "netizen," "phat" and "ethnic cleansing." On the other hand, only OED had a definition and illustrative quote for "mezzanine funding/financing," mad cow disease and a number of other words where RHWUD drew a blank. There was even a relatively new word that only OED had a definition for among all the dictionaries that I checked. The word is "affluenza" and it was a title word in the March 20th issue of U.S. News and World Report. None of the 599 dictionaries searched by the engine of the impressive OneLook site retrieved this word; neither the American Heritage Dictionary or the Random House Webster's Unabridged Dictionary found this word. OED has not only a definition for "affluenza" (albeit one that does not indicate the jocular nature of the term), but five perfectly documented quotations from very credible and widely read sources, proving that the word has been in use at least since 1979.

Loanwords and Etymologies

Loanwords best illustrate the depth of etymological research that goes into tracing the origin and evolution of words included in OED. Sometimes the etymology part is more detailed than an encyclopedic entry. This is the case, for example, with the entry on the word "maelstrom", which gives a lengthy but understandable explanation for the transformation of the word, including a critical note about the misconception regarding its Faroese origin.

When it comes to inclusion of and information about words borrowed from other languages that maintain their form and spelling completely or almost completely, as is the case with "paparazzo," "spiel," "Schadenfreude" or "csardas," OED beats all the other monolingual dictionaries. Again, beyond providing the etymology and the definition of the words, OED clarifies the senses by offering many quotations. With that said, there still seem to be deficiencies when it comes to loan words. One is the somewhat lopsided selection of terms that struck me especially with regard to the Hawaiian and the Japanese languages. Although I don't know either, having lived in Hawaii for a time I have first-hand impressions about the contemporary use of Hawaiian and Japanese loan words in English language publications. It does not seem realistic that while there are only 29 terms (in 66 quotations) identified as of Hawaiian origin, there are more than 300 words (used in 371 quotations) of Japanese origin.

I think it has more to do with the very successful OED Reading Programme in Japan (which supplied plethoric quotations to the editors) than with the frequency of actual use of those words. If the U.S. Reading Programme had included some additional novels or published letters of such best-selling (hence widely read) authors as Mark Twain, Robert Louis Stevenson, Jack London, James A. Michener, Somerset Maugham and Paul Theroux, many more Hawaiian words would have made it to OED with first-class literary warrant. I am not lamenting the omission of words mostly known to and used by people living on the islands, but words that appear in mainland publications and certainly have literary warrant, such as "haku" (the flower headdress), "keiki" (children), "pupus" (hors d'ouvres) or "mahalo" (thank you).

It is also odd that while the OED correctly points out that the abbreviated format of Japanese — both as a noun and as an adjective — has strong derogatory connotation and is now falling into disuse, the dictonary uses exactly this derogatory form in the etymological labeling of the words. The Library of Congress had the good taste to use "jpn" for both language and country codes in its MARC records, even though it is a divergence from its implicit rule of using the first three letters of the English version of the name of languages or countries unless two languages/countries would produce the same code like "India" and "Indonesia."

Hungarian, the language I know best, does not get its fair share for its contribution to English in OED either. While this may bother hardly any other readers of this column, it is not merely a chauvinistic plug-in on my part — the same deficiencies may apply to many other languages that loaned words to English. One issue is that the words that are listed as of Hungarian origin are spelled in a Germanized version such as "Dobos torte" instead of "Dobos torta" (a cake), "palatschinken" instead of "palacsinta" (another trademark Hungarian food), "hussar" instead of "huszár" (a horseman of the light cavalry). Worse is the treatment of some other Hungarian words that are not even labeled as Hungarian, such as the Anglo-German spelling of the correct "gulyás" or "cigány". Ironically, in the latter case the entry makes it clear that the main entry spelling is imperfect ("a better spelling would be tsigan"), and all the variants listed came from the Hungarian original version. Still, the main entry is this touristy version of the word. The ultimate insult is that the main entry for "paprikahuhn" defines it as an "Austrian dish, perhaps of Hungarian origin." It is like defining pizza as an "American fast food perhaps of Italian origin". I think some large public libraries in New Jersey would boycott OED for such an entry.

Such language searches may not be perfect because the etymological labeling has not been consistent in OED, to say the least. Partly, some of the ignorant travel diaries used for OED are to be blamed for the etymological gaffes, but it should be a matter of editorial common sense to recognize that there is a discrepancy between the definition and the majority of the quotations in most of the examples quoted above. Clearly, the word "paprika" should have been labeled as of Hungarian origin as attested by the quotations.

Occasionally, OED fails to provide an explanation of the origin of the word. This is the case with "paparazzo". OED does not mention that it originates from Fellini's La Dolce Vita where the name of a rather sleazy photographer is Paparazzo, i.e. the word is an eponym.

Beyond the omission of obvious etymological labels, their massive inconsistency must be criticized, especially because they would be easy to correct in the digital version. It is one thing to reproduce the variations of the language as they appeared in print. It's quite a different thing to use 4-6 variants for the etymological label; this is an editorial matter and should be handled as uniform titles in book cataloging. If the entries are not to be changed, at least the language variants known to the editors should be automatically OR-ed behind the scene when any of them is used in searching the etymology index.

It is excessive how many spelling versions there are beyond the preferred Anglo-French (AF. AFr. Anglo Fr. Anglo-Fr. AngloFr.) but it may be chalked up to the fact that there are so many such words (over 1,700), entered at various times in the history of OED. This excuse would not hold true for such languages as the Algonquin (close to 60 words), which also appears as Algonkin, Algonquian and Algonquin Indian. True, a truncated search in the etymology index like Alg* would retrieve all of these but it is below the standard of OED. In case of Béarnese (less than a dozen words) and Faeroese (about 50 words) the inconsistency becomes absurd: there are etymological labels for Bearnese, Béarn., Béarnais, and Faer., Faroese, Faerose Faröe., Faeroic and Feroic. Frankly, no heroic action is needed to consolidate these with a global replace command.

Some of the labels are really obscure. It is hard to understand why a label like "nosology" is used, especially in cases where "psychology" would have been natural as, for example, for "megalomania". There are very few entries where this word would be used in the definition or the etymology parts but even one seems to be one too many. I don't think that "nosography," "nosological," "nosologist" and "nosology" deserve main entry status, and the latter one receives two main entries, one with several sub-senses, while the other is a cross-reference to noseology.

Definitions and Quotations

The most impressive aspect of OED to me is the incredible richness of the definitions and their beautiful illustrations with more than 2.5 million quotations. It was already demonstated through the seemingly simple verb "abate" how many shades of meaning OED unearths and documents. There are words that have several hundred sub-meanings and illustrative quotations, such as the longest one, "set," or the verbs "take," "make," "get" and "run." There are separate entries for homonyms with unrelated meanings. The meaning of "bimbo" for a punch made with Cognac brandy instead of arrack has its own entry, and so does the disparaging meaning, which has a submeaning for men and another for women.

The variety of meanings and shades of meanings and their definitions is breathtaking. The word "avoidance" has seven meanings and two of those have two sub-meanings. The Random House Webster's Unabridged Dictionary provides merely two meanings of the word, the legal and the social ones.

OED also provides the definition for the anthropological meaning of the word. Often there are shades of meaning within a sub-meaning of a word, such as the slang use of the verb "dig", which are lavishly illustrated from a variety of sources.

It is a valuable practice of OED that it is also warns of the incorrect use and meaning of the words — such as using "beautify" for "beatify" as in the quote "beautified by Pope Paul." These are clearly marked, although labeled with the rather obscure abbreviation "catachr." that stands for the somewhat archaic "catachretical" derived from "catachresis". Sometimes a definitely incorrect spelling gets labeled as an obsolete variant instead of as a catachretical use, as is the case for "concensus" for consensus. It is another issue that the phraseology used in warning about incorrect use is very inconsistent, varying from "misused for," to "incorrectly used for," to "catachretically used for." Nevertheless, the illustrative quotes can be really funny, such as the first one misusing "prostitute" for "prostrate".

It is not merely the length of definition that matters, but the grabbing of the essence or the allusions of a word that makes the difference. This can be seen, for example, in the definition of "carpetbagger" in OED versus RHWUD. While RHWUD merely registers the meaning of "dulcinea" as a sweetheart, OED gives the reference to Don Qixote and in the etymological section correctly notes that the word is a derivative of the Spanish word for sweet. No doubt sometimes OED shies away from clarifying the origin of a word, as is the case with the slang term "futz", which the Merriam-Webster College edition rather credibly traces back and explains.

Occasionally the definition in OED is also used to informally but significantly antedate the first quotation. While the definition of "glasnost" in itself is very informative, the real revelation is in the substantial small print. It advises the reader that the word appeared in Russian dictionaries of the 18th century in its general sense of publicity and in Solzhenitsyn's famous open letter 15 years before Gorbachev made it a household term and 12 years before the New York Times used it first. Given the breadth of OED it is no wonder that it is the only dictionary that registers with the correct "jocular" function label the adjective and noun "glasnostic" used by the New York Times, the Christian Science Monitor and the Chicago Tribune.

While most of the largest dictionaries define the contemporary meaning of "hacker" well, only OED gives the old American meaning of the word ["a tool for making an . . . incision in a tree as a channel for the passage of sap, gum, or resin"], which to me sheds new light on the possible origin of the term "computer hacker."

In most dictionaries some words are labeled offensive or disparaging. Even guidebooks warn against using some words that your practical experience tells you are okay to use. I recall that "Siam" and "Siamese" are treated like that (allowing for Siamese cat and Siamese twins) in most guidebooks. In Thailand you can see the word proudly used in English language brochures, on billboards and on the neon signs of hotels across the country. The same applies to the word "Hun." I don't know any Hungarian for whom it would be offensive from the lips of an English-speaking person. Still, most dictionaries label it offensive. "Hun" is used in Hungarian and can even take on a poetic sense. (It's also used in many company names.) Most dictionaries indicate, too, that the word "Hun" can also be used in a pejorative sense for German soldiers. I never found an explanation for this until I looked up the term in OED, which solved the mystery by tracing it back to a take-no-prisoner-speech by Emperor Wilhelm II in 1900 to German troops.

Another perfect example for the depth of OED was the widely publicized tracing down of the origin of the word "nacho" by an enthusiastic OED contributor, to be seen at http://www.oed.com/public/news/9907.htm.

One of the beauties of OED is that if the definition should fall short a bit, the quotations help out. We can see that in the case of "pheromone" where two quotations suggest its importance also in human communication, which is missed in the definition and quotations of all other dictionaries. Similarly, the definition of "imprinting" in OED misses the sense of the word that refers to a newborn animal's attachment to any species based on conditioning at and after birth. But if the first and most authentic quote by Konrad Lorenz himself, who coined the German word and its English equivalent, does not drive the point home, the second by Huxley certainly will.

OED consistently stays away from the prescriptive attitude to usage and prefers the descriptive approach. It applies negative usage labels quite rarely or adds an escape label for those who err. This can be seen for the undoubtedly incorrect words such as "irregardless", which is labeled as non-standard or humorous. It tells you something about the non-judgmental attitude of OED that about 20 words are labeled as non-standard or sub-standard.

In the past, Oxford University Press has burnt its fingers with politically sensitive words in some of its other dictionaries. The editors now hold to a long-established policy of not including geographic names and their definitions in OED, thus avoiding definitions that may not please all readers about places like Kashmir, Pakistan or Israel. When words like Kashmir are used attributively they are included, but the entries swiftly gloss over risky parts that have, even very recently, resulted in censuring, boycotting, banning and burning encyclopedias and other dictionaries by governments (and of course by individuals) that felt offended. It is surprising to see that in the entry about Palestinian, OED volunteers a relatively long definition note that this work rarely uses in entries for derivatives of geographic names. This may make trouble for the dictionary.

In the territory of another controversial issue such diplomacy cannot be applied. A decision must be made regarding the inclusion of coarse, vulgar and obscene words. However, the treatment of those words can show diplomacy. The OED First Edition ignored most of the obscene words, used Latin terms for unavoidable anatomical words and derivatives, and gave terse definitions, moving on to the next clean word as quickly as possible. The Second Edition and the Additions broke with this policy and included practically all of the commonly known four letter words appropriately — and certainly to the disgust of Mr. Bowdler and his followers. (The ever sarcastic Samuel Johnson characterized this bowdlerism very well when he remarked to the ladies who praised him for omitting vulgar words from his dictionary: "My ladies then you must have been looking for them.")

I will spare you examples here, but suffice it to say that while OED still likes to use the Latin terms and to beat around the bush in the definitions, it is generous with the inclusion of the words themselves and with the illustrative quotes as well, without the excessive labeling seen in the Encarta English Dictionary, which uses the label "taboo" for words that you can hear in kindergardens.

In some cases, no OED definitions are given in a word entry. This avoids redundancy when the quotations from respected sources already provide definitions — as is the case with "bionics".

Some of OED's meanings in the scientific, technical and medical sciences may not be as up-to-date as those in the American Heritage Dictionary, the various Random House unabridged dictionaries, or the 10th edition of Merriam-Webster Collegiate Dictionary, which have all been continuously updated. OED has always been meant to focus on the words of the literary world. It is quite telling to look at the bibliography of the First Edition. It clearly shows that literary works were used almost exclusively as sources for its quotations and thereby implicitly defined the perimeters of the vocabulary. Undoubtedly, the Supplements corrected this bias significantly, relying on a large variety of non-literary works. General newspapers, magazines, published letters and even transcripts of television programs have been increasingly monitored for words and their meanings by the editorial staff and the volunteers in the Reading Programme. As these sources increasingly cover science, psychology, sociology, technology and medicine a significant number of such words have appeared in the Additions series. Still, OED does not want to compete with, say, the Dictionary of Science and Technology of McGraw-Hill.

With all this said, the OED fared very well in my test words in science, technology and medicine. It is the only dictionary I tested that provides a definition and quotes for "e-mail" as a verb, the only one that defines "presbyotic", which hard-of-hearing patients may mistake for "presbyopic." The definition in the OED may not be uplifting for those diagnosed with presbyopia but it is informative, and may even lit a light in the mind of the reader what the name "presbyter" alludes to.

For my test words, OED also had the best definitions for the mad cow disease and the computer technology aspect of "daisy-chaining" with excellent illustrative quotations. OED listed the variant meanings of the acronym "CAT" with supportive quotations. As mentioned earlier OED is the only one of the major dictionaries that alludes to the fact that "pheromone" may play a role in human meta-communication. Although the definition limits "pheromone" to the animal world, two of the quotations in OED bring up the human side of the pheromone phenomenon. On the negative side, OED did not have an entry for carpal tunnel syndrome.

OED is definitely a descriptive dictionary — not a prescriptive one — and this shows in its well balanced, non-judgmental definitions for sensitive terms that are favorite targets of advocacy groups in off-season periods. If you look at the definition of "homosexual" or "abortion" they are exemplary in their neutrality. The latter also deserves praise for mentioning the restricted medical use of the word, although it does not mention the highly controversial partial birth abortion.

The definition of "educationese" is a typical example of OED's diplomatic approach, defining it as a "term for the jargon-ridden language supposedly characteristic of educational administrators." Supposedly? The same approach can be seen in another term, "Haigspeak", which you will find only in OED. The definition for "scientology" puts in a qualifying word "claiming" in the definition for good measure (to put it mildly — so that I don't get sued).

In a very few cases the definitions are a little vague or wax lexicographical when a good example would have been perfect. You can sense the joy and pride of the lexicographer who did the definition for the word "whatever" — finally able to unleash the desire to talk shop in describing it as a pronoun that introduces "a qualifying dependent clause equivalent to a conditional or disjunctive clause, often with verb in subjunctive." The definition note rubs it in, explaining that as "predicate sometimes (esp. of persons) expressing quality or character, and thus approaching a pred. adj. (cf. WHAT A. 17). Often with ellipsis ..." Whatever.

The quotations sometimes can turn out to be confusing. Under the 11th sense of the meaning of the verb "accommodate" in some of the quotes the verb is spelled with a single "m" without any indication that this may have been a variant spelling during a certain time period. But there are few spelling errors considering the size of this dictionary. Among my test words, the misspelled version "millenium" was the worst with seven "hits," especially because three of them were in the millennium entry, which could confuse the reader about the correct spelling. Despite that, running the test in the index created from the definition parts of the entries showed what an excellent proofreading job was done on OED.

On the other hand the lack of use of an authority file for language names, titles of works and personal names of authors is very bothersome. The variety of language names is ugly and hinders finding all of the words of, say, Anglo-French origin. The variations are listed in the help file but it would have been much better to standardize them a posteriori at least — a relatively easy job with the software that was available for the editors.

Searching by name of quoted authors is difficult because of the partly identical format of two or three authors' names. The search for R. Chandler retrieves both Raymond Chandler and Robert Chandler. "A. Miller" is used not only for Arthur Miller, but is also part of G. A. Miller and W. A. Miller. While it is easy to get only the results for the latter two, the search for the playwright retrieves entries that quote the other two persons. The quotation date certainly helps in this case, but not in those cases where the quoted authors are contemporaries.

OED has been criticized for relying too much on British literary works for quotations. This was true for the first edition but changed a lot in the preparation of the Supplements and the Additions. Best-selling American novels, novelists like Philip Roth, Bernard Malamud, John Updike, literary magazines, news magazines and newspapers alike are sumptuously quoted. It is interesting to see how much more preferred a source Newsweek is with 408 quotations than is U.S. News & World Reports with 54 quotations (unless I missed a lot by not figuring out all the possible abbreviations). The New York Times yields more than 2,700 quotations (43 from the New Edition) while the Washington Post has only 1,190 quotations (merely 8 from the New Edition). It is not logical that while the New York Times is abbreviated rather consistently as N.Y. Times, the Los Angeles Times is spelled out except for one quotation.

THE SOFTWARE

The OED software does justice to the marvelous content. It is powerful and has grace, it reproduces the full character set of the printed OED without the need for plug-ins or Java applets. It offers sophisticated options in an intuitive way. Still, there is some room for improvement, especially in the area of the default Quick Search mode and customization.

Interface and Navigation

The interface is exemplary in every regard with the seeming exception of screen density. The screen layout is as good as it gets considering the unparalleled amount of text that comes with the typical entry. Screen density is way above the recommended 30-40 percent of the "screen estate," but the primary purpose in this case must have been to accommodate as much as possible of the content of an entry. A typical screen showing the main entry that I borrowed from the excellently structured and illustrated help file serves as a good starting point for discussing software issues.

The solution of using buttons to display different parts of the entry is the best that I have ever seen for user-defined display format. It is quick, elegant, intuitive and convenient. I use the entry for the word "hunk" to illustrate how one can easily change the content of the display of an entry. The shortest format used by default merely includes the headwords, the part of speech code, the sense number, the geographic and function label and the definition. There are five buttons right above the definition. Sometimes there are less buttons when there is no pronunciation provided, or more if there is a revised entry for the term. Pressing a button will display the selected element of the entry, such as the pronunciation, the spelling variants, the etymology, which is, in this case, a see reference, or the quotations section. The date chart in OED is an attractive feature that shows on a graph the distribution of the quotes across the time period for the term. These buttons change the display format immediately and are a great convenience.

While the typical full main entry requires several screens for displaying all of the parts — and all of the senses/meanings — there are entries that are short cross-references. Some of these are easy to understand but others use unfamiliar and inconsistent abbreviations that are probably not clear to all the users. The cross-reference from "kist" (as in "sunkist") informs the users that this is an "occas. pa. tense and pa. pple. of KISS" — which may make them tense. As the display is otherwise empty it would be quite useful to show the decoded form of these abbreviations at least in case of cross-references.

The left pane of the window accommodates a variety of functions. Its primary purpose is to show from the index those entries that are before and after the term the user searched for. For practical reasons navigation is in chunks of about 25 terms. This pane is also used to display the map of the entry that is discussed below.

Quick Search

There are two search modes: Quick Search and Full Text Search. The former is the default search represented by a query cell and the Find Word icon, which is somewhat of a misnomer as it finds only headwords in main entries and subordinate entries. The search is exact, i.e. there is no automatic singularization or pluralization, let alone automatic stemming. Searching for "nachos" will not find the lovely entry for nacho showing the likely origin of the word from the nickname of the person who first cooked up nachos and served them. The search term "signaling" will not find the main entry signal. Irregular plurals like "women" or Greek plural forms like "criteria" will not retrieve the main entry under the singular heading, and this may be a stumbling block for a novice who doesn't realize this limitation in the Quick Search mode. It would be an important improvement to extend the search to the regular plural form and to the Greek and Latin plurals automatically. The index used in this mode of searching should also include all the variant forms listed in the entry. They are listed in the entries because they are or were in use and therefore are likely to be used also as search terms.

The British and American spelling variations should also be taken care of automatically, as is the case in the InfoSeek search engine, which retrieves both encyclopaedia and encyclopedia or archeology and archeaology no matter which format is used as a search term. The only way this can happen in OED is when the headword includes both variants. This is, however, surprisingly inconsistent. While both formats appear in the headword for encyclopedia only the British format is used for pediatric without acknowledging the U.S. format anywhere in the record. However, because it is a cross-reference the user is correctly taken to the British headword "paediatric." "Archeology" is not even listed as a variant in the main entry "archaeology" and therefore the Quick Search does not find the entry. The inconsistency of displaying the American spelling as alternate headword becomes really odd sometimes, as in the case of the word "vaporware," which was born in the U.S.A. OED admits this fact up front and uses three quotes from U.S. sources and only one from Punch. Still, only the British version "vapourware" appears as the headword, so the user would not find this word through Quick Search. Even when an entry makes it clear that there are different (and implicitly valid) spelling formats of a word, if the users enter one of the variants, they may not find the word, as is the case with the search term "perestroyka". The variants should be included in the index used by the Quick Search mode.

On the positive side, explicit stemming for either an unlimited or a limited number of characters is possible. The '?' symbol will retrieve a single character while the '*' character will retrieve any number of characters after the stem. This is a routine feature, but it is not routine that you may apply left-hand truncation as well to retrieve all the words ending in a character string, e.g. "phobia" will find all the entries about the daily growing number of phobias from acrophobia to zoophobia.

These characters may also be used for masking a specified number or an unlimited number of characters within a search string. This comes in handy when you are not sure of the spelling of a word. The search term "acr*ous" will find "acrimonious" among other words — and you don't have to sweat to look up "otorhinolaryngology." It will show up among the four words that start with "oto" and end in "gy" if you just type "oto*gy." The '?' symbol requires the presence of a character that may be any character (such as in "organization" versus "organisation"), e.g. "vapo?ri*" will retrieve only "vapouring," "vapourish," "vapourized" and "vapourizer" but not such words as "vaporizer," "vaporizing," or "vaporiferous" because the "?" symbol requires that one character be present between the "o' and the "r". The '*' character stands for any number of characters or no characters at all, so "*p*dia" finds words ending either in "pedia" or "paedia," thus saving you the trouble caused by such inconsistencies as "logopedia" being spelled with an "e" while "hypnopaedia" is spelled with "ae."

Equally convenient in the Quick Search mode is the fact that you don't need to know if a word is spelled with or without a hyphen or a space, i.e. in searching "call-girl," "callgirl" or "call girl," the software will find all the three variants. This is not true in the Full Text mode.

The results display of the Quick Search depends on whether one or several matches were found. In the former case the entry is immediately displayed. If there is an entry both in the New Edition and the Second Edition, the former is displayed first. In case there are two or more matches in the basic index (which is created from headwords as main entries and subordinate entries), a part of the index is listed in the left pane, highlighting the search term among the words in front of it and following it. You may scroll up and down in the index. Clicking on an index term will display the entry.

The content of the entry depends on what options were chosen in displaying the last entry because these options are persistent. For example, if you have previously chosen to display the Etymology segment of a main entry, Etymology will continue being displayed until you disable it. Changing the content of the entries displayed is swift and intuitive, as was illustrated earlier using the entry on "hunk." The matching terms are highlighted in bright red wherever they appear. For example, the search string "J. Lennon" in the quoted author field will make that name appear in red when the record on the word "vibe" is displayed, including a quote from John Lennon. The index entries are displayed by default alphabetically, but by the press of a button they are listed in date order. Re-sorting is instantaneous and provides an interesting snapshot for showing words that appeared first in print in a given year, ie. words that are of the same vintage year.

The left pane is also used to display the map of the entry. Although the Arabic (and occasionally also Roman) numerals and the letters per se don't mean anything, except to indicate the variety of meanings of a word at a glance, they serve a very useful purpose. Entries often have references to a specific meanings within another, possibly very long, entry. On the map the exact location of the match is marked with a red dot. There may be more than one match within one entry. Clicking on the red dot marker will position on the screen the appropriate segment from the entry, so you don't have to scroll screen after screen to find the referenced part of the entry.

Beyond being displayed, the entries may be printed in a format that makes perfect use of the standard size (8.5 by 11 inch) paper in producing a much better output than merely printing the screen via the browser. You may also send an e-mail to someone who will be able to look at a full entry by clicking on the link embedded in the mail automatically. This link will automatically expire in 3 days, and does not allow the recipient any navigation in OED, of course. Still, it is a great idea when you want to illustrate the meaning of a word — and you may add a note to the e-mail. It is the perfect compromise that OUP could make.

Full Text Search

The software shines in this area both functionally and in terms of elegant and intuitive interface design. Searching the full text globally is done via a small query form that also offers a variety of field-specific searches through a pull-down menu. It allows the users to restrict the search to the definition, etymology and quotation areas of the entries. As for the quotation area, the options are generous, offering to search anywhere in the quotation area or only in specific subfields within quotations, such as the date, author, work and the text of the actual quotation. The sample searches below clearly show the differences when searching the word "purgatory" in the various segments:

The results list shows the name of the entry in which the search term appears. It also shows the date of the earliest quotation for the main entry term — not the search term. If you click on the underlined entry name you get to the top of the entry for "long-haired." If you click on the underlined match the entry is displayed showing the matching part directly. In case of the full text search for the word "archeology" there are 5 matches (because it is usually spelled as archaeology). The results list shows that one of the headwords that has a quotation that includes the word archeology is "long-haired." The year next to it indicates that the first quotation for "long-haired" is from 1552. It is followed by a segment from the quotation area where the match was found which is similar to the form used by the Keyword in Context (KWIC) indexes. The KWIC-segment provides sufficient information to decide if it is a relevant match or not. In case of searching the quotation index or the date, author, title or text quoted areas, the date of the quotations found will appear, in our case by Joseph Heller.

The results list is displayed by default in alphabetical order of the main entry words. If the search term occurs more than once in the same entry, the entry name is listed with each match. When there is no value in the date column on the result list for one or more hits it means that the search term was found in the definition or etymological area or was a cross reference entry. The entries can be sorted by date as well, very quickly, and the software is smart enough to ignore the codes that precede the year. These codes qualify the year (the letter "c" stands for "circa" and an estimated year, the letter "a" is the code for "ante," indicating that the word was used before the year given, but no literary example could be found). The user may choose how many items should appear on the result list at once, ranging from 10 to 100 in increments of ten.

The software offers proximity and positional operations between words that can refine the precision of the search. You may specify how far and in what order the two search terms should be to qualify. For example, requiring that the word "ulysses" in the quotations field follow the word "joyce" 1 word apart will bring up 1,282 items. The distance between the terms may be 1, 2, 5 or 10 words in the same section (quotation, definition or etymology). Alternatively, you may specify the words to be anywhere in the same section. The positional operators include before, after or before/after.

The truncation operation works the same as described under the Quick Search function, and so do the display, print and e-mail functions. However, in Full Text Search the hyphens and spaces are treated differently. It is a nice feature that in the Quick Search mode the software ignores hyphens and spaces and when you enter the term "car-jacking" it finds "carjacking" the form used as the main entry, and would find "car-jacking" and also "car jacking" if these were the headwords. This, however, does not apply to the Full Text Search mode where "car-jacking" finds the entries that have a quotation using the "car-jacking" or "car jacking" variant but not the ones using only "carjacking." The process should be identical.

Suggested software enhancements

A few enhancements to the software would bring out the best of OED. Extending the Quick Search to spelling variants identified in the entry is a top priority. It is hard to accept that the dictionary is aware of the alternate spellings but retrieves them only in the Full Text Search mode. The software should go one step further and recognize common misspellings even when they are not listed in the entry. This delightful feature of intelligent software is found in the fee-based version of Encyclopaedia Britannica and the awesome and free GuruNet utility that I will review soon. To illustrate, I went to Gurunet for a quick search for the misspelled word "wensday", and a full text search for the misspelled word "crucifiction".

Such misspellings are not fiction but real problems for many users who are likely to be turned off by getting no result because of a misspelling. This very problem could be simply alleviated by always showing the index area where the term the user entered is or should be. Currently, the index is not displayed if there is no match but merely issues a message and encourages you to make a full text search. Full text search is not necessarily the best option. If a user has spelled the word "marvelous" the American way, it would be much better to show the index so that the user could see that there is a headword for the British spelling of "marvellous." A savvy user can force the displaying of the index by using the * truncation symbol in the search, but the point is to keep the non-savvy users engaged, or at least not to alienate them.

I would like to see the option to search by first date when a term appears in a quotation, i.e. to search for words that appeared in sources in, say, 1996 for the first time. You may do a date search but it merely lists the entries that have a quote from that year that includes the search term. It is obviously not the first date of occurrence when the search for the quotation date 1945 brings up such entries as "aerial," "ageing," "agreed," "agronome" and "ahead." These entries do have quotations from 1945 but not as first quotation dates.

To a limited extent first quotation date searches can be done in a round-about way by making a search for words starting with the letter 'a', then sorting the result by date order, and then scrolling down in the index. This is far from being a perfect solution.

The intimidating appearance of OED is partly due to the excessive and obscure abbreviations used in the entries. This was understandable for the print edition to reduce the length of each entries and prevent spawning more volumes. Even experienced users are sometimes at a loss to decipher the part of speech abbreviation "pa. pple," which stands for past participle and passive participle alike, or "vbl sb" for verbal substantive, "ppl. adjs." for participial adjectives. The abbreviations used in the titles of quoted works are not much easier. How many would recognize that the acronym "Jas." means the King James version of the Bible, or "Tr. & Cr." stands for Troilus and Cressida even when it is next to "Shak."

True, there are lists of abbreviations through the help file for the ones used in definitions, etymologies, authors and titles of works, books of the Bible and the language names. But casual users are reluctant to leave the search process and dig down just to find the explanation for an abbreviation. The digital version makes it easy to replace the abbreviations in the entries with the fully spelled out format. Such an option may be offered optionally so that experienced users can keep enjoying their mastery of the abbreviations and professors can show off in the classroom by instant deciphering.

Alternatively, making these abbreviations hot in the sense that clicking on them would display their meaning or fully spelled out version in a small window would provide a perfect compromise (and would require a Java-enriched software, which is getting more and more common). This change could make much difference for those who stay away from OED because of its byzantine conventions. By the way, Byzantine Greek also appears in many records as "Byzant. Gr.," "Byzantine Gr." and "Byz. Gr."

The sophisticated notation system used by OED can handle rather difficult situations, although it may look confusing to the casual user. Searching for "scotophobia" (fear or dislike of the dark) yields a slightly odd-looking result with cross references to two senses of the prefix SCOTO-. OED uses all uppercase cross references to indicate that the word is a main entry. When you click on the cross-references, the main entry terms appear as Scoto-1 and scoto-2, respectively — for good reason. The former variant is used in compound words for things related to Scotland or the ethnic group, including "Scotophobia" the "morbid dread or dislike of the Scots or things Scottish." The latter version is used as prefix for things related to darkness, including "scotophobia," the fear of darkness.

Sometimes a definition may include a term that the user may not fully understand, not even from the quotations. This was my case with the word "slush" in the definition of "slush fund". It would be nice to be able to click on those words and see their definition in a small pop-up window. You may type the word in the small Quick Search window but you abandon your original entry, and, in case of a long archaic word, chances are high for misspelling when typing.

The massive volume of inconsistency and abbreviations accumulated throughout the 115 years of OED in these elements becomes the bane of searching by quoted author, quoted work and language of origin. If it looks suspicious to you that there are only 32 quotations from E. A. Poe, you are right. Trying just "Poe" in the quoted author index yields 442 additional hits. Eyeballing the result list indicates that all of them are from the poet. It also seemed unlikely that Dante Allighieri's Purgatory would be quoted only 4 times. It turned out to be a good hunch to search for "Purg," which yielded 179 hits, but many of them were not Dante's Purgatory. Combining Dante with the abbreviated form yielded a more likely 58 hits from the various translations of this classic. Using the last name alone is often a bad search strategy as there are many authors with the same surname, as I discussed above. If you are not cautious and merely search for quotations from the Bard by the abbreviation "Shak." that you see so often in entries you find "only" 29,312 hits, missing out on the several thousand that abbreviate his name as "Shakes." This may not be a problem in recognizing the author when looking at an entry but it definitely is when searching by author.

These are minor tasks compared to the Revision Programme that has already started and is estimated to be completed by 2010. Chief editor John Simpson has had enough experience with OED to accept his estimation as realistic. Certainly there will be hundreds of new words added and the work will be enormous by dictionary standards. But not to the tune of 600,000 new words as the New York Times article may lead you to believe.

After uncharacteristically trite passages and cliché numbers about the OED that "stretches over 20 volumes, weighs in at 138 pounds and so thoroughly plumbs the history of 640,000 words and phrases there would seem to be nothing left to plumb," the New York Times author claims that the "10-year overhaul will add more than 600,000 new words and revise the 19th-century entries for many old ones." It is surprising that neither the author nor the editor of this most respected newspaper spotted the contradiction between the two sentences and the absurdity of the second one. Neither did editors at those newspapers that directly or indirectly parrot the New York Times. For the record: all the 640,000 existing words will be revised, not just the 19th century entries. As for new words there will be hundreds added because no matter how comprehensive OED is, there are new words born that will deserve their place in the New Edition, which may be published only in digital format. If you have gotten so far in this long review you probably will come for the next one in 2010.

Careers at Cengage   |   Contact Cengage Cengage Learning     —     Gale   |   Course Technology   |   Delmar   |   Academic   |   Nelson
Privacy Statement   |   Terms of Use   |   Copyright Notice