Title: Gale Virtual Reference Library 2.0
Publisher: Thomson Gale
URL: http://www.gale.com/gvrl/
Cost: depends on the collection selected
Tested: Feb. 4-20, 2005
Disclosure: Thomson Gale is the publisher and sponsor of the Péter's Digital Reference Shelf column.
Thomson Gale is one of the largest publishers of unabridged, subject-specific reference books. The Gale Virtual Reference Library (GVRL) represents only a subset of Thomson Gale's reference titles, as many of its directories, dictionaries and encyclopedias are part of its other specialized digital reference suites, such as Biography Resource Center and Gale Ready Reference Shelf. On the other hand, it includes important encyclopedias from partner publishers, such as seven encyclopedias from Sage and four encyclopedias from Wiley as of this writing. I would love to see Wiley's three-volume Internet Encyclopedia become part of GVRL.
Only xreferplus and Oxford Reference Online (ORO) are in the same league as GVRL when comparing unique works (between 130-200 in the collections of this league) rather than number of volumes — where GVRL is in its own league. The partnership with xreferplus and the linking to the optional, customizable add-on collection makes GVRL by far the most comprehensive digital ready-reference source.
This is not merely a theoretical semantic issue. As my full-text test searches have shown, size and variety really do matter — a lot. In both xreferplus and ORO single volume and concise encyclopedias and dictionaries dominate, whereas in GVRL, multi-volume, unabridged reference works are predominant.
These include such works as 41 volumes of Contemporary Authors, six volumes of Authors and Artists for Young Adults, nine volumes of Children's Literature Review, 10 volumes of Contemporary Musicians, five volumes of the Gale Encyclopedia of Medicine (all averaging 700 pages per volume) and the equally hefty Encyclopedia of Education. The more-than-three-times-as-large, brand new print edition of the Encyclopedia of Religion, which is to become available on GVRL in March, weighs in with the equivalent of 13,500 pages in 15 volumes. This edition is in addition to the other eight reference works on religion already on GVRL. Obviously, an encyclopedia on Judaism would well round out GVRL.
For comparison, Oxford Reference Online has 10 reference titles on religion. (There are 22 Oxford Companion titles in the ORO Premium Collection on various subject matters. These usually have much longer articles of 500-1,200 words than the rest of the titles in the ORO collection.) It will not change the dominance of Thomson Gale in the digital reference collection landscape when Oxford University Press launches the Oxford Digital Reference Library with 10 unabridged encyclopedias as a separate collection in March 2005. Just for good measure, GVRL will offer some encyclopedias from Cambridge University Press later in 2005.
GVRL has comprehensive ready-reference coverage of almost all of the subject fields in arts, humanities, sciences, social sciences, medicine and technology. The exceptions are reference works in languages and mythology (beyond religion), which have never been the turf of Thomson Gale and the companies it has acquired. The lacuna in these two subject fields should be of no concern for two reasons. One: Thomson Gale's partner in GVRL, xreferplus, has fairly broad coverage in these two areas, thus filling the gap. Two: there are many excellent open-access monolingual and bilingual general dictionaries, thesauri and style guides. Many of them can be cross-searched in one fell swoop through the superb OneLook and YourDictionary multi-search engines.
Breadth and Depth of Coverage
In-depth, multidisciplinary coverage is the forte of GVRL. This comes through particularly well when searching for issues that have many angles, such as euthanasia. For the casual user, ORO would, at first glance, appear to have an edge over GVRL as a search in the title (main entry or header) field yields 28 hits versus the 21 in GVRL. On closer examination, however, five of the hits are so-called "lead-in terms" that provide see reference to the preferred term or to the preferred format of the entry heading in the source publication. Two hits are short thesaurus entries listing two or three synonyms. Four hits are single-word entries from concise and pocket bilingual dictionaries. Most of the other hits are concise entries ranging from 50 to 200 words, such as the ones from the Oxford Dictionary of Nursing and the quite similar entry in the Oxford Concise Dictionary of Medicine. Only three or four hits have longer informative entries — 655 to 1,300 words — from the Oxford Companion titles.
In contrast, GVRL had "only" 21 hits for the title search for "euthanasia." Every item is available in both PDF and HTML format. Many include illustrations. None of the hits were merely lead-in terms. More importantly, the articles matching the test query ranged from one to 27 pages, averaging nearly eight pages. Although, for fairness, some of the pages are from small format handbooks, and articles rarely start at the top of the page and end at the bottom of the page, so a seven-page average may be more accurate. For purposes of comparison with ORO, the average article length for this query was close to 2,000 words — without the often substantial bibliography part. It would be wise of Thomson Gale to include not only the number of PDF pages, but also the number of words with the bibliographic citation in the results list to emphasize the substance of the articles in an objective manner.
Variety of Information-rich Sources
The variety of sources is another remarkable trait of GVRL. For the euthanasia title-only query, there were articles and chapters from Contemporary American Religion, Dictionary of American History, Encyclopedia of Aging, Encyclopedia of Bioethics, Encyclopedia of Crime and Justice, Encyclopedia of Genocide and Crimes, Encyclopedia of Population, Encyclopedia of the American Constitution, Encyclopedia of Nursing and Allied Health, Handbook of Death and Dying, International Encyclopedia of Marriage and Family, Macmillan Encyclopedia of Death and Dying, New Catholic Encyclopedia and West's Encyclopedia of American Law.
The richness of this collection both in terms of the article content and the variety of sources comes through loud and clear when performing full-text searching, even for the casual user. A search for euthanasia found 97 hits in ORO and 403 in GVRL. The pattern of article length illustrated for the euthanasia title-only search was prevalent in all of my test searches. The hits for full-text searches for almost all of my tests also yielded more hits to much longer articles (often with illustrations and bibliographies) in GVRL than in ORO. Some examples for the number of hits in GVRL and ORO (respectively) were: "polygamy" (433 vs 103), "gender bias" (87 vs 22), "cold fusion" (29 vs 10), "Islamic jihad" (278 vs 16), "tsunami warning system" (4 vs 2), "toxoplasma" (25 vs 7), "toxoplasmosis" (65 vs 21), "Raoul Wallenberg" (20 vs 2), "Mark Knopfler" (14 vs 0) and "Indigo Girls" (23 vs 0).
In my test suite there was one term, toxoplasma, which yielded no hits in GVRL when limiting to the title field, but found two hits in ORO. In a full-text search, GVRL showed its typical superiority with 25 hits versus seven in ORO. In one test, for the word kamikaze, ORO was better than GVRL both in the title search (13 vs 4) and the full-text search (56 vs 51), although several of the dictionary definitions in ORO were almost identical. For the word umrah (the small pilgrimage to Mecca), ORO had eight hits while GVRL only four (and two articles were very similar), but the launch of the huge Encyclopedia of Religion by Thomson Gale is likely to change the current hit counts for religion-related queries in my test suite.
Interdisciplinary and Age-specific Sources
Beyond the rich content at the article level and the variety of substantial sources, the most remarkable features of GVRL are the number of interdisciplinary sources and age-specific sources. For example, health issues specific to children and the elderly are as much a part of psychology as medicine and have their own legal and ethical aspects. Beyond the obvious disciplinary-specific resources, such as the Encyclopedia of Psychology, Encyclopedia of Medicine, Encyclopedia of Mental Disorders and Encyclopedia of Nursing and Allied Health, there are also such symbiotic and age-specific resources as Child Development, Encyclopedia of Aging, Encyclopedia of Children's Health: Infancy through Adolescence, Nutrition and Well-Being A to Z, Encyclopedia of Health and Behavior, Handbook of Death and Dying and the Encyclopedia of Death and Dying. It should also be noted that the age-specific approach is present from the perspective of the target audience. Beyond the scholarly and professional titles, in many subjects there are several sources meant for middle school students, such as Drama for Students, Poetry for Students and Novels for Students series, the Junior Worldmark Encyclopedia family and U*X*L series. This range of choices facilitates the creation of the most appropriate mix of resources to be acquired by the library.
The new interface and software is good, but not flawless, offering basic and advanced search modes. The latter has extra options shown when requested by the user. This helps to avoid unnecessarily complex search templates. The software allows not only the usual limits to a specific resource or to title or full-text field, but also to keywords that are assigned by indexers and/or occur in cited references. This makes it very feasible to use the keyword index to strike a balance between the precision and recall provided by limiting the search to the title versus the entire text. The extra search options allow limiting the search to articles with images and to specific user groups, which is particularly useful given the fact that the collection includes references targeted to middle school students, college students and laypersons, in addition to professional and scholarly users.
The results can be sorted by relevance, source title or article title. There is an oddity in relevance ranking. The relevance percentage scores rarely reach 100%, even in cases where there is no doubt that the article is a perfect hit. These five articles in the result list get only 80% relevance score, even though they are very informative, substantial articles dedicated to the topic of euthanasia. The result list could also be displayed somewhat more tightly to see more hits at a glance.
Acronyms matching the query term represent a challenge in the results display and need some modification, especially because they hog the top of the results list by "virtue" of having a very short entry (the fully spelled out version of the acronym) and thus being assigned the highest rank score. For example, the search for Mecca in the title yields 22 hits. Half of them are acronyms that have nothing to do with the pilgrimage site. They are legitimate results, but as they are displayed in a row without any distinguishing data elements, they look like subtitles scrolling across the TV screen.
The links to related articles discussed in the same reference title are clearly listed at the end of the articles and provide a jump start for further research. Some encyclopedias have certain words in boldface or different typeface in the body of the articles. They also indicate titles (or quasi titles) of related articles. This typographical notation for cross-references is born by the obvious limitations of the print edition and could be linked in the digital implementation.
For those who also have the xreferplus collection, there is a link to pass and execute the query in that collection. This is a good idea, but could be improved by passing some of the preferences specified for the original search in GVRL, such as the title field limiter. All it takes is to add the h: prefix (for header) before the word, as in h:polygamy, which would retrieve 26 hits. Currently, the passed query searches in the full-text index of xreferplus, which yields 141 hits.
In some cases, passing an exact phrase query it is a bit more complex for two reasons. One is that the quotation marks are correctly encoded into the usual %22 code pair, as in %22cold fusion%22 for "cold fusion," when sent to xreferplus. However, in the xreferplus query they are not converted back to quotation marks and the query yields an absurd result of 12,511 "hits". The correct phrase query "cold fusion" should yield 16 hits. Limiting the phrase to the title, the result should be 10 hits. The query-passing process needs to be refined to bring out the best from the potential synergy of GVRL and xreferplus.
GVRL is the most comprehensive digital ready-reference collection with the richest content at the article level. Beyond its current and forthcoming assets, GVRL has a latent ace up its sleeve that none of the other publishers' reference collections have without relying on third parties: the potential of dynamic linking to and from Thomson Gale full-text databases, such as InfoTrac OneFile Plus (46 million records) and the ad-hoc calibrated pass-forward searches.
While the user is looking up the matching articles in GVRL, a calibrated search with several options could be passed forward and run automatically behind the scenes, making use of the rich metadata in InfoTrac OneFile. A small pop-up window could show a score card with the number of hits for articles:
Conversely, users of journal article mega databases could be advised about the number of:
A click on the appropriate combination would show those items for background information about tsunamis, their development, history and mitigation. In cases of issues with wide but temporary interest — such as we saw during the tsunami disaster — Thomson Gale could set a switch to offer a few free entries for those who only have GVRL and not the journal article database or vice versa. This would help enlighten potential users about the advantages of metadata-enriched databases (also known as full-text databases with abstracting/indexing metadata). Google can't handle such sophisticated searches even if the full set of metadata is made available to it, as was the case in the sources poorly harvested for Google Scholar, let alone for the rest of the non-proprietary Web with no or very little metadata assigned by humans.