Help:Using Worldcat data

WorldCat is a collection of 200+ million searchable bibliographic records maintained by the OCLC (Online Computer Library Center) inter-library cooperative. It Covers books, serials, music, films and some other types of holdings. The OCLC Worldcat is the largest Union catalog currently in existance.


 * There are several different interfaces to this data available:


 * When using data from Worldcat to create ISFDB publication records, or to add data to pub records (as opposed to using it to confirm existing data), a number of cautions must be observed:
 * Publisher inaccuracies:
 * Worldcat seems to often shorten publisher names, so the reported publisher name may well not be the form on the actual book.
 * Publisher imprints are often not noted at all, and publishers are in some cases so abbreviated as to be ambiguous. (for example "London: Lane" could be either "John Lane" or "Allen Lane").
 * Publication locations (listed with publisher name) may be accurate only to the level of the country, if that -- for example any publisher in England, and even some headquartered in Scotland may be listed as "London".
 * Therefore, use Worldcat to confirm publisher data form other sources, unless no other source is available.
 * Author inaccuracies and ambiguities:
 * When a book is published as by a pseudonym known to the Worldcat system (or perhaps to the entering librarian) the "author" is listed as their canonical name, and you need to check the "responsibility" line to see the actual name on the printed work. (The responsibility tag is on the details tab if not using the First Search interface. It is not always present.)
 * Often an illustrator is listed as a co-author, again you need to check "responsibility" where this is usually spelled out.
 * When an illustrator is listed, there is often no way to tell if the credit is for interior art or cover art, or both.
 * Anthologies/collections which have contents level data often use abbreviated author names like "R. Heinlein" or "R. A. Heinlein"; there is no easy way of telling how the authors were actually credited. Punctuation, capitalization and even spelling are also often mangled.
 * Date Issues:
 * Dates are rarely given more precisely than by year -- month and day are not noted.
 * Often multiple dates are listed, with the earlier being a copyright date, or the date of an earlier edition, particularly if the later ed is a facsimile.
 * Publication formats must be determined from size in cm, although paperbacks are often noted as such. But this includes both trade and mass-market formats, so size must be checked.
 * "18cm" is the size of the standard US/Canadian mass market paperback.
 * "19cm" is probably a small tp/hc.
 * 20+ cm is either a tp or a hc, but distinguishing between tps and hcs can be tricky. Sometimes OCLC will print "(pbk.)" next to the ISBN, which is self-explanatory, but if they don't, then it's time to check other sources. Price is not always a reliable indicator, but it can help.
 * If the book was published pre-WWII, it's usually either a hardback or, in some cases, a pamphlet since paperbacks didn't take off until WWII -- first in the UK, then in the US. There were some cheap editions in the early part of the 20th century whose binding sometimes approached the current paperback binding, but few have survived and fewer are owned by libraries. If the catalog information is ambiguous, then leave the binding field blank and record the volume size in the Notes field.
 * Anthologies and collections may have contents listed, but care must be taken when using this information to record titles in the ISFDB. Quite often initial pronouns will be truncated from the title, e.g. "The House on Blackmore Street" might be listed as "House on Blackmore Street". Don't assume the presence or absence of "The", "A" or "An" without using several secondary sources as a backup.
 * Older works, printed in multiple volumes, often give no page counts but only volume counts.
 * Some editions have multiple records in WorldCat and you have to pull them up side by side to see if there is additional information to be derived from the interplay of different fields. Often, but not always, there is one single "best" record for a given edition/printing. When there is not, multiple OCLC record numbers may need to be recorded as sources.
 * Prices are usually reflective of whatever Baker and Taylor charged the last time the book was available to libraries, which may be significantly higher than the original price of the book. Also, library editions are sometimes priced differently. Unless the book is fairly recent (and is in the current B&T catalog) no price is likely to be listed at all.
 * Some authors are much better covered in WorldCat because the cataloging librarian had access to a specialized collection, e.g. Andre Norton's books are exceptionally well covered (although the data is not easy to find).
 * WorldCat rarely lists printing numbers and its subject headings leave much to be desired, but you can always troll other online catalogs for this data (which opens a whole different can of worms). It does, however, usually indicate a first edition as such.
 * Data listed in brackets such as "[First Ed.]" is by convention inferred rather than stated in the publication. Other data will normally be stated in the pub, but may have been abbreviated.


 * When a publication appears in Worldcat, it is a good idea to list the Worldcat record number (shown as "accession number" in FirstSearch records). When data is derived from Worldcat as a source, this is essential.
 * The simplest way of doing this is: "OCLC: nnnnnnnnnn".
 * A basic link can be provided like this:
 * A link to the details tab can be provided like this:
 * Or like this:


 * When any data in the entry is derived from Worldcat, this should be specifically indicated in the notes. If the entire entry is derived from Worldcat, something like "Entry based on data from OCLC/Worldcat" is probably a good idea.


 * When a publication is being verified against a Worldcat record, the Worldcat/OCLC accession number should be listed in the notes for the publication, with or without a hypertext link.


 * Sometimes a single record will list multiple ISBNs. This generally means that multiple states of the publication were cataloged together, such as hardcover, trade paperback mass-market paperback, and/or library binding. This appears to be done only when the different states have the same year of publication and page count. It seems particularly likely for a library binding version based on an otherwise identical (except for binding and ISBN) state.


 * Since Worldcat records are derived from reports of individual libraries, if two or more member libraries catalog a publication differently (different forms of the publisher or author name, for example) there may be multiple records for what is actually a single publication. Try to avoid creating multiple ISFDB records in such cases.


 * Worldcat often lists the LCCN -- the Library of Congress Control Number (formerly Catalog Number). This is generally worth recording in the notes of the relevant publication. For books published before 2000, the number is generally shown as a two-digit year, a hyphen, and a serial number of up to 6 digits. For books after 2000-01-01, the year is given as 4 digits, and the hyphen is often omitted. A permanent link to the LoC record for a given LCCN is formed as follows: "LCCN: YY-NNNNN" or "LCCN: YYYYNNNNNN" The number in the URL should start with the 2 or 4-digit year, omit the hyphen, and pad the remaining number to 6 digits with leading zeros, if needed. Thus 45-7832 gives http://lccn/loc/gov/45007832, 76-89432 gives http://lccn/loc/gov/76089432, and 2007683290 gives http://lccn/loc/gov/2007683290 for a URL.

How to Tell When a Book Cataloged by WorldCat Never Appeared
WorldCat data is only as good as the data that OCLC receives from individual libraries. If a library enters an announced book into its catalog and the data is transferred to WorldCat, then this record may remain in WorldCat indefinitely even if the book is never published.

There are a couple of ways of telling when an OCLC record is bogus or "vaporware". First, unlike Amazon, libraries rarely enter the page count or the book size until they receive at least one copy, so an OCLC record with an empty page count and no size designation is likely to be vaporware. Second, if OCLC reports that a book supposedly published by a major publisher is available from only 1-10 libraries, there is something wrong with the picture. Finally, if you check OCLC's list of libraries that supposedly owned a copy at some point and all (or almost all) of them are not hyperlinked (i.e. do not report a current copy on file), that's a big red flag as well. Worldcat may list as "libraries" holding a publication one or two purchasing services, such as Baker & Taylor. These should be disregarded, and if no other libraries are listed, that is again a red flag that the book was never actually published.

See also Online Computer Library Center