Quality Control in Scholarly Publishing on the Web
by WILLIAM Y. ARMS
When the Web was young, a common complaint was that it was full of junk. Today a marvelous assortment of high-quality information is available on line, often with open access. As a recent JSTOR study indicates, scholars in every discipline use the Web as a major source of information. There is still junk on the Web -- material that is inaccurate, biased, sloppy, bigoted, wrongly attributed, blasphemous, obscene, and simply wrong -- but there is also much that is of the highest quality, often from obscure sources. As scholars and researchers, we are often called upon to separate the high-quality materials from the bad. What are the methods by which quality control is established and what are the indicators that allow a user to recognize the good materials?
This paper is motivated by three interrelated questions:
- How can readers recognize good quality materials on the Web?
- How can publishers maintain high standards and let readers know about them?
- How can librarians select materials that are of good scientific or scholarly quality?
The traditional approaches for establishing quality are based on human review, including peer review, editorial control, and selection by librarians. These approaches can be used on the Web, but there are economic barriers. The volume of material is so great that it is feasible to review only a small fraction of materials. Inevitably, most of the Web has not been reviewed for quality, including large amounts of excellent material.
Peer review is often considered the gold standard of scholarly publishing, but all that glitters is not gold.
The Journal of the ACM is one of the top journals in theoretical computer science. It has a top-flight editorial board that works on behalf of a well-respected society publisher. Every paper is carefully read by experts who check the accuracy of the material, suggest improvements, and advise the editor-in-chief about the overall quality. With very few exceptions, every paper published in this journal is first rate.
However, many peer-reviewed journals are less exalted than the Journal of the ACM. There are said to be 5,000 peer-reviewed journals in education alone. Inevitably the quality of papers in them is of uneven quality. Thirty years ago, as a young faculty member, I was given the advice, "Whatever you do, write a paper. Some journal will publish it." This is even more true today.
One problem with peer review is that many types of research cannot be validated by a reviewer. In the Journal of the ACM, the content is mainly mathematics. The papers are self-contained. A reviewer can check the accuracy of the paper by reading the paper without reviewing external evidence beyond other published sources. This is not possible in experimental areas, including clinical trials and computer systems. Since a reviewer cannot repeat the experiment, the review is little more than a comment on whether the research appears to be well done.
[snip] ACM conference papers go through a lower standard of review than journal articles. Moreover, many of these papers in this conference summarize data or experiments that the reviewers could not check for accuracy by simply reading the papers; for a full review, they would need to examine the actual experiment. Finally, the threshold of quality that a paper must pass to be accepted is much lower for this small conference than for the ACM's premier journal. This is a decent publication, but it is not gold.
In summary, peer review varies greatly in its effectiveness in establishing accuracy and value of research. For the lowest-quality journals, peer review merely puts a stamp on mediocre work that will never be read. In experimental fields, the best that peer review can do is validate the framework for the research. However, peer review remains the benchmark by which all other approaches to quality are measured.
Incidental Uses of Peer Review
Peer-reviewed journals are often called "primary literature," but this is increasingly becoming a misnomer. Theoretical computer scientists do not use the Journal of the ACM as primary material. They rely on papers that are posted on Web sites or discussed at conferences for their current work. The slow and deliberate process of peer review means that papers in the published journal are a historic record, not the active literature of the field.
Peer review began as a system to establish quality for purposes of publication, but over the years it has become used for administrative functions. In many fields, the principal use of peer-reviewed journals is not to publish research but to provide apparently impartial criteria for universities to use in promoting faculty. This poses a dilemma for academic librarians, a dilemma that applies to both digital and printed materials. Every year libraries spend more money on collecting peer-reviewed journals, yet for many of their patrons these journals are no longer the primary literature.
As we look for gold on the Web, often all we have to guide us is internal evidence. We look at the URL on a Web page to see where it comes from, or the quality of production may give us clues. If we are knowledgeable about the subject area, we often can judge the quality ourselves. Internal clues, such as what previous work is referenced, can inform an experienced reader, but such clues are difficult to interpret.
Strategies for Establishing Quality
The Publisher as Creator
Many of the most dependable sites on the Web contain materials that are developed by authors who are employed by the publisher. Readers judge the quality through the reputation of the publisher.
In the three previous examples, the content was created or selected by the publisher's staff. As an alternative, the publisher can rely on an editorial process whereby experts recommend which works to publish. The editors act as a filter, selecting the materials to publish and often working with authors on the details of their work.
Outsiders sometimes think that peer-reviewed materials are superior to those whose quality is established by editorial control, but this is naive. For instance, the Journal of Electronic Publishing contains some papers that have gone through a peer review and others that were personally selected by the editor. This distinction may be important to some authors, but is irrelevant to almost all readers. Either process can be effective if carried out diligently.
In every example so far ... the author, editor, or publisher has a well-established reputation. The observations about the quality of the materials begin with the reputation of the publisher. How can we trust anything without personal knowledge? Conversely, how can a new Web site establish a reputation for quality?
Caroline Arms of the Library of Congress has suggested that everything depends upon a chain of reputation, beginning with people we respect. As students, we begin with respect for our teachers. They direct us to sources of information that they respect -- books, journals, Web sites, datasets, etc. -- or to professionals, such as librarians. As we develop our own expertise, we add our personal judgments and pass our recommendations on to others. Conversely, if we are disappointed in the quality of materials, we pass this information on to others, sometimes formally but often by word of mouth. Depending on our own reputations, such observations about quality become part of the reputation of the materials.
Reviews provide a systematic way to extend the chain of reputation. Reviewers, independent of the author and publisher, describe their opinion of the item. The value of the review to the user depends on the reputation of the reviewer, where the review is published, and how well it is done.
The Web lends itself to novel forms of review, which can be called "volunteer review" processes. [snip]In a volunteer review process, anybody can provide a review. The publisher manages the process, but does not select the reviewers. Often the publisher will encourage the readers to review the reviewers. The reputation of the system is established over time, based on readers' experiences.
The success of volunteer reviews shows that systematic aggregation of the opinions of unknown individuals can give valuable information about quality. This is the concept behind measures that are based on reference patterns. The pioneer is citation analysis [snip] More recently, similar concepts have been applied to the Web with great success in Google's PageRank algorithm. These methods have the same underlying assumption. If an item is referenced by many others, then it is likely to be important in some way. Importance does not guarantee any form of quality, but, in practice, heavily cited journal articles tend to be good scholarship and Web pages that PageRank ranks highly are usually of good quality.
Quality Control in the NSD
The goal of the NSDL is to be comprehensive in its coverage of digital materials that are relevant to science education, broadly defined. To achieve this goal requires hundreds of millions of items from tens or hundreds of thousands of publishers and Web sites. Clearly, the NSDL staff cannot review each of these items individually for quality or even administer a conventional review process. The quality control process that we are developing has the following main themes:
- Most selection and quality control decisions are made at a collection level, not at an item level.
- Information about quality will be maintained in a collection-level metadata record, which is stored in a central metadata repository.
- This metadata is made available to NSDL service providers.
- User interfaces can display quality information.
How does a scholar or scientist build a reputation outside the traditional peer-reviewed journals? A few people have well-known achievements that do not need to be documented in journal articles. [snip]. More often, promotions are based on a mechanical process in which publication in peer-reviewed journals is central. Although it is manifestly impossible, most universities wish to have an objective process for evaluating faculty and this is the best that can be done. As the saying goes, "Our dean can't read, but he sure can count."
Meanwhile we have a situation in which a large and growing proportion of the primary and working materials are outside the peer-review system, and a high proportion of the peer-reviewed literature is written to enhance resumes, not to convey scientific and scholarly information. Readers know that good quality information can be found in unconventional places, but publishers and librarians devote little efforts to these materials.
The NDSL project is one example of how to avoid over-reliance on peer review. Most of the high quality materials on the Web are not peer-reviewed and much of the peer-reviewed literature is of dubious quality. Publishers and libraries need to approach the challenge of identifying quality with a fresh mind. We need new ways to do things in this new world.
The Journal of Electronic Publishing / August, 2002 / Volume 8, Issue 1 / ISSN 1080-2711