ROADKILL ON THE ELECTRONIC HIGHWAY? THE THREAT TO THE MATHEMATICAL LITERATURE Frank Quinn Virginia Tech Blacksburg VA 24061-0123 quinn@math.vt.edu Final version, January 1994 ABSTRACT This article begins with an analysis of reliability and usability in the mathematical literature. Mathematical practice is seen to be adapted to very high standards in these regards, and quite sensitive to even modest declines in quality. Unfortunately most scenarios for the transition to electronic publication suggest serious loss of quality. This is a problem special to mathematics: other sciences are adapted to lower standards, and should be much less sensitive to changes. Some ways to prevent this decline are suggested. % Written for the Emath electronic journal of the American % Mathematical Society, emath@math.ams.org % A condensed version will appear in the Notices of the AMS % Intended for a mathematical audience INTRODUCTION There has been much discussion of the potential benefits of the electronic medium for scholarly publication. There are also concerns. The benefits tend to be emphasized by users, particularly technically adept users with a tolerance for large volumes of information [Odlyzko], [Harnad]. The concerns crop up among those involved in the infrastructure of publication: librarians, publishers, and editors [Franks]. In either case mathematics is usually considered as one of many essentially similar branches of science or scholarship. The benefits and problems apply to all scholarly work, not just math. And experiences in theoretical physics, molecular biology, or psychology, are assumed to have direct relevance for mathematics. In this article we focus on issues specific to mathematics. The conclusion is that in fact mathematics is different: we have more to lose than do the other sciences. We should therefore be more careful about the precedents we set, and more prepared to take action to protect our advantages. A key issue is: who should be served by publication, authors or readers? Is the primary purpose to establish priority and record the achievements of authors, or is it to be a useful resource for readers? I would like to thank John Franks, Arthur Jaffe, Andrew Odlyzko, and Dick Palais for their influences on this analysis. RELIABILITY AND USABILITY The published mathematical literature is, by and large, reliable; much more so than in other sciences. Mathematical papers may be boring or useless, but they are nearly always correct. And if one is interested then they are usually usable in the sense that techniques and details can be reconstructed. This is not universal: most mathematicians know of papers which are wrong or opaque. But it is much more often true than in other fields. A colleague has estimated that a third of the primary literature in biology is wrong. Imagine picking up a journal with twelve articles, and knowing that of these, four are likely to be seriously flawed. In mathematics it would be unusual for even one to be in error. A clarification about conjecture and applied mathematics may be useful here. ``Complete reliability'' suggests mathematical proof, and might seem to exclude these other modes of mathematical thinking. This is not correct: here ``reliability'' used in connection with a conjecture does not mean the conjecture is true, but that a careful distinction is made between what is really known and what is conjectured. The entire statement, including these distinctions, is reliable. Similarly ``reliability'' of a paper describing a numerical method does not mean the method works, but that careful distinctions are made between what is known for sure, and what experience leads the author to expect. In other words reliability is careful honesty and ``mathematical precision'' in describing conclusions. We start with a discussion of the consequences of these properties of mathematics. Users from outside, or from other fields in mathematics are significantly empowered by reliability. They may find math hard to understand, like a legal contract with a lot of fine print. But after proper allowance for the fine print they very confidently expect delivery of the goods. And they get the goods so routinely that many do not even wonder about checking details. A higher error rate would certainly make it harder to use. Mathematicians also accept long elaborate proofs by contradiction which would be ridiculous if the ingredients were not completely reliable. In the other sciences information may be very good, but can never be absolutely reliable. And accordingly elaborate arguments are viewed with suspicion: the output is at best a hypothesis which should be tested. Usability is significant in several ways. It is important in the internal development of a field. Few things are ever absolutely complete, and to go further it is necessary to understand how previous understanding was achieved. Frequently tomorrow's advances grow from the details of today's work. If these details are incomprehensible or unavailable then this process is blocked. A usable record of detail is also a great help to students. Usability is also important in interactions among research areas. Mathematics itself is a seamless whole, but the limitations of human understanding lead us to see it as a great many separate threads. The real unity shows itself as a ceaseless splitting, joining, and intertwining of these threads. Threads come together when experts in one field discover they need methods or results from another. In most cases the reliability and usability of the literature makes a profitable interaction straightforward. We consider these mathematical threads in more detail. Many people have the image of a research field as an active community with common goals, exchanging ideas and checking each other's work. This image is appropriate in other sciences, and in some areas in mathematics. But the Math Reviews classification divides mathematics into approximately 5,000 subjects. These are not all separate research areas, but there are a great many. And most of these do not fit the active-community model: they are small groups or individuals working in near isolation. Or we can think of them as communities distributed in time, communicating through the literature. There are two points to this: first the reliability of the methods of mathematics enables isolated groups to make steady progress. And second, reliability encourages specialization: there is little benefit to replication or duplication (checking) of work, so mathematicians spread themselves out. If the reliability of the literature was seriously compromised these practices would have to change. Among other things mathematicians would have to reorganize into fewer, larger groups to have the immediate interactions necessary to correct errors. Reliability influences mathematical practice in many other ways. For instance mathematicians are more likely to be familiar with the older literature, and build upon it. When a literature is unreliable there is less benefit to knowing it: it is often easier to rediscover material than sort through and check publications. In such areas there is also a tendency to work from preprints, and depend on word of mouth to identify the good ones. Many areas in theoretical physics follow this pattern. There are important exceptions to the familiarity of mathematicians with the literature. The best mathematicians often internalize their subject (develop intuition) and expand it to such an extent that they seldom use the literature. They are also able to rapidly assess the plausibility of new work, so often do not see reliability as a key issue. It is the rank-and-file who tend to know and depend on the literature. Reliability and usability of the literature enables such people to contribute significantly to the mathematical enterprise rather than simply being camp followers of the great. So it is mathematicians of average ability who have the most to lose if reliability is compromised. An unfortunate corollary is that in this instance mathematics may not be well served by the leadership of outstanding individuals. Ignorance or distrust of the literature also leads to duplication of effort and repetitive publication. In disciplines with unreliable literatures there is often a relaxed attitude toward publication of very similar papers. It is sometimes even comforting: like duplication of an experiment it seems to increase the probability that the conclusion is correct. By contrast the reliability of mathematics has led to a definite avoidance of overlapping publication. When something is done there is seldom any benefit to doing it again. A final difference between mathematics and other fields is that we have less tradition of review articles: secondary literature which sifts and consolidates the primary literature. It is less necessary because the primary literature is more directly usable. There is also less benefit since with fewer errors and less duplication to discard there is less compression. In emphasizing the importance of reliability and usability we have implicitly taken the view of users. But the literature also serves authors, to record and publicize their achievements. There is a tension between these two functions, since in an author-oriented literature, like much of theoretical physics, fast publication is a goal and loose standards are accepted if not preferred. One conclusion from the analysis above is that mathematics has somehow evolved a very user-oriented literature, and that the benefits are so profound that (so far) it has been unresponsive to author-oriented pressures. This distinction between user and author-oriented functions of the literature is often not clearly understood. Sometimes authors will try to articulate their interests in reader-oriented language: fast publication is important because there are unknown persons who will be vitally interested in their latest results. But this misrepresents the interests of the unknown persons: there is little benefit in getting something fast if it is wrong or unreliable. And even correct statements with sketchy or incomplete details can be damaging: if the unknown person was working on the same problem this can render their work obsolete and at the same time deny them access to the tools necessary to reproduce or extend the result. Authors do have legitimate interests and needs, but they must be accurately presented, and honestly balanced against those of users. In conclusion, mathematics has a reliable and useful (reader- oriented) primary literature. Our practices are adapted to this in profound ways. In particular we lack customs used in other fields to protect against, correct, and refine an unreliable literature. It is therefore particularly important for mathematicians to be aware of, and cautions about, changes that might threaten reliability and the reader orientation. REASONS FOR RELIABILITY Reliability in mathematics is not an accident. Mathematics is unique in that its methods, when correctly applied, do yield conclusions which are in practice completely reliable. It is true that Godel has shown we cannot prove complete reliability. But several thousand years of vigorous testing have firmly established reliability as an experimental result. It is probably the most thoroughly tested conclusion in science. More problematic is the fact that mathematics is done by people. Even the most reliable of methods can be used incorrectly. And usability, which was described above as being as significant as reliability, depends on people for its meaning as well as its implementation. Therefore the fact that these properties are actually achieved is a consequence of social mechanisms, rather than just hypothetical achievability. We turn to a discussion of the social mechanisms leading to reliability and usability. There is first of all (at least in the West, and with a few exceptional areas) a long tradition of careful work and critical self-examination. Next, published papers are refereed, often very carefully. This catches many errors, and often results in revision which increases usability. These two practices reinforce each other. People write carefully because standards are high for acceptance into the literature. And high standards are practical because people write carefully. This is a very beneficial equilibrium, but an unstable one which could easily be disturbed. A consequence of this equilibrium concerns the meaning of ``publication.'' At present there is a relatively black-and-white distinction between published and unpublished work. This enforces the standards of the literature: either write carefully or don't get published. If there were a continuum of levels of publication then standards would be less clear and would have far less force. Authors would tend to write to their own comfort level of quality and then negotiate the level of publication. Overall quality of the literature would decline, possibly dramatically. Unfortunately this strong published/unpublished distinction is an artifact of paper publication, and will disappear in the transition to electronic media unless consciously maintained. Another aspect of the quality equilibrium involves refereeing. Refereeing in any science can be thought of as a centralization of the ``self-correcting nature of science.'' Referees do the first cut of checking, to reduce the load on users further down the line. A balance tends to evolve: standards should be high, to reduce the burden on users as much as possible. But standards must be low enough that nearly all work can be brought up to this standard and go through the system. If standards are too high a ``black market'' unpublished literature (preprints or announcements) develops and again a burden is passed on to users. Each area evolves its own compromise, ideally to minimize the burden on users. It follows from this that standards are adapted to individual research areas, and are not interchangeable. An attempt to enforce mathematical standards in the physical literature would drastically reduce the amount published. This would lead to use of uncontrolled outlets of lower quality than the current literature, so would increase the burden on users. Conversely the use of physical standards in mathematics would lead to a serious decline in quality which would also burden users. Another point is that these area-specific standards are social agreements among authors, editors, and referees, and evolve over an extended period of time. If these agreements are widely ignored, or even just lose credibility, they could dissolve and take a long time to reconstitute. Thus standards which work should be cherished and protected. In a transition such as the one to electronic publication great care should be taken that not only are the standards preserved, but credibility in them is maintained. In conclusion, standards of reliability and usability for published work are social agreements which evolve to fit specific fields. In a reader-oriented literature an important pressure shaping this evolution is the minimization of the burden of checking and reconstruction left to the user. The methods of mathematics enable, and the maturity of mathematics has led to, very high standards. But all this is unstable in several ways, and without careful management the transition to electronic publication is likely to result in a significant decline in standards. THE LURE OF SPEED AND ALTERNATIVE PATHS TO KNOWLEDGE There are several ways in which a desire for speed threatens the quality of the mathematical literature. The first is a craving for faster publication, and a second is a desire to speed up the research process itself. There is also a renewed flirtation with more intuitive approaches to mathematics [Jaffe-Quinn]. These are not specifically electronic publication problems, but electronic publication is likely to weaken the barriers against defective information. In many areas there are already electronic preprint databases through which papers can be immediately circulated world-wide. They sometimes hit the wires a few seconds after the final keystrokes, and needless to say often not in final form. If the work is reviewed and corrected before it is frozen into the literature, then the additional exposure is a good thing. But there are pressures to regard this instantaneous circulation as publication. Information can be transmitted instantly, so authors want credit instantly. They want to stay in the flow of ideas rather than take the time to nail the last one down firmly. The pace of research itself, as well as publication, can be frustrating. The understanding accumulated over decades and centuries is astounding, but on a day to day scale the pace can be maddeningly slow. A major reason for this is the insistence on reliability and usability in publication. It would certainly be possible to speed things up in the short run by relaxing standards. But the lesson of history is that this is counterproductive in the long run. Low quality work is in effect a debt: it must eventually be repaid, and usually with interest. There is a lack of enthusiasm for this, and a really big deficit can run people out of a field. Consistent high quality is the intellectual equivalent of a balanced budget. There is nothing new about the desire for speed. Fast-moving fields have always engendered a sense of urgency. And there have been fields and times when giving a lecture at Princeton was considered tantamount to instant publication. But in the past the people who moved on too fast, or only lectured at Princeton, did not seriously damage the literature. Instead they reduced their own long-term impact on mathematics. Now it is technically feasible to damage the literature. Another hazard to the mathematical literature is a growing uncertainty about what should be counted as ``mathematical knowledge''. For example one can determine that a number is very, very likely to be prime [Pinch], or that a complicated identity is virtually certain to be true [Zeilberger]. These are useful conclusions. But no matter how high the probability, it would be dangerous to accept primeness or an identity as ``mathematical knowledge'' without the caveat that they are not completely reliable. Our experience is that one often encounters low-probability events in long and elaborate arguments. Indeed many important mathematical developments are based on low-probability events: the representation theory of Lie groups can be regarded as giving solutions to some almost certainly insoluble systems of equations. Therefore an argument in which some steps are only probably true should be handled like arguments in other sciences: the conclusion is a hypothesis which may need further testing even to conclude that it is probably true. From this point of view some earlier concerns about computer proofs were misguided. A proof in which a computer checks 20,000 cases is not essentially different from a proof in which a person checks 20 cases: the argument is still designed to give completely reliable results. Indeed if the algorithms are carefully explained, and documented source code is available, then a computer proof is preferable to a bald assertion that the author has identified and checked all cases. There are also non-electronic alternate paths to knowledge being explored (or revisited, since none of this is new). These usually involve a greater reliance on intuition, or direct visualization or ``understanding''; see [Jaffe-Quinn], and the responses to it (in the April 1994 Bulletin; see particularly [Thurston]). If this information is presented without a warning that such knowledge is not completely reliable, and may not be reproducible, then it also threatens the integrity of the literature. Again I want to emphasize that conjectures or intuitions or experimental conclusions **presented as such** are not ``low quality'' or unreliable in the sense used above, even if wrong. It is a false claim of knowledge, or a failure to provide usable details which is a de facto borrowing against the future labor of others, and which creates gaps in the literature. The conclusion is that absolute reliability should not be abandoned as a goal in mathematics. Knowledge obtained by methods which should produce complete certainty should be distinguished from understanding which is not certain, no matter how likely. And in particular the quest for speed should not be permitted to weaken standards of precision and usability. WHAT SHOULD WE DO? We have received a wonderful legacy from our predecessors: a remarkably reliable and usable user-oriented literature, and customs and practices to maintain it. Will we pass these on to our successors? Or will they be casualties of the move to the electronic new world? We could just relax and let the new age find its own equilibrium. The analysis presented here leads one to expect a substantial decline in reliability and usability of the literature. This would not fatally cripple the mathematical enterprise. It would disadvantage mathematicians of average ability, but it is widely believed that most advances happen at the top. Perhaps average mathematicians are expendable. It would force abandonment of smaller research areas in a reorganization into fewer and larger research groups. But most small research areas never contribute in a vital way to the ``big picture'', so could be seen as expendable. ``In groups'' would develop private folklore about the hazards of their local literatures. This would place outsiders at a disadvantage, but most advances are probably made by insiders. And theoretical high-energy physics already has an unreliable author-oriented literature, and is far from dead. Therefore it is a matter of quality-of-life rather than life-or- death. But it would be a significant reduction in quality-of-life. The first line of defense against this loss should be the maintenance of a strong distinction between preprints and material which has been officially accepted into the literature, either paper or electronic. This acceptance must remain a certification that the work is written to high standards of reliability, precision, and usability, and has been refereed. In other words maintain the linkage between high standards and publication, and keep the alternative sufficiently clear and less attractive that people will be willing to write to high standards. At present this distinction is an artifact of the rigors and expense of paper publication, so it must be consciously supported if it is to continue. Some suggestions on how this might be done are given in [Quinn]. The second line of defense should be to build as much feedback as possible into preprint databases. Originally preprints were of very limited circulation, usually to a group which could communicate verbally within itself about the preprint, and provide verbal feedback to the author. Now preprints are distributed very nearly as widely as the final publication. This additional exposure is a good thing if the other features are also extended. Mechanisms should be provided for getting feedback from the new wider audience, and making this information available to others in this audience. For instance this could be done by allowing readers to post comments, or references to other work, to a file linked to the preprint. It has been suggested [Odlyzko] that such feedback mechanisms might in large part substitute for the current refereeing and editing system, and render the published/unpublished distinction unnecessary. This is unattractive in two ways. First it relieves authors of the necessity to write to high standards to be published. Instead they would write to their own comfort level, and let people complain if there are problems. Weak writing with a file full of complaints is not really a substitute for good writing. Second it still passes on to users the burden of checking and replication now done by referees. The comment file should provide some help with this, if the reader is sufficiently expert to understand it, and if it is not so polluted to be useless. But the burden is still on the reader. SUMMARY The mathematical literature is reliable in a sense impossible to achieve in the other sciences. This deeply affects both technical and social practices in mathematics, and has the effect of making the literature unusually useful to readers. This reliability, and its benefits, are threatened in several ways by electronic publication. The transition probably can be managed to avoid loss of these benefits. But this will require a consensus in the community that these benefits are worth preserving, and wide support for the necessary actions. REFERENCES [Franks] John Franks "The impact of electronic publication on scholarly journals" Notices of the AMS 40 (November 1993) 1200--1202. [Harnad] Stevan Harnad "Scientific quality control in scholarly electronic journals" in: Proceedings of international conference on refereed electronic journals. [Jaffe-Quinn] Arthur Jaffe and Frank Quinn "'Theoretical mathematics': toward a cultural synthesis of mathematics and theoretical physics", Bulletin AMS 29 (1993)1-13. [Odlyzko] Andrew Odlyzko "Tragic loss or good riddance? The impending demise of traditional scholarly journals", Preprint December 1993. [Pinch] R. G. E. Pinch "Some primality testing algorithms" Notices of the AMS 40, (November 1993) 1203-1210. [Quinn] Frank Quinn, "A role for libraries in electronic publication" electronic preprint Jan. 1994. [Thurston] William Thurston "On proof and progress in mathematics" To appear in the Bulletin AMS, April 1994. [Zeilberger] Doron Zeilberger "Theorems for a price: Tomorrow's semi-rigorous mathematical culture", Notices of the AMS 40 (October 1993) 976-981.