Thursday, July 17, 2008

Who says the Internet broadens your horizons?
[Here’s the long version of my latest, understandable shortened Muse for Nature News.]

A new finding that electronic journals create a narrowing of scientific scholarship illustrates the mixed blessings of online access.

It’s a rare scientist these days who does not know his or her citation index, most commonly in the form of the h-index introduced in 2005 by physicist Jorge Hirsch [1]. Proposed as a measure of the cumulative impact of one’s published works, this and related indices are being used informally to rank scientists, whether this be for drawing up lists of the most stellar performers or for assessing young researchers applying for tenure. Increasingly, careers are being weighed up through citation records.

All this makes more pressing the question of how papers get cited in the first place: does this provide an honest measure of their worth? A study published in Science by sociologist James Evans at the University of Chicago adds a new ingredient to this volatile ferment [2]. He has shown that the increasing availability of papers and journals online, including what may be decades of back issues, is paradoxically leading to a narrowing of the number and range of papers cited. Evans suggests that this is the result of the way browsing of print journals is being replaced by focused online searches, which tend both to identify more recent papers and to quickly converge on a smaller subset of them.

The argument is that when a journal goes online, fewer people flick through the print version and so there is less chance that readers will just happen across a paper related to their work. Rather, an automated search, or following hyperlinks from other online articles, will take them directly to the most immediately relevant articles.

Evans has compiled citation data for 34 million articles from a wide range of scientific disciplines, some dating back as far as 1945. He has studied how citation patterns changed as many of the journals became available online. On average, a hypothetical journal would, by making five years of its issues available free or commercially online, suffer a drop in the number of its own articles cited from 600 to 200.

That sounds like a bad business model, but in fact there are some important qualifications here. It doesn’t necessarily mean that a journal gets cited less when it goes online, but simply that its citations get focused on fewer distinct articles. And all these changes are set against an ever-growing body of published work, which means that more and more papers are getting cited overall. The changes caused by going online are relative, set within the context of a still widening and deepening universe of citations.

All the same, this means that the trend for online access is making citation patterns narrower than they would be otherwise: fixated on fewer papers and fewer journals.

In some ways, the narrowing is not a bad thing. Online searching can deliver you more quickly to just those papers that are most immediately relevant to your own work, without having to wade through more peripheral material. This may in turn mean that the citation lists in papers are more helpful and pertinent to readers.

Online access also makes it much easier for researchers to check citation details – to look at what a reference actually said, rather than what someone else implies they said. It’s not clear how often this is actually done, however – one study
(see also here), using mis-citations as a proxy, has suggested that 70-90 percent of literature citations have simply been copied from other reference lists, rather than being directly consulted [3,4]. But at the very least, easier access should reduce the chances of that.

Yet there are two reasons in particular why Evans’ findings are concerning. One is in fact a mixed blessing. With online resources, scientific consensus is reached more quickly and efficiently, because for example hyperlinked citations allow you to deduce rapidly which papers other are citing. Some search strategies also rely on consensual views about relevance and significance.

This might mean that less attention, time and effort get wasted down dead ends. But it also means there is more chance of missing something important. “It pushes science in the direction of a press release”, says Evans. “Unless they are picked up immediately, things will be forgotten more quickly.”

Moreover, feedback about the value judgements of others seems to lead to amplification of opinions in a way that is not necessarily linked to ‘absolute’ value [5]. It’s an example of the rich-get-richer or ‘Matthew’ effect, whereby fame becomes self-fulfilling and a few individuals get disproportionate rewards at the expense of other, perhaps equally deserving cases. While highly cited papers may indeed deserve to be, it seems the citation statistics would not look very different if these papers had simply benefited from random amplification of negligible differences in quality [6]. Again, this could happen even with old-style manual searching of journals, but online searches make it more likely.

The other worry is that this trend exacerbates the already lamented narrowing of researchers’ horizons. It is by scanning through the contents pages of journals that you find out what others outside your field are doing. If scientists are reading only the papers that are directly relevant to their immediate research, science as a whole will suffer, not least because its tightly drawn disciplines will cease to be fertilized by ideas from outside.

Related to this concern is the possibility of collective amnesia: the past ceases to matter in a desperate bid to keep abreast of the present. Older scientists have probably been complaining that youngsters no longer ‘read the old literature’ ever since science journals existed, but it seems that neglecting the history of your field is made more likely with online tools.

There’s a risk of overplaying this issue, however. It’s likely that so-called ‘ceremonial citation’, the token nod to venerable and unread papers, has been going on for a long time. And the increased availability of foundational texts online can only be a good thing. Nonetheless, Evans’ data indicate that online access is driving citations to become ‘younger’ and reducing an article’s shelf-life. This must surely increase the danger of reinventing the wheel. And there is an important difference between having decided that an old paper is not sufficiently relevant to cite, and having assumed it, or having not even known of its existence.

In many ways these trends are just an extension to the scientific research community of things that have been much debated in the broader sphere of news media, where the possibilities for personalization of content leads to a solipsistic outlook in which individuals hear only the things they want to hear. (The awful geek-speak for this – individuated news – itself makes the point, having apparently been coined in ignorance of the fact that individuation already has a different historical meaning.) Instead of reading newspapers, the fear is that people will soon read only the ‘Daily Me.’ Web journalist Steve Outing has said that “90 percent of my daughters’ media consumption is ‘individuated’. For kids today, non-individuated media is outside the norm.” We may be approaching the point where that also applies to young scientists, particularly if it is the model they have become accustomed to as children.

Ultimately, the concerns that Evans raises are thus not a necessary consequence of the mere fact of online access and archives, but stem from the cultural norms within which this material is becoming available. And it is no response – or at least, a futile one – to say that we must bring back the days when scientists would have to visit the library each week and pick up the journals. The efficiency of online searching and the availability of archives are both to be welcomed. But a laissez-faire attitude to this ‘literature market’ could have some unwelcome consequences, in particular the risk of reduced meritocracy, loss of valuable research, and increased parochialism. The paper journal may be on the way out, but we’d better make sure that the journal club doesn’t go the same way.

References
1. J. E. Hirsch, Proc. Natl Acad. Sci. USA 102, 16569-16572 (2005).
2. J. A. Evans, Science 321, 395-399 (2008).
3. M. V. Simkin & V. P. Roychowdhury, Complex Syst. 14, 269-274 (2003).
4. M. V. Simkin & V. P. Roychowdhury, Scientometrics 62, 367-384 (2005).
5. M. J. Sagalnik et al., Science 311, 854-856 (2006).
6. M. V. Simkin & V. P. Chowdhury, Annals Improb. Res. 11, 24-27 (2005).

1 comment:

JimmyGiro said...

Hey, dis' the internet at your peril:

SINNER