Who’s downloading pirated papers? EVERYONE

In rich and poor countries researchers turn to the Sci-Hub website

(28 April 2016) Science has published an article by John Bohannon which quantifies the use of Sci-Hub, the go-to website for pirated scientific papers.

He explores the cost of document delivery from legitimate services and why researchers turn to the free and copyright-busting alternatives.

Increasing numbers of researchers around the world use Sci-Hub which hosts 50 million papers. Over the six months leading up to March, Sci-Hub served up 28 million documents. More than 2.6 million download requests came from Iran, 3.4 million from India, and 4.4 million from China. The papers cover every scientific topic. The publisher with the most requested Sci-Hub articles? It is Elsevier by a long shot—Sci-Hub provided half-a-million downloads of Elsevier papers in one recent week.

With the founder of Sci-Hub, Bohannon attempts to answer: Who are Sci-Hub’s users, where are they, and what are they reading? The results lead him to say that the Sci-Hub data provide the first detailed view of what is becoming the world’s de facto open-access research library. And if you think that it’s the poorer countries who are to blame for this massive theft of intellectual property, Bohannon’s data shows that The United States is the fifth largest downloader after Russia, and a quarter of the Sci-Hub requests for papers came from the 34 members of the OECD.

Bohannon’s assertions are supported by data as you can see in his freely available article published by AAAS in Science.

Toby Green, Head of Publishing at the OECD, commented on LIBLICENSE:

This is interesting, but the numbers need to be put into context (always a good idea with numbers – to put them in context). I have no idea, for example, how many articles are being downloaded from Science Direct, JSTOR, or other platforms and repositories in order to gauge whether Sci-Hub’s 28 million is ‘small’, ‘medium’ or ‘large’. For what it’s worth, OECD Publishing’s downloads last year were 28 million (so we’re running at around 50% of Sci-Hub) but our catalogue is much, much smaller – we have around 200,000 items on our platform, a far cry from Sci-Hub’s 50 million. Does anyone (STM, perhaps?) have data on journal article downloads worldwide? 

However, this data does support a conjecture that we have at OECD: the potential audience is always far larger than one thinks. I recently had one of our authors say her latest paper would have an audience of ‘200’ and she swore blind that it wouldn’t be any larger. Based on our past performance with similar papers, I reckon we’ll reach twice or three times that number. This thinking is quite widespread. I was recently challenged at a conference, at which I had shared data on the growth in accesses to our content following the introduction of our freemium publishing model, by someone arguing that OECD content was somehow different from scholarly content published in journals and was bound to have a larger audience. I countered by stating that 40% of OECD populations are now educated to first-degree level as are many in non-OECD countries, especially in places like Iran, China and India.

Therefore, the potential audience that has the skill and ability to read a journal article is really very large indeed. The data from Sci-Hub seems to be proving the point. 

The final anecdote about ease of discovery and access is sobering.

If we (publishers and librarians together) can’t get this right, especially at subscribing institutions, then we’re failing badly. But, this brings me back to the first point – the context of this data.

What is the share of Sci-Hub downloads at subscribing institutions? If it becomes significant, then we are failing, if it isn’t, then we’re not.

Laura Brown, JSTOR Managing Director, commented on LIBLICENSE: 

(3 May 2016) For 2015, JSTOR made 10.2 million journal articles available. We had just over 70 million downloads (this figure excludes bots and counts in compliance with COUNTER 4).  This includes 1.9 million downloads of our open content (Early Journal Content) with the remainder coming predominantly from people at nearly 10,000 institutions that license JSTOR journal collections either for a fee or for free through our African Access and Developing Nations Initiatives.

We also offer anyone in the world the option of registering with JSTOR to read journal articles online for free.  To Toby’s point about audiences being larger than one thinks, we started this program when Google began sending significant traffic to JSTOR from beyond our participating institutions. About 80% of the journals on JSTOR are now available this way (some publishers have opted out of this program).

Last year, we gave this free reading option to users over 110 million times (as a point of comparison in terms of potential audience, we passed authenticated users into articles 147 million times).  However, users actually took the option and registered or logged-in to read the articles about 1.6 million times. 

So our usage is higher than Sci-Hub’s, but there is more we can do.

We recognize and have benefited from the reality that use is a function of convenience in our core community and the growing one outside.  Even our small registration barrier can be too high for some people.  Users simply expect access to be as easy as possible, especially when they know what they are looking for.  We have no illusions about this fact and see delivery of a fantastic, easy, multi-device experience as the bar we should be aiming for at JSTOR, for institutional and unaffiliated users alike. We are pursuing this reality in a way that respects the rights of others and that takes into account the other things that matter beyond frictionless delivery of a PDF. These include structured data – mentioned in the Science article, preservation, reliable long-term access, and the ability to invest new resources in continuing to bring content from the past online in useful ways.  JSTOR digitized 4.3 million pages of new content last year and, with the investment of those institutions that contributed fees to JSTOR, has digitized 63 million over the last 20 years. 

It is not easy but I am optimistic we can find ways to continue to invest as a community in the things that matter even as access becomes more convenient for everyone.