The existing public access policy for our three journals—The Journal of Cell Biology, The Journal of Experimental Medicine, and The Journal of General Physiology—is fully compliant with new policies from the Research Councils UK (RCUK) and the Wellcome Trust. In addition to mandating public access, the new policies specify licensing terms for reuse of content by third parties, in particular for text and data mining. We question the need for these specific terms, and we have added a statement to our licensing policy stipulating that anyone, including commercial entities, is permitted to mine our published text and data.
On April 1 (no fooling), the RCUK, a federal funding system, and the Wellcome Trust, a private funder, will implement new public access policies. RCUK will require journals to release all articles reporting RCUK-funded research to the public within six months after publication. This unifies the policies of the seven research councils in the UK, which had varied considerably in their terms. If a journal offers immediate access for a fee, RCUK encourages authors to take advantage of this option. RCUK indicates that they will pay these fees through block grants to universities and other research institutions.
In the same manner, the Wellcome Trust continues their existing policy of paying immediate access fees on their authors’ behalf (if this option is available) or requiring journals to release articles reporting Wellcome-funded research to the public six months after publication. What is new about their policy is that, in exchange for paying the immediate access fees, they (and RCUK) now dictate the licensing terms for reuse of articles by third parties. These terms do not apply to articles released by the publisher after six months.
The existing publication policies of The Rockefeller University Press (RUP) are fully compliant with the new British policies. We release all of our content six months after publication, regardless of funding source. RUP articles are made freely available on our own website, and they are also posted in PubMed Central (PMC) and Europe PMC. We do not offer an immediate access option because we do not believe in providing immediate access to just a subset of our articles.
Clarifying licensing terms
RUP has long championed public access to research articles via a six-month embargo, and we thus support this aspect of the Wellcome Trust policy and the new RCUK policy. However, we question the value of a provision in their policies that specifies licensing terms for reuse of content by third parties. Both organizations now require that the articles for which they have paid an immediate access fee be made available under a Creative Commons attribution license (CC-BY). This means that third parties are allowed “to copy, distribute and transmit the work, … to adapt the work, and to make commercial use of the work, [as long as they] attribute the work in the manner specified by the author or licensor” (1).
CC: Creative Commons—a nonprofit organization that provides standardized copyright licenses, which define the terms of reuse of creative work.
BY: Attribution must be given to the original work.
NC: Noncommercial—published work can be reused without permission, as long as it is for noncommercial purposes.
SA: Share alike—any subsequent distribution must follow the rules set out in the original license.
RUP is able to avoid this requirement by not offering an immediate access option. Indeed, we could not release all of our content under a license that permits unfettered commercial reuse of the work. This would allow commercial content aggregators to bundle and sell our content without any recompense to us, thus jeopardizing our subscription revenues. RUP has done the next best thing. Since May 2008, our content has been released to the public under a CC-BY-NC-SA license, which allows third party use of the content for noncommercial purposes but prevents its use for commercial gain without expressed permission (2).
Subscription-based publishers who do take immediate access fees from RCUK and the Wellcome Trust will, by necessity, have content under two different licenses—the one that applies to the bulk of their content and preserves their subscription revenues and the license that applies to immediate access articles funded by RCUK and the Wellcome Trust. This will add considerable confusion to the research community about what third parties can and cannot do with a particular journal’s content. There is an initiative by the National Information Standards Organization to tag individual articles with licensing information and to make that metadata accessible to the public (3), but those discussions are in their infancy, and implementation is probably several years away.
A final issue with the licensing provision is that it is not necessary to achieve one of the stated goals of the Wellcome Trust—to ensure “that the content will be fully accessible by anyone” (including commercial entities) for text and data mining (4). The need to differentiate access to content for mining from the right to mine content has been considered in detail elsewhere (5). The Intellectual Property Office of the UK government recently issued a clarification of the rights afforded by existing UK copyright law “to allow non-commercial researchers to use computers to study published research results and other data without copyright law interfering” (6). There is legal precedent in the United States to indicate that the right to mine text and data from published research articles can fall into the category of “fair use,” and thus any entity—noncommercial or commercial—may be permitted to do so under any license (7, 8). The designation of fair use is context specific* and is often controversial, confusing, and ambiguous. Thus, to clarify the RUP’s position, we have added a statement to our licensing policy specifying that anyone, including commercial entities, is permitted to mine the text and data:
“The RUP acknowledges that text or data mining by commercial entities for their internal research purposes is allowed without further permission from RUP. Commercial entities may develop indexing or search services—available to the public for free or for a fee—based on text or data mining without further permission from RUP, but they may reproduce only snippets of text up to 156 characters in length, or thumbnails of images up to 72 pixels in the long direction, as part of such a service” (10).
The RUP’s existing license to publish does not restrict mining of the content to advance scientific research by commercial or noncommercial entities; it only restricts reproduction of the content for commercial gain. This restriction is an essential component of the business models of the vast majority of publishers, which remain subscription based (11).
What about access?
All of RUP’s content is available in the “Open Access Subset” of PMC six months after publication. This provides easy access to a machine-readable form of the content for text and data miners. We encourage all publishers to participate in this service. If all participants would agree that text and data mining constitute fair use, the content could be mined by any third party regardless of the copyright/licensing terms attached to it. The name Open Access Subset is actually a misnomer, as the content does not have to be available on an immediate basis (or under a CC-BY license) to be included. Efforts to open up citation data for mining (12) are commendable if one is trying to break the stranglehold on citation data by large commercial entities, but citations are only a small portion of the textual data in an article. There are no licensing or technical barriers to making all of the text and data published by subscription-based publishers available for mining.
For example, The Code of Best Practices in Fair Use for Academic and Research Libraries (9) indicates that “it is fair use for libraries to develop and facilitate the development of digital databases of collection items to enable nonconsumptive analysis across the collection for both scholarly and reference purposes.”