Publishers Accept the Inevitable and Are Now Teaching LLMS: Is it the Right Move?

Artificial Intelligence is now a cornerstone of publishing whether authors want to admit it or not. AI tools and large language models (LLMs) are being used not just to write sections of manuscripts but also to help with the editing process and with the review process for scholarly journals. We’ve talked about AI’s growing role in scholarly publishing repeatedly including its role in reviewing and writing manuscripts. We’ve also talked about publishers doing their best to create new “responsible” AI with the launch of Elsevier’s Scopus. And now other publishers are getting in on the action, putting their money (and extensive archive of materials) where their mouth is to try and improve AI’s potential in scholarly publishing.

LLMs. Robot in a library reviewing a book

                As AI use becomes more widespread, multiple publishers have signed large-scale deals with AI programming companies to license their material to assist in the training of AI tools, particularly LLMs. In May of 2024, Informa, the parent company of publisher Taylor & Francis, announced a $10-million deal to license much of its journal content to Microsoft to help train its Copilot AI system. Within the next month, academic publisher Wiley announced to its investors a $23 million deal with an unnamed firm that is developing generative-AI models. A few months later in September, the company said that it expected to earn another $21 million from agreements with other AI firms in the same financial year. A Taylor & Francis spokesperson stresses that these deals with developers are for the greater good of the publishing world as they “are providing data and content under license for the purposes of training AI, such as LLMs, so that those models become more accurate and relevant for the benefit of everyone who uses them.”

Taylor & Francis further argue that they have strict parameters for these agreements and that they will ultimately benefit the authors in both the short and long term. The spokesperson states that the royalties from these deals will be paid out to authors of the licensed content, and the licensed content can be used only for training the AI tools with no permission granted to reproduce the content in any format. Similarly to Taylor & Francis, a Wiley spokesperson said that the royalties from their agreement will also be paid to book authors and other publishing partners. They are also monitoring the developer’s material for any use of copyrighted material without permission. Several publishers have also put measures in place to prevent AI tools from using published content from the web in their analysis without getting proper permission.

                With an influx of these licensing agreements, non-profit Ithaka S+R has created a Generative AI Licensing Agreement Tracker to catalog these agreements and their terms as they are announced. This is designed to help keep authors aware of the potential use of their material and content in the training of LLMs. These deals in many ways represent an about face on AI usage in scholarly publishing. In a June 2023 guide notice, the NIH stated that using AI tools to assist in the review of grant applications was a clear “breach of confidentiality” and was prohibited. And while some publishers have allowed the use of AI tools in the writing of articles as long as it is acknowledged by authors upon submission with a statement included upon publication, many publishers are still wary of AI tools being used in various stages of scholarly research. These agreements perhaps show that many publishers see widespread AI tool use as inevitable and are doing their best to safeguard their journals from irresponsible use of tools. For instance, Sage confirmed that it signed a deal to license its published content mostly out of fear that content was already being harvested by developers through less than legal channels to train LLMs. Agreeing to licensing their content allows them to better safeguard the reproduction of this material while also passing on royalty fees to authors and journal societies.

                It should be noted that only a few publishers have thus far noted agreements to use their material in LLM training. As of this writing, many publishers (including Elsevier) have not inked licensing deals with AI developers or at least have not made any announcements of such a deal. Licensing agreements with AI were a major talking point at the IPG 2024 Autumn Conference in September with Cambridge University Press’ Briar May outlining their new “opt-in” approach, in which the publisher asks for the consent of all authors and rightsholders before licensing their content to generative AI technologies.

What do you think of these new licensing agreements? Are they the correct step in safeguarding material for publishers and authors, or are they just a capitulation to the inevitable AI takeover of the publishing world? Let us know in the comments below.

By Chris Moffitt
Chris is a Managing Editor at Technica Editorial

You May Also Be Interested In

What Editors Do

What Editors Do

Book Review The book to begin more books. In the worlds of publishing, editors play a multitude of roles: finding worthy works, guiding those works into books, communicating with authors, helping connect with readers, and more. Editing is a profession, an art, a...

The Technica Advantage

At Technica Editorial, we believe that great teams cannot function in silos, which is why every member of our staff is cross-trained in editorial support and production. We train our employees from the ground up so they can see how each role fits into the larger publishing process. This strategy means Technica is uniquely positioned to identify opportunities to improve and streamline your workflow. Because we invest in creating leaders, you get more than remote support — you get a partner.