We’ve almost talked to death the topic of ChatGPT’s authorship role and the ethics of authors using it and other AI tools to produce manuscripts in past blog posts. Yet, this doesn’t change the fact that AI continues to infiltrate every step of the scholarly publishing process from writing manuscripts to searching through databases during the investigation phase, but can it be used in perhaps the most overburdened area of scholarly publishing: peer review?
Being a volunteer-based part of the publishing process, peer review has become arguably the most time-consuming aspect of scholarly publishing. There always seems to be more papers coming in with fewer qualified reviewers available to review them leading to a bottleneck for many editors and publishers. We’ve talked about reviewer fatigue in past posts and even the notion of paying reviewers in order to get more qualified reviewers to submit timely reviews. Some think that AI could be the key to fighting this growing fatigue. In 2021, researchers created an AI tool with the goal of providing an alternative to human peer review when evaluating journal articles. Early studies showed that the tool often provided reviewer comments that were like comments provided by human peer reviewers. However, it must be noted that the researchers focused on “a rather superficial set of features” including general readability and formatting. It also doesn’t overcorrect for any of the biases that currently exist in chatbot programming. There is also currently no clear AI tool for fraud detection.
In late 2022, researchers Hosseini and Horbach explored using a large language model (LLM) like ChatGPT to assist in peer review processing. This is just one of many studies AI assisting in peer evaluation. While they found that the AI helped with summarizing reviewer comments and even worked well in drafting decision letters for editors, their role in writing peer reviews should be highly scrutinized. Based off this report, they feared that AI chatbots could be further used by fraudulent services to produce fake reviews for journals. Another study in 2013 found that LLMs provided well-written peer review reports with what seem like genuine reviewer comments, but the comments sometimes had nothing to do with the manuscript being reviewed. The LLM also presented a list of fake references that they requested the authors cite upon revision. The result was a produced reviewer report that seemed genuine and thoughtful but was ultimately hollow and illogical, but in the hands of the wrong editor could lead to a rejection of an article that deserves publication.
It is likely because of studies like this that many publishers and groups are taking steps to keep AI out of their peer review process. In a June 2023 guide notice, the NIH stated that using AI tools to assist in the review of grant applications was a clear “breach of confidentiality” and was prohibited. That being said, it can be tough for many editors to determine the differences between a peer review written by a qualified scientists and one written by AI and this will only become more difficult to distinguish as the AI becomes smarter and the technology improves. Because of this, reviewers need to be very careful about incorporating any comments generated by AI in their review. These chatbots might make things easier for overwhelmed reviewers, but an inaccurate review can cost authors, editors, and journals an enormous amount of credibility.
By: Chris Moffitt
Chris is a Managing Editor at Technica Editorial