Scientific studies created with the help of artificial intelligence are polluting the ecosystem of academic information on the Internet, according to an alarming report published in the Disinformation Review of the Harvard Kennedy School.
A group of researchers studied the prevalence of scientific articles with signs of artificially generated text on Google Scholar, an academic search engine that makes it easy to find research published in the past in a variety of academic journals.
The team specifically investigated the misuse of generative pre-trained transformers (or GPTs), a type of large language model (LLM) that includes familiar software such as OpenAI’s ChatGPT. These models are able to quickly interpret input text and quickly generate responses in the form of numbers, images, and long strings of text.
In their study, the team analyzed a sample of research papers found on Google Scholar that showed signs of using GPT. The selected articles contained one or two common phrases used by dialog agents (usually chatbots) operated by law majors. The researchers then examined the extent to which these questionable articles were distributed and posted online.
“The risk of what we call ‘evidence hacking’ increases significantly when research generated by artificial intelligence is shared on search engines,” said Björn Ekström, a researcher at the Swedish School of Library and Information Science and co-author of the paper, in a press release from Borås University. “This can have tangible consequences, as incorrect results can seep further into society and possibly into more and more areas.”
According to a recent team of researchers, Google Scholar does not weed out papers whose authors have no academic affiliation or peer review; the system extracts academic byproducts-student papers, reports, preprints, etc.-along with research that has passed a higher bar of scrutiny.
The team found that two-thirds of the papers they studied were at least partially created through the covert use of GPT. The researchers found that 14.5% of the GPT-generated articles were related to health, 19.5% to the environment, and 23% to computer technology.
“The majority of these GPT-fabricated articles were found in non-indexed journals and working papers, but in some cases, the studies were published in mainstream scientific journals and conference proceedings,” the authors write.
The researchers outlined two main risks associated with this development. “First, the abundance of fabricated ‘research’ seeping into all areas of the research infrastructure threatens to overwhelm the scientific communication system and jeopardize the integrity of the scientific record,” the group writes. “The second risk is that it is increasingly likely that convincing-looking scholarly content has in fact been fraudulently created using AI tools and optimized for search by public academic search engines, including Google Scholar.”
Since Google Scholar is not an academic database, it is easy to use to search for scientific literature. This is a good thing. Unfortunately, it is harder for members of the public to separate the wheat from the chaff when it comes to reputable journals; even the difference between a peer-reviewed study and a working paper can be confusing. In addition, AI-generated text has been found in some peer-reviewed papers as well as in less thoroughly vetted papers, indicating that GPT fabricated work is muddying the waters throughout the online academic information system – not just in work that exists outside of most official channels.
“If we can’t trust that the research we read is genuine, we risk making decisions based on incorrect information,” study co-author Jutta Haider, also a researcher at the Swedish School of Library and Information Science, said in the same release. “But as much as this is an issue of scientific dishonesty, it is an issue of media and information literacy.”
In recent years, publishers have failed to successfully weed out several scientific articles that were in fact complete nonsense. In 2021, Springer Nature was forced to retract more than 40 articles in the Arabian Journal of Geosciences, which, despite the journal’s title, discussed a wide variety of topics, including sports, air pollution, and pediatric medicine. In addition to being off-topic, the articles were poorly written – to the point of not making sense – and the sentences often lacked a convincing sequence of thought.
Artificial intelligence is exacerbating this problem. Last February, Frontiers was criticized for publishing an article in its journal Cell and Developmental Biology that contained images created by Midjourney software, including highly anatomically incorrect images of signaling pathways and rat genitals. Frontiers retracted the article a few days after it was published.
AI models can be a boon to science: systems can decipher fragile texts from the Roman Empire, find previously unknown Nazca lines, and reveal hidden details in dinosaur fossils. But the impact of AI can be as positive or negative as the person who uses it.
Peer-reviewed journals – and possibly hosting and search engines for academic texts – need safeguards to ensure that the technology works in favor of scientific discovery, not against it.









