Junk science has been forcing a reckoning among scientific and medical researchers for the past year, leading to thousands of retracted papers. Last year, Stanford president Marc Tessier-Lavigne resigned amid reporting that some of his most high-profile work on Alzheimer’s disease was at best inaccurate. (A probe commissioned by the university’s board of trustees later exonerated him of manipulating the data).
But the problems around credible science appear to be getting worse. Last week, scientific publisher Wiley decided to shutter 19 scientific journals after retracting 11,300 sham papers. There is a large-scale industry of so-called “paper mills” that sell fictive research, sometimes written by artificial intelligence, to researchers who then publish it in peer-reviewed journals — which are sometimes edited by people who had been placed by those sham groups. Among the institutions exposing such practices is Retraction Watch, a 14-year-old organization co-founded by journalists Ivan Oransky and Adam Marcus. I spoke with Oransky about why there has been a surge in fake research and whether fraud accusations against the presidents of Harvard and Stanford are actually good for academia.
Give me a sense of how big a problem these paper mills are.
I’ll start by saying that paper mills are not the problem; they are a symptom of the actual problem. Adam Marcus, my co-founder, had broken a really big and frightening story about a painkiller involving scientific fraud, which led to dozens of retractions. That’s what got us interested in that. There were all these retractions, far more than we thought but far fewer than there are now. Now, they’re hiding in plain sight.
That was 2010. Certainly, AI has accelerated things, but we’ve known about paper mills for a long time. Everybody wanted to pretend all these problems didn’t exist. The problems in scientific literature are long-standing, and they’re an incentive problem. And the metrics that people use to measure research feed a business model — a ravenous sort of insatiable business model. Hindsight is always going to be 20/20, but a lot of people actually were predicting what we’re seeing now.
Regarding your comment that paper mills are symptoms of a larger problem, I read this story in Science and was struck by the drive for credentialing — which gets you better jobs, higher pay, and more prestige. In academia, there aren’t enough jobs; are the hurdles to these jobs impossibly high, especially for people who may be smart but are from China or India and may not have entry into an American or European university?
I actually would go one step higher. When you say there aren’t enough jobs, it’s because we’re training so many Ph.D.’s and convincing them all that the only way to remain a scientist is to stay in academia. It’s not, and that hasn’t been true for a long time. So there’s definitely a supply-and-demand problem, and people are going to compete.
You may recall the story about high-school students who were paying to get medical papers published in order to get into college. That’s the sort of level we’re at now. It’s just pervasive. People are looking only at metrics, not at actual papers. We’re so fixated on metrics because they determine funding for a university based on where it is in the rankings. So it comes from there and then it filters down. What do universities then want? Well, they want to attract people who are likely to publish papers. So how do you decide that? “Oh, you’ve already published some papers, great. We’re gonna bring you in.” And then when you’re there, you’ve got to publish even more.
You’re replacing actual findings and science and methodology and the process with what I would argue are incredibly misleading — even false — metrics. Paper mills are industrializing it. This is like the horse versus the steam engine.
So they’re Moneyballing it.
Absolutely. They’ve Moneyballed it with a caveat: Moneyball sort of worked. The paper mills have metricized it, which is not as sexy to say. If you were to isolate one factor, citations matter the most, and if you look at the ranking systems, it’s all right there. The Times Higher Education world-university rankings, U.S. News — look at whichever you want, and somewhere between like 30 percent and 60 percent of those rankings are based on citations. Citations are so easy to game. So people are setting up citation cartels: “Yes, we will get all of our other clients to cite you, and nobody will notice because we’re doing it in this algorithmic, mixed-up way.” Eventually, people do notice, but it’s the insistence on citations as the coin of the realm that all of this comes from.
Your work gets to the heart of researchers’ integrity. Do you feel like you’re a pariah in the scientific community?
I’m a volunteer. Adam is paid a very small amount. We use our funding to pay two reporters and then two people work on our database side. We approach these things journalistically; we don’t actually identify the problems ourselves. It’s very, very rare for us to do that. Even when it may appear that way on a superficial read — we’ve broken some stories recently about clear problems in literature — it’s always because a source showed us the way. Sometimes those sources want to be named, sometimes they don’t.
We’ve been doing this for 14 years. There are various ways to look at what the scientific community thinks of us. We’re publishing 100 posts a year about people committing bad behavior and only getting, on average, one cease-and-desist letter a year. We have never been sued, but we do carried defamation insurance. Our work is cited hundreds of times in the scientific literature. I definitely don’t feel like a pariah. Me saying I’m a pariah would be a little bit like, you know, someone whose alleged cancellation has promoted them to the top of Twitter.
People are unhappy that we have do what we do. If you talk to scientists, the things we’re exposing or others are exposing are well known to them. Because of the structures, the hierarchies, and the power differentials in science, it’s very difficult for them as insiders to blow the whistle. There’s a book out by Carl Elliott about whistleblowers, mostly in the sort of more clinical fields. That’s the vulnerable position. That’s where you end up being a pariah even though you should be considered a hero or heroine.
Are some fields better at policing their own research than others?
Yes. Going back to the origin story of Retraction Watch, Adam broke a story about this guy named Scott Reuben, who came from anesthesiology. We have a leaderboard of the people with the most retractions in the world, and at least three out of the top ten right now are anesthesiologists. That is a much higher percentage than one might expect. Some people may say, “Oh, does anesthesiology have a problem?” No, in fact, anesthesiology has been doing something about this arguably longer than any other field has.
What is it about anesthesiology that makes it so anesthesiologists are more willing to scrutinize the work in their own field?
It had a crisis earlier than others, and it’s small. Journal editors are generally considered pretty august personages, leaders in the field. They got together and it was like a collective action by the journal editors when they realized they had problems. I’m not saying anesthesiologists are better, but they’re a more tight-knit community, which I do think is important. The same thing happened in social psychology and in psychology writ large. There’s a higher number than you would expect of people on leaderboards in that field. So it’s a question of, When did they get there, and how did they react to it? There are fields that haven’t actually gotten there, even though it’s been a while. So maybe there are some sociologists who could tell you better than me why that might be the case.
That wasn’t the reason I expected. I thought you would say something along the lines of, well, it’s life or death and anesthesiologists don’t want to see people dying on the table.
If anything, sometimes when the stakes are higher, fields are more resistant.
Geez.
There’s a guy named Ben Mol. Ben is an OB/GYN, and he is a force to be reckoned with. Fascinating character. He’s a pit bull, and he has found tons and tons of problems in the OB/GYN literature. I would characterize the leaders in that field now as still a bit more reluctant to engage with these issues than some of the other fields I mentioned.
Can you tell me how you go about authenticating real language from AI, especially in papers that can be hard to parse and are laden with jargon to begin with?
We rely on experts. We’re not really doing that ourselves. You don’t need to be an expert; you just need to know how to use Ctrl+F if you see certain phrases in a paper. And by the way, a lot of journals are perfectly fine with people using chat GPT and other kinds of AI. It’s just whether you disclose it or not. These are cases where they didn’t disclose it.
With the resignation of Stanford’s and Harvard’s presidents, do you worry about the way the general public has been using these tools?
The fact that they’re giving speeding tickets to certain groups of people doesn’t mean we’re not all speeding. It means they’re getting targeted in, I would argue, an unfair way. We’re in a great reckoning with Harvard’s Claudine Gay being the key example. Former Stanford president Marc Tessier-Lavigne is not an example of that. The targeting is a concern. And clearly, there are false positives. The flip side of this is that AI is being used to find these problems.
This interview has been edited for length and clarity. The story was updated to include that a probe found that Tessier-Lavigne didn’t manipulate data.