It is massive, lots of papers analyzed, but just the Abstract part of each.
One way you could look at it is: some authors used an LLM to create the abstract from the paper contents. "Hey, there's a lot of new books with AI covers"
One other way is: There could be a correlation between LLM usage in the abstract and LLM usage during the production or writing of the paper. "Hey, I wonder if this book with an AI cover also was written by an AI. It should be investigated"
The paper was written by four Chinese authors who had previously collaborated together. It was one of the worst papers I've ever seen. It went over the same ground as previous studies so in that sense there was nothing new, but more bizarrely it did things like state well known equations incorrectly (most notably one of Maxwell's equations) incorrectly and drifted off onto total tangents about how one might calculate certain things in an idealised situation that had no relevance to the method that the authors used. My assumption is that they had generated the methods section using generative AI and not gone over it with even a cursory check.
But the worst part is that I recommended rejecting it outright, and the journal just sent it on to it's slightly less prestigious sister journal rather than doing so.
From the article I saw that they're using "excess words" as an indicator, is that a reliable method?
Also, is it possible that it's just autocorrect that added "excess words" when fixing grammar? If that's the case, should that be considered as "use of AI"?
" while our approach can detect unexpected lexical changes, it cannot separate different causes behind those changes, like multiple emerging topics or multiple emerging writing style changes. For example, our approach cannot distinguish word frequency increase due to direct LLM usage from word frequency increase due to people adopting LLM-preferred words and borrowing them for their own writing. For spoken language, there is emerging evidence for such influence of LLMs on human language usage (32). However, we hypothesize that this effect is much smaller and much slower. "
Since "unexpected" can be left out as there is no "expected" baseline, we are left with "detected lexical changes, cause unknown" followed by a purely speculative and unproven hypothesis.
That of course does not stop them from claiming: "In conclusion, our work showed that the effect of LLM usage on scientific writing is truly unprecedented".
This is slop. Even if you agree with the suspicion of widespread use of LLMs in Academic writing (I do, and anecdotally suspect even their upper bound is way underestimated) They did not show such an effect at all. They used a projection on a dataset, hypothesized, but did not investigate a cause.
But hey, let not the absence of information restrain us from pushing the agenda: "Our analysis can inform the necessary debate around LLM policies providing a measurement method for LLM usage". In other words, here is what we did not do but you can pretend we did anyway in citing us as if we did because nobody reads anything but the abstract and the conclusion.
As a bonus kicker: "We hope that future work will meticulously delve into tracking LLM usage more accurately and assess which policy changes are crucial to tackle the intricate challenges posed by the rise of LLMs in scientific publishing." I'm sure this sentence would flag this as LLM slop, even according to their own sloppy methodology.