AI detection software discriminates against non-native English speakers - study
Computer programs that identify texts generated by artificial intelligence may discriminate against non-native English speakers.
Here's What We Know
Scientists ran 91 English texts written by non-native speakers through seven popular GPT detectors to determine their accuracy. The results showed that such articles were often falsely labelled as generated by artificial intelligence.
More than half of the essays written for the widely recognised TOEFL English language exam were marked as being written by artificial intelligence. One of the programs reported that 98% of the texts were generated by AI.
When these programs checked the texts written by native English speakers, they classified them as written by humans in more than 90% of cases.
Scientists explain the discrimination by the way detectors distinguish AI from humans. Programs are analysed for the so-called "text perplexity" " - a measure of the model's "surprised" " or "confused" " when trying to predict the next word in a sentence.
If the algorithm easily copes with the task, the text surprise is rated as low. However, if the next word is difficult to predict, the text's surprise score is high.
In other words, if a person uses simple words and phrases, the programme is more likely to accept them as AI. As a rule, non-native speakers often use common words and expressions, which leads to discrimination.
Having discovered the built-in bias in AI recognition programs, the researchers asked ChatGPT to rewrite TOEFL essays using more complex wording. When the edited texts were run through the detectors again, they were all marked as human.
According to the researchers, with the advent of ChatGPT, many educators have begun to see AI detection as "critical countermeasure to deter a 21st-century form of cheating"." However, they warn that the 99% accuracy claimed by some detectors is "misleading at best"".
Source: The Guardian.