The foreign language copycat catcher

Post by **Nipuna** » Tue Sep 17, 2013 8:44 am

: Watching the cheats Image: Chris Shinn/Stone/Getty); mg21929344.900-1_300[1].jpg (12.86 KiB) Viewed 2796 times

LAZY students take note – lifting an article off the internet, translating it into another language and presenting it as your own work won't necessarily go unnoticed. It used to be really tough to spot this kind of plagiarism, thanks to creativity on the part of online translators. Not any more.

A team led by Alberto Barron-Cedeno at the Polytechnic University of Catalonia, Spain, used a number of statistical methods to analyse suspicious-looking documents. One involved breaking each text down into fragments that were five sentences long and looking for elements of words that were similar in two languages.

Another method used a bilingual dictionary to automatically check how many words in each text were the same. The documents could also be translated into a language with a common root to make the analysis easier.

The results surprised even them: their technique showed "remarkable performance" not only in identifying entire documents that had been copied – but in spotting tracts that made use of excessive paraphrasing, too (Knowledge Based Systems, doi.org/nqc). If a document is flagged by the system as being similar to another, then human experts can take a closer look.

This article appeared in print under the headline "Cheating is cheating – in any language"