Post

Corpus Of Artificial Texts

I am happy to present CoAT: Corpus of Artificial Texts, a robust and a general benchmark for artificial (GenAI) text detection.

CoAT Metrics

It is published with a help of Cambridge University Press. The work is done during my research assistance at MMCP Lab, HSE University.

CoAT is a large-scale corpus of human-written and AI-generated texts in Russian, spanning 6 domains and 13 different text generation models. The study provides insights into artificial text detection capabilities and challenges, highlighting the need for improved detection methods as language models continue to advance.

This post is licensed under CC BY 4.0 by the author.