CoAT: Corpus of Artificial Texts
A large-scale Russian-language dataset containing both human-written and AI-generated texts.
CambridgeMachine Learning Engineer
I am a versatile and dedicated engineer with a strong focus on ML product development for natural languages and human speech.
I work at Applied Sciences Group, Microsoft. As part of this group, I contribute to the LLM platform and Text Intelligence scenarios, enabling language models across Microsoft products, including Windows OS.
Follow the full publication trail on Google Scholar.
A large-scale Russian-language dataset containing both human-written and AI-generated texts.
CambridgeA unified interface for 20 NLG evaluation metrics, designed for comprehensive generated-text evaluation.
arXivResults from the Dialogue Evaluation shared task on artificial text detection in Russian.
arXivMy BSc thesis project studying how language models distinguish artificial texts from human-written ones, connected with my research at CPLab.
GitHubA chat-based job-matching assistant. The user describes their experience and what they are looking for in plain language; the system turns that into a structured profile, retrieves the best-matching vacancies from a vector database, re-ranks them with a trained model, and presents them next to the conversation, which the user can keep refining to request more results.
GitHubA sanitized template for running a personalized OpenClaw agent on a small Azure Linux VM, with GitHub Copilot-backed model access, pull-based GitOps, scheduled workflows, cloud-drive guardrails, Azure Speech integration, and disposable Crabbox/Daytona dev boxes for heavier software testing.
GitHub Write-up