Maty(as) Bohacek

Stanford, CA, USA

I am a student at Stanford University with a passion for AI research. Advised by Professor Hany Farid, I focus on generative AI, deepfake detection, and other problems at the intersection of AI, computer vision, and media forensics. My full name is Matyas, but I go by Maty.

News

  • (January 2025) Discussed AI and disinformation at Unicef’s DCE workshop in Nairobi, Kenya.

  • (November 2024) Honored to receive the Czech AI Personality Award (Osobnost.ai) as the Discovery of the Year.

  • (November 2024) Discussed my research on Hyde Park Civilization, a Czech TV show about science.

  • (November 2024) Delivered a guest lecture in Stanford’s DATASCI 194D.

  • (November 2024) Joined a panel at the Aspen Institute’s Annual Conference in Prague, Czechia.

  • See full archive here.

Upcoming

Selected Publications & Preprints

Has an AI Model Been Trained on Your Images?

Bohacek M. & Farid H. ArXiv, abs/2501.06399.
Paper Dataset — This paper introduces a method to determine whether a generative AI model was trained on a specific image, helping to address concerns around fair use and copyright in generative AI.

Human Action CLIPS: Detecting AI-generated Human Motion

Bohacek M. & Farid H. ArXiv, abs/2412.00526.
Paper Dataset — This paper proposes a method for distinguishing real from text-to-video clips using multi-modal semantic embeddings, evaluated on DeepAction, a new dataset of real and AI-generated human motion.

DeepSpeak Dataset v1.0

Barrington S., Bohacek M., and Farid H. ArXiv, abs/2408.05366.
Paper Dataset — This paper introduces DeepSpeak, a large-scale dataset of real and deepfake footage designed to support research on detecting state-of-the-art face-swap and lip-sync deepfakes.

Lost in Translation: Lip-Sync Deepfake Detection from Audio-Video Mismatch

Bohacek M. & Farid H. CVPR 2024 Workshops.
Paper — This paper presents a method for detecting lip-sync deepfakes by comparing mismatches between audio-to-text transcription and automated lip-reading, evaluated on both controlled and in-the-wild datasets.

Nepotistically Trained Generative-AI Models Collapse

Bohacek M. & Farid H. ArXiv, abs/2311.12202.
PaperThis paper demonstrates how some generative AI models, when retrained on their own outputs, produce distorted images and struggle to recover even after retraining on real data.

For a complete list of my academic publications, please refer to my Google Scholar profile.

Contact & Misc.

  • Email: maty (at) stanford (dot) edu

  • Resume (coming soon)