(Photo by Katie Lenhart / Dartmouth)

I am an Assistant Professor in the Department of Computer Science at Dartmouth and lead the SEE Lab @ Dartmouth, where we are Teaching Machines to See and Feel. I am the recipient of the NSF CAREER Award (2026). Our research advances socially and emotionally intelligent AI that can see, feel, and understand the world, spanning empathy-driven video understanding, multimodal reasoning, accessibility for blind and low vision (BLV) audiences, and privacy-preserving learning.

Before joining Dartmouth, I was a postdoctoral associate at the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts Institute of Technology (MIT), working with Dr. Aude Oliva. I earned my PhD in the College of Information and Computer Sciences (CICS) at the University of Massachusetts Amherst, advised by Dr. Erik Learned-Miller.

Selected Publications

(* indicates equal contribution)
(† denotes equal senior author contribution)
  • Minh Dinh and SouYoung Jin. Unsafe2Safe: Controllable Image Anonymization for Downstream Utility. Computer Vision and Pattern Recognition (CVPR), 2026.
    Highlight [arXiv] [project page]

  • Wayner Barrios and SouYoung Jin. Beyond Final Answers: CRYSTAL Benchmark for Transparent Multimodal Reasoning Evaluation. Preprints, 2026.
    [arXiv] [project page]

  • Wayner Barrios, Andrés Villa, Juan León Alcázar, SouYoung Jin†, Bernard Ghanem†. MoDA: Modulation Adapter for Fine-Grained Visual Grounding in Instructional MLLMs. International Conference on Machine Learning (ICML), 2026. [arXiv] [project page]

  • Benjamin Lahner, Kshitij Dwivedi, Polina Iamshchinina, Monika Graumann, Alex Lascelles, Gemma Roig, Alessandro Thomas Gifford, Bowen Pan, SouYoung Jin, Ratan Murty, Kendrick Kay, Aude Oliva†, Radoslaw Cichy†. Modeling short visual events through the BOLD moments video fMRI dataset and metadata.. Nature Communications, 2024 [paper] [project page] [dataset repository]

  • Wayner Barrios and SouYoung Jin. Multi-layer Learnable Attention Mask for Multimodal Tasks. Preprints, 2024. [arXiv]

  • Bowen Pan, Rameswar Panda, SouYoung Jin, Rogerio Feris, Aude Oliva, Phillip Isola, Yoon Kim. LangNav: Language as a Perceptual Representation for Navigation. Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL) 2024 (Findings). [paper] [MIT News]

  • Howard Zhong, Samarth Mishra, Donghyun Kim, SouYoung Jin, Rameswar Panda, Hilde Kuehne, Leonid Karlinsky, Venkatesh Saligrama, Aude Oliva, Rogerio Feris. Learning Human Action Recognition Representations Without Real Humans. Neural Information Processing Systems Datasets and Benchmarks Track 2023. [paper] [supp] [project page]

  • Camilo Luciano Fosco, SouYoung Jin, Emilie L Josephs, Aude Oliva. Leveraging Temporal Context in Low Representational Power Regimes. Computer Vision and Pattern Recognition (CVPR), 2023. [paper] [supp] [project page]

  • Yo-whan Kim, Samarth Mishra, SouYoung Jin, Rameswar Panda, Hilde Kuehne, Leonid Karlinsky, Venkatesh Saligrama, Kate Saenko, Aude Oliva, Rogerio Feris. How Transferable are Video Representations Based on Synthetic Data?. Neural Information Processing Systems Datasets and Benchmarks Track 2022. [paper] [supp]
    [project page] [MIT News]

  • Alexander H Liu, SouYoung Jin, Cheng-I Jeff Lai, Andrew Rouditchenko, Aude Oliva, James Glass. Cross-Modal Discrete Representation Learning. Annual Meeting of the Association for Computational Linguistics (ACL) 2022.
    oral [paper] [MIT News]

  • Mathew Monfort*, SouYoung Jin*, Alexander Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva. Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions. Computer Vision and Pattern Recognition (CVPR), 2021.
    [paper] [supp] [project page]

  • Ashish Singh*, Hang Su*, SouYoung Jin, Huaizu Jiang, Chetan Manjesh, Geng Luo, Ziwei He, Li Hong, Erik G. Learned-Miller, Rosemary Cowell. Half&Half: New Tasks and Benchmarks for Studying Visual Common Sense. CVPR 2019 Workshop on Vision Meets Cognition, 2019. [paper]

  • Aruni RoyChowdhury, Prithvijit Chakrabarty, Ashish Singh, SouYoung Jin, Huaizu Jiang, Liangliang Cao, Erik Learned-Miller. Automatic adaptation of object detectors to new domains using self-training. Computer Vision and Pattern Recognition (CVPR), 2019.
    [paper] [project page]

  • SouYoung Jin*, Aruni RoyChowdhury*, Huaizu Jiang, Ashish Singh, Aditya Prasad, Deep Chakraborty, Erik Learned-Miller. Unsupervised hard example mining from videos for improved object detection. European Conference on Computer Vision (ECCV), 18 pages, 2018.
    [paper] [project page]

  • SouYoung Jin, Hang Su, Chris Stauffer, Erik Learned-Miller. End-to-end face detection and cast grouping in movies using Erdos-Renyi clustering. International Conference on Computer Vision (ICCV), 10 pages, 2017.
    spotlight [paper] [supp] [code] [project page]

Team

Current

Alumni

  • Seung Hyun Hahm Undergrad Senior Thesis: High Honors (Fall 2024 – Spring 2025)
  • Duncan Korkosz Undergrad (Fall 2024 – Spring 2025)
  • Dylan Thomas Undergrad (Fall 2023 – Spring 2025)
  • Henry Scheible Undergrad (Spring 2023 – Fall 2023)
  • Aneesh Patnaik Undergrad (Spring 2023 – Summer 2023)

Teaching

        • COSC 74/274 Machine Learning and Statistical Data Analysis . W2023, W2024, W2026
        • COSC 78/278 Deep Learning . F2024, S2026
        • COSC 89.30/189 Video Understanding. S2023, W2024, F2025.