Shoumik Saha

[CV]. [Google Scholar]. [LinkedIn].
Live and let live!

Shoumik_profile_pic.jpg

CS Ph.D. Student

UMD College Park

smksaha@umd.edu

I am a 4th year Computer Science Ph.D. student at the University of Maryland - College Park, where I am fortunate to be advised by Prof. Soheil Feizi. This and last summer, I have worked as an Applied Scientist Intern at Amazon AWS. My research journey began with a focus on machine learning for security, particularly in malware detection. Over time, my interests have evolved toward security and reliability in machine learning. These days, I’m dedicated to enhancing the robustness and reliability of generative AI, and AI Agents.

If you check out my CV, you’ll see a consistent theme: I enjoy exploring challenges from both sides of the coin – attack and defense, red team and blue team, or however you’d like to frame it. Sounds interesting? Feel free to reach out to discuss my research or potential collaborations!

Before joining the Ph.D. program, I earned my B.Sc. from Bangladesh University of Engineering and Technology (BUET). I then gained valuable experience as a full-time lecturer at United International University and a part-time research assistant at BUET’s research lab.

news

May 27, 2025 Joined Amazon AWS as an Applied Scientist Intern
May 16, 2025 Research got featured in The New York Times!
May 15, 2025 Paper on AI-text Detection accepted to ACL 2025!
Jan 19, 2025 Paper (co-authored) accepted to SaTML 2025!
Dec 20, 2024 Fall 2024: Completed all the coursework of Ph.D.

selected publications

  1. arxiv
    Breaking the Code: Security Assessment of AI Code Agents Through Systematic Jailbreaking Attacks
    Shoumik Saha, Jifan Chen, Sam Mayers, Sanjay Krishna Gouda, Zijian Wang, and Varun Kumar
    arXiv preprint arXiv:2510.01359, 2025
  2. ACL
    Almost AI, Almost Human: The Challenge of Detecting AI-Polished Writing
    Shoumik Saha, and Soheil Feizi
    ACL (Association for Computational Linguistics), 2025
  3. NEURIPS
    Adversarial Paraphrasing: A Universal Attack for Humanizing AI-Generated Text
    Yize Cheng, Vinu Sankar Sadasivan, Mehrdad Saberi, Shoumik Saha, and Soheil Feizi
    NeurIPS (Conference on Neural Information Processing Systems), 2025
  4. EMNLP
    ProcVQA: Benchmarking the Effects of Structural Properties in Mined Process Visualizations on Vision-Language Model Performance
    Kazi Tasnim Zinat, Saad Mohammad Abrar, Shoumik Saha, Sharmila Duppala, Saimadhav Naga Sakhamuri, and Zhicheng Liu
    EMNLP (Empirical Methods in Natural Language Processing), 2025
  5. NEURIPS
    LLM-Check: Investigating Detection of Hallucinations in Large Language Models
    Gaurang Sriramanan, Siddhant Bharti, Vinu Sankar Sadasivan, Shoumik Saha, Priyatham Kattakinda, and Soheil Feizi
    In NeurIPS (Conference on Neural Information Processing Systems), 2024
  6. ICML
    Fast Adversarial Attacks on Language Models In One GPU Minute
    Vinu Sankar Sadasivan, Shoumik Saha, Gaurang Sriramanan, Priyatham Kattakinda, Atoosa Chegini, and Soheil Feizi
    2024
  7. ICLR
    DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified Robustness
    Shoumik Saha, Wenxiao Wang, Yigitcan Kaya, Soheil Feizi, and Tudor Dumitras
    2023
  8. Computers & Security
    MAlign: Explainable static raw-byte based malware family classification using sequence alignment
    Shoumik Saha, Sadia Afroz, and Atif Hasan Rahman
    Computers & Security, 2024
  9. IEEE SaTML
    ML-Based Behavioral Malware Detection Is Far From a Solved Problem
    Yigitcan Kaya, Yizheng Chen, Marcus Botacin, Shoumik Saha, Fabio Pierazzi, Lorenzo Cavallaro, David Wagner, and 1 more author
    2025