Dat Nguyen

Post‑doctoral Fellow in Computer Science, Harvard SEAS  ·  Basis Research Institute

profile.jpg

Short bio

I am a Joint Postdoctoral Fellow at Harvard’s Programming Languages and Formal Methods groups and the Basis Research Institute.

My research focuses on program synthesis and probabilistic programming, with a track record in graph-based learning for code and documents. I completed my PhD at the University of Melbourne and previously worked at Cinnamon AI Lab on visually rich document information extraction.

At Harvard, I work on proof automation in Lean and causal systems for drug repurposing. At Basis, I contribute to MARA and R-ADA.

Research interests

  • Program synthesis and probabilistic programming
  • Graph-based learning for code and documents
  • Neuro-symbolic systems with LLMs and SMT
  • Reliable and explainable ML for software

News

  1. Awarded Gold Reviewer at ICML’26. Thanks to the area chairs and to the authors whose submissions were a pleasure to read.
  2. Our work, WorldTest, is accepted at ICML! WorldTest formulates world-model learning evaluation with environment-level queries that pose general questions about the environments, and we instantiated it with AutumnBench. See you in Korea! arXiv, project.
  3. Preprint, follow-up to NeuroSymbolicDG. We re-formulated image classification as spatial predicate induction over learned image primitives! arXiv.
  4. ExoPredicator learns symbolic state and causal processes (agent actions plus exogenous mechanisms) via variational Bayesian inference with LLM proposals. Accepted at ICLR’26. arXiv, openreview.
  5. AutumnBench featured on the Basis Research Institute blog.

Technical blogs

Project demos

NeuroSymbolicDG NeuroSymbolicDG
Domain-invariant classifier head for fine-grained bird recognition, via a PCFG over spatial layouts.
code · paper · blog · checkpoints
VRDSynth VRDSynth (ISSTA '24)
Program synthesis for multilingual document information extraction.
code · paper
Autumn.cpp (ICML '26)
Autumn interpreter in C++. Powers MARA and AutumnBench. Try it live ←
code · AutumnBench paper · blog · playground
↓ to spin droplet, click cloud & sun to interact
ExoPredicator ExoPredicator (ICLR '26)
Learning abstract models of dynamic worlds for robot planning.
paper · openreview
VirDA VirDA (TMLR '25)
Unsupervised domain adaptation by reusing the backbone with visual reprogramming.
code · paper
GNNInfer GNNInfer (ICSE '22, arXiv '24)
Inferring properties of graph neural networks.
paper
FFL FFL (ICSME '22)
Fine-grained fault localization for student programs.
code · paper

Positions

2025 to present
Joint Post-doctoral Fellow
2021 to 2024
PhD, School of Computing & Information Systems
University of Melbourne · Melbourne Research Scholarship
2016 to 2021
AI Research Engineer
Cinnamon AI Lab

Selected Publications

  1. ICML '26Benchmarking World-Model Learning with Environment-Level Queries. Archana Warrier, Dat Nguyen, Michelangelo Naim, Moksh Jain, Yichao Liang, Karen Schroeder, Cambridge Yang, Joshua B. Tenenbaum, Sebastian Vollmer, Kevin Ellis, Zenna Tavares.
  2. ICLR '26ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning. Yichao Liang, Dat Nguyen, Cambridge Yang, Tianyang Li, Joshua B. Tenenbaum, Carl Edward Rasmussen, Adrian Weller, Zenna Tavares, Tom Silver, Kevin Ellis.
  3. arXiv '25A Systematic Survey on Debugging Techniques for Machine Learning Systems. Dat Nguyen, Haoye Tian, Bach Le, Patanamon Thongtanunam, Shane McIntosh.
  4. arXiv '24Inferring Properties of Graph Neural Networks. Dat Nguyen, Hieu M. Vu, Cong-Thanh Le, Bach Le, David Lo, ThanhVu Nguyen, Corina Pasareanu.
  5. ISSTA '24VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document Information Extraction. Dat Nguyen, Tung Do-Viet, Hung Nguyen-Duy, Tuan-Hai Luu, Hung Le, Bach Le, Patanamon Thongtanunam.
  6. arXiv '24Combining Induction and Transduction for Abstract Reasoning. Wen-Ding Li, Keya Hu, Carter Larsen, Yuqing Wu, Simon Alford, Caleb Woo, Spencer M. Dunn, Hao Tang, Michelangelo Naim, Dat Nguyen, Wei-Long Zheng, Zenna Tavares, Yewen Pu, Kevin Ellis.
  7. arXiv '23Adversarial Attacks on Code Models with Discriminative Graph Patterns. Dat Nguyen, Yang Zhou, Xuan Bach D. Le, Patanamon Thongtanunam, David Lo.
  8. ICSME '22FFL: Fine grained Fault Localization for Student Programs via Syntactic and Semantic Reasoning. Dat Nguyen, Thanh Le-Cong, Duc-Minh Luong, Van-Hai Duong, Xuan Bach Le Dinh, David Lo, Thang Huynh-Quyet.
  9. ICSE '22Toward the Analysis of Graph Neural Networks. Dat Nguyen, Thanh Le-Cong*, ThanhVu H. Nguyen, Xuan-Bach D. Le, Quyet-Thang Huynh.
  10. ICPR '20End-to-End Hierarchical Relation Extraction for Generic Form Understanding. Tuan-Anh Nguyen Dang, Duc Thanh Hoang, Quang Bach Tran, Chih-wei Pan, Dat Nguyen.
  11. MAPR '20PCA-based 3D Facial Reenactment From Single Image. Dat Nguyen, Tuan-Anh Nguyen Dang, Viet Sang Dinh.
  12. BMVC '19End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional UNet. Tuan Anh Nguyen Dang, Dat Nguyen.