Dat Nguyen

Post‑doctoral Fellow in Computer Science, Harvard SEAS  ·  Basis Research Institute

profile.jpg

Short bio

I am a Postdoctoral Fellow at Harvard’s Programming Languages and Formal Methods groups and an incoming Postdoctoral Scientist at the Basis Research Institute.

My research focuses on program synthesis and probabilistic programming, with a track record in graph-based learning for code and documents. I completed my PhD at the University of Melbourne and previously worked at Cinnamon AI Lab on visually rich document information extraction.

At Harvard, I work on proof automation in Lean and causal systems for drug repurposing. At Basis, I contribute to MARA and R-ADA.

Research interests

  • Program synthesis and probabilistic programming
  • Graph-based learning for code and documents
  • Neuro-symbolic systems with LLMs and SMT
  • Reliable and explainable ML for software

Technical blogs

Project demos

NeuroSymbolicDG NeuroSymbolicDG
PCFG over a spatial layout DSL as a domain-invariant classifier head for fine-grained bird recognition.
code · blog · checkpoints
VRDSynth VRDSynth (ISSTA '24)
Synthesizing programs for multilingual visually rich document information extraction.
code · paper
Autumn.cpp Autumn.cpp
An Autumn interpreter in C++ for MARA.
code
ExoPredicator ExoPredicator (ICLR '26)
Learning abstract models of dynamic worlds for robot planning.
paper · openreview
VirDA VirDA (TMLR '25)
Reusing backbone for unsupervised domain adaptation with visual reprogramming.
code · paper
GNNInfer GNNInfer (ICSE '22, arXiv '24)
Inferring properties of graph neural networks.
paper
FFL FFL (ICSME '22)
Fine-grained fault localization for student programs via syntactic and semantic reasoning.
code · paper

Positions

Period Role & Affiliation
2025 – present Post-doctoral Fellow, Harvard SEAS & Basis Research Institute
2021 – 2024 PhD, School of Computing & Information Systems, University of Melbourne (Melbourne Research Scholarship)
2016 – 2021 AI Research Engineer, Cinnamon AI Lab

News

Jan 26, 2026 Co-authored paper “ExoPredicator” accepted at ICLR’26. Authors: Yichao Liang, Thanh Dat Nguyen, Cambridge Yang, Tianyang Li, Joshua B. Tenenbaum, Carl Edward Rasmussen, Adrian Weller, Zenna Tavares, Tom Silver, Kevin Ellis.
Jul 27, 2025 AutumnBench featured on the Basis Research Institute blog.
Mar 5, 2025 ArXiv: “A Systematic Survey on Debugging Techniques for Machine Learning Systems” (link).
Jan 15, 2025 Paper “VirDA” published in TMLR’25. Authors: Duc-Duy Nguyen, Dat Nguyen.
Dec 2, 2024 “Combining Induction and Transduction for Abstract Reasoning” won Best Paper at the ARC contest (arXiv).

Selected Publications [Full List]

  1. ICLR
    ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning
    Yichao Liang, Thanh-Dat Nguyen, Cambridge Yang, and 7 more authors
    In 2026
  2. arXiv
    A Systematic Survey on Debugging Techniques for Machine Learning Systems
    Thanh-Dat Nguyen, Haoye Tian, Bach Le, and 2 more authors
    arXiv preprint arXiv:2503.03158 Mar 2025
  3. TMLR
    VirDA: Reusing Backbone for Unsupervised Domain Adaptation with Visual Reprogramming
    Duc-Duy Nguyen, and Dat Nguyen
    Transactions on Machine Learning Research Mar 2025
  4. arXiv
    Inferring Properties of Graph Neural Networks
    Dat Nguyen, Hieu M. Vu, Cong-Thanh Le, and 4 more authors
    arXiv preprint arXiv:2401.03790 Jan 2024
  5. ISSTA
    VRDSynth: Synthesizing Programs for Multilingual Visually Rich Document Information Extraction
    Thanh-Dat Nguyen, Tung Do-Viet, Hung Nguyen-Duy, and 4 more authors
    arXiv preprint arXiv:2407.06826 Jul 2024
  6. arXiv
    Combining Induction and Transduction for Abstract Reasoning
    Wen-Ding Li, Keya Hu, Carter Larsen, and 11 more authors
    arXiv preprint arXiv:2411.02272 Nov 2024
  7. arXiv
    Adversarial Attacks on Code Models with Discriminative Graph Patterns
    Thanh-Dat Nguyen, Yang Zhou, Xuan Bach D. Le, and 2 more authors
    arXiv preprint arXiv:2308.11161 Aug 2023
  8. ICSME
    FFL: Fine grained Fault Localization for Student Programs via Syntactic and Semantic Reasoning
    Thanh-Dat Nguyen, Thanh Le-Cong, Duc-Minh Luong, and 4 more authors
    In 2022 IEEE 38th International Conference on Software Maintenance and Evolution, Research Track Aug 2022
  9. ICSE
    Toward the Analysis of Graph Neural Networks
    Thanh-Dat Nguyen, Thanh Le-Cong*, ThanhVu H. Nguyen, and 2 more authors
    In 2022 IEEE/ACM 44th International Conference on Software Engineering: New Ideas and Emerging Results) Aug 2022
  10. ICPR
    End-to-End Hierarchical Relation Extraction for Generic Form Understanding
    Tuan-Anh Nguyen Dang, Duc Thanh Hoang, Quang Bach Tran, and 2 more authors
    ICPR 2020 Aug 2020
  11. MAPR
    PCA-based 3D Facial Reenactment From Single Image
    Thanh-Dat Nguyen, Tuan-Anh Nguyen Dang, and Viet Sang Dinh
    In Aug 2020
  12. BMVC
    End-to-End Information Extraction by Character-Level Embedding and Multi-Stage Attentional UNet
    Tuan Anh Nguyen Dang, and Thanh-Dat Nguyen
    British Machine Vision Conference (BMVC) Aug 2019
  13. Non-local DenseNet for plant CLEF 2019 contest
    Thanh-Dat Nguyen, George Quénot, and Lorraine Goeuriot
    CEUR-Workshop Aug 2019