Nathaniel Diamant

Email:
Website:
Github:

I am mathematician and computer scientist with a strong background in the theory and engineering of machine learning. I have experience with the full pipeline of a deep learning research project: from collaborating with domain experts, to designing, implementing, and validating neural networks. My ideal role is diving into cutting edge projects applying deep learning to biological data.

Education

Harvey Mudd College Claremont, CA - 2015 - 2019

Major: Computer Science Math Joint Major
GPA: 3.87
Awards and Honors: Henry A. Krieger Prize for Outstanding Promise in the Field of Probability Statistics or Operations Research, Dean's list all eight semesters, graduated with high distinction and math departmental honors

Experience

Machine Learning for Health at Broad Institute ML Engineer II - Summer 2019 to Present

  • Collaborated with top research clinicians and communicated complex machine learning concepts to non-experts, leading to many successful research projects
  • Contributed multi-processed data-loading and automatic construction of multi-modal neural network architectures to open source codebase

Computational Cardiovascular Research Group at MIT Research - Winter 2019 to Spring 2020

  • Researched best way to generate representations of ECGs for use in downstream tasks. The representations are currently being used in follow up research projects for predicting cardiovascular disease
  • Wrote a paper on the method, which was selected for a spotlight talk at SAIL 2021

AdRoll Engineering Intern - Summer 2018

  • Contributed to every part of the modeling pipeline: data backend in C++ and SQLite, modeling in Python, batch job creation and scheduling in Luigi and AWS, visualizations in D3
  • Generated synthetic data for testing complex statistical model of customer conversion using inverse transform sampling

Yelp Engineering Intern - Summer 2017

  • Implemented automatic batch training of models using Redshift and SQLAlchemy with thorough unit and integration tests
  • Developed framework responsible for filtering 72k photos and 6k photo caption edits every day with thorough manual and unit tests

Harvey Mudd CS Department Research - Summer 2016

  • Generated and analyzed arbitrary order Markov models of how students progress through programming problems
  • Implemented complex MySQL and NumPy data munging pipeline

Publications

Inherited Basis of Visceral, Abdominal Subcutaneous and
Gluteofemoral Fat Depots Broad Institute - 26 Aug 2021

Saaket Agrawal, Minxian Wang, Marcus D. R. Klarqvist, Joseph Shin, Hesam Dashti, Nathaniel Diamant, Seung Hoan Choi, Sean J. Jurgens, Patrick T. Ellinor, Anthony Philippakis, Kenney Ng, Melina Claussnitzer, Puneet Batra, Amit V. Khera. medRxiv
  • Helped provide neural network derived phenotypes for GWAS

Deep Learning to Predict Cardiac Magnetic Resonance–Derived Left Ventricular Mass and Hypertrophy From 12-Lead ECGs Broad Institute - 15 Jun 2021

Shaan Khurshid, Samuel Friedman, James P. Pirruccello, Paolo Di Achille, Nathaniel Diamant, Christopher D. Anderson, Patrick T. Ellinor, Puneet Batra, Jennifer E. Ho, Anthony A. Philippakis, Steven A. Lubitz. Circulation: Cardiovascular Imaging
  • Top contributer to the ML4H codebase used to design and train the models for the research

Cohort Design and Natural Language Processing to Reduce Bias in Electronic Health Records Research: The Community Care Cohort Project Broad Institute - 30 May 2021

Shaan Khurshid, Christopher Reeder, Lia X. Harrington, Pulkit Singh, Gopal Sarma, Samuel F. Friedman, Paolo Di Achille, Nathaniel Diamant, Jonathan W. Cunningham, Ashby C. Turner, Emily S. Lau, Julian S. Haimovich, Mostafa A. Al-Alusi, Xin Wang, Marcus D.R. Klarqvist, Jeffrey M. Ashburner, Christian Diedrich, Mercedeh Ghadessi, Johanna Mielke, Hanna M. Eilken, Alice McElhinney, Andrea Derix, Steven J. Atlas, Patrick T. Ellinor, Anthony A. Philippakis, Christopher D. Anderson, Jennifer E. Ho, Puneet Batra, Steven A. Lubitz. medRxiv
  • Top contributer to and designer of the code still used to build and update the dataset of millions of patients

Association of Machine Learning-derived Measures of Body Fat Distribution in >40,000 Individuals with Cardiometabolic Diseases Broad Institute - 10 May 2021

Saaket Agrawal, Marcus D. R. Klarqvist, Nathaniel Diamant, Patrick T. Ellinor, Nehal N. Mehta, Anthony Philippakis, Kenney Ng, Puneet Batra, Amit V. Khera. medRxiv
  • Led the exploratory phase of the project (deriving fat measurements from MRIs) leading to two papers (including above) with more coming

Patient Contrastive Learning: a Performant, Expressive, and Practical Approach to ECG Modeling Broad Institute, MIT - 9 Apr 2021

Nathaniel Diamant, Erik Reinertsen, Steven Song, Aaron Aguirre, Collin Stultz, Puneet Batra arXiv
  • Developed and implemented novel contrastive learning objective applied to 3 million ECG dataset
  • Selected for one of five spotlight talks at SAIL 2021

Deep learning to estimate cardiac magnetic resonance–derived left
ventricular mass Broad Institute - Apr 2021

Shaan Khurshid, Samuel Freesun Friedman, James P. Pirruccello, Paolo Di Achille, Nathaniel Diamant, Christopher D. Anderson, Patrick T. Ellinor, Puneet Batra, Jennifer E. Ho, Anthony A. Philippakis, Steven A. Lubitz. Cardiovascular Digital Health Journal
  • Top contributer to the ML4H codebase used to design and train the models for the research

Physiology as a Lingua Franca for Clinical Machine Learning Broad Institute - 8 May 2020

Aaron Aguirre, Chris Anderson, Puneet Batra, Seung-Hoan Choi, Paolo Di Achille, Nathaniel Diamant, Patrick Ellinor, Connor Emdin, Akl C. Fahed, Samuel Friedman, Lia Harrington, Jennifer E.Ho, Amit V. Khera, Shaan Khurshid, Marcus Klarqvist, Steve Lubitz, Anthony Philippakis, James Pirruccello, Christopher Reeder, Collin Stultz, Brandon Westover. Patterns Opinion
  • Helped develop and implement philosophy outlined in the paper

Genome Wide Associations of Learned Low Dimensional
Representations of Cardiac MRI Broad Institute - 13 Dec 2019

Samuel F. Friedman, Nathaniel Diamant, James P. Pirruccello, Puneet Batra. NeurIPS: Learning Meaningful Representations of Life
  • Helped develop latent spaces of cardiac MRIs and ECGs for GWAS using variational autoencoders