Design, train, and evaluate foundation-style machine learning models that learn robust and reusable representations from large-scale datasets.
Develop and maintain scalable model training infrastructure using PyTorch and distributed training paradigms (e.g., multi-GPU and multi-node setups).
Train and adapt transformer-based architectures for representation learning across diverse data sources.
Apply self-supervised, weakly supervised, and representation learning techniques to leverage partially labeled or unlabeled data.
Build flexible modeling frameworks capable of integrating multiple data sources and heterogeneous signals.
Collaborate with pathologists, scientists, and engineers to ensure models are biologically meaningful and aligned with translational research goals.
Process, curate, and analyze large, complex datasets using efficient and reproducible workflows.
Support exploratory analyses, downstream modeling, and internal research initiatives using learned representations.
Contribute to internal technical documentation, research outputs, and long-term modeling strategy.
Follow best practices in software engineering, experiment tracking, and collaborative model development.
Requirements
PhD in Computer Science, Data Science, Computational Biology, Bioinformatics, Engineering, Mathematics, or a related quantitative field with exposure to biological or medical data.
0–4 years of experience applying machine learning or deep learning in research or industry settings (postdoctoral experience acceptable).
Strong understanding of deep learning model training, optimization, and evaluation.
Hands-on experience with transformer-based models, including both language-focused and vision-focused architectures.
Proficiency in Python and PyTorch.
Hands-on experience with distributed training (e.g., PyTorch DDP, multi-GPU or multi-node workflows).
Experience working in Linux environments and using Git for version control.
Ability to work with large datasets and complex data pipelines.
Strong written and verbal communication skills.
Tech Stack
Linux
Node.js
Python
PyTorch
Benefits
Training
Periodic travel and work during evenings, weekends, or holidays may be required.