We Make Change is a startup utilizing AI to combat cervical cancer through innovative solutions. The role involves designing and implementing synthetic data generation pipelines to enhance cervical image datasets and improve screening model robustness.
Responsibilities:
- Develop synthetic image pipelines using diffusion models (Stable Diffusion, DreamBooth, LoRA) or GAN-based approaches
- Generate class-balanced datasets, particularly for underrepresented abnormal cases
- Design and run experiments comparing real vs synthetic vs hybrid datasets
- Evaluate synthetic data quality using FID, KID, and downstream model performance
- Implement prompt engineering and conditioning strategies for medically realistic outputs
- Perform domain gap analysis between real and synthetic data
- Collaborate with ML engineers to integrate synthetic data into training pipelines
Requirements:
- Experience with PyTorch and Hugging Face Diffusers or GAN frameworks
- Strong understanding of data augmentation vs synthetic generation tradeoffs
- Experience evaluating generative models (FID, distribution alignment)
- Familiarity with medical imaging challenges
- Experience with GAN based models and LoRA fine-tuning pipelines
- Prior work with imbalanced or medical datasets
- Understanding of bias in synthetic data
- Experience with healthcare AI, low-resource environments, or global health applications