Developing pipelines to integrate new data elements into our normalized oncology schema
Overseeing and monitoring our existing data infrastructure for stability, performance and accuracy
Improving our data warehousing and reporting capabilities to support real-time analysis of tens of thousands of patients representing millions of data points
Integrating standard and proprietary ontologies into our data enrichment processes
Enhancing our de-identification capabilities to support machine learning and clinical research use-cases
Building reusable integrations with major clinical systems (e.g. EMR/EHRs)
Deploying updates frequently to immediately improve the state of cancer care
Providing constructive feedback to your team members through code and architecture reviews
Requirements
A solid base of software engineering experience, typically 1-5 years, with at least part of that time in data-focused roles or projects
Fluency with a functional or imperative language (we use Python)
Experience working with relational and non-relational databases (we use Postgres, MongoDB, Redis, and ElasticSearch)
Tendency to seek simple, elegant solutions to complex problems
Ability to analyze and optimize existing solutions
A focus on writing understandable, testable, and maintainable code
Experience working with asynchronous and distributed systems (we use RabbitMQ)
Familiarity with modern containerized environments (we use Docker & Kubernetes)
Experience with healthcare data standards and integration is a huge plus (HL7, FHIR, DICOM, etc.)
Experience designing data models for analytical and transactional workloads
Tech Stack
Distributed Systems
Docker
ElasticSearch
Kubernetes
MongoDB
Postgres
Python
RabbitMQ
Redis
Benefits
401k, health and dental insurance
flexible vacation policy
paid parental leave
eBooks, online courses
workstation setup
happy hours
team dinners
conversations with oncologists (will return soon!)