Medecision is seeking a Senior Software Engineer for their Data Platform to provide innovative solutions for managing health and care. The role involves designing and building a cloud-native data platform that supports clinical analytics and reporting, while ensuring the reliability and scalability of data pipelines.

Responsibilities:

Design, develop, and maintain production-grade Python and Java data services and pipelines deployed on Google Cloud Platform, following established architectural conventions, coding standards, and data platform patterns
Build and evolve Google Cloud Dataflow batch and streaming pipelines for data ingestion, standardization, curation, and analytics load. Handling deduplication, validation, member-reference integrity, and incremental/full-reload modes
Implement and maintain event-driven data workflows using Google Cloud Pub/Sub, including file-complete notification topics, FHIR ingest topics, and Pub/Sub ? BigQuery export subscriptions
Design and manage BigQuery datasets, table schemas, partitioning strategies (range-bucket by member partition), clustering, and reporting views — across standard analytics, custom analytics, and curated storage layers
Build and maintain Cloud Composer (Apache Airflow) DAGs for workflow orchestration — including file ingestion DAGs, test execution DAGs, and third-party processing DAGs (e.g., MEG)
Develop and maintain Cloud Run microservices (e.g., Ingestion Event Service, Custom Dataset Management Service) and Cloud Functions (inbound GCS bucket triggers)
Participate in design and code reviews; mentor junior engineers and contribute to shared coding standards, patterns, and team knowledge base
Collaborate with on-shore and off-shore teams, architects, and tech leads to ensure on-time delivery and best engineering practices
Contribute to CI/CD pipeline improvements using GitLab CI/CD, including build, test, containerization (Docker), and deployment automation to GCP environments
Engage proactively in the triage and resolution of escalated production issues — diagnosing failures, investigating root causes, and driving durable fixes with a sense of urgency, clear communication to stakeholders, and a commitment to preventing recurrence
Follow and comply with all security policies and procedures established by the organization, including adherence to HIPAA and HITRUST regulations
Own and evolve the end-to-end data ingestion pipeline: from SFTP/GCS file receipt through ingestion, standardization, curation, and load to BigQuery analytics datasets
Design and implement custom dataset ingestion capabilities — including dynamic schema mapping, configuration-driven pipeline execution, and automatic BigQuery table/view provisioning on dataset activation
Maintain and improve the Ingestion Event Service (IES). The platform's central tracking service for data ingestion progress — including Pub/Sub async processing and Firestore document management
Implement FHIR R4 data ingestion workflows via Pub/Sub and GCP Healthcare Datasets, including XSD/schema validation and HL7 resource mapping
Support population matching data flows, ensuring custom dataset filters integrate correctly with population builder services and BigQuery analytics queries
Build and maintain reporting dataset views (BI-compatible BigQuery views) for tenant data exposure
Integrate with third-party clinical analytics engines via GCE VM instance templates, Airflow DAGs, and GCS-based data exchange
Ensure all data services meet HIPAA requirements: PHI handling, tenant data isolation, audit logging, and data classification
Demonstrate a solid understanding of AI concepts, capabilities, and limitations as they apply to software engineering workflows, including code generation, test scaffolding, and documentation
Use Claude Code as a primary productivity tool for code drafting, refactoring, test generation, and technical documentation — applying it with judgment, rigor, and accountability
Leverage AI-assisted workflows to accelerate implementation, surface edge cases, generate structured artifacts, and conduct at-scale analysis of service dependencies and API contracts
Contribute to building and exposing MCP-wrapped APIs that enable AI agents to safely interact with platform services
Maintain strict HIPAA discipline in all AI-assisted work: no real PHI in prompts or AI-generated artifacts; adhere to managed-settings policies and complete mandatory HIPAA + AI training
Contribute to the team's shared AI knowledge base. Validated prompts, skills, and workflows — and participate in the AI Champions community of practice

Requirements:

Bachelor's degree in Computer Science, Software Engineering, Data Engineering, or equivalent practical experience
5+ years of data engineering or backend software engineering experience building production data pipelines and platform services
Proven hands-on experience with Google Cloud Platform data services: BigQuery (schema design, partitioning, clustering, query optimization), Cloud Storage, Cloud Pub/Sub, Cloud Dataflow (Apache Beam), Cloud Composer (Airflow), Cloud Run, Cloud Functions, Firestore, Cloud SQL (PostgreSQL), and Secret Manager
Strong proficiency in Java for data pipeline development
Proficiency in Python for Airflow DAG authoring, and automation scripting
Experience designing and implementing batch and streaming data pipelines — including file-based ingestion, event-driven processing, deduplication, validation, incremental load, and full-reload patterns
Proficiency with BigQuery data modeling: partitioned and clustered tables, dataset organization (standardized, curated, analytics, custom analytics, reporting layers), and SQL query optimization
Experience with Apache Airflow / Cloud Composer — authoring, deploying, and maintaining production DAGs with parameterized configurations and robust error handling
Experience with containerization (Docker) and deploying services to cloud-native environments
Proficiency with GitLab CI/CD for pipeline automation and multi-environment deployment
Excellent communication skills — able to articulate technical decisions, participate in design reviews, and collaborate effectively with cross-functional teams
Familiarity with Datadog for service monitoring, alerting, and observability in a cloud-native data platform
Familiarity with Sisense or equivalent BI/reporting platforms and BigQuery view-based reporting patterns
Solid understanding of AI concepts, capabilities, and limitations as they apply to software engineering and product delivery workflows
Hands-on experience with Claude Code or equivalent AI-assisted tools. Used as a primary productivity tool for code generation, refactoring, test scaffolding, and documentation, not just experimentally
Ability to evaluate AI-generated code critically: identifying hallucinations, logic errors, security gaps, and missing edge cases before they reach production
Practical understanding of MCP (Model Context Protocol) or strong willingness to learn — for building tool wrappers that expose platform APIs to AI agents safely and with appropriate guardrails
Commitment to responsible AI use: applying AI with judgment, rigor, and personal accountability. Consistent with the principle that humans own decisions, agents own toil
HIPAA discipline in AI-assisted work: understanding of PHI boundaries in AI workflows and commitment to managed-settings policies and mandatory HIPAA + AI training
Openness to contributing to and learning from a shared AI knowledge base. Validated prompts, skills, and workflows — and active participation in the AI Champions community of practice
Knowledge of HIPAA and experience working in HIPAA-regulated product environments, including PHI handling, data classification, and audit requirements
Hands-on experience with HAPI FHIR R4 and healthcare interoperability standards (HL7, FHIR resource mapping, validation workflows)
Understanding of multi-tenant SaaS architecture patterns — tenant context propagation, per-tenant feature flags, and data isolation

Senior Software Engineer, Data Platform

Key skills

About this role

Responsibilities:

Requirements: