Teladoc Health is empowering individuals to live their healthiest lives through innovative virtual care solutions. The Principal Platform Engineer will play a crucial role in enhancing platform delivery by combining software engineering and operational expertise, while also mentoring other engineers and driving best practices across teams.
Responsibilities:
- Act as a technical “force multiplier” on the highest-priority initiatives; clarify approach, resolve ambiguity, and drive work to completion with high quality and pragmatic trade-offs
- Reduce cross-team friction by defining clear interfaces, breaking work into deliverable increments, and enabling parallelization through strong architecture boundaries
- Establish and model best practices for engineering excellence: design docs/RFCs, architecture reviews, code review discipline, and effective automated testing strategies
- Drive API-first and “platform as a product” behaviors: define and promote consistent platform interfaces that reduce bespoke integrations and siloed solutions
- Create reusable platform capabilities (templates/modules/golden paths) that reduce reinvention and speed up delivery for teams
- Drive automation opportunities (including agentic/AI-enabled workflows) that improve operational and delivery efficiency
- Lead cross-cutting improvements that enhance stability and reduce toil: observability standards, alert hygiene, incident learning loops, and resilience patterns
- Partner with operations and platform stakeholders to measurably improve reliability outcomes and reduce operational drag on platform delivery teams
- Coach senior/staff engineers by pairing on real work, running reviews, and teaching pragmatic system-level thinking
- Set clear examples of technical leadership, collaboration, and accountability without formal people management responsibility
- Participate in the on-call rotation and contribute to restoration, root cause learning, and prevention
Requirements:
- Bachelor's degree in Computer Science, Engineering, or a related technical field
- 15+ years of hands-on software engineering designing, building, testing, deploying and operating large-scale distributed systems in cloud-native environments
- 5+ years operating at Staff or Principal scope, leading multi-quarter, cross-team technical initiatives that span 3+ teams and deliver organization-level outcomes
- 8+ years of experience designing and operating microservices-based systems, including API design and versioning, authentication and authorization frameworks (e.g. OAuth, OIDC, IAM), and Infrastructure-as-Code (e.g. Terraform, Cloudformation, ARM)
- Deep hands-on experience (5+ years) in at least three of the following: Kubernetes and container orchestration platforms, public cloud infrastructure (AWS/Azure/GCP), CI/CD systems and deployment automation, Infrastructure-as-Code and configuration management, and production operations, reliability tooling and on-call systems
- Demonstrated ownership of production systems supporting business-critical workloads, including participation in incident response, post-incident reviews, and reliability improvements at scale
- Proven ability to operate as a self-directed technical leader, navigating ambiguity, defining problem spaces, and driving clarity and alignment across multiple teams
- Demonstrated success influencing technical direction across globally distributed teams and multiple levels of the organization without formal authority
- Strong written and verbal communication skills, with the ability to translate complex technical concepts for engineering, product and executive audiences
- Experience designing or evolving internal platforms or self-service capabilities that materially improve developer experience, delivery throughput, or operational efficiency
- Strong background in observability (metrics, logs, traces), incident management, and reliability practices, with a track record of improving system health and reducing operational toil
- Deep understanding of performance optimization, system resilience, and observability in high-scale production environments
- Experience working in regulated industries such as healthcare or fintech, including familiarity with compliance-driven architectural and security considerations
- Familiarity with healthcare data standards (e.g. FHIR, HL7) and platform security best practices