Tango Analytics is focused on helping businesses make smarter decisions through innovative technology and data. They are seeking a Principal DevOps Engineer to lead the platform and cloud architecture efforts, drive cloud modernization initiatives, and establish reliability and operational excellence across their systems.
Responsibilities:
- Own platform and cloud architecture
- Define and evolve Tango’s platform architecture across AWS, GCP, and Azure, including networking, compute, storage, and environment strategy
- Drive cloud modernization initiatives (migration, re-platforming, standardization of deployment patterns, scalability and resilience improvements)
- Create reference architectures and guardrails that enable autonomy while improving consistency and operational quality
- Establish reliability and operational excellence
- Set reliability standards and practices (SLOs/SLIs, error budgets, capacity planning, incident response and postmortems)
- Improve observability and operational readiness (monitoring, alerting, tracing, dashboards, runbooks) to reduce MTTR and recurring incidents
- Identify systemic bottlenecks (latency, throughput, resource utilization, toil) and lead cross-team initiatives to address them
- Enable government cloud readiness (FedRAMP)
- Provide technical leadership for architectures and operational controls needed for FedRAMP / government cloud certification readiness
- Partner with Security/Compliance to translate controls into engineering requirements and pragmatic implementation plans
- Lead technical reviews for identity, access, auditability, logging, change management, and incident processes in regulated environments
- Raise the bar through standards and mentorship
- Establish and evangelize best practices: design docs, architecture reviews, coding standards, testing strategy, CI/CD quality gates, production readiness
- Mentor senior engineers, tech leads, and emerging technical leaders; coach teams through complex designs and tradeoffs
- Create reusable patterns and “golden paths” (templates, libraries, tooling) that make the right way the easy way
Requirements:
- Applicants must be authorized to work in the U.S. for any employer
- Strong working knowledge across multiple stacks, including Java/J2EE, Oracle, Postgres, and modern frontend frameworks (React and Vue.js)
- Hands-on cloud experience across AWS, GCP, and/or Azure, with ability to dive deep on architecture, performance, reliability, and operations
- Experience designing and operating distributed systems for B2B SaaS with a strong focus on resiliency, scalability, and cost/performance tradeoffs
- Experience with global cloud systems (multi-region, disaster recovery, high availability, data residency considerations)
- Experience supporting FedRAMP and/or comparable government cloud certification efforts (e.g., Moderate/High) or operating within similarly regulated environments
- Proven ability to influence without authority—aligning stakeholders, driving decisions, and building consensus across teams
- Excellent written and verbal communication; able to author clear design docs and architectural proposals
- A coaching mindset with demonstrated impact mentoring and developing engineers across teams
- Kubernetes and cloud-native operations experience (platform tooling, service meshes, policy-as-code)
- Experience building internal platforms/shared services (identity/auth, developer tooling, CI/CD platforms)
- FinOps or cloud cost optimization experience at scale