Collins Aerospace is a leader in aviation technology and is seeking a Principal Platform Engineer for their Aviation Intelligence team. In this role, you will lead critical projects related to AI/ML, data science, and data management, while managing AWS infrastructure and collaborating with various engineering teams.
Responsibilities:
- Own and evolve AWS infrastructure supporting data platform, ML training, and analytics workloads (Iceberg/Trino, ETL pipelines, Kubeflow/MLflow)
- Design, deploy, and maintain EKS-based services and Kubernetes workloads
- Build and manage Terraform infrastructure across environments (dev/staging/prod)
- Design and maintain CI/CD pipelines for infrastructure and application deployment (GitLab/GitHub)
- Operate and improve Kafka/Redpanda clusters
- Improve reliability, observability, and performance of prediction services
- Support Dagster for data workflow orchestration
- Collaborate with Data Science, ML Engineering, and Data Engineering to productionize models and data pipelines
- Strengthen AWS IAM, networking, and connectivity between cloud and on-prem systems
- Support cyber hardening efforts with our RTX cyber team
- Identify and incrementally improve existing infrastructure and deployment patterns
Requirements:
- Typically requires a degree in Science, Technology, Engineering or Mathematics (STEM) and minimum 8 years prior relevant experience or an Advanced Degree in a related field and minimum 5 years of experience
- Proficiency with Python
- Strong experience with AWS (EKS, IAM, VPCs, networking)
- Hands-on Kubernetes experience operating production workloads
- Experience managing Kafka (or Redpanda) in production environments
- Proficiency with Terraform for infrastructure as code
- Experience building and maintaining CI/CD pipelines (GitLab, GitHub Actions, or similar)
- Solid understanding of distributed systems, reliability, and scaling
- Experience supporting production data pipelines or ML systems
- Experience with observability stacks (Prometheus, Grafana, Thanos, etc.)
- Experience with Kafka Connect
- Experience with Rust or a willingness to learn it
- Familiarity with Dagster, Airflow or other modern ETL orchestration tools
- Background supporting ML/AI systems in production (MLOps)
- Experience optimizing infrastructure for performance and cost
- Familiarity with Iceberg, Spark, Elasticsearch and other large-scale data processing systems
- Experience working in hybrid infrastructure environments (cloud + on-prem Kubernetes)