Netflix is a company dedicated to entertaining the world through innovative storytelling and technology. They are seeking a Support Solutions Engineer to enhance the support experience for their developer community by troubleshooting customer requests, automating support processes, and improving support documentation.
Responsibilities:
- Monitor and handle customers’ requests, troubleshoot, solve issues, and automate support needs
- Develop support documentation and runbooks, improving and maintaining support tools and automation
- Understand product offerings and continuously look for ways to improve the engineering support experience
- Provide insights, feedback and champion customer sentiment about the tools supported to partners across Infrastructure Engineering
- Partner with Product Management, Developer Education and Engineering to track and maintain visibility into ongoing issues and communicate customer needs
- Drive collaboration efforts to reduce product friction and increase usability of the Graph Search platform
Requirements:
- Proven ability to deliver superior customer support and advocate for customer needs across complex organizations, ideally within a central team
- Highly adaptable, thriving in fast-paced, ambiguous environments, and comfortable managing end-to-end investigations with creative problem-solving
- Data-driven decision-maker with strong written and verbal communication skills, including experience enhancing documentation and explaining technical concepts to diverse audiences
- Working knowledge of search engine concepts: index mappings, text analyzers, aliases, data backfill processes, and index lifecycle management. Able to diagnose why data isn't appearing in an index, interpret document count discrepancies, and reason through reindexing and refresh workflows. Experience with OpenSearch or Elasticsearch is strongly preferred
- Ability to read and reason about GraphQL schemas, understand the difference between filter and aggregation semantics, and debug query issues such as unexpected results, unindexed field errors, and schema federation mismatches. Experience with any GraphQL server framework (Apollo, Spring GraphQL, or similar) is a plus
- Experience debugging platform configuration issues: provisioning errors, schema validation failures, variant or environment setup, and deployment conflicts. Comfortable using CLI-based tooling to inspect and manipulate index state, and able to interpret error messages from configuration pipelines without direct access to source code
- Understanding of event-driven architectures, message queue concepts, and dead-letter queue (DLQ) patterns. Able to identify why events aren't flowing — e.g., missing publish, inactive consumer, enrichment errors — and know when to re-trigger vs. escalate. Experience with Kafka, RabbitMQ, AWS SQS/SNS, or similar messaging systems is relevant
- Familiarity with access control models — RBAC, ABAC, or policy-based systems — and experience troubleshooting permission errors. Able to distinguish between self-serve and operator-required changes, and guide users through access requests efficiently
- Proficient in reading distributed traces and structured logs to identify root cause across services. Able to use monitoring dashboards and basic SQL to investigate data inconsistencies and communicate findings to both technical and non-technical stakeholders. Experience with tools such as Datadog, Jaeger, Grafana, Splunk, or similar is relevant
- Comfortable reasoning about differences between test and production environments. Able to guide users through environment inconsistencies — data not synced, pipelines inactive in test, alerting misconfigured per environment — and document resolutions as reusable guidance
- Demonstrated ability to write clear runbooks, FAQs, and troubleshooting guides that reduce repeat support load. Experience translating complex platform behavior into actionable self-service resources for a developer audience