Join a globally diverse team that both builds and finds best-of-breed tools to bring critical Observability services to Adobe
Craft new tools and UIs to maintain and support one of the largest logging deployments in the industry
Assist in building Adobe’s observability strategy
Solve complex tasks where it’s easy to draw a line from efforts to real accomplishments
Requirements
3-5+ years production level experience with distributed applications at scale in public and/or private cloud
Experience architecting and implementing large-scale Observability platforms
B.S. degree in Computer Science or related technical field
Work in a diverse and distributed team environment
Must Have Experience with internally hosted logging systems like Splunk, Clickhouse, Loki, Elastic, assisting clients and improving environment performance and stability
AI agent development and experience integrating AI workflows into large-scale deployments
Programming experience with languages like Go, Python
Experience building integrations and applications to large-scale Observability environments
Experience designing and implementing systems for fault tolerance, scalability and stability
Experience developing, deploying and running distributed applications on cloud platforms
Experience with container and orchestration technologies (Docker, Kubernetes)
Ensure the highest level of up-time and Quality of Service (QoS) to Adobe’s customers through operational excellence
Knowledge in defining service level objectives (SLOs) and service level indicators (SLIs) to represent and measure service quality
Knowledge of (public and/or private) cloud deployments – AWS, Azure, Data Center
Collaborate with SRE and Engineering/Product teams in driving critical initiatives
Experience in designing and maintaining production monitoring systems
Experience in solving performance and stability issues using a wide variety of tools
Excellent communicator in and across teams, driving projects to completion
Impacts the organization through contribution to technical direction and strategic decisions
Good to Have Experience with other Observability tooling like Grafana, Cortex, Tempo, OTEL
Experience with Open-Source products/community like Open telemetry
Familiar with a variety of cloud security and automation concepts, practices and procedures.