BigApple Infotech LLC is seeking an Observability Engineer who designs and maintains platforms for deep visibility into systems, networks, and applications. This role focuses on enabling proactive detection and data-driven operational excellence through various metrics and events.
Responsibilities:
- Design and operate observability platforms (metrics, logs, traces, events)
- Build instrumentation standards and onboarding patterns
- Implement monitoring, alerting, and dashboards for critical systems
- Partner with engineering and operations teams on observability best practices
- Optimize signal quality and reduce alert noise
- Support incident response, post-incident analysis, and reporting
- Enable observability data for AI Ops and automation use cases
- Maintain platform reliability, scalability, and cost efficiency
Requirements:
- 7+ years of experience in observability, SRE, or platform engineering
- Strong experience with monitoring and observability tools
- Experience instrumenting applications and infrastructure
- Strong understanding of distributed systems and failure modes
- Ability to translate operational needs into actionable signals
- Experience working within an Agile or SAFE Agile work model
- Familiarity with observability-driven automation or AIOps
- Strong cloud skills AWS/Azure and script skills Python
- Strong communication (Must)