Participate in core production management functions including incident, change, and problem management, collaborating closely with cross‑functional teams to support metrics‑based tracking, planning, risk mitigation, and certification readiness for top‑tier clients.
Support Kafka clusters, topics, consumer groups, and message flows to maintain high availability and consistent data processing across production systems.
Oversee and troubleshoot job scheduling workflows—with Chronos preferred but optional—ensuring SLA adherence, timely batch execution, dependency handling, and automated recovery for failed tasks.
Perform plan reviews and submit high‑quality changes that improve platform resiliency, change hygiene, and operational success for both Kafka and scheduling workloads.
Collaborate with production assurance teams to maintain production run books, change registers, automation assets, and process improvements; conduct detailed impact and root‑cause analysis for issues involving data streaming or scheduling components.
Provide production support for innovative, high‑quality, large‑scale financial solutions in partnership with engineering and operations professionals.
Execute proactive monitoring using tools such as Splunk, Dynatrace, Grafana, and Moogsoft, including building alerts and dashboards to monitor Kafka lag, consumer health, broker performance, and scheduling job status.
Requirements
5+ years of experience in an enterprise IT environment.
4+ years of experience with Unix and Windows, including scheduling tools (Chronos optional; equivalents accepted).
4+ years of experience with monitoring tools such as Splunk, Dynatrace, Grafana, and Moogsoft.
4+ years of experience working with incident, change, and problem management processes and ticketing systems such as ServiceNow, Remedy, or CA Service Desk.
3+ years of hands‑on experience with Kafka.
3+ years of experience in production management, application/platform monitoring, and participation in on‑call rotations.
Bachelor’s degree in a related field or an equivalent combination of education, military, and work experience.
Tech Stack
Grafana
Kafka
ServiceNow
Splunk
Unix
Benefits
Fuel Your Life program to support your physical, financial, social, and emotional well-being.
Paid holidays and generous time away policies.
No-cost mental health support through Employee Assistance Programs.
Living Proof program to recognize your peers’ extra effort with points redeemable for rewards.
Eight Employee Resource Groups to foster a collaborative culture and expand your network.
Unparalleled professional growth with training, development, and internal mobility opportunities.
Medical, dental, vision, life, and disability insurance options available from day one.
Retirement planning and discounted shares with the Employee Stock Purchase Plan.