The Voleon Group is a technology company specializing in AI and machine learning applications in finance. As a Site Reliability Engineer, you will enhance and monitor production-critical infrastructure and data pipelines, collaborating with software engineers to improve system reliability and efficiency.
Responsibilities:
- Improve fault-tolerance and maintainability of code in proprietary data pipelines and trading systems
- Diagnose and fix bugs in code
- Lead complex deployments
- Automate manual workflows
- Track and prioritize outstanding production-related issues
- Share an on-call rotation responding to incidents to ensure the continuous operation of production-critical systems
Requirements:
- Experience with coding and debugging Python
- Experience with Linux
- Familiarity with Relational Databases & SQL
- Sharp analytical and problem-solving skills and a persistent drive to make things work (better)
- Strong growth mindset and a passion for learning
- Strong technical communication skills
- Attention to detail
- 2 years of relevant industry experience
- An undergraduate degree or comparable training in a quantitative field or equivalent, relevant industry experience
- Familiarity with best practices concerning code maintainability, documentation, quality assurance, continuous integration and deployment
- Experience supporting production systems
- Experience with any of the following: gRPC microservices, Postgres, Pandas, Golang, R, Git, Jenkins, Bazel, Prometheus, Grafana, Airflow, Kubernetes