Domino Data Lab is a company that builds software for AI-driven organizations to develop and operate advanced data science solutions. The Staff Software Engineer will be responsible for integrating model monitoring, enhancing tagging capabilities, and expanding LLM hosting capabilities, while collaborating with cross-functional teams to innovate within the Domino Apps offering.
Responsibilities:
- Integrate model monitoring to provide a holistic view of deployment health and performance
- Enhance tagging capabilities across Domino entities to improve discoverability and tracking
- Expand LLM hosting capabilities to address customer needs for scale, performance, and logging
- Innovate within our Domino Apps offering by incorporating feature requests from major customers
Requirements:
- Hands-on experience developing and managing high-performance back-end systems in distributed computing environments
- Working closely with cross-functional teams to integrate systems with front-end interfaces and third-party services
- Designing and implementing secure, scalable APIs (e.g., RESTful APIs, gRPC)
- Profiling and optimizing back-end performance, especially in cloud environments or with container technologies like Docker and Kubernetes
- Using robust testing frameworks (unit, integration, end-to-end) and setting up CI/CD pipelines
- Familiarity with model registries, versioning, and lifecycle management tools like MLflow or KubeFlow
- Experience with frameworks like Apache Spark, Azure ML, or SageMaker
- Proficiency with cloud providers (AWS, Azure, GCP) and deploying services in these environments
- Expertise in languages such as Python, Java, Scala, or Go