Tebra is an all-in-one EHR+ platform dedicated to independent healthcare practices. The Senior Machine Learning Engineer will build, deploy, and optimize machine learning services for the Tebra platform, focusing on the reliability and scalability of AI solutions.
Responsibilities:
- Write high-quality, production-grade software for data ingestion, feature extraction, and model inference, specifically focusing on optimizing code for latency, throughput, and resource efficiency
- Implement robust CI/CD pipelines, automated testing, and comprehensive logging/monitoring for the models you deploy to ensure immediate detection of issues
- Construct and maintain specific data pipelines required for training and inference, ensuring data quality and consistency at the component level
- Develop reusable software modules and utilities that streamline the development process for the wider team, while championing clean code and test-driven development
- Translate business requirements into technical specifications and execute them with precision, serving as an expert at breaking down complex tasks into deliverable units
- Monitor the daily performance of production models, debug incidents, and execute routine retraining workflows to address data drift
- Partner with Engineering team members and Product Managers to estimate effort, flag technical risks, and deliver features on schedule
Requirements:
- 5+ years of professional software development experience including system design, large-scale services, and production-grade infrastructure
- 3+ years of hands-on experience in machine learning engineering or applied AI, with a strong record of deploying and maintaining models in production
- Technical subject matter expertise in 3+ general areas of software development (e.g., server, database, security, etc) including machine learning infrastructure
- Demonstrated ability to deliver significant, measurable real-world impact through applied ML
- Proven ability to design and write modular, performant, and easy to read software that solves complex business problems
- Proficiency in Python, TensorFlow/PyTorch, and scikit-learn
- Strong background in MLOps and data infrastructure (e.g., Airflow, Spark, feature stores, MLflow, data versioning)
- Proven ability to deploy and maintain ML models in production with CI/CD, monitoring, and alerting
- Familiarity with cloud ML environments (AWS, GCP, or Azure) and containerization (Kubernetes, Docker)
- Experience building or fine-tuning Large Language Models (LLMs) or generative models for structured business processes
- Experience with retrieval-augmented pipelines or feedback-driven model retraining
- Excellent technical communication and a product mindset—comfortable driving initiatives from concept to delivery
- Background in healthcare software operations, or financial automation
- Contributions to open-source ML infrastructure projects
- Published research or conference papers in machine learning, natural language processing, or applied AI
- Experience leading AI reliability and observability initiatives — designing monitoring frameworks, drift detection, and alerting systems for multiple production models