Company Confidential is seeking a proactive AI Systems Engineer to design, implement, and maintain end-to-end AI systems that scale. The role involves bridging the gap between data science, software engineering, and operations to ensure robust AI-enabled solutions that meet business objectives.

Responsibilities:

Design, deploy, and operate production-grade AI systems and pipelines (data ingestion, preprocessing, model training, validation, deployment, monitoring, and retraining)
Collaborate with data scientists to translate research models into scalable, maintainable, and observable services
Implement MLOps practices: versioning for data, models, and code; CI/CD for ML pipelines; automated testing and canaries; model governance and drift monitoring
Build and maintain scalable data architectures (ETL/ELT, streaming, data lakes/warehouses) with emphasis on data quality, lineage, and observability
Develop APIs and services for model inference, including high-throughput, low-latency endpoints; ensure security, authentication, and access controls
Design and implement monitoring, alerting, and incident response for AI systems (model performance, data quality, system health, latency, cost)
Optimize infrastructure for cost, performance, and reliability (cloud platforms, containers, orchestration, GPUs/accelerators, edge devices where applicable)
Ensure compliance with privacy, security, and regulatory requirements; implement audit trails and reproducibility
Collaborate with product managers and stakeholders to define requirements, success metrics, and acceptance criteria
Mentor junior engineers, contribute to standard methodologies, documentation, and best practices

Requirements:

Bachelor's or Master's degree in Computer Science, Software Engineering, Electrical Engineering, Analytics, or related field (or equivalent practical experience)
3+ years of experience in systems engineering, ML/AI deployment, or MLOps
Strong software engineering skills: proficiency in one or more general-purpose languages (e.g., Python, Java, Go, C++) and familiarity with software engineering best practices (version control, testing, code reviews)
Experience architecting and deploying end-to-end AI pipelines (data ingestion, feature engineering, model training, deployment, and monitoring)
Hands-on experience with ML frameworks (TensorFlow, PyTorch, scikit-learn) and model serving platforms (TensorFlow Serving, TorchServe, MLflow, Kedro, Seldon, or similar)
Proficiency with cloud platforms (AWS, Azure, GCP) and containerization (Docker), orchestration (Kubernetes), and CI/CD tooling
Strong understanding of data engineering concepts (ETL/ELT, data governance, data quality, lineage)
Experience with model monitoring and drift detection, A/B testing, and experimentation pipelines
Familiarity with security and compliance practices (IAM, secrets management, encryption, audit logging)
Excellent problem-solving, communication, and collaboration skills; able to work cross-functionally
Master's or PhD in a relevant field; specialization in ML systems, MLOps, or data engineering
Experience with real-time inference, streaming data (Kafka, Kinesis), and feature stores
Knowledge of DevOps fundamentals, SRE practices, and reliability engineering for AI systems
Experience with edge AI deployments or on-device inference
Publications or contributions to open-source ML systems projects

AI Systems Engineer

Key skills

About this role

Responsibilities:

Requirements: