Teradata is a leading company in cloud analytics and data platforms for AI, empowering businesses to make informed decisions. They are seeking a Senior AI Engineer to contribute to a new service that collects and normalizes data catalogs from various sources, requiring strong engineering skills and collaboration with AI/ML teams.
Responsibilities:
- Design, build, and operate a highly available data catalog collection service that ingests schema and metadata from heterogeneous data sources (RDBMS, data lakes, streaming platforms, APIs)
- Develop robust data pipelines for catalog extraction, normalization, lineage tracking, and semantic tagging to power AI-driven query routing
- Build and maintain RESTful and/or gRPC APIs that expose catalog data to an AI query agent
- Deploy and manage services on Kubernetes (K8s), including helm chart authoring, autoscaling configuration, and multi-cluster operations
- Ensure service reliability through SLO definition, circuit breakers, retry logic, and distributed tracing
- Integrate with open-source and cloud-native technologies including Apache Kafka, Spark, dbt, Apache Atlas, or OpenMetadata
- Collaborate with AI/ML engineers to design and iterate on the metadata schema and query routing interface
- Participate in on-call rotations and contribute to incident response, postmortems, and reliability improvements
- Contribute to CICD pipelines, infrastructure-as-code (Terraform / Helm), and automated testing frameworks
Requirements:
- 3+ years of software engineering experience building and operating production services
- Proficiency in one or more of: Go, Rust, Java or Python-- with a preference for Rust or Python for backend services
- Hands-on experience with data pipeline development: ingestion, transformation, and metadata management at scale
- Solid understanding of RESTful API design principles and service-to-service communication patterns
- Experience deploying and operating services on Kubernetes (K8s) in production cloud environments
- Familiarity with at least one major public cloud platform: AWS, Azure, or GCP
- Strong knowledge of relational and non-relational database systems and their schema/catalog semantics
- Experience with distributed messaging systems such as Apache Kafka or AWS Kinesis
- Proficiency with Git, code review workflows, and agile development practices
- Excellent troubleshooting skills and comfort operating in Linux environments
- Experience with data catalog or metadata management tools such as Apache Atlas, Open Metadata, DataHub, or Collibra
- Familiarity with semantic search, vector databases, or LLM-based query generation systems
- Experience designing or integrating AI/ML model APIs into production backend services
- Knowledge of data governance, lineage tracking, and schema registry patterns
- Experience with infrastructure-as-code tools: Terraform, Pulumi, or AWS CDK
- Background in multi-tenant SaaS platform engineering
- Contributions to open-source data or infrastructure projects