Dice is a leading technology and engineering solutions provider, and they are seeking a GenAI Data Automation Engineer to design and implement AI-driven automation solutions. The role involves building intelligent data pipelines and automations across AWS and Azure environments to support analytics and customer engagement.
Responsibilities:
- Design and maintain data pipelines in AWS using S3, RDS/SQL Server, Glue, Lambda, EMR, DynamoDB, and Step Functions
- Develop ETL/ELT processes to move data from multiple data systems including DynamoDB to SQL Server (AWS) and between AWS and Azure SQL systems
- Integrate AWS Connect CRM data into the enterprise data pipeline for analytics and operational reporting
- Engineer and enhance ingestion pipelines with Apache Spark, Flume, and Kafka for real-time and batch processing into Apache Solr and AWS Open Search platforms
- Leverage Generative AI services and frameworks to create automated processes for vector generation, enhance data quality, and build conversational BI interfaces
- Develop AI-powered copilots for pipeline monitoring and automated troubleshooting
- Implement SQL Server stored procedures, indexing, query optimization, profiling, and execution plan tuning to maximize performance
- Apply CI/CD best practices using GitHub, Jenkins, or Azure DevOps for both data pipelines and GenAI model integration
- Ensure security and compliance through IAM, KMS encryption, VPC isolation, RBAC, and firewalls
- Support Agile DevOps processes with sprint-based delivery of pipeline and AI-enabled features
Requirements:
- BS in Computer Science or a related field
- 2+ years of data engineering and automation experience
- Hands-on experience with SQL, SSIS, Python, Spark, Bash, PowerShell, and AWS/Azure CLIs
- Experience with AWS services like S3, RDS/SQL Server, Glue, Lambda, EMR, and DynamoDB
- Familiarity with Apache Flume, Kafka, and Solr for large-scale data ingestion and search
- Familiarity with LLM and Gen AI frameworks using AWS Bedrock, Azure OpenAI, or open-source platforms and tools
- Experience with integrating REST API calls in data pipelines and workflows
- Familiarity with JIRA, GitHub / Azure DevOps / Jenkins for SDLC and CI/CD automation
- Strong troubleshooting and performance optimization skills in SQL, Spark, or other data engineering solutions
- Experience operationalizing Generative AI (GenAI Ops) pipelines, including model deployment, monitoring, retraining, and lifecycle management for LLMs and AI-enabled data workflows
- Good communication and presentation skills
- Certifications: AWS Data Engineer, AWS AI/ML Specialty, Azure AI Engineer, Databricks Certified Data Engineer
- Experience implementing RAG pipelines, embeddings, and vector search with Solr, OpenSearch, FAISS, Pinecone, or Pgvector/SQL server vector types
- Experience with GenAI powered coding tools such as Claude Code, OpenAI Codex, VS Code
- Experience with multi-cloud data integration (AWS to Azure SQL)
- Familiarity with Microsoft BizTalk and SSIS for SQL Server ETL workflows
- Knowledge of data lineage/governance tools (Purview, Unity Catalog, AWS Glue Catalog)
- Familiarity with Infrastructure-as-Code (Terraform/CloudFormation, Bicep) for automated deployments
- Experience with compliance frameworks (FedRAMP, PCI-DSS, HIPAA)