OnBoard is a leading board management software company that serves over 5,000 organizations worldwide. They are seeking a Cloud Data Engineer II to design and operate SQL Server workloads, build retrieval pipelines for LLM applications, and ensure observability and security in data systems.
Responsibilities:
- Design, deploy, and operate Microsoft SQL Server workloads across Azure IaaS (VM) and PaaS (Azure SQL DB / SQL Managed Instance)
- Collaborate with developers and ML teams to build retrieval pipelines for LLM applications, including vector indexing, prompt metadata enrichment, and embeddings
- Integrate observability into all layers of the data/search stack
- Analyze and optimize SQL queries and indexing strategies using Query Store, DMVs, and execution plans
- Secure data access using encryption (TDE, Always Encrypted), private endpoints, and RBAC
- Automate recurring operations using PowerShell/Azure DevOps Pipelines, including job scheduling, alerting, index maintenance and schema validation
- Implement Infrastructure-as-Code (IaC) for SQL and search platforms using Bicep
- Drive incident resolution and root cause analysis for issues across SQL, search, and LLM pipelines
- Maintain technical documentation and operational runbooks for the combined SQL + Search + AI infrastructure
Requirements:
- 5+ years of experience managing Microsoft SQL Server in production, ideally both IaaS and PaaS
- Proficiency in tuning and debugging database performance (wait stats, execution plans, indexing)
- Hands-on experience with observability platforms and standards (OpenTelemetry, Grafana, APM tools, etc.)
- Experience migrating and improving database environments
- Automation scripting with PowerShell or Python
- Proficiency with Infrastructure-as-Code (Bicep, Terraform, ARM templates)
- Familiarity with compliance, RBAC, and secure data practices in cloud environments
- Experience with full-text and semantic search using Azure Cognitive Search and/or ElasticSearch
- Experience building or maintaining RAG/LLM architectures, including vector stores, embedding generation, and retrieval APIs
- Experience instrumenting and debugging LLM inference chains or vector database retrieval quality