The Computer Merchant, LTD (TCM) is seeking a Data Automation Engineer to work with a team of subject matter experts and developers. The role involves designing and implementing innovative data automation solutions for Azure cloud-based platforms, translating business requirements into data engineering and AI-based solutions to support an enterprise scale data analytics and reporting platform.
Responsibilities:
- Utilize Microsoft Azure services including Azure Data Factory, Synapse Pipelines, Apache Spark Notebooks, Python, SQL, stored procedures to develop high performing data pipelines
- Continuously improve and optimize the automation toolset for reliability, scalability, and adaptability
- Research and implement cutting-edge AI/ML and GenAI tools to rapidly develop intelligent applications, scripts, and ETL pipelines that automate data processes, and proactively eliminate workflow bottlenecks
- Work closely with implementation specialists, engineering teams, and customer to understand data driven needs and build solutions that address real operational challenges
- Work closely with client personnel and team members to understand data requirements and develop appropriate data solutions
- Identify, create, prepare data required for advanced analytics, visualization, reporting, and AI/ML
- Implement data migration, data integrity, data quality, metadata management, and data security functions to optimize data pipelines
- Monitor and troubleshoot data related issues to maintain high availability and performance
- Assist in ETL performance testing for data pipelines by executing test runs and validating data load performance against expected benchmarks
- Support baseline performance analysis by comparing new pipeline performance with legacy system metrics (e.g., load time, throughput, latency)
- Help monitor pipeline execution metrics such as run time, data volume processed, and resource utilization to identify bottlenecks
- Participate in performance test execution for Azure-based pipelines (ADF, Synapse, Databricks) in non-production environments
- Assist in identifying performance issues and contribute to tuning efforts (e.g., query optimization, partitioning, indexing basics)
- Validate data consistency and completeness after performance test runs to ensure no data loss during high-volume processing
- Collaborate with DevOps and infrastructure teams to understand how compute, memory, and scaling impact pipeline performance
- Follow defined performance testing processes, checklists, and guidelines established by senior team members
- Document test results, issues, and observations clearly for team review and tracking
- Actively support Agile DevOps process, including Program Increment planning
- Actively engage in continuous learning to increase relevant skills
- Maintain strict versioning and configuration control to ensure integrity of data
Requirements:
- BS degree in Computer Science or related field and 2+ years of experience in relevant field
- 2+ years of experience with more than one of the follow scripting languages: SQL, T-SQL, MDX/DAX, Python, and PySpark
- Experience designing and building ETL/data engineering solutions, schedule and monitor utilizing various cloud services such as Azure Data Lake Services, Azure Synapse Analytics, Azure Data Factory, Integration Runtime
- Experience working with Microsoft database and business intelligence tools, including SQL Server, including stored procedures, SSIS, SSRS, SSAS (cubes), and Power BI
- Experience with data automation using Azure/AWS CLI tools with Bash or PowerShell scripting
- Familiarity with Azure DevOps Repos or GitHub and pipeline versioning/release management
- Demonstrated experience in supporting production, testing, integration, and development environments
- Open mindset, ability to quickly adapt new technologies to solve customer problems
- Experience in Agile projects, working with a multi functional team
- Must be detail oriented, and able to support multiple projects and tasks
- Demonstrate continuous learning to increase relevant skills
- US Citizenship and ability to successfully obtain a government-issued Public Trust clearance
- Experience and/or certifications in Generative AI development, Generative AI for Data Analytics, and solution delivery