Dexian is a leading provider of staffing, IT, and workforce solutions with over 12,000 employees and 70 locations worldwide. The AI Data Engineer will be responsible for designing, developing, and maintaining efficient and reliable data pipelines, collaborating with stakeholders to gather business requirements and ensure data quality and accessibility.
Responsibilities:
- Design, build, and maintain scalable data platform and pipelines using Python, SQL, Airflow, and Spark
- Collaborate with stakeholders to understand and translate business requirements into technical specifications
- Develop and implement data models that support analytics and reporting needs
- Ensure data accuracy, consistency, and reliability by implementing robust data validation and quality checks
- Work with cross-functional teams, including data analysts, data scientists, and business leaders, to deliver high-quality data solutions
- Continuously monitor and optimize data pipelines for performance, scalability, and cost-efficiency
- Build and implement monitoring and observability metrics to ensure data quality and detect anomalies in data pipelines
- Maintain clear and comprehensive documentation of data processes and communicate technical concepts effectively to non-technical stakeholders
Requirements:
- 2 years of experience in a data engineering and infrastructure
- Proficiency in datawarehouse management, Python, SQL, Airflow, and Spark
- Strong experience in building and maintaining robust data pipelines and ETL processes
- Ability to gather business requirements, debug issues for ingestion or any other areas of the data warehouse
- Excellent verbal and written communication skills, with the ability to convey technical information to non-technical audiences
- Proven ability to work effectively in a collaborative, cross-functional environment
- Bachelor's degree in Computer Science, Engineering, Information Systems, or a related field
- Experience with cloud platforms such as AWS, GCP, or Azure
- Familiarity with data warehousing technology (e.g., Deltalake, Azure Fabric, Snowflake, Redshift, BigQuery)
- Knowledge of data governance and data security best practices