Codvo.ai is committed to building scalable, future-ready data platforms that power business impact. The Data Engineer will be a foundational architect responsible for building and maintaining the data ecosystem required for complex data challenges in the oil and gas sector.

Responsibilities:

Architect & Build Data Pipelines: Design, construct, install, test, and maintain highly scalable data management systems and ETL/ELT pipelines
Integrate Diverse Data Sources: Develop processes to ingest and integrate high-volume, high-velocity data from SCADA systems, historians (like OSIsoft PI, Aspen InfoPlus.21), DCS, PLC, and IoT sensors
Cloud Data Platform Development: Implement and manage data solutions on the Microsoft Azure cloud platform, Leveraging services like Azure IoT Hub, Azure Event Hubs, and Azure Stream Analytics for real-time ingestion and processing of operational technology (OT) data
Data Modelling & Warehousing: Design and implement data models optimized for time-series data from industrial assets, supporting operational dashboards and real-time analytics
Enable Advanced AI: Build the data infrastructure to support AI/ML models for predictive maintenance, operational anomaly detection, and process optimization using real-time OT data
Champion Master Data Management (MDM): Design and implement MDM strategies and solutions to create a single, authoritative source of truth for critical data domains such as wells, equipment, and assets, ensuring data consistency across the enterprise
Ensure Data Quality & Governance: Implement robust data quality checks, validation rules, and monitoring to ensure the accuracy, consistency, and reliability of our data. Adhere to and help shape our data governance policies
Embrace Industry Standards: Champion and implement industry-specific data standards and models, such as the OSDU™ Data Platform, to ensure interoperability and a unified data view across the upstream lifecycle
Collaborate & Innovate: Work closely with a cross-functional team of geoscientists, drilling engineers, data scientists, and business analysts to understand their data needs and deliver effective solutions
Automate & Optimize: Identify opportunities for process automation and infrastructure optimization to improve data delivery, scalability, and cost-effectiveness
Security First: Implement and maintain security best practices to protect our sensitive and proprietary data assets

Requirements:

Bachelor's in engineering, Information Systems, or a related quantitative field
5+ years of proven experience in a data engineering role
Experience within oil and gas industry is highly preferred
Demonstrable experience building and operationalizing large-scale data pipelines and applications
Expert-level proficiency in SQL and Python for data manipulation and pipeline development
Hands-on experience with distributed computing frameworks like Apache Spark (PySpark). Experience with streaming technologies like Kafka is a plus
Deep experience with Microsoft Azure (Azure Data Lake Storage, Azure Data Factory, Azure Databricks, Azure Synapse)
Proven experience with modern data platforms Databricks Delta Lake
Understanding of machine learning lifecycles and the data requirements for training and deploying AI/ML models
Experience with workflow orchestration tools like Airflow, Dagster, or Azure Data Factory
Strong understanding of both relational (e.g., PostgreSQL, SQL Server) and NoSQL databases. Experience with graph databases (e.g., Neo4j) and vector databases is highly desirable
Proficiency with Git and CI/CD best practices
Familiarity with historian systems (e.g., OSIsoft PI System) and their data structures
Hands-on experience with the OSDU™ Data Platform
Experience working with industrial communication protocols (e.g., OPC UA, Modbus TCP/IP)
Understanding of cybersecurity considerations for OT environments and data segregation
Experience integrating data from ERP systems like SAP
Professional certifications in Azure or Databricks
Advanced SQL skills, including query optimization and performance tuning
Knowledge of containerization technologies (Docker, Kubernetes)
Experience of constructing and maintaining enterprise knowledge graphs

Data Engineer Lead (OT Data)( Oil & Gas) (Remote)

Key skills

About this role

Responsibilities:

Requirements: