OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. They are seeking a Data Engineer to build data-intensive systems that will power People Innovation Labs’ internal products and enable the People Analytics function.
Responsibilities:
- Design, build and manage people data pipelines, ensuring all data is seamlessly integrated into our Databricks warehouse
- Develop canonical datasets to track key people metrics and People Innovation Labs product metrics
- Work collaboratively with various teams, including, Data Platform, Data Science, People Analytics, and Compensation and Equity to understand their data needs and provide solutions
- Implement robust and fault-tolerant systems for data ingestion and processing
- Participate in data architecture and engineering decisions, bringing your strong experience and knowledge to bear as the primary data engineering expert on the team
- Ensure the security, integrity, and compliance of data according to industry and company standards
Requirements:
- Have 3+ years of experience as a data engineer and 8+ years of any software engineering experience (including data engineering)
- Proficiency in at least one programming language commonly used within Data Engineering, such as Python, Scala, or Java
- Experience with data warehousing technologies such as Databricks and Snowflake, and expertise with ETL schedulers such as Fivetran, Airflow, Dagster, Prefect, or similar
- Experience with distributed processing technologies and frameworks, such as Spark, Hadoop, Flink and distributed storage systems (e.g., HDFS, S3)