The National Resident Matching Program (NRMP) is a private, not-for-profit organization focused on matching applicants to graduate medical education positions. They are seeking a Data Engineer to build and optimize data pipelines, support data governance, and collaborate with stakeholders to operationalize data and analytics initiatives.
Responsibilities:
- Build data pipelines: Managed data pipelines consist of a series of stages through which data flows from data sources of acquisition to integration to consumption for specific use cases. These data pipelines must be created, maintained, and optimized as workloads move from development to production for specific use cases. Architecting, creating, and maintaining data pipelines will be the primary responsibility of the data engineer
- Drive Automation through effective metadata management: The data engineer will be responsible for using innovative and modern tools, techniques, and architectures to partially or completely automate the most-common, repeatable and tedious data preparation and integration tasks in order to minimize manual and error-prone processes and improve productivity. The data engineer will also need to assist with renovating the data management infrastructure to drive automation in data integration and management. This will include (but not be limited to):
- Using modern data preparation, integration and metadata management tools and techniques
- Tracking data consumption patterns
- Monitoring schema changes
- Recommending and automating — existing and future integration flows
- Collaborate across departments: The newly hired data engineer will need strong collaboration skills to work with varied stakeholders within the organization. In particular, the data engineer will work in close relationship with research teams and with data analysts in refining their data requirements for various data and analytics initiatives and their data consumption requirements
- Become a data and analytics evangelist: The data engineer will be considered a blend of data and analytics “evangelist,” “data guru” and “fixer.” This role will promote the available data and analytics capabilities and expertise to business unit leaders and educate them in leveraging these capabilities in achieving their business goals
Requirements:
- At least 6-8 years or more of work experience in data management disciplines including data integration, modeling, optimization, and data quality, and/or other areas directly relevant to data engineering responsibilities and tasks
- At least 3 years of experience working in cross-functional teams and collaborating with business stakeholders in support of departmental and/or multi-departmental data management, analytics, and business intelligence initiatives
- A bachelor's or master's degree in engineering or computer science or a related quantitative field
- Legal authorization to work in the United States without sponsorship or restriction
- Resides in the United States and ability to work remotely with occasional overnight travel
- Strong experience with Object-oriented/object function scripting using python and related libraries
- Strong experience with popular database programming languages including SQL and PL/SQL for relational databases
- Strong experience in working with and optimizing ETL/ELT processes and data integration / data preparation flows and moving them across various environments including production
- Proficient in working in AWS environment (Glue, S3, Lambda, IAM)
- Experience in working with open-source technologies such as Airflow to automate data pipelines
- Adept in agile methodologies and capable of applying DevOps and increasingly DataOps principles to data pipelines to improve the communication, integration, reuse and automation of data flows between data providers and consumers across NRMP
- Ability to implement data quality checks and ensure data integrity within the data warehouse environment
- Experience working with data quality, security, and governance teams in moving data pipelines through environments with appropriate data quality, governance and security standards
- Be highly creative and collaborative
- Be a confident, energetic self-starter with strong interpersonal skills
- Comfortable in a fast-paced small company environment with the ability to manage a variety of projects simultaneously
- Have good judgment, demonstrate initiative, and demonstrate commitment to high standards of ethics, regulatory compliance, customer service, and business integrity
- Collaborate with Business Intelligence team to build effective solutions
- Keen interest in learning and using latest software tools, methods and technologies to solve problems with an eye on maintainability
- Be a strategic, intellectually curious thinker with a focus on outcomes
- AWS Certified Developer certification is highly desirable
- The ideal candidate will have a combination of IT skills, data governance skills, and analytics skills
- Familiarity with undergraduate and graduate medical education and the residency selection process is highly desirable