Progressive Insurance is dedicated to helping employees move forward and live fully in their careers. As an intermediate machine learning data engineer, you’ll work on developing machine learning platform solutions for operations and deployment, leveraging cloud services and building necessary data pipelines to support Data Science models.
Responsibilities:
- Work on a Claims IT team focused on developing machine learning platform solutions for machine learning operations and deployment
- Use data-focused software engineering and languages like Python to solve problems with techniques you may need to research, learn, and implement
- Develop and deploy the machine learning platform
- Build and deploy solutions leveraging cloud services and resources from cloud providers
- Build out the necessary data pipelines to support Data Science models
- Enable efficient access to multiple data sources
- Support packaging and deploying new models or updated models to production
- Create the necessary infrastructure needed for model training
Requirements:
- Bachelor's Degree or higher in an Information Technology discipline or related field of study and minimum of one year of work experience designing, programming, and supporting software programs or applications
- In lieu of degree, minimum of two years related work experience designing, programming, and supporting software programs or applications may be accepted
- Data focused software engineer with experience integrating cloud services and resources (e.g., AWS - S3, EC2) and with ability to process large volumes of structured and unstructured data
- Experience with Unix/Shell scripting, Bash, NEO4J, Hadoop, Snowflake or Tecton, Terraform, Docker, highly proficient in Python and experience with one or more parallel computing and/or data manipulation tools (e.g., Ray, Dask, Prefect, Spark, SQL or multiprocessing package)
- Deployment of machine learning models for real-time use cases and understanding of machine learning algorithms
- Experience designing and evaluating approaches to high volume real-time data streams
- CI/CD automation (e.g., Jenkins or Azure DevOps)