The University of Chicago is a research institution focused on impactful problems in various fields. They are seeking a Senior Platform Engineer to provide production support, monitoring, and CI/CD design for open-source software platforms in translational data science.
Responsibilities:
- Responsible for technical tasks and timely delivery of such tasks, meeting the required level of quality
- Participation in complex and challenging activities, including design and implementation
- Provide support and maintenance for existing applications
- Provide technical mentorship to interns and onboarding staff and technical leadership in technical initiatives
- Actively participate in the hiring process and provide fair and productive interview feedback
- Elaborate on technical solutions internally and externally when required
- Investigates, analyzes and resolves day-to-day technical problems using standard procedures
- Works with stakeholders to gather and analyze requirements for developmental programs. Receives a moderate level of guidance to design applications to meet University and business requirements
- Performs code testing on components and works to ensure that appropriate implementation standards are met. Evaluates design alternatives for development cost and solutions using various methods
- Supports and maintains existing applications. Works with web developers and responds to requests from users
- Performs other related work as needed
Requirements:
- Minimum requirements include a college or university degree in related field
- Minimum requirements include knowledge and skills developed through 2-5 years of work experience in a related job discipline
- Responsible for technical tasks and timely delivery of such tasks, meeting the required level of quality
- Participation in complex and challenging activities, including design and implementation
- Provide support and maintenance for existing applications
- Provide technical mentorship to interns and onboarding staff and technical leadership in technical initiatives
- Actively participate in the hiring process and provide fair and productive interview feedback
- Elaborate on technical solutions internally and externally when required
- Investigates, analyzes and resolves day-to-day technical problems using standard procedures
- Works with stakeholders to gather and analyze requirements for developmental programs
- Receives a moderate level of guidance to design applications to meet University and business requirements
- Performs code testing on components and works to ensure that appropriate implementation standards are met
- Evaluates design alternatives for development cost and solutions using various methods
- Supports and maintains existing applications
- Works with web developers and responds to requests from users
- Performs other related work as needed
- Advanced degree in computer science, mathematics, statistics, engineering, or a relevant quantitative field strongly preferred
- 3+ years experience developing infrastructure, configuration and/or deployment automation or demonstrated skills and qualifications through projects, initiatives, or outstanding performance
- Hands-on scripting experience (Bash, Python, or other dynamic language)
- Unix/Linux programming or system administration experience
- Experience with OpenStack and AWS p(EC2/S3) cloud technologies
- Experience with configuration management utility (Chef, Puppet, Ansible)
- Experience with F5 or other load balancing technologies (Nginx, AWS ELB/ALB, etc.)
- Experience with source control and build systems (SVN, Git, Jenkins, etc.)
- Experience with container based deployment (Docker, Kubernetes)
- Experience with log aggregation tools (ELK stack, Splunk)
- Experience with security frameworks (FISMA, NIST, FIPS)
- Experience with cloud platforms (AWS, GCP, Openstack), CI/CD, and Agile methodologies
- Experience provisioning and managing GPU-enabled infrastructure (NVIDIA GPUs, CUDA, multi-GPU systems) in cloud and/or on-prem environments
- Familiarity with GPU orchestration in Kubernetes (e.g., NVIDIA device plugin, GPU scheduling, MIG, node affinity)
- Experience optimizing GPU utilization, memory management, and cost efficiency for compute-intensive workloads
- Ability to collaborate and interact effectively with team members, following guidelines and best practices and ensuring accountability for deliverables and outcomes
- Ability to take and provide constructive and helpful input and feedback on technical issues
- Ability to elaborate on technical solutions internally and externally when required
- In-depth knowledge in core areas of DevOps technologies, including scripting languages, deployment/configuration frameworks, and/or cloud platforms, or having demonstrated the ability to achieve that level of proficiency in a short period of time
- Ability to take multiple complex tasks and break them into smaller ones, estimating the effort needed to complete them, prioritizing them appropriately, and ensuring the completion of each task, meeting the required level of quality
- Ability to work in a collaborative team and ensure accountability for deliverables and outcomes
- Ability to prioritize and manage workload to meet project milestones and deadlines