Architect, develop and deliver fully automated multi-account global cloud infrastructure on AWS using repeatable methods and patterns
Support continuous delivery pipelines for the deployment to all layers of our SDLC
Conduct Infrastructure as Code reviews ensuring high programming standards
Monitors and tunes the performance of the infrastructure
Identify and correct bottlenecks in the system, while working with Engineering on optimizations and best practices
Support integration of log management and APM solution to provide continuous monitoring capabilities, track all aspects of the system, infrastructure, performance, application errors and roll up metrics
Provide mentorship and training to other team members on technologies and processes, drive education and knowledge transfer of design patterns, technical practices, and relevant technologies and tools
Troubleshoot, analyze the root cause issues in the platform
Fully participate in the ownership of your services and components, including on-call duties
Support Data Center and Cloud Operations: System Provisioning, Monitoring/Alerting, Network Configuration, Administration of Linux, Windows, VMWare/vSphere and Cloud Computing
Requirements
5+ years experience with architecting, building, and operating large-scale distributed systems on a multi-account AWS platform
7+ years experience of software development and DevOps
7+ years experience with CI/CD practices, tool chains such as but not limited to Jenkins, Artifactory, Bitbucket, Docker, Ansible, AWS Code Deploy
7+ years experience with automation programming in Python, PowerShell, or similar scripting language
Extensive proven experience with AWS network and security implementations and management. Proven experience with CloudFormation, Docker and ECS
Advanced Windows and Linux experience related to administration, security, configuration, and monitoring.
Experience with monitoring and alerting tools such as New Relic, Datadog, ELK Stack