Angel is a platform that amplifies light through storytelling, allowing creators and audiences to connect over creative projects. They are seeking a skilled Senior DevOps Engineer to ensure robust, scalable, and reliable systems, working closely with development teams to automate processes and manage cloud environments using AWS.
Responsibilities:
- Automate infrastructure provisioning, configuration, and deployment processes using tools such as Terraform, or similar
- Collaborate with development teams to integrate CI/CD pipelines and streamline code deployment
- Leverage AI-powered tools to improve deployment workflows, incident analysis, and operational efficiency
- Design, deploy, and manage scalable and secure AWS infrastructure
- Optimize AWS resource usage and cost through effective monitoring and management
- Implement security best practices for cloud-based environments
- Evaluate and implement AI-driven cloud optimization or observability tooling where appropriate
- Implement and maintain monitoring, logging, and alerting systems to ensure high availability and performance of applications
- Identify and resolve performance bottlenecks and reliability issues in production environments
- Contribute to and execute incident response and disaster recovery plans
- Identify opportunities to automate repetitive operational tasks using scripting or AI agents
- Work closely with software engineers to ensure seamless integration of new features and services
- Participate in on-call rotations and provide support for incident management
- Document processes, configurations, and procedures to ensure knowledge sharing and continuity
Requirements:
- Skilled and experienced in DevOps practices
- Experience with automation of infrastructure provisioning, configuration, and deployment processes using tools such as Terraform or similar
- Experience collaborating with development teams to integrate CI/CD pipelines and streamline code deployment
- Experience leveraging AI-powered tools to improve deployment workflows, incident analysis, and operational efficiency
- Experience designing, deploying, and managing scalable and secure AWS infrastructure
- Ability to optimize AWS resource usage and cost through effective monitoring and management
- Knowledge of implementing security best practices for cloud-based environments
- Experience evaluating and implementing AI-driven cloud optimization or observability tooling
- Experience implementing and maintaining monitoring, logging, and alerting systems to ensure high availability and performance of applications
- Ability to identify and resolve performance bottlenecks and reliability issues in production environments
- Experience contributing to and executing incident response and disaster recovery plans
- Ability to identify opportunities to automate repetitive operational tasks using scripting or AI agents
- Strong collaboration and communication skills
- Experience working closely with software engineers to ensure seamless integration of new features and services
- Willingness to participate in on-call rotations and provide support for incident management
- Ability to document processes, configurations, and procedures to ensure knowledge sharing and continuity