Assist in the operational support of multiple global multi-tenant cloud-based applications
Keep vitally important IT systems up and running by overseeing day-to-day monitoring and response to alarms, fault and performance management activities
Proactively monitor systems, networks, and applications to provide input in improving the stability, security, efficiency, and scalability of systems
Participate in incident reviews to create improved supportability documentation, diagnostics, tooling, error messages and automation
Manage the secure, scalable and resilient hosting of numerous applications in a regulated (HIPAA) environment
Implement monitoring and security controls across various platforms
Collaborate closely with the multiple technology and cross-functional groups within the organization
Requirements
5+ years demonstrated work experience with AWS (e.g. ECS, Fargate, ELB, S3, Lambda, etc) and / or Azure (Virtual Machines, Container Instances, Blob Storage, etc)
Experience with Application and Infrastructure Monitoring such as AWS Cloudwatch, Azure Monitor, Datadog)
Proven experience to analyze and run audit forensics ,trend analysis and cloud data reporting
Experience with Change management, incident review and root cause analysis of maintenances and network outages
Bachelor's degree in Computer Science, System Administration or equivalent
AWS certifications, preferably Solutions, DevOps or SysOps
Experience with Agile software development
DevOps infrastructure experience, including Terraform, Ansible, CloudFormation, Atlassian Suite (JIRA, Bamboo, Bitbucket), etc
Experience supporting scalable database technologies, including MongoDB, MS SQL, and Postgres
Experience in supporting secure software platforms in a regulated environment (SOX, HIPAA, GDPR)