Calix is looking for a Summer Intern to join their Products Team, providing an opportunity to learn new skills through training and hands-on experience. The intern will be part of the cloud operations and reliability team, gaining exposure to production systems and SRE principles.
Responsibilities:
- Assist in monitoring and maintaining reliability of cloud‑based services and platforms
- Support day‑to‑day SRE operations including incident investigation, root cause analysis, and post‑incident documentation
- Participate in 24/7 rotational shift coverage, under supervision, to support monitoring, alert triage, and operational readiness
- Help build and enhance automation tools and scripts using Python and Shell
- Contribute to observability initiatives using metrics, logs, and traces (Grafana, Prometheus, etc.)
- Assist in managing and troubleshooting databases (relational and/or NoSQL) with guidance from senior engineers
- Support reliability and performance analysis for Kafka or event‑streaming systems, including basic troubleshooting and monitoring
- Work with Infrastructure‑as‑Code (Terraform) to provision and validate environments
- Assist with CI/CD pipelines and environment deployments
- Create and maintain runbooks, dashboards, and operational documentation
- Collaborate with software engineers and platform teams to improve system resilience and scalability
Requirements:
- Currently enrolled in a Bachelor's or Master's Degree program majoring in Computer Science, Engineering, or a related field. Preference will be given to students who have completed their Junior or Senior years and who have previous internship or work experience
- Strong fundamentals in Linux/Unix systems and command‑line usage
- Basic understanding of networking concepts (TCP/IP, DNS, load balancing)
- Familiarity with Python, Shell scripting, or similar languages
- Basic knowledge of databases (e.g., MySQL, PostgreSQL, MongoDB) including queries and schema concepts
- Willingness to participate in 24/7 rotational shifts as part of a structured learning and support model
- Good problem‑solving skills and eagerness to learn large‑scale systems
- Able to work for the complete summer break (May - August or June - September)
- Exposure to cloud platforms (GCP, AWS, or Azure)
- Familiarity with Kafka or distributed messaging systems (topics, producers, consumers, offsets)
- Basic understanding of database reliability concepts such as backups, replication, failover, and performance tuning
- Awareness of Kubernetes and containerized workloads
- Experience with Git and basic CI/CD concepts (Jenkins or similar tools)
- Interest in SRE principles such as SLIs, SLOs, error budgets, and automation‑first thinking