Vultr is a leading cloud infrastructure company focused on making high-performance cloud solutions accessible for enterprises and AI innovators. The Infrastructure Capacity Analytics Engineer will design forecasting models, build scalable data pipelines, and create dashboards to optimize infrastructure and support strategic decision-making across the organization.
Responsibilities:
- Develop and maintain capacity models for compute, storage, and network infrastructure across global environments
- Build and productionize advanced time‑series forecasts (e.g., ARIMA/ETS, Prophet, XGBoost/LightGBM) to predict demand, saturation points, and runway
- Conduct scenario modeling (“what‑if”) on deployment plans, workload changes, demand spikes, and hardware refresh strategies
- Analyze historical utilization to identify emerging risks, inefficiencies, and optimization opportunities
- Design, build, and maintain Python‑based data pipelines for ingesting, transforming, and validating large‑scale infrastructure telemetry
- Create ETL/ELT workflows to support analytics, modeling, and reporting
- Integrate data from observability platforms (e.g., Prometheus/Grafana), CMDB/asset systems, and internal services
- Develop APIs/services to expose forecast results and capacity signals to dashboards and tooling
- Build executive‑ready dashboards in Power BI (DAX, Power Query, custom visuals) and integrate real‑time forecasting outputs
- Deliver clear, compelling insights to engineering, operations, and finance leaders to support both strategic and tactical decision‑making
- Automate reporting workflows and ensure up‑to‑date visibility into runway, utilization, and risk posture
- Partner with engineering, operations, and finance teams to align capacity plans with growth, reliability, and cost objectives
- Establish standards for model governance, documentation, and data quality
- Drive continuous improvement of capacity planning systems, tooling, and analytics frameworks
Requirements:
- Bachelor's degree in Computer Science, Engineering, or related discipline
- 6+ years of professional experience in Python development and data engineering
- 4+ years in infrastructure capacity planning, performance analysis, or related fields
- Strong expertise in time‑series forecasting, statistical modeling, and Python libraries (pandas, NumPy, scikit‑learn, statsmodels, XGBoost)
- Proficiency with SQL scripting and column-based SQL databases (I.e. ClickHouse); experience designing scalable ETL/ELT pipelines
- Advanced proficiency in Grafana, Tableau, or Power BI (DAX, Power Query, modeling, custom visuals)
- Experience working with infrastructure telemetry and systems (servers, storage, networking)
- Prior experience managing capacity at a cloud service provider or large‑scale distributed environment
- Excellent communication and executive‑level presentation skills
- Experience with cloud‑native data platforms (Azure Data Lake/Synapse, AWS Redshift, Google BigQuery)
- Familiarity with containerized environments (Docker) and CI/CD pipelines
- Knowledge of RESTful APIs and microservices architecture
- Experience with version control (Git) and agile engineering practices
- Exposure to machine learning, anomaly detection, or ensemble forecasting methods
- Strong spreadsheet skills (advanced formulas, modeling workflows)