Beatdapp is a company delivering the most advanced streaming integrity technology in the world. They are seeking an ML Engineer II to work on machine learning inference systems for audio at scale, focusing on bridging the gap between raw audio and detection models. The role involves engineering and optimizing inference containers, building cloud infrastructure, and maintaining data pipelines.
Responsibilities:
- Build, tune, and ship our inference containers
- Building and maintaining Dockerfile and dependencies, image size and cold-starts, GPU access patterns, the multi-cloud orchestration shape that runs it (ECS, Cloud Run, GKE, EKS), test coverage for the container surface, and the storage abstraction it depends on
- Squeeze more out of each GPU instance: concurrency tuning, VRAM accounting, request timeouts and queueing, rate limiting, multi-GPU distribution on instances that have more than one, and the right-sizing decisions that follow
- Build and run scale and stress scenarios across mock deployments that mirror real customer environments
- Characterize the latency-vs-throughput curves, find the breaking points, and turn the results into autoscaling and instance-sizing decisions
- Operate the Terraform stack across multiple clouds (GCP, AWS)
- Networking, identity, GPU nodes, autoscaling, per-tenant account configurations
- Build and extend the customer-facing API layer that fronts the inference service: client authentication, rate limiting, per-client data isolation, and request metering
- Maintain and extend the data orchestration pipelines that feed model evaluation, customer reporting, and operational dashboards
- Build and tune the metrics, dashboards, logging, and alarms across three layers: the inference service, the running instances, and the deployed models themselves