Beatdapp is a company delivering the most advanced streaming integrity technology in the world. They are seeking an ML Engineer II to work on machine learning inference systems for audio at scale, focusing on bridging the gap between raw audio and detection models. The role involves engineering and optimizing inference containers, building cloud infrastructure, and maintaining data pipelines.

Responsibilities:

Build, tune, and ship our inference containers
Building and maintaining Dockerfile and dependencies, image size and cold-starts, GPU access patterns, the multi-cloud orchestration shape that runs it (ECS, Cloud Run, GKE, EKS), test coverage for the container surface, and the storage abstraction it depends on
Squeeze more out of each GPU instance: concurrency tuning, VRAM accounting, request timeouts and queueing, rate limiting, multi-GPU distribution on instances that have more than one, and the right-sizing decisions that follow
Build and run scale and stress scenarios across mock deployments that mirror real customer environments
Characterize the latency-vs-throughput curves, find the breaking points, and turn the results into autoscaling and instance-sizing decisions
Operate the Terraform stack across multiple clouds (GCP, AWS)
Networking, identity, GPU nodes, autoscaling, per-tenant account configurations
Build and extend the customer-facing API layer that fronts the inference service: client authentication, rate limiting, per-client data isolation, and request metering
Maintain and extend the data orchestration pipelines that feed model evaluation, customer reporting, and operational dashboards
Build and tune the metrics, dashboards, logging, and alarms across three layers: the inference service, the running instances, and the deployed models themselves

ML Engineer II (Inference Platform)

Key skills

About this role

Responsibilities: