Revolgy is a leading multinational company providing digital transformation services through online collaboration tools and cloud infrastructure. They are seeking an AWS Cloud Engineer to architect the AWS foundation and middleware layer for a GenAI creation platform based on ComfyUI, focusing on building production-ready infrastructure and serverless middleware.
Responsibilities:
- Build the AWS Platform Foundation: You will design and automate production-ready infrastructure, including secure networking (VPC, private connectivity), Identity (RBAC/IAM), and storage lifecycles
- Architect Elastic GPU Compute: You will implement the execution environment—likely EKS with GPU nodes—handling autoscaling, NVIDIA drivers, and safe scheduling defaults
- Develop Serverless Middleware: You will build the "glue" that makes ComfyUI run like a managed service. This includes using AWS Lambda, SQS, Step Functions, and EventBridge to handle job ingestion, decoupled orchestration, and retries
- Create the ComfyUI Adapter: You will write the logic that acts as the translator between app inputs and the ComfyUI API, managing execution lifecycles, capturing logs, and handling output manifests
- Implement Real-Time Feedback: You will build the status and progress streaming architecture (WebSocket/SSE) to ensure users have visibility into their generation jobs
- Enforce Production Rigor: You will implement infrastructure-level guardrails (concurrency limits, cost tagging) and deliver fully automated CI/CD pipelines using IaC (CDK preferred)
Requirements:
- Deep AWS Expertise: You have mastered both Serverless (Lambda, Step Functions, EventBridge) and Container platforms (EKS operations, autoscaling)
- ComfyUI & GenAI Fluency: You have demonstrated experience with ComfyUI (API, workflow JSON structure, custom nodes) or strong public evidence of extending open-source GenAI codebases. You understand the runtime behavior of generative models
- Strong Software Engineering: You are not just a config manager; you write clean, testable code for middleware and automation
- Production Mindset: You prioritize security (tenant scoping, secrets management), observability (CloudWatch/X-Ray), and cost-awareness in every architectural decision
- Delivery Automation: You are proficient in Infrastructure as Code (CDK or Terraform) and building CI/CD for both infrastructure and container publishing