Architect Labs is a frontier AI lab for chip design, focused on building AI models and tools for custom ASICs. They are seeking a Senior Member of Technical Staff to own the compiler stack for their SIMD/VLIW NPU, collaborating closely with the NPU architect to optimize both hardware and software components.

Responsibilities:

Own the compiler end-to-end: graph ingestion (ONNX, PyTorch) through IR optimization, AI-driven code generation, instruction scheduling, and register allocation for a SIMD/VLIW NPU
Implement and own the memory management layer; for instance SW-managed on-chip scratchpad memory with the compiler handling data tiling, bank allocation, DMA scheduling, and double-buffering across SRAM banks
Design and iterate on mid-end and backend optimization passes: operator fusion, loop transformations, vectorization, and software pipelining to close the gap between peak and achieved throughput
Co-design the ISA and instruction encoding with the architect and silicon team. Feed real workload performance data back into architectural decisions
Support quantization and mixed-precision lowering (32bit single-precision FP or INT, along with lower INT8/4, BF16, FP16/8/4 precisions) with correct numerics end-to-end
Benchmark compiler output against cycle-accurate models, RTL simulation, and FPGA prototypes. Own QoR tracking
Grow into a compiler team lead as the team scales

Member of Technical Staff - Compilers

Key skills

About this role

Responsibilities: