Hyphen Connect is seeking an AI Specialist Engineer to enhance the performance of large language and vision models for on-device inference. The role involves developing and deploying cutting-edge AI solutions to ensure optimal efficiency across diverse hardware architectures.

Responsibilities:

Compress and optimize large language and vision models for on-device inference
Develop pipelines for model distillation and hardware-specific compilation
Benchmark performance across various NPU/GPU architectures

Requirements:

Expertise in model distillation, pruning, and 4-bit/8-bit quantization techniques
Hands-on experience with TensorRT, ONNX Runtime, and edge deployment
Strong C++ and Python skills

AI Specialist (AI Engineering)

Key skills

About this role

Responsibilities:

Requirements: