Design, develop, and optimize new features and algorithms for oneDNN targeting Intel processors, Intel Processor Graphics, and Intel discrete GPUs.
Perform performance analysis and optimization to achieve best‑in‑class deep‑learning inference and training throughput on current and next‑generation Intel platforms.
Develop hardware‑specific parallel algorithms, including multithreading, vectorization, and memory‑layout optimizations.
Contribute to assembly‑level programming and low-level performance tuning for Intel microarchitectures.
Collaborate with cross‑functional teams across software engineering, architecture, and AI performance to ensure strong integration with Intel’s broader AI ecosystem.
Engage with the open‑source community, participate in code reviews, and maintain high-quality coding and documentation standards.

Master or PhD Mathematics, Physics, Computer Science or in a related field
5+ years of experience in the following areas: C++ Algorithms and data structures, or Mathematical background
Low-level Performance Optimizations, preferably on GPUs
3 years+ High-performance computing (HPC) applications development (preferred)
1 year+ Machine learning and deep learning algorithms (preferred)
1 year+ Agile software development environment (preferred)
1 year+ Intel development tools (preferred)
Software libraries design and architecture (preferred)
Background in Linear algebra solvers, matrix-vector operations, or Fast Fourier Transforms (preferred)
Software development on Linux (preferred)
GPU optimizations (OpenCL, CUDA, SYCL/DPC++, C for Metal or similar) (preferred)
Parallel programming (OpenMP, TBB, or MPI) (preferred)

Senior AI Algorithm Engineer

Key skills