Archetype AI is developing an innovative AI platform that integrates real-world data into actionable insights. They are seeking a Staff Software Engineer to manage data processing and analysis for edge devices, focusing on building high-performance data pipelines and device software.

Responsibilities:

Analyze raw data using Python for statistical analysis, visualization, and exploratory techniques to understand quality, patterns, and anomalies
Prepare datasets for AI workflows: cleaning, normalization, imputation, filtering, resampling, and validation
Execute iterative preprocessing cycles: refine transformations, evaluate results, compare against baselines, retain improvements
Build tooling for data validation, quality monitoring, and automated preprocessing
Generate clear reports and visualizations that communicate findings to technical and non-technical stakeholders
Build and optimize data processing software in C++ that runs on small, resource-constrained Linux devices
Ensure pipelines meet real-time performance requirements: low latency, bounded memory, reliable throughput
Integrate sensor inputs and manage data flow on-device: ingestion, buffering, local processing, and transmission
Work within device constraints: limited CPU, memory, storage, and intermittent connectivity
Contribute to device deployment, configuration, and operational tooling
Partner with Solutions Engineers to assess customer data assets and deployment requirements
Translate customer data challenges into reusable pipeline components and analysis workflows

Requirements:

7+ years in data engineering, data analysis, or related technical roles with hands-on data processing focus
Deep experience with time-series data (video a plus): ingestion, preprocessing, feature extraction, quality assessment
Proven ability to apply diverse analytical techniques: statistical analysis, signal processing, visualization, anomaly detection
Experience with iterative data workflows: hypothesis, transformation, evaluation, refinement
Comfortable building and running software on Linux devices, familiarity with system-level concerns (resource usage, process management, I/O)
Experience with real-time or streaming data processing under latency and throughput constraints
Familiarity with data preparation for ML: dataset formatting, labeling workflows, train/eval splits, data validation
C++ (production development): Strong proficiency building production data pipelines and device software. Experience with modern C++, memory management, multithreading, and performance optimization
Python (analysis & prototyping): Strong proficiency for data exploration, statistical analysis, visualization, and rapid prototyping. Experience with NumPy, Pandas, Matplotlib, and Jupyter notebooks
Proven expertise in Linux system architecture and performance, including process design, I/O strategies, and diagnosing complex production issues
Debugging & profiling: Strong skills diagnosing performance issues, memory problems, and data pipeline failures in both C++ and Python
Clear, structured written communication, including customer-facing documentation of findings, processes, and technical decisions
Proven ability to present complex analytical and technical results directly to customers, translating them into concrete, actionable insights for technical teams and business stakeholders
Background in signal processing, control systems, or physics-based data analysis
Experience with embedding-space analysis or other AI/ML diagnostic techniques
Prior work optimizing data pipelines for resource-constrained environments
Background in solutions engineering or customer-facing technical work

Staff Software Engineer, Edge AI Systems

Key skills

About this role

Responsibilities:

Requirements: