EOI Space is developing and deploying a network of satellites in Very Low Earth Orbit (VLEO) to provide ultra-high-resolution Earth imagery. They are seeking a Staff Software Engineer to develop systems and applications for managing an Edge device-based High-Performance Computing cluster onboard the satellite, focusing on building robust infrastructure for on-orbit processing of image data.
Responsibilities:
- Coordinate, code, and lead implementation of cluster management and workload coordination systems and tools to manage the HPC
- Implement and test highly optimized GPU-aware containerized image processing workloads on the cluster
- Work hand in hand with Security, Export Compliance, and DevOps/Platform engineering to ensure we can deploy and maintain test, qualification, and flight–level updates
- Operate in a lean startup environment, maintaining a laser focus on the balance between what we need today and the things we are excited to add and enhance tomorrow
- Champion our development team principles:
- Clear and explicit communication, within the team and between teams across the company
- Develop iteratively, integrate early and often, and coordinate functional cross-team demos of the iterative releases
- End-to-end observable and traceable systems
- Plan, define, prioritize, and track design and development activities to meet milestones and support inter-team dependencies
- Work with other team leads to make sure basic “steel thread” system capabilities are established early and continuously refined and matured
- Develop and deploy both image processing workload and cluster orchestration software to bench, rack, and flight versions of the payload processing system
- Support integration of the payload subsystem itself (multiple compute elements therein), as well as integration with bus flight software, wideband RF communications systems, and ground and space-based image processing pipelines
- Support testing and qualification campaigns as well as on-orbit updates
- Optimize for space flight by selecting and applying lightweight but modern OSS frameworks and tools, applied in a bandwidth-conscious way to support on-orbit updates to any level of the system (BSP, OS, and applications)