Arkhya Tech. Inc. is seeking a Senior HPC Engineer to be the single point of contact for design, deployment, maintenance, and optimization of on-prem Linux / High-Performance Computing (HPC) environments. This role involves system administration, software engineering, and collaboration to support bioinformatics pipelines and manage large-scale biological data.
Responsibilities:
- Pipeline Development & Automation : Support integration of existing pipelines on new cluster and help design, develop, and maintain robust, automated bioinformatics pipelines for next-generation sequencing
- Software Development : Collaborate with software developers to create or modify custom bioinformatics software and tools, ensuring they are scalable and efficient on the cluster compute environment
- Database Management : Create, maintain, and manage databases to store, organize, and track large volumes of biological data and metadata, ensuring data integrity and security
- Troubleshooting and Support : Diagnose and resolve technical problems in system configuration, job schedulers, and scientific applications; train and support researchers in the use of tools and infrastructure
- Collaboration and Communication : Work effectively within interdisciplinary teams of scientists, bioinformaticians, and IT staff to determine data needs, gather requirements, and present analysis results and reports
- Innovation and Evaluation : Research and evaluate new technologies, tools, and algorithms, to improve operations and analytical capabilities
Requirements:
- Proficiency in relevant languages such as Python, R, and Linux bash scripting is essential
- Proven track record of managing data and infrastructure in High-Performance Computing (HPC) clusters is required
- Experience with workflow management systems (e.g., Nextflow, Snakemake, WDL) and containerization technologies (e.g., Docker, Singularity)
- Knowledge and experience using code management tools, such as Git/GitHub
- Strong analytical, problem-solving, and conceptual skills to address complex data challenges
- Knowledge of cloud computing environments is nice to have
- A solid understanding of molecular biology, genomics, and relevant biological domain knowledge (e.g., oncology, rare diseases) is often necessary for context-specific problem-solving