FUJIFILM America is a leading company in the biotechnology sector, seeking a Principal Data/AI Engineer to drive the technical strategy and architecture of data and AI platforms. The role involves planning, designing, and developing data pipelines, advocating for best practices, and mentoring junior engineers.
Responsibilities:
- Architect, build, and maintain highly scalable batch and streaming pipelines on the Snowflake Data Platform (Snowpipe, Tasks, Streams, Dynamic Tables, Snowpark, Iceberg)
- Architect and deliver ML/GenAI solutions using managed cloud services (AWS, Azure, Snowflake Cortex)
- Implement modern data modeling and architecture patterns; establish and enforce standards for data quality (tests, expectations, SLAs/SLOs), observability (metrics, logs, traces), and lineage
- Ensure integration of biotech systems (MES, LIMS, SCADA, ERP, QMS) into centralized data platform
- Collaborate with product managers, product engineers, platform architects, and business stakeholders to align data and AI engineering solutions with business requirements
- Enable modern AI use cases - feature stores, vector search/RAG, model serving, safety/guardrails, and continuous monitoring for drift, bias, and performance
- Optimize storage tiers, compute clusters/warehouses, caching, and workload orchestration for latency and throughput
- Partner with cybersecurity and compliance teams to ensure adherence to GxP, FDA 21 CFR Part 11, and data privacy regulations
- Lead design reviews, incident postmortems, and cross-team architecture forums
- Stay current with emerging technologies (data mesh, real-time streaming, digital twins, generative AI platforms) and introduce relevant innovations
- And other job duties that may be assigned from time to time
Requirements:
- Bachelor's degree in Computer Science, Data Engineering, AI/ML Engineering, or related field
- 12+ years of professional experience in data/software engineering, AI/ML engineering, or cloud platform engineering
- Proven experience using Python and SQL
- Extensive experience building and maintaining data pipelines using modern frameworks (e.g. Airflow, dbt)
- Proven experience with data modelling for analytics and AI use cases
- Strong experience with cloud platforms (AWS, Azure)
- Proven experience delivering production-grade data solutions
- Familiarity with biotech or life sciences systems and regulatory compliance frameworks (GxP, FDA, EMA)
- Design and implementation of scalable batch and streaming data pipelines
- Strong proficiency in Python and SQL/dbt for data processing, automation, and analytics
- Extensive experience in Airflow or similar orchestration tool
- Expertise in designing and developing data solutions on Snowflake, including data modelling, performance optimization, and cost-efficient usage
- Experience with modern AI technologies, including LLMs, embeddings, and vector databases
- Proven track of delivering cloud-based solutions (AWS, Azure)
- Containerization and deployment of data and AI workloads using Docker
- Orchestration and operation of containerized workloads using Kubernetes
- Data quality management, observability, lineage, and governance
- Knowledge of biotech IT/OT systems (MES, LIMS, SCADA), and compliance frameworks (GxP, FDA, data privacy)
- Strong problem-solving, optimization, and troubleshooting skills for large-scale data systems
- Effective communication with both technical and non-technical stakeholders, influencing at senior levels
- Passion for emerging technologies, continuous improvement, and building innovative engineering cultures
- Advanced degree (MS/PhD) preferred
- Relevant industry certifications (e.g., Snowflake, AWS, Azure) preferred