PandasPythonPyTorchSQLAIMachine LearningMLNLPLarge Language ModelsData EngineeringAgileLeadershipProject ManagementProduct ManagementCommunicationProblem Solving
About this role
Role Overview
Analize and manipulate a large, highly connected biological knowledge graph constructed of data from multiple heterogeneous sources, to identify data enrichment opportunities and strategies.
Work with data and knowledge engineering experts to design and develop knowledge enrichment approaches/strategies that can exploit data within our knowledge graph.
Provide solutions related to classification, clustering, more-like-this-type querying, discovery of high value implicit relationships, and making inferences across the data that can reveal novel insights.
Deliver robust, scalable and production-ready ML models, with a focus on optimizing performance and efficiency.
Architect and design ML solutions, from data collection and preparation, model selection, training, fine-tuning and evaluation, to deployment and monitoring.
Collaborate with your teammates from other functions such as product management, project management and science, and other engineering disciplines.
Sometimes provide technical leadership on Knowledge Enrichment projects that seek to use ML to enrich the data in BenchSci’s Knowledge Graph.
Work closely with other ML engineers to ensure alignment on technical solutioning and approaches.
Liaise closely with stakeholders from other functions including product and science.
Help ensure adoption of ML best practices and state of the art ML approaches within your team(s).
Participate in various agile rituals and related practices.
Requirements
Minimum 3, ideally 5+ years of experience working as an ML engineer.
Some experience providing technical leadership on complex projects.
Degree, preferably PhD, in Software Engineering, Computer Science, or a similar area.
A proven track record of delivering complex ML projects working alongside high performing ML, data and software engineers using agile software development.
Demonstrable ML proficiency with a deep understanding of how to utilize state of the art NLP and ML techniques.
Mastery of several ML frameworks and libraries, with the ability to architect complex ML systems from scratch. Extensive experience with Python and PyTorch.
Track record of contributing to the successful delivery of robust, scalable and production-ready ML models, with a focus on optimizing performance and efficiency.
Experience with the full ML development lifecycle from architecture and technical design, through data collection and preparation, model selection, training, fine-tuning and evaluation, to deployment and maintenance.
Familiarity with implementing solutions leveraging Large Language Models, and a deep understanding of how to implement solutions using Retrieval Augmented Generation (RAG) architectures, including both Graph RAG and Vector RAG.
Experience with graph machine learning (i.e. graph neural networks, graph data science) and practical applications thereof. Your experience working with Knowledge Graphs, ideally biological, and a familiarity with biological ontologies complement this.
Experience with complex problem solving and an eye for details such as scalability and performance of a potential solution.
Comprehensive knowledge of software engineering, programming fundamentals and industry experience using Python.
Experience with data manipulation and processing, like SQL, Cypher or Pandas.
A can-do, proactive and assertive attitude
your manager believes in freedom and responsibility and helping you own what you do. You will excel if this environment suits you.
You have experience working in cross-functional teams with product managers, scientists, project managers, and engineers from other disciplines (e.g. data engineering).
Ideally, you have worked in the scientific/biological domain with scientists on your team.
Outstanding verbal and written communication skills. Can clearly explain complex technical concepts/systems to engineering peers and non-engineering stakeholders.
A growth mindset continuously seeking to stay up-to-date with cutting-edge advances in ML/AI, complemented by actively engaging with the ML/AI community.
Tech Stack
Pandas
Python
PyTorch
SQL
Benefits
A great compensation package that includes BenchSci equity options
A robust vacation policy plus an additional vacation day every year
Company closures for 14 more days throughout the year
Flex time for sick days, personal days, and religious holidays
Comprehensive health and dental benefits
Annual learning & development budget
A one-time home office set-up budget to use upon joining BenchSci
An annual lifestyle spending account allowance
Generous parental leave benefits with a top-up plan or paid time off options
The ability to save for your retirement coupled with a company match!