Machine Learning Engineer, Biological Sequence Design

Permanent contract
London
Salary: £60K to 75K
A few days at home
Apply

InstaDeep
InstaDeep

Interested in this job?

Apply
Questions and answers about the job

The position

Job description

InstaDeep, founded in 2014, is a pioneering AI company at the forefront of innovation. With strategic offices in major cities worldwide, including London, Paris, Berlin, Tunis, Kigali, Cape Town, Boston, and San Francisco, InstaDeep collaborates with giants like Google DeepMind and prestigious educational institutions like MIT, Stanford, Oxford, UCL, and Imperial College London. We are a Google Cloud Partner and a select NVIDIA Elite Service Delivery Partner. We have been listed among notable players in AI, fast-growing companies, and Europe's 1000 fastest-growing companies in 2022 by Statista and the Financial Times. Our recent acquisition by BioNTech has further solidified our commitment to leading the industry.

Join us to be a part of the AI revolution!

InstaDeep is seeking talented Machine Learning Engineers to join our Research Team in London. Our team is working at the intersection of machine learning and biology to address diverse challenges in biological sequence design. 

Role Description:

As an ML Engineer in the Research Team you will be responsible for developing and implementing software investigating promising research ideas in the field of Bayesian optimization, active learning, large language models, representation learning, uncertainty quantification, and distribution shifts, to the most relevant challenges in biological sequence design. 

The ideal candidate will have a strong foundation in machine learning and software engineering, with computational biology experience as a great bonus!, As a Machine Learning Engineer, you will work closely with Research Scientists and Research Engineers to support our ambitious research infrastructure; playing a key role in the implementation and validation of machine learning models along with data curation and library maintenance. If you are passionate about leveraging machine learning to solve complex biological problems and driving advancements in life sciences, we encourage you to apply and join our innovative team. 

Responsibilities

  • Lead the engineering components of long-term research projects encompassing all stages of the project life-cycle. Responsibilities include data generation pipelines, database management, development and maintenance of codebases, as well as the design and execution of analysis pipelines and reporting mechanisms.
  • Collaborate closely with the Core ML and Engineering teams to integrate and optimise cutting-edge methodologies for the distribution and scaling of large-scale (billion parameter plus) ML models.
  • Align with engineering leads across other critical projects to improve standardisation and methodological best practices across the company.
  • Develop and maintain robust, high-quality software solutions. Ensure code is modular, well-documented, and integrates smoothly with continuous integration systems.
  • Work in collaboration with Research Scientists, Engineers, and technical leads from various projects to uphold high coding standards and foster standardisation and methodological best practices across the Research Team.
  • Deploy machine learning models and associated processes across large-scale, distributed computing infrastructures, including CPUs, GPUs, and TPUs, utilising both in-house and cloud-based platforms.
  • Manage the efficient, reproducible, and performant handling of complex, multi-modal biological data. This includes optimising data generation, storage, and retrieval processes, particularly through advanced database management systems like SQL.
  • Actively contribute to the team's research initiatives, including publishing results and participating in open-source projects.
  • Report and present experimental results and research findings clearly and effectively, both internally and externally, verbally and in writing.
  • Requirements

  • Masters-level degree in Computational Science, Machine Learning or a related scientific field.
  • Experience using Deep Learning frameworks like PyTorch, Tensorflow and/or Jax.
  • Strong software engineering experience (Object-Oriented Programming, Unit Testing, Profiling, CI, Docker) via previous work or contributions to open-source projects.
  • Excellent communication skills and collaborative spirit.
  • Desirables

  • Experience in professional research teams; either industrial or through PhD/post-doctoral positions.
  • Computational biology experience and biological data curation and management.
  • Experience in model guided optimisation, applying computational modelling techniques to protein sequence design, fitness prediction and/or folding or binding tasks. 
  • Published scientific papers in related domains such as ML or bioinformatics.
  • Our commitment to our people

    We empower individuals to celebrate their uniqueness here at InstaDeep. Our team comes from all walks of life, and we’re proud to continue encouraging and supporting applicants from underrepresented groups across the globe. Our commitment to creating an authentic environment comes from our ability to learn and grow from our diversity, and how better to experience this than by joining our team? We operate on a hybrid work model with guidance to work at the office at least 2 to 3 days per week to encourage close collaboration and innovation. We are continuing to review the situation with the well-being of InstaDeepers at the forefront of our minds.

    Right to work: Please note that you will require the legal right to work in the location you are applying for.

    Want to know more?

    These job openings might interest you!

    These companies are also recruiting for the position of “Data / Business Intelligence”.

    Apply