At Joko, we believe that today’s online shopping experience is fundamentally flawed, and we are putting a lot of effort into disrupting it. We are crafting a new experience that enables users to find their desired products in the smoothest way possible, to effortlessly compare all their characteristics, and to obtain transparent and clear information on both their price and environmental cost.
To achieve our goal, we are building the world’s largest product catalog, a universal catalog composed of all the products sold by all e-commerce sites in the world. For this, we need to understand any web page in order to extract structured information from it, and to clean and standardize information from multiple sources in near real-time. We have developed LLM-based approaches to address all these challenges. One of the major challenges is to scale these approaches on colossal volumes of data (we have to process hundreds of millions of products several times a day). We have developed state-of-the-art approaches that rely on fine-tuning relatively small LLMs, but there is still a lot of research needed to optimize their performance and resource efficiency, and/or find more efficient approaches.
Then, we are developing an AI copilot that helps users find the right products in our gigantic product catalog. Developing this conversational experience is a huge challenge that goes beyond traditional RAG systems: it requires a deep understanding of search engines (to use combinations of traditional full-text search and vector search on huge volumes of data), and a mastery of LLMs to deliver a reliable experience with low latency and controlled costs. We are constantly iterating on this product, and have many associated research problems to improve the search accuracy, reduce the latency, and better capture the intention of the user.
Joko has been offering research internships in Machine Learning for several years. All our internships are closely tied to our engineering teams to maximize their tangible impact. Almost all previous interns joined Joko in full-time positions after their internship.
As a Machine Learning Research Intern, you will work on one of the following research subject:
Improve the performance and the scalability of our LLM-based data processing pipeline for our universal product catalog. For this project, it will be necessary to explore fine-tuning LLMs for specialized tasks as well as various techniques aimed at reducing model size (such as quantization, pruning, or distillation). Rigorous evaluation of model performance (notably using LLMs as judges) will represent one of the challenges.
Improve the search performance in our universal product catalog. For this project, it will be necessary to benchmark the performance of different search techniques, combining full-text search and vector search, and to identify the most effective LLM-based embedding methods.
Improve the performance and the latency of our AI copilot. For this project, it will be necessary to benchmark numerous models, explore their fine-tuning, work on reducing their size, as well as work on ML Ops topics to ensure the best possible latency in a production context. Here again, rigorous evaluation of model performance will represent an important challenge.
Exploration will represent an important part of the internship, through experiments, literature reviews, and theoretical developments. You will have full ownership of your projects and the liberty to orient the research direction of your internship based on your results and what you consider promising among the directions we determined. Your goal will be to deploy your work in production and monitor its impact on hundreds of thousands of users.
Your responsibilities:
Research: You will work on all steps of the research process – you will formalize the objectives of your work, conduct literature reviews to have a deep understanding of the problems, design new algorithms, analyze them both theoretically and experimentally, and collect and transform relevant data for your experiments.
Exploration & ownership: You will participate in orienting the internship towards research directions you deem valuable to our users.
Implementation, deployment & monitoring in production: Helped by the engineering team, you will be responsible for integrating into our product the most scalable and robust algorithms you will have worked on. Finally, you will monitor their impact on our users.
Problem solver: You have strong analytical skills, you are creative, and you love solving complex problems.
Fast learner: You are comfortable in any technical environment and are able to quickly learn new technologies and new practices.
Programming skills: You have experience with writing code and are willing to improve on it. You have experience with Python and Python Machine Learning frameworks.
Attention to detail: You know that the devil is in the details, and you have a talent for spotting flaws when they exist.
Tech savvy: You are constantly looking at emerging technologies and you keep a close eye on the latest trends in the domain.
Efficiency: You are fond of productivity tools and able to deliver on time on projects with many stakeholders.
Mindset: You have an entrepreneurial mindset, you like challenges, you welcome feedback and you are willing to get better every day to reach excellence.
Communication: You have strong written and verbal communication skills, and you are able to explain something complex with simple words.
Languages: You are fluent in English both written and spoken, as we are expanding internationally soon. Mastering French is not required.
Education: You are a graduate student in a Machine Learning master.
Experience: You have experience with research, from past internships or projects.
15-min call with the Hiring Manager
45-min personality interview with two team members
Live tests with team members
45-min Founders interview
Reference calls
You might also be invited to meet other team members at the office for a coffee or a drink!
Tyto společnosti rovněž nabírají pracovníky na pozici "{profese}".