We are seeking a skilled Machine Learning Engineer to join our dynamic team. The ideal candidate will be involved in the development, optimization, and maintenance of our codebase used for training large generative neural network models. This role requires a strong background in machine learning, software development, and the ability to work collaboratively in a research-focused environment.
The Swiss AI Initiative is a collaborative research project led by ETH Zurich and EPFL, focused on developing responsible and transparent generative AI. A significant project of this initiative is the Large Language Model (LLM) effort, which aims to create state-of-the-art language models at various scales, including an ambitious 70B parameter model.
This work leverages the Alps supercomputer at the Swiss National Supercomputing Centre (CSCS), which features over 10,000 NVIDIA Grace Hopper GPUs, making it one of the most powerful AI-focused computing resources in Europe. The Swiss AI Initiative plans to distribute 15-20 million GPU hours annually to support various research and development projects in AI.
As a machine learning engineer in this project, you will contribute to the development and optimization of the training pipelines for these large-scale models, working at the intersection of cutting-edge research and high-performance computing to advance Switzerland's position in AI innovation.
As a machine learning research engineer, you will be responsible for developing and maintaining software for training large-scale neural networks, such as large language models. You will work closely with researchers and other engineers to design and implement scalable solutions for model training, evaluation, and deployment. A key aspect of your role will involve optimizing existing machine learning frameworks to improve performance and efficiency.
To excel in this position, you will need to stay updated with the latest advancements in AI and machine learning technologies. You will actively participate in code reviews and maintain comprehensive documentation to ensure code quality and reproducibility. Additionally, you may be expected to contribute to research papers and technical reports as needed, helping to disseminate our team's technical achievements and research findings to the broader scientific community.