We are seeking a highly skilled Machine Learning Operations Leader to join our team. As an MLOps Team Lead, you will be responsible for leading a group of experienced MLOps Engineers in developing robust pipelines and cutting-edge Generative AI solutions.
Responsibilities
- Manage and develop a team of MLOps Engineers focusing on growth and execution excellence.
- Lead and oversee project deliveries from inception to completion.
- Develop MLOps pipelines leveraging Amazon SageMaker and its features.
- Create Generative AI solutions and Proof-of-Concepts using the latest architectures and technologies.
- Deliver end-to-end ML products, including model performance development, training, validation, testing, and version control.
- Provision ML AWS resources and infrastructure.
- Develop ML-oriented CI / CD pipelines using GitHub Actions or similar tools.
- Deploy Machine Learning models in production.
- Help customers tackle challenges at scale using distributed training frameworks.
- Utilize and write Terraform libraries for infrastructure deployment.
- Maintain infrastructures and environments of all types, from dev to production.
- Perform security monitoring and administration.
Requirements
- Proven leadership experience with a track record of managing technical teams.
- Excellent customer-facing skills to understand and address client needs effectively.
- Ability to design and implement cloud solutions and build MLOps pipelines on AWS.
- Experience with one or more MLOps frameworks like Kubeflow, MLFlow, DataRobot, Airflow, etc.
- Fluency in Python, good understanding of Linux, and knowledge of frameworks such as scikit-learn, Keras, PyTorch, Tensorflow, etc.
- Ability to understand tools used by data scientists and experience with software development and test automation.
- Experience with one or more large language models, such as frameworks like OpenAI SDK, Amazon Bedrock, LangChain, and LlamaIndex.
- Experience with Docker and Kubernetes.
- Fluent written and verbal communication skills in English.
- Working knowledge of some Vector Databases such as OpenSearch, Qdrant, Weaviate, LanceDB, etc.
- Working knowledge with Snowflake, BigQuery, and / or Databricks.
- GCP or Azure knowledge (DevOps / MLOps).
- ML certification (AWS ML Specialty or similar).
- Professional training and certifications covered by the company (AWS, FinOps, Kubernetes, etc.).
- International work environment.
- Referral program – enjoy cooperation with your colleagues and get a bonus.
- Company events and social gatherings (happy hours, team events, knowledge sharing, etc.).
- Wellbeing and professional coaching through Oliva Health.