Director of Infrastructure | Sydney, Hybrid/Remote | MLOps, Infrastructure, AI, Computer Vision, Kubernetes
Are you ready to lead a cutting-edge engineering team for a reputable & impactful, tech-first organization operating at the intersection of artificial intelligence and cloud-native engineering?
With proprietary technology, AI-driven analysis pipelines, and robust SaaS offerings, they seek a deeply experienced technical and people leader to guide their AI & Infrastructure Engineering squad, managing large-scale, cloud-based infrastructures.
This team connects cloud operations with applied Computer Vision + AI efforts, ensuring workflows are reliable, observable, and fast—scaling to thousands of compute nodes or integrating the latest generative AI technologies.
Director of MLOps Responsibilities
- Spearhead design and optimization of highly parallelized, distributed processing systems for large-scale AI workloads — handling petabytes of data and dynamic resource provisioning.
- Lead deployment and monitoring of real-time machine learning models at scale using modern orchestration platforms.
- Build developer-centric tooling and frameworks to help ML Engineers and Data Scientists rapidly prototype, train, and deploy models.
- Drive adoption and integration of emerging technologies, including large language models, into core platform functionality.
- Collaborate across DevOps, Cloud Engineering, and AI teams to ensure seamless alignment of priorities, infrastructure reliability, and delivery timelines.
- Manage and mentor a global team of engineers and ML infrastructure specialists across multiple regions.
Director of MLOps Requirements
- Cloud & Orchestration: AWS, EKS, Karpenter, Terraform, ArgoCD
- AI Infrastructure: Ray, Kubeflow, Weights & Biases (WandB), Kafka (via Confluent)
- Workflows & Pipelines: DAG-based batch systems, streaming pipelines, real-time inference architectures
- Languages: Go, Python, Bash, and other backend-focused languages
- Experience as a backend infrastructure or platform engineer with scaling complex systems in cloud-native environments.
- Strong technical leadership skills, empowering AI teams by solving engineering challenges.
- Experience with large-scale distributed computing, observability, and MLOps principles.
- Ability to operate reliably in high-stakes environments demanding speed and precision.
- Passion for bridging data science and cloud engineering.
If interested, please apply with your most up-to-date CV and reach out.