Job Search and Career Advice Platform

Enable job alerts via email!

Lead Machine Learning Engineer

ThoughtWorks

Singapore

On-site

SGD 100,000 - 125,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A global technology consultancy in Singapore is looking for a Technical Lead in model optimization. In this role, you will lead the design and implementation of advanced optimization pipelines while mentoring engineering teams. Candidates should have deep expertise in model and runtime optimization techniques, as well as strong proficiency in deep learning frameworks. This position offers meaningful growth and career development opportunities.

Benefits

Interactive career development tools
Numerous development programs
Supportive team culture

Qualifications

  • Deep practical expertise in advanced model optimization techniques.
  • Proven experience optimizing inference workloads in production.
  • Ability to diagnose and optimize performance using various tools.

Responsibilities

  • Lead design and implementation of model optimization pipelines.
  • Guide teams in high-throughput serving strategies.
  • Develop benchmarks and performance dashboards for system efficiency.

Skills

Model and runtime optimization techniques
Deep learning frameworks (PyTorch, TensorFlow)
Profiling tools (Nsight, PyTorch/TensorFlow profilers)
Communication and stakeholder engagement skills
Experience designing scalable inference systems

Tools

vLLM
NVIDIA Triton
Dynamo
Job description
Job responsibilities
  • Lead the design and implementation of advanced model optimization pipelines, including quantization, pruning, and distillation. Architect and tune inference runtimes and serving frameworks to achieve optimal performance across deployments.
  • Guide teams in implementing high-throughput serving strategies (continuous batching, KV caching, speculative decoding, asynchronous scheduling).
  • Develop benchmarks and performance dashboards to measure and communicate system-level efficiency improvements (throughput, latency, GPU utilization, cost).
  • Evaluate trade-offs across accuracy, performance, and cost, and design architectures to meet target SLAs across varied hardware environments (cloud, on‑prem, edge).
  • Collaborate with infrastructure, MLOps, and product teams to embed inference optimization into production workflows and platform designs.
  • Provide technical leadership and mentorship to engineers, fostering a culture of experimentation, rigor, and continuous performance improvement.
  • Contribute to the development of internal frameworks, reference architectures, and playbooks for scalable and cost‑efficient inference.
  • Engage with clients to translate optimization outcomes into business value and articulate the ROI of technical improvements.
Job qualifications
Technical Skills
  • Deep practical expertise in model and runtime optimization techniques (quantization, pruning, distillation, batching, caching).
  • Proven experience optimizing inference workloads using frameworks such as vLLM, NVIDIA Triton / Dynamo.
  • Strong proficiency in deep learning frameworks (PyTorch, TensorFlow) with production deployment experience.
  • Ability to diagnose and optimize performance using profiling tools (Nsight, PyTorch / TensorFlow profilers).
  • Solid understanding of GPU and accelerator architectures, and experience tuning workloads for cost and performance efficiency.
  • Experience designing and benchmarking scalable inference systems across heterogeneous environments (GPU clusters, serverless, edge).
  • Familiarity with observability stacks, telemetry, and cost instrumentation for AI workloads.
Professional Skills
  • Demonstrated ability to lead small‑to‑medium engineering teams or technical workstreams.
  • Skilled at balancing hands‑on delivery with architectural oversight and mentorship.
  • Strong communication and stakeholder engagement skills and are able to connect low‑level optimizations with business impact.
  • Comfortable in ambiguous and fast‑evolving technology landscapes, with a passion for applied innovation.
  • Commitment to continuous learning and knowledge sharing across teams and communities.
Other things to know

Learning & Development: There is no one‑size‑fits‑all career path at Thoughtworks : however you want to develop your career is entirely up to you. But we also balance autonomy with the strength of our cultivation culture. This means your career is supported by interactive tools, numerous development programs and teammates who want to help you grow. We see value in helping each other be our best and that extends to empowering our employees in their career journeys.

Job Details

Country : Singapore

City : Singapore

Date Posted : 10-30-2025

Industry : Information Technology

Employment Type : Regular

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.