Job Search and Career Advice Platform

Enable job alerts via email!

Senior Machine Learning / Reinforcement Learning Engineer

SLEEK R&D AND MANAGEMENT (PTE. LTD.)

Singapore

On-site

SGD 80,000 - 120,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology firm in Singapore is seeking a Machine Learning / Reinforcement Learning Engineer to design and scale next-generation ML/RL systems. This role requires direct contributions in improving system efficiency and reliability under real-world constraints. Candidates should have strong experience in building applied ML systems, using Python and PyTorch, with a minimum of 5 years in production. If you are passionate about innovative solutions and customer-focused technologies, this position is for you. Competitive compensation and a collaborative environment await.

Qualifications

  • 5+ years building, training, and shipping ML systems.
  • Experience with distillation, quantization, or fine-tuning.
  • Solid RL fundamentals and inference-time optimisation.

Responsibilities

  • Design and build next-generation ML/RL systems.
  • Deliver production-ready ML/RL systems with measurable improvements.
  • Implement test-time RL and multi-step workflows.

Skills

Applied ML in Production
Efficient Model Training
Reinforcement Learning
Agentic Systems
ML/RL Operational Excellence

Tools

Python
PyTorch
Job description

Through proprietary software and AI, along with a focus on customer delight, Sleek makes the back-office easy for micro SMEs.

We give Entrepreneurs time back to focus on what they love doing - growing their business and being with customers. With a surging number of Entrepreneurs globally, we are innovating in a highly lucrative space.

We operate 3 business segments:

Corporate Secretary: Automating the company incorporation, secretarial, filing, Nominee Director, mailroom and immigration processes via custom online robots and SleekSign. We are the market leaders in Singapore with ~5% market share of all new business incorporations.

Accounting & Bookkeeping: Redefining what it means to do Accounting, Bookkeeping, Tax and Payroll thanks to our proprietary SleekBooks ledger, AI tools and exceptional customer service.

FinTech payments: Overcoming a key challenge for Entrepreneurs by offering digital banking services to new businesses.

Sleek launched in 2017 and now has around 15,000 customers across our offices in Singapore, Hong Kong, Australia and the UK. We have around 500 staff with an intact startup mindset.

We have recently raised Series B financing off the back of >70% compound annual growth in Revenue over the last 5 years. Sleek has been recognised by The Financial Times, The Straits Times, Forbes and LinkedIn as one of the fastest growing companies in Asia.

Backed by world‑class investors, we are on track to be one of the few cash flow positive, tech‑enabled unicorns based out of Singapore.

At Sleek, we are on a mission to streamline operations and elevate customer experience through intelligent automation powered by efficient, reliable, and production‑grade ML/RL systems.

We are seeking a Machine Learning / Reinforcement Learning Engineer (Applied) who will be a key individual contributor responsible for designing, building, and scaling next-generation ML/RL systems that operate under real‑world business constraints.

As one of Sleek’s senior applied ML/RL contributors, you will partner closely with Product, Engineering, and AI teams to translate ambiguous business problems into measurable ML/RL outcomes. You will own systems end‑to‑end — from model optimisation and evaluation through deployment and post‑production monitoring — ensuring that ML/RL capabilities are efficient, controllable, observable, and dependable in production.

You will play a central role in moving beyond generic, large‑model approaches, replacing or augmenting them with small, domain‑specific models, test‑time reinforcement learning, and agentic systems that deliver clear improvements in quality, latency, cost, and reliability. Your work will directly shape how ML/RL is deployed across Sleek’s products and internal operations.

You Will Ensure
  • Efficient, production‑ready ML/RL systems that make explicit, data‑driven trade‑offs between quality, latency, throughput, and cost.

  • Robust optimisation and evaluation practices, including benchmarks, regression testing, and production monitoring, to ensure sustained performance over time.

  • Reliable test‑time reinforcement learning and agentic workflows, with guardrails, fallbacks, and observability to manage risk and instability.

  • Pragmatic integration of ML/RL into real systems, designed for scalability, maintainability, and operational excellence rather than experimentation alone.

  • Clear technical communication and cross‑team alignment, enabling predictable delivery and informed decision‑making.

  • A high bar for engineering discipline, including reproducibility, monitoring, documentation, and continuous improvement.

Key outcomes in the first 6‑12 months

Ship High‑Impact ML/RL Systems
  • Deliver production‑grade ML/RL systems that create measurable improvements in quality, latency, cost, or reliability.

  • Replace or augment baseline approaches with small, domain‑specific models where they provide superior performance‑to‑cost trade‑offs.

  • Define and track clear success metrics and benchmarks for all deployed systems.

Establish Efficient Model Training & Serving (SMOL)
  • Build and operate efficient training and serving pipelines using distillation, quantization, and parameter‑efficient fine‑tuning.

  • Maintain benchmark suites covering quality, latency, throughput, memory, and cost.

  • Drive explicit, data‑backed trade‑offs in model and deployment decisions.

Deploy Test‑Time RL & Optimization
  • Implement test‑time optimisation (TTRL / TPO) to improve generative or agentic outputs within strict latency and cost budgets.

  • Introduce reward‑guided decoding or reranking with measurable gains.

  • Add monitoring, guardrails, and fallback strategies to manage instability and regressions.

Build Reliable Agentic Systems
  • Design and ship agentic workflows with multi‑step planning and execution across tools and data sources.

  • Implement orchestration for long‑running workflows (state, retries, timeouts, idempotency).

  • Establish evaluation harnesses and regression tests to track agent reliability and cost over time.

Establish ML/RL Operational Excellence
  • Implement production monitoring for quality, latency, cost, and failure modes.

  • Ensure training, experimentation, and deployment are reproducible, documented, and observable.

  • Partner closely with Product and Engineering to translate ambiguous problems into shippable ML/RL solutions.

Must‑have experience:

Candidates must demonstrate hands‑on, production experience across all areas below.

  • Applied ML in Production: ~5+ years building, training, and shipping ML systems using Python and PyTorch, with clear ownership beyond experimentation.

  • Efficient Model Training (SMOL): Experience replacing or augmenting large models with smaller, domain‑specific ones using distillation, quantization, or parameter‑efficient fine‑tuning, supported by clear benchmarks.

  • Reinforcement Learning & Test‑Time Optimization: Solid RL fundamentals and experience deploying inference‑time optimisation systems (e.g. reward‑guided decoding, reranking) under latency and cost constraints.

  • Agentic Systems: Experience building multi‑step agents with orchestration concerns such as state, retries, timeouts, and fallbacks, and improving their reliability and cost in production.

  • ML/RL Operational Excellence: Experience with reproducible training pipelines, evaluation, monitoring, and production debugging, and collaborating closely with Product and Engineering on constraint‑driven problems.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.