Job Search and Career Advice Platform

Enable job alerts via email!

Senior Site Reliability Engineer - APAC

Tyk Technologies

Remote

SGD 90,000 - 130,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology company in Singapore is seeking a Senior Site Reliability Engineer to optimize and enhance their cloud platform. Candidates should have experience in SRE with strong knowledge of cloud technologies such as Kubernetes, AWS, and Linux. This role allows for unlimited paid holidays and offers a flexible working environment, emphasizing creativity and performance. Successful applicants will collaborate with teams to improve operational processes and implement essential infrastructure improvements.

Benefits

Unlimited paid holidays
Flexible working hours
Employee share scheme
Generous maternity and paternity leave
Volunteering Days
Employee Wellbeing platform

Qualifications

  • Experience in an SRE role.
  • Strong knowledge of cloud technologies and SLA SLO SLI management.
  • Excellent communication and leadership skills.
  • Ability to analyze and improve operational processes and performance metrics.
  • Experience in software design, automation, and root cause analysis.
  • On-call support experience and customer-focused mindset.
  • Collaborative attitude with commercial and technical teams.
  • Experience in launching and operating production Kubernetes clusters.
  • Designing and operating infrastructure on AWS and similar providers.
  • Operating MongoDB or Redis clusters.
  • Administering Linux servers.
  • Operating Prometheus and Grafana.

Responsibilities

  • Optimize, automate, and improve performance of the global Cloud platform.
  • Shape SRE strategy and translate it into actionable plans.
  • Identify reliability issues and implement solutions.
  • Lead performance tuning and fault finding.
  • Design automation for operational tasks.
  • Develop proactive alerting and monitoring.
  • Participate in on-call rotation for incident response.
  • Conduct blame-free post-mortems and maintain operational runbooks.
  • Drive multi-region and multi-cloud platform expansion.
  • Optimize infrastructure performance and cost efficiency.

Skills

Kubernetes (administrator)
Go (advanced)
Python (advanced)
AWS / EKS (advanced)
Linux (advanced)
Terraform and IaC in general (proficient)
Helm (proficient)
MongoDB (or similar)
Redis (or similar)
Monitoring – prometheus, grafana, thanos (familiar)
Networking concepts (subnets, routing, etc.)
Common networking protocols (DNS, TCP/IP, etc.)
Proactive, energetic, innovative
Leadership and mentoring
Job description
Who are Tyk, and what do we do?

The Tyk API Management platform is helping to drive the connected world and power new products and services. We’re changing the way that organisations connect any number of their systems and services. Whether internal, external, public or highly encrypted systems, Tyk helps businesses drive value across the retail, finance, telecoms, healthcare, or media industries (to name just a few!)

Founded in 2015 with offices in London - UK, London - Ontario, Atlanta and Singapore, we have many thousands of users of our B2B platform across the globe. Brands using Tyk range from Lotte, Bell, T Mobile, to RBS, Capital One and Vinci. We have a varied user base hailing from every continent – even Antarctica.

Our Mission

Tyk is on a mission to connect every system in the world. We’ve started by building an API Management platform.

Total flexibility, default remote, radical responsibility

We offer unlimited paid holidays and remote working from anywhere in the world, for everyone. We believed this principle of flexibility and autonomy unlocks best performance and enables us to build the best possible team, location and working hours are no barrier.

The role

At Tyk, we’re obsessed with building software that solves problems. Our Site Reliability Engineers (SREs) empower users with a rich feature set, high availability, and stellar performance level to pursue their missions. Our customer base is growing, so we’re seeking an experienced Senior SRE to optimize, automate, and improve performance using insights from massive‑scale data in real time. We want an original thinker, a challenger, a technical legend, an opinionated collaborator who wants to make things better.

Requirements
  • Lead hands‑on maintenance and optimization of our global Cloud platform within SL(A / I / O)s you'll help define
  • Collaborate to shape SRE strategy, then translate into actionable technical plans coordinated through SCRUM
  • Identify reliability issues, drive root cause analysis, and implement solutions alongside your squad
  • Lead performance tuning and fault finding through analysis of OS and application metrics
  • Design and implement automation for common operational tasks and cloud‑operations workflows
  • Develop proactive alerting, monitoring roadmap, and relevant dashboards; define and track KPIs
  • Participate in on‑call rotation, ensuring effective incident response and resolution within SLAs
  • Conduct blame‑free post‑mortems, document findings, and maintain operational runbooks
  • Drive multi‑region and multi‑cloud platform expansion with focus on scalability and automation
  • Optimize infrastructure performance and cost efficiency without impacting service delivery
  • Engage with commercial teams on growth plans and translate into technical SRE strategies
  • Coordinate penetration testing through provider liaison, technical setup, and environment configuration
  • Champion continuous improvement across processes, communication, and team practices
  • Model excellence in software design and knowledge sharing
  • Plan and execute software upgrades to enhance cloud services
Experience required
  • Experience in an SRE role
  • Strong knowledge of cloud technologies and SLA SLO SLI management
  • Excellent communication and leadership skills
  • Ability to analyze and improve operational processes and performance metrics
  • Experience in software design, automation, and root cause analysis
  • On‑call support experience and customer‑focused mindset
  • Collaborative attitude with commercial and technical teams
  • Launching and operating production Kubernetes clusters
  • Designing and operating infrastructure on AWS and other providers
  • Operating MongoDB (or other document database) clusters
  • Operating Redis (or other key‑value storage) clusters
  • Administering Linux servers
  • Operating Prometheus and Grafana
  • Operating logging collection and analysis system
  • Participating in the on‑call rotation (4 : 00am - 16 : 00pm UTC)
Skills
  • Kubernetes (administrator)
  • Go and / or Python (advanced)
  • AWS / EKS (advanced)
  • Linux (advanced)
  • Terraform and IaC in general (proficient)
  • Helm (proficient)
  • MongoDB (or similar)
  • Redis (or similar)
  • Monitoring – prometheus, grafana, thanos (familiar)
  • Grasp of networking concepts (subnets, routing, peering, load balancing, NAT, etc.)
  • Common networking protocols (DNS, TCP / IP, HTTP, TLS, UDP)
  • Proactive, energetic, innovative and change oriented
  • A desire to lead / mentor a team
Benefits
  • Everyone has unlimited paid holidays.
  • We have total flexibility in hours, as we believe creativity flows better when our people are given freedom to decide when they are most productive. Everyone is unique after all.
  • Employee share scheme
  • Generous maternity and paternity leave
  • Volunteering Days
  • Employee Wellbeing platform

We all share the same vision - we value authenticity, respect, responsibility, independence, honesty, diversity and inclusion and most importantly treating others how you wish to be treated. We look for like‑minded people who bring their personalities to work everyday, strive to achieve their personal goals and who are willing to challenge the way we do things, why? - to make what we do even better!

Our values tell the story of Tyk - here’s how
  • It’s ok to screw up!

We’ve found that it’s often the ‘stupid’ or unexpected ideas that turn out to be the successful ones - so try it, at least we can say we have!

  • The only stupid idea, is the untested one!

It’s in our DNA - starting a business with founders 12 hours apart, giving our gateway away for free - sure, we did that, and we’d do it again!

  • Trust starts with you - make it count!

Trust is a two‑way street - instil it from day one!

  • Assume best intent!

We have each other’s back - we’re all on the same team. Think before you speak or act.

  • Make things better!

Always try to leave things better than when you found them - change is constant, inevitable and embraced! Be that change we want to see.

What’s it like to work here?! check it out : https://tyk.io/worklife/

Tyk is an equal opportunities employer and we are determined to ensure that no applicant or employee receives less favourable treatment on the grounds of gender, age, disability, religion, belief, sexual orientation, marital status, or race, or is disadvantaged by conditions or requirements which cannot be shown to be justifiable.

You can see more about us here https://tyk.io

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.