Enable job alerts via email!

Site Reliability Engineer, Traffic Platform

TIKTOK PTE. LTD.

Singapore

On-site

SGD 80,000 - 120,000

Full time

22 days ago

Job summary

A leading technology company is looking for a Site Reliability Engineer to manage their global traffic platform. This role involves building, automating, and operating large-scale systems across public and private clouds. Ideal candidates will have a strong background in Linux systems, programming, and cloud services, with an emphasis on analytical skills and problem-solving in a fast-paced environment.

Benefits

Career growth opportunity

Paid leave

Meals provided

Qualifications

Minimum 3 years experience with Linux systems, programming in Go/Python/Shell.
Experience with cloud services like AWS, Google, Azure.
Strong analytical and problem-solving skills.

Responsibilities

Build and operate Bytedance’s global traffic platform.
Work on optimizations and automations for large scale systems.
Participate in technical operations responding to performance issues.

Skills

Linux systems

Programming in Go

Programming in Python

Shell scripting

Cloud services

CI/CD tools

Analytical skills

Problem-solving

Education

Bachelor or Master's degree in Computer Engineering

Bachelor or Master's degree in Electrical Engineering

Bachelor or Master's degree in Computer Science

Tools

GIT

Docker

Kubernetes

ELK stack

About Us

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok, Lemon8, CapCut and Pico as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.

Why Join ByteDance

Inspiring creativity is at the core of ByteDance's mission. Our innovative products are built to help people authentically express themselves, discover and connect – and our global, diverse teams make that possible. Together, we create value for our communities, inspire creativity and enrich life - a mission we work towards every day.

As ByteDancers, we strive to do great things with great people. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. By constantly iterating and fostering an "Always Day 1" mindset, we achieve meaningful breakthroughs for ourselves, our Company, and our users. When we create and grow together, the possibilities are limitless. Join us.

Diversity & Inclusion

ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

Job highlights

Career growth opportunity, Paid leave, 100+ mil users, Meals provided

Responsibilities

About the Team

Site Reliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed infrastructures. Our SREs are tasked to ensure the traffic services are reliable, fault-tolerant, efficiently scalable and cost-effective. You will have the opportunity to manage a variety of complex systems at scale, including traffic systems that serve hyperscale datacenters and public cloud, global load balancer that handles Tbps of traffic.

Responsibilities

- Build, expand and operate Bytedance’s global traffic platform, including large-scale systems in public and private clouds, edge data centers.

- Build tools, automations, visualizations and monitors to facilitate the operation and optimization of the global traffic platform.

- Work in a fast-paced environment. Participate in technical operations and rotations in response to performance and reliability issues.

- Help improve the whole lifecycle of infrastructure services from inception and design throughout development, to deployment, user support and refinement.

Qualifications

Minimum Qualifications

- Bachelor or Master's degree in Computer Engineering, Electrical Engineering, Computer Science or related major.

- Proven years experience working with Linux systems from kernel to shell and beyond with experience working with system libraries, file systems, and client-server protocols.

- At least 3 years experience in one or more programming languages such as Go, Python and Shell script.

- Familiar with Cloud and CI/CD framework/Tools, such as GIT, Docker, Kubernetes, etc.

Preferred Qualifications

- Experience in designing, analyzing and building automation and tools for large scale systems

- Experience in building solutions with AWS, Google, Azures and other cloud services.

- Experience in networking technologies such TCP/IP, HTTP, DNS, etc. in a carrier-grade environment.

- Experience in developing and operating one or more of following systems: Kubernetes, Nginx, ipvs, ELK stack, etc.

- Self-driven and capable of coping with ambiguity and moving projects from concept to delivery.

- Strong in analytical skills and the ability to solve real world problems in a fast moving environment.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Site Reliability Engineer, Traffic Platform

TIKTOK PTE. LTD.

Singapore

On-site

SGD 80,000 - 120,000

Full time

Job summary

Benefits

Qualifications

Responsibilities

Skills

Education

Tools

Job description

Company

Services

Free resources

Support

Site Reliability Engineer, Traffic Platform

TIKTOK PTE. LTD.

Singapore

On-site

SGD 80,000 - 120,000

Full time

Job summary

Benefits

Qualifications

Responsibilities

Skills

Education

Tools

Job description

Follow us

Company

Services

Free resources

Support