Enable job alerts via email!

Staff Platform/ SRE Engineer

Grasshopper Pte Ltd

Singapore

On-site

USD 90,000 - 150,000

Full time

5 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative firm is seeking a Staff Site Reliability Engineer to join their Infrastructure Team. In this pivotal role, you will enhance research and batch computing capabilities while collaborating with cross-functional teams to architect and maintain scalable solutions. The ideal candidate will have extensive experience in SRE engineering, particularly in high-frequency trading environments. This dynamic workplace fosters creativity and autonomy, providing an exciting opportunity to work with cutting-edge technology in a diverse team. If you are passionate about solving complex problems and advancing your career, this position is perfect for you.

Benefits

21 days annual leave

Mentorship and personal growth opportunities

Comprehensive insurance package

Well-stocked pantry

Annual dental and wellness budget

Gym membership

Employee bonus referral program

Competitive compensation

Qualifications

8+ years in Platform/SRE engineering, ideally in HFT or research.
Strong knowledge of Kubernetes and cloud infrastructure.
Proficient in Python or Go with excellent collaboration skills.

Responsibilities

Design and maintain observability systems for high availability.
Develop scalable solutions on Google Cloud and on-premise.
Collaborate with developers to enhance CI/CD pipelines.

Skills

Platform/SRE Engineering

Kubernetes

GitOps

Cloud Infrastructure (AWS/GCP)

Python

Monitoring Tools (Prometheus, ELK)

CI/CD Pipelines

Tools

Argo-CD

GitLab CI

Argo Workflows

Puppet

Chef

Ansible

Jenkins

About Grasshopper

Grasshopper is a quantitative trading technology provider based in Singapore. Our state-of-the-art technology, built from the ground up in-house, puts us at the forefront of developments in electronic trading. An unbroken record of consistency and profitability is underpinned by firm values of curiosity, empowerment and flexibility.

About the role:

As a Staff Site Reliability Engineer on the Infrastructure Team, you will play a key role in advancing our research and batch computing capabilities. You will work closely with cross-functional teams to architect, develop, and maintain scalable solutions on our Google Cloud and on-premise Infrastructure.

Responsibilities:

Design, implement, and maintain robust observability systems, including monitoring, logging, tracing, and alerting, to ensure high availability, rapid incident detection, and deep system visibility across all services.
Architect, develop, and maintain scalable solutions on Google Cloud and on-premise infrastructure.
Support and advance our research cluster.
Investigate infrastructure/application issues on live production systems.
Collaborate with developers to improve our development environment, including CI/CD pipelines and built tools.
Promote an SRE mindset within the organization.

Qualifications:

At least 8 years of solid experience in Platform/SRE engineering, preferably in high-frequency trading (HFT) or research environments involving research and/or backtesting platforms.
Good understanding of Kubernetes architecture and operational management.
Proficiency in GitOps principles with tools like Argo-CD and GitLab CI.
Experience with workflows (Argo Workflows) and batch/HPC workloads.
Strong grasp of cloud infrastructure, with practical experience in AWS or GCP.
Ability to investigate infrastructure/application issues independently on live production systems.
Proficiency in programming languages such as Python or Go.
Excellent interpersonal, collaboration, and communication skills.
Strong entrepreneurial spirit and adaptability to changing requirements and technologies.
Passion for learning and innovating solutions to complex problems.

Preferred additional experience:

Experience with Kubernetes operators.
Knowledge of on-premises bare-metal environments.
Proficiency with containerization technologies.
Experience with configuration management tools like Puppet, Chef, or Ansible.
Understanding of security principles from operational and implementation perspectives.
Experience with monitoring tools like Prometheus and the ELK stack.
Familiarity with RedHat and CentOS Linux distributions.
Contributions to open-source projects.
Experience with CI tools such as Jenkins.

What we offer:

21 days annual leave
Mentorship and personal growth opportunities
Comprehensive insurance package including dependents
Well-stocked pantry
Annual dental and wellness budget
Gym membership
Employee bonus referral program
Competitive compensation

Working at Grasshopper:

At Grasshopper, you will work in a diverse and dynamic environment with a flat hierarchy. Our open office hosts over 100 employees from 15 nationalities, emphasizing communication and collaboration. We foster autonomy, creativity, risk-taking, and learning from mistakes to stay ahead in trading technology. We are an equal opportunity employer.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.