Enable job alerts via email!

Observability Platform Engineer

G-Research

Camden Town

On-site

GBP 80,000 - 100,000

Full time

Today
Be an early applicant

Job summary

A leading quantitative finance firm in Camden Town seeks an Engineer for their Observability Platform team. The role involves developing observability solutions and managing telemetry data for seamless engineering processes. The ideal candidate has deep expertise in observability stacks, experience with cloud environments, and strong communication skills. This position offers highly competitive compensation, 35 days of annual leave, and excellent work/life balance.

Benefits

Highly competitive compensation
Lunch provided
35 days' annual leave
9% company pension contributions
Comprehensive healthcare

Qualifications

  • Experience running large-scale observability platforms for diverse customers.
  • Proven experience on observability or SRE teams in a production environment.
  • Customer focused with enthusiasm for infrastructure as a service.

Responsibilities

  • Develop observability and reliability platforms.
  • Collaborate with cross-functional teams for observability integration.
  • Enable SRE frameworks and promote reliability improvements.

Skills

Deep expertise in observability stacks
Experience with cloud-native or hybrid-cloud environments
Familiarity with SRE principles
Hands-on experience with tools like Prometheus and Grafana
Expertise in infrastructure as code
Excellent communication skills

Tools

Prometheus
Grafana
AWS
Terraform
Ansible
Job description
Overview

We tackle the most complex problems in quantitative finance, by bringing scientific clarity to financial complexity. From our London HQ, we unite world-class researchers and engineers in an environment that values deep exploration and methodical execution - because the best ideas take time to evolve. Together we're building a world-class platform to amplify our teams' most powerful ideas. As part of our engineering team, you'll shape the platforms and tools that drive high-impact research - designing systems that scale, accelerate discovery and support innovation across the firm. The role As an Engineer on the Observability Platform team, you'll manage the critical entry and exit points to our telemetry services, ensuring engineers across the business can reliably produce and consume telemetry data for their services. You'll work closely with the Observability Engineering team to design and implement robust, scalable data pipelines that ingest, route and visualise telemetry in predictable and composable ways. Your work will empower engineers to gain actionable insight into their systems, enabling informed decision-making and operational efficiency. Operating under the broader Platform Engineering department, our team also holds responsibility for enhancing the reliability of our entire High-Performance Computing (HPC) stack - from networking and storage through to compute and application platforms.

Responsibilities
  • Being a key contributor to the development of our observability and reliability platforms
  • Contributing to the roadmap for observability tooling, ensuring alignment with business goals and scalability requirements
  • Working with telemetry data at enormous scale, ingesting data from industry-leading GPU clusters
  • Working with AWS services, ensuring seamless integration with the observability platform
  • Collaborating with cross functional engineering teams to establish observability as a core function of the development lifecycle
  • Working closely with application teams to ensure observability systems are fully integrated and providing the necessary insights
  • Enabling SRE frameworks, promoting SLAs, SLOs and SLIs, and working closely with platform teams to ensure reliability is constantly improving
  • Helping to foster a culture of continuous learning and improvement, encouraging adoption of new observability tools and techniques
Qualifications and experience
  • Were looking for an engineer with deep expertise in observability stacks and a keen understanding of the unique challenges associated with managing telemetry at cloud-scale volumes. Youre passionate about building systems that give customers clear, consistent access to telemetry data, helping them run their services as effectively as possible.
  • Experience running large-scale observability platforms for a diverse customer base is essential. Familiarity with core Site Reliability Engineering (SRE) principles is highly beneficial.
  • The ideal candidate will have the following skills and experience:
  • Proven experience on observability or SRE teams in a cloud-native or hybrid-cloud environment, running platforms in production and at scale
  • Well versed in reliability engineering concepts, including different types of testing, progressive deployments, error budgets, the role observability plays and fault-tolerant design
  • Hands-on experience with modern observability tools and frameworks such as Prometheus, OTEL (OpenTelemetry), Grafana and enterprise SaaS Observability platforms, such as Datadog and Dynatrace
  • Expertise in designing, building and scaling observability solutions for distributed systems
  • Customer focused, with an enthusiasm for providing infrastructure as a service and defaulting to a product lens when evaluating platform scale problems
  • Excellent communication skills and the ability to collaborate with cross-functional teams
  • Experience with cloud platforms, such as AWS, Azure or Google Cloud
  • Familiarity with microservices architecture and containerised environments, such as Kubernetes and Docker
  • Knowledge of infrastructure as code (IaC) and automation tools, such as Terraform and Ansible
Benefits
  • Highly competitive compensation plus annual discretionary bonus
  • Lunch provided (via Just Eat for Business) and dedicated barista bar
  • 35 days' annual leave
  • 9% company pension contributions
  • Informal dress code and excellent work/life balance
  • Comprehensive healthcare and life assurance
  • Cycle-to-work scheme
  • Monthly company events

G-Research is committed to cultivating and preserving an inclusive work environment. We are an ideas-driven business and we place great value on diversity of experience and opinions. We want to ensure that applicants receive a recruitment experience that enables them to perform at their best. If you have a disability or special need that requires accommodation please let us know in the relevant section

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.