Job Search and Career Advice Platform

Enable job alerts via email!

Observability Engineer

Sepal

Remote

USD 80,000 - 100,000

Part time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology firm in Boston is seeking an Observability Engineer to design complex queries and optimize telemetry datasets for AI-driven system reliability. The ideal candidate will have at least 3 years of experience with Clickhouse, Grafana, and SQL optimization. This position offers a pay range of $50-$85/hr, allowing for remote work with flexible hours. The project is expected to last 5-6 weeks and focuses on supporting effective incident investigation and performance analysis.

Qualifications

  • 3+ years of experience in observability, production engineering, or platform engineering.
  • Deep expertise in Clickhouse and Grafana.
  • Strong SQL skills for performance optimization.

Responsibilities

  • Design complex queries over large telemetry datasets.
  • Create synthetic datasets simulating cloud infrastructure logs.
  • Improve system reliability through incident investigation.

Skills

SQL
Clickhouse
Grafana
Query optimization
Telemetry data analysis
Job description
Overview

The Observability Engineer / Production Engineer will design and optimize complex queries over large telemetry datasets to support AI‑driven incident investigation and system reliability. This role involves creating synthetic datasets simulating cloud infrastructure logs and working extensively with tools like Clickhouse and Grafana. Candidates should have 3+ years of experience in observability or production engineering with strong SQL and query optimization skills.

About Sepal AI

Sepal AI builds the world’s hardest tests for AI grounded in real‑world software systems. We are looking for an Observability Engineer with 3+ years of experience to help us understand, debug, and operate complex production systems at scale. You will work deeply with production logs, metrics, and traces, building tasks and datasets that can teach AI to investigate incidents, analyze performance, and improve system reliability.

What You’ll Do
  • Design complex, distributed queries over massive log and telemetry datasets.
  • Explore creative ways to challenge AI's reasoning ability and log analysis skills.
  • Create and manage synthetic datasets that simulate real‑world DevOps, observability, or cloud infrastructure logs.
Who You Are
  • 3+ years of experience in observability engineer, production engineer, or platform engineer roles.
  • Deep expertise in Clickhouse and Grafana, with a focus on large‑scale query optimization, schema design, and performance tuning.
  • Strong SQL skills: you know how to reason through performance problems and spot inefficient query patterns.
Compensation & Logistics

Pay: $50 – 85/hr depending on experience. Remote, flexible hours. Project timeline: 5–6 weeks.

Keywords

Observability Engineer, Production Engineer, Clickhouse, Grafana, SQL optimization, Distributed systems, Telemetry data, Cloud infrastructure logs, AI debugging, System reliability.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.