Enable job alerts via email!

Lead SRE - Observability Lead (1083862)

The Judge Group

Plano (TX)

On-site

USD 120,000 - 150,000

Full time

30+ days ago

Job summary

A leading company in the technology sector is seeking an Observability Technical Lead to enhance its observability capabilities. This role involves implementing and managing tools for logging, tracing, and monitoring, ensuring system reliability and operational excellence. The ideal candidate will have extensive experience in application engineering, observability tools, and a strong background in Agile environments. Join a dynamic team to drive innovation and improve system performance across the organization.

Qualifications

  • 10+ years in application engineering; 7+ years in SRE/Observability roles.
  • Hands-on experience with observability tools and Agile SDLC environments.

Responsibilities

  • Design, configure, and maintain observability tools across environments.
  • Develop automation tools and processes to support monitoring and compliance.

Skills

Monitoring
Automation
Collaboration
Incident Resolution

Education

Bachelor’s degree in Computer Science

Tools

Dynatrace
Splunk
AppDynamics
New Relic
Elastic
Job description

As theObservability Technical Lead, you’ll play a key role in enabling and enhancing the observability capabilities across the Technology organization. You’ll be responsible for implementing and managing tools that support logging, tracing, alerting, visualization, and AIOps. You’ll help drive operational excellence and system reliability through robust monitoring solutions.

Key Responsibilities

  • Design, configure, and maintain observability tools (e.g.,Splunk,Dynatrace) across on-prem and cloud environments.
  • Collaborate with Observability leadership to define strategy, roadmaps, and priorities.
  • Develop automation tools and processes to support monitoring, security, and compliance.
  • Integrate additional frameworks to enhance enterprise-wide monitoring automation.
  • Analyze incidents and usage data to proactively identify and resolve issues.
  • Partner with teams to assess and improve system performance and reliability.
  • Promote adoption of observability tools across technology teams.
  • Define and implementService Level Objectives (SLOs)andService Level Indicators (SLIs)with service owners.
  • Track and report on platform stability, scalability, and DevOps maturity.
  • Lead incident resolution efforts and ensure SLA compliance.
  • Translate monitoring needs into actionable tasks for engineering teams.
  • Deliver presentations and mentor engineers on observability best practices.

Qualifications

  • 10+ yearsin application engineering;7+ yearsin SRE/Observability roles.
  • Hands-on experience with tools likeDynatrace,Splunk,AppDynamics,New Relic, orElastic.
  • Strong background inmonitoring within Agile SDLCenvironments.
  • Expertise in system design with a focus onperformance, security, and availability.
  • Proficiency inrelational databases(e.g., MSSQL, MySQL, PostgreSQL, MongoDB).
  • Scripting experience inPython,PowerShell, orUnix Shell.
  • Familiarity withcontainerization,cloud platforms, andDevOps practices.
  • Coding experience inJava,JavaScript,Python, or.NET.
  • Exposure toAI/MLconcepts and tools is a plus.
  • Experience in regulated industries;financial servicespreferred.
  • Bachelor’s degree in Computer Science, MIS, Math, or related field (advanced degree preferred).
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.