Enable job alerts via email!

Senior Software Engineer (Observability)

GuruLink

Richmond Hill

Hybrid

CAD 100,000 - 120,000

Full time

29 days ago

Job summary

A leading technology consulting firm in Richmond Hill is seeking a Senior Software Engineer with a strong background in application development and system reliability. This hybrid role involves enhancing monitoring and building observability frameworks for microservices in Node.js and/or Java. Candidates should have a bachelor’s degree and 5+ years of experience, with skills in Docker, Kubernetes, and scripting in Linux/Unix environments. Join a team that emphasizes continuous improvement and proactive problem-solving.

Qualifications

5+ years of hands-on software development experience in Node.js and/or Java.
Strong debugging, analytical, and collaboration skills.
Experience working in Linux/Unix environments and writing scripts.

Responsibilities

Design and implement tools for embedding metrics, logs, and traces into applications.
Analyze and improve instrumentation of Node.js and Java services.
Build and maintain scalable observability platforms.

Skills

Software development in Node.js and/or Java

Docker

Kubernetes

Object-oriented programming

SQL and NoSQL databases

Linux/Unix environments

JavaScript frameworks

Debugging and analytical skills

Education

Bachelor’s degree in Computer Science, Software Engineering, or related discipline

Tools

Elastic APM

OpenTelemetry

Spring Boot

Location

Location: Richmond Hill, Ontario

About the Team

Our client’s platform engineering group operates with a Site Reliability Engineering (SRE) mindset, committed to delivering highly reliable, scalable, and performant systems across a public cloud infrastructure. The team specializes in enhancing system transparency, enabling deep diagnostics, and ensuring seamless collaboration between development and operations. Shared ownership, proactive problem-solving, and continuous improvement are at the core of everything they do.

The Opportunity

Our client is looking for a Senior Software Engineer with a strong background in application development and a passion for observability and system reliability. This hybrid role blends hands-on development with reliability engineering. You’ll work closely with existing microservices in Node.js and/or Java to enhance instrumentation and build out scalable observability frameworks that support modern containerized workloads on Kubernetes.

What You’ll Be Doing

Create Observability Frameworks: Design and implement tools that make it easier to embed metrics, logs, and traces into applications.
Enhance Application Monitoring: Analyze and improve the instrumentation of Node.js and Java services using Elastic APM to capture performance data and operational context.
Define and Evangelize SRE Best Practices: Collaborate with engineers to define meaningful SLIs, SLOs, and KPIs, integrating them into ongoing development workflows.
Monitoring Systems Architecture: Build and maintain scalable observability platforms using Elastic APM, InfluxDB, and Prometheus.
Performance Analysis: Use system metrics, performance test data, and application code insights to diagnose bottlenecks and suggest optimizations.
Incident Response & Resolution: Serve as a go-to expert during incidents, leveraging observability tools to identify root causes and propose fixes.
Postmortems & Continuous Improvement: Lead structured reviews after incidents, recommending and implementing system improvements to avoid recurrence.
Mentorship & Cultural Impact: Promote observability-first thinking across engineering teams by mentoring peers and embedding SRE practices into the development culture.

Must Have Skills

What You’ll Need to Succeed

Bachelor’s degree in Computer Science, Software Engineering, or related discipline
5+ years of hands-on software development experience in Node.js and/or Java
Professional experience with Docker and Kubernetes
Proficiency in object-oriented programming and understanding of HTTP protocols & RESTful APIs
Familiarity with both SQL and NoSQL databases
Experience working in Linux/Unix environments and writing scripts
Strong debugging, analytical, and collaboration skills
Exposure to modern JavaScript frameworks (React, Angular, Vue, ExtJS, etc.)
Solid grasp of software architecture, testing strategies, and performance monitoring principles

Nice-to-Haves

Practical experience with Elastic APM, OpenTelemetry, or similar observability tools
Experience building REST APIs using Spring Boot or Node.js
Understanding of performance tuning and system capacity planning
Exposure to testing tools such as Selenium, JUnit, Mockito, Mocha
Familiarity with Oracle databases, PL/SQL, or servlet-based Java frameworks (Spring MVC, Struts, etc.)
Web server experience with Apache, Tomcat, or Nginx
Experience in SRE-focused roles, including development and maintenance of monitoring platforms
Background with Infrastructure as Code (Terraform, etc.)
Experience working in public cloud environments (AWS, Azure, or GCP)

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.