Enable job alerts via email!

Senior Software Engineer (Observability)

GuruLink

Richmond Hill

Hybrid

CAD 100,000 - 120,000

Full time

29 days ago

Job summary

A leading technology consulting firm in Richmond Hill is seeking a Senior Software Engineer with a strong background in application development and system reliability. This hybrid role involves enhancing monitoring and building observability frameworks for microservices in Node.js and/or Java. Candidates should have a bachelor’s degree and 5+ years of experience, with skills in Docker, Kubernetes, and scripting in Linux/Unix environments. Join a team that emphasizes continuous improvement and proactive problem-solving.

Qualifications

  • 5+ years of hands-on software development experience in Node.js and/or Java.
  • Strong debugging, analytical, and collaboration skills.
  • Experience working in Linux/Unix environments and writing scripts.

Responsibilities

  • Design and implement tools for embedding metrics, logs, and traces into applications.
  • Analyze and improve instrumentation of Node.js and Java services.
  • Build and maintain scalable observability platforms.

Skills

Software development in Node.js and/or Java
Docker
Kubernetes
Object-oriented programming
SQL and NoSQL databases
Linux/Unix environments
JavaScript frameworks
Debugging and analytical skills

Education

Bachelor’s degree in Computer Science, Software Engineering, or related discipline

Tools

Elastic APM
OpenTelemetry
Spring Boot
Job description
Location

Location: Richmond Hill, Ontario

About the Team

Our client’s platform engineering group operates with a Site Reliability Engineering (SRE) mindset, committed to delivering highly reliable, scalable, and performant systems across a public cloud infrastructure. The team specializes in enhancing system transparency, enabling deep diagnostics, and ensuring seamless collaboration between development and operations. Shared ownership, proactive problem-solving, and continuous improvement are at the core of everything they do.

The Opportunity

Our client is looking for a Senior Software Engineer with a strong background in application development and a passion for observability and system reliability. This hybrid role blends hands-on development with reliability engineering. You’ll work closely with existing microservices in Node.js and/or Java to enhance instrumentation and build out scalable observability frameworks that support modern containerized workloads on Kubernetes.

What You’ll Be Doing
  • Create Observability Frameworks: Design and implement tools that make it easier to embed metrics, logs, and traces into applications.
  • Enhance Application Monitoring: Analyze and improve the instrumentation of Node.js and Java services using Elastic APM to capture performance data and operational context.
  • Define and Evangelize SRE Best Practices: Collaborate with engineers to define meaningful SLIs, SLOs, and KPIs, integrating them into ongoing development workflows.
  • Monitoring Systems Architecture: Build and maintain scalable observability platforms using Elastic APM, InfluxDB, and Prometheus.
  • Performance Analysis: Use system metrics, performance test data, and application code insights to diagnose bottlenecks and suggest optimizations.
  • Incident Response & Resolution: Serve as a go-to expert during incidents, leveraging observability tools to identify root causes and propose fixes.
  • Postmortems & Continuous Improvement: Lead structured reviews after incidents, recommending and implementing system improvements to avoid recurrence.
  • Mentorship & Cultural Impact: Promote observability-first thinking across engineering teams by mentoring peers and embedding SRE practices into the development culture.
Must Have Skills

What You’ll Need to Succeed

  • Bachelor’s degree in Computer Science, Software Engineering, or related discipline
  • 5+ years of hands-on software development experience in Node.js and/or Java
  • Professional experience with Docker and Kubernetes
  • Proficiency in object-oriented programming and understanding of HTTP protocols & RESTful APIs
  • Familiarity with both SQL and NoSQL databases
  • Experience working in Linux/Unix environments and writing scripts
  • Strong debugging, analytical, and collaboration skills
  • Exposure to modern JavaScript frameworks (React, Angular, Vue, ExtJS, etc.)
  • Solid grasp of software architecture, testing strategies, and performance monitoring principles
Nice-to-Haves
  • Practical experience with Elastic APM, OpenTelemetry, or similar observability tools
  • Experience building REST APIs using Spring Boot or Node.js
  • Understanding of performance tuning and system capacity planning
  • Exposure to testing tools such as Selenium, JUnit, Mockito, Mocha
  • Familiarity with Oracle databases, PL/SQL, or servlet-based Java frameworks (Spring MVC, Struts, etc.)
  • Web server experience with Apache, Tomcat, or Nginx
  • Experience in SRE-focused roles, including development and maintenance of monitoring platforms
  • Background with Infrastructure as Code (Terraform, etc.)
  • Experience working in public cloud environments (AWS, Azure, or GCP)
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.