Enable job alerts via email!

Senior Software Engineer - SRE

Veeva Systems

Ottawa

Remote

CAD 110,000 - 270,000

Full time

Today
Be an early applicant

Job summary

A leading cloud software company in Canada seeks a Senior Site Reliability Engineer to ensure the scalability and reliability of their applications. The ideal candidate will have extensive Java expertise and a proven track record in incident management within enterprise environments. Responsibilities include building cloud infrastructure, leading triage efforts, and automating manual processes. This role offers a competitive salary and comprehensive benefits, including health insurance and retirement programs.

Benefits

Medical, dental, and vision insurance
PTO and paid holidays
Retirement programs
Charitable giving program

Qualifications

  • 5+ years of experience in Java development.
  • Hands-on operational experience in production service environments.
  • Ability to write clean, testable, and maintainable code.

Responsibilities

  • Build cloud infrastructure and ensure reliability.
  • Lead incident management and triage during issues.
  • Develop tools for automation and optimization.

Skills

Java
Open-source technologies
Incident management
Leadership
Effective communication

Tools

Docker
Kubernetes
AWS
Git
MySQL
Job description
The Role

Join our dynamic team as a Senior Site Reliability Engineer on the Vault Platform team, where you’ll ensure the scalability and reliability of our enterprise applications. You’ll tackle complex challenges at a global scale, drawing on your deep expertise in Java and modern open‑source technologies to make a tangible impact on production systems.

What You’ll Do
  • Build Cloud Infrastructure: Rapidly build new cloud infrastructure from scratch, adhering to software development best practices.
  • Drive Reliability & Scalability: Ensure our platform meets the scalability and reliability needs of our hundreds of global customers (across North America, Europe, and Asia).
  • Lead Incident Management: During an incident, effectively lead triage and mitigation efforts, potentially performing periodic on-call duty for escalations.
  • Automate & Optimize: Develop tools and automation to eliminate manual work and reduce issue resolution times.
  • Full-Stack Diagnostics: Proactively learn all necessary systems to provide full‑stack diagnostics and determine root causes of production problems.
  • Strategic Engineering Partnership: Strategize with engineering teams on complex problems, offering insights on what will work at scale (supporting 2M+ users) and guiding development decisions before features ship.
  • Influence Design: Participate in engineering design reviews of new features and drive initiatives to improve operational efficiency and platform scalability.
  • Cross-functional Collaboration: Partner effectively with Product Management, Design, and QA to deliver cutting‑edge solutions and direct customer value.
  • Backend Focus: Work across multiple layers of our technology stack, with a primary focus on backend development, and opportunities in frontend and infrastructure.
  • Effective Communication: Communicate clearly with engineering teams, succinctly describing problems for seamless hand-offs during outages with both technical and non‑technical audiences.
  • Mentorship: Actively mentor team members, contributing to a positive and high‑performing team environment.
Requirements
  • Deep Java Expertise: 5+ years of experience in Java development, with a strong preference for experience within enterprise cloud software companies.
  • Operational Experience: Hands‑on operational experience in a high‑volume or critical production service environment, including incident management and root cause analysis.
  • Code Quality: Proven ability to write clean, testable, readable, and maintainable code within a collaborative team setting.
  • Open Source Proficiency: Hands‑on experience with a range of open‑source technologies, such as Spring, MySQL, Hibernate, Solr, Maven, Git, Tomcat, Linux, AWS, Vagrant, Docker, and Kubernetes.
  • Database Mastery: 3+ years of experience in relational databases with expert‑level SQL skills.
  • Scripting Skills: Solid scripting proficiency with languages such as Shell, Bash, Ansible, Python, Go, Ruby, etc.
  • Leadership & Communication: Demonstrated history of incident management and leadership ability, with effective communication skills across all levels (individual contributors to executives).
  • Mentorship: Proven record of making your team better through mentorship.
  • This role requires a working schedule of Monday - Friday, 6 AM - 2 PM EST , and candidates must be located in the EST or AST time zones to be considered.
Perks & Benefits
  • Medical, dental, vision, and basic life insurance.
  • PTO and company‑paid holidays.
  • Retirement programs.
  • 1% charitable giving program.
Compensation
  • Base pay : $110,000 - $270,000.
  • The salary range listed here has been provided to comply with local regulations and represents a potential base salary range for this role. Please note that actual salaries may vary within the range above or below, depending on experience and location. We look at compensation for each individual and base our offer on your unique qualifications, experience, and expected contributions. This position may also be eligible for other types of compensation in addition to base salary, such as variable bonus and / or stock bonus.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.