Enable job alerts via email!

Senior Site Reliability Engineer

Bold Commerce

Winnipeg

Remote

CAD 100,000 - 130,000

Full time

4 days ago
Be an early applicant

Job summary

A leading ecommerce company is seeking a Senior Site Reliability Engineer (SRE) to design, build, and maintain systems that support their SaaS infrastructure. This role involves ensuring systems are reliable and scalable, collaborating with developers and IT to enhance service delivery. Candidates should have 7+ years of SaaS/cloud experience and strong system knowledge, along with a Bachelor’s or Master’s degree in Computer Science.

Benefits

Competitive compensation
Employer Paid Health & Dental Benefits
Virtual mental health support
Annual Health Benefit
Employee Options
Flexible work hours
Annual Bonus Program
Competitive paid vacation days
Employer Paid Employee & Family Assistance Program

Qualifications

  • 7+ years of experience in SRE or similar role within a SaaS/cloud environment
  • Proficient in at least one programming language
  • Strong collaboration and communication skills

Responsibilities

  • Design and maintain fault-tolerant infrastructure for SaaS products
  • Develop monitoring and incident response processes
  • Contribute to CI/CD pipelines and release management

Skills

Linux/Unix systems knowledge
Shell scripting
GitOps
Python
Go
Ruby
Cloud platforms experience
Container orchestration
Networking knowledge
Monitoring tools (e.g. Prometheus)
Incident management

Education

Bachelor's or Master’s degree in Computer Science

Tools

Ansible
Terraform
Docker
Kubernetes

Job description

Who is Bold Commerce?

Bold Commerce powers personalized checkout experiences for leading omnichannel retailers and direct-to-consumer brands.

As a leader in the composable commerce space, Bold makes checkout better, boosting profitability by enabling personalized, customer-specific checkout flows designed to increase the Checkout Power Trio of conversion, AOV, and LTV - not just conversion. Built with a composable & headless architecture, Bold Checkout fits with any commerce stack, making it easy to overcome platform limitations. Leading omnichannel retailers like Harry Rosen and Staples Canada trust their business with Bold Checkout.

Named one of Built In Austin’s Best Places to Work, Canada’s Top Employers for Young People, and Manitoba’s Top Employers, we're a dynamic team that truly cares about building the future of ecommerce. We live by the BUILDERS Code, a shared set of practices, beliefs, and values that help shape this remote-first company.

Founded in 2012, with team members (Builders) located throughout Canada and the U.S., and backed by investors like OMERS Ventures, WhiteCap Venture Partners, and Round13 Capital, Bold is leading the way to a better, composable ecommerce future.

About the role

Bold is looking for a Senior Site Reliability Engineer (SRE) to design, build, and maintain the systems and tools that support our SaaS infrastructure. You’ll play a key role in ensuring our platforms are reliable, scalable, and performant. Working closely with developers, product managers, and IT operations, you’ll help shape robust solutions that align with our service-level objectives (SLOs) and deliver value to our merchants.

What you’ll do

  • Design and maintain highly available, fault-tolerant infrastructure to support our SaaS products
  • Develop and optimize monitoring, alerting, and incident response processes
  • Improve system performance through capacity planning, load testing, and performance tuning
  • Automate deployment and configuration tasks using infrastructure-as-code practices
  • Partner with development teams to enhance software reliability through efficient CI/CD pipelines and release management
  • Conduct root cause analysis and post-incident reviews to drive continuous improvement
  • Contribute to the architecture of performance monitoring systems and train teams on reliability best practices
  • Organize and manage execution of planned projects
  • Balance speed and stability in product delivery while upholding well-defined SLOs

What we’re looking for

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field
  • 7+ years of experience in SRE or a similar role within a SaaS/cloud environment
  • Strong Linux/Unix systems knowledge and shell scripting
  • Experience with GitOps (ArgoCD is a plus)
  • Proficient in at least one language (e.g. Python, Go, Ruby) and familiar with tools like Ansible, Terraform, or similar
  • Hands-on experience with cloud platforms (GCP preferred; AWS/Azure also relevant) and container orchestration (Docker, Kubernetes)
  • Solid grasp of networking, monitoring (e.g. Prometheus, Grafana, OpenTelemetry), and incident management
  • Strong collaboration and communication skills, with a focus on documentation and cross-functional partnership
  • Trusted team player who builds relationships and cultivates a culture of reliability
  • Flexible hours with participation in an on-call rotation and occasional scheduled maintenance

Our investment in YOU!

Benefits designed to support your well-being and happiness:

  • Competitive compensation that reflects your experience and skills
  • Employer Paid Health & Dental Benefits, Virtual Care, & Disability top-up - starting day 1!
  • Virtual mental health and EAP platform for support anytime
  • Annual Health Benefit ($1,000 per year) to help you thrive!
  • Working remotely - anywhere in Canada & the United States!
  • Employee Options to help you grow with us!
  • Flexible work hours
  • Annual Bonus Program aligned to your Job Level
  • Competitive paid vacation days (starting at 3 weeks)
  • Employer Paid Employee & Family Assistance Program (EFAP)
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs