Enable job alerts via email!

Site Reliability Engineer

SmartSimple Software

Canada

Remote

CAD 60,000 - 100,000

Full time

25 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative company is seeking a Site Reliability Engineer to enhance the reliability and performance of their SaaS infrastructure. This role involves collaborating with software engineers and product teams to automate processes and ensure system availability. You'll be at the forefront of driving improvements in a dynamic environment that values flexibility and innovation. Join a team that is committed to making a positive impact through technology, while enjoying the freedom of a remote-first workplace. If you're passionate about optimizing systems and enhancing user experiences, this opportunity is perfect for you.

Benefits

Flexible PTO
Tuition Reimbursement
Lifestyle Reimbursements
Mindfulness Initiatives
Fitness Initiatives
Employee Recognition Programs

Qualifications

  • 3+ years of experience in Site Reliability Engineering or similar roles.
  • Strong experience with AWS and infrastructure-as-code tools.

Responsibilities

  • Ensure high availability and performance of SaaS products.
  • Implement automation for deployments and maintenance.

Skills

Site Reliability Engineering
Cloud Platforms (AWS)
Infrastructure as Code (Terraform, CloudFormation)
Containerization (Docker, Kubernetes)
Application Performance Monitoring (APM)
Programming/Scripting (Python, Bash)
CI/CD Pipelines
Database Management
Networking Fundamentals
Analytical Skills

Education

Bachelor's degree in Computer Science, Engineering or related field

Tools

Terraform
CloudFormation
Docker
Kubernetes
ELK Stack
Datadog
New Relic

Job description

Direct message the job poster from SmartSimple Software

Human Resources Associate at SmartSimple Software Inc.

About SmartSimple & Foundant

At SmartSimple and Foundant Technologies, we empower mission-driven organizations to manage their data, workflows, and impact with our comprehensive software solutions. From grant management and community foundations to process automation and data collaboration, our combined expertise supports a diverse range of organizations - from nonprofits and charitable entities to corporations and governments.

With the recent merger of SmartSimple and Foundant Technologies, we’ve created a powerhouse of solutions designed to meet the unique needs of organizations striving to make a difference. Together, we’re setting new standards in innovation, flexibility, and impact management by helping organizations achieve their missions more efficiently and effectively.

Where You’ll Work:

  • As a remote-first workplace, we believe in offering flexibility and the freedom to work where it suits you best, while staying connected through technology. Our global network of talent is supported by physical office hubs and virtual collaboration, fostering a dynamic environment where innovation and growth thrive.
  • With headquarters in Bozeman, Montana (Foundant), Toronto, Canada (SmartSimple), and our EMEA office in Dublin, Ireland, you’ll be part of a globally connected team. Whether you’re working remotely or from one of our office locations, you’ll be contributing to a vibrant, collaborative culture focused on driving meaningful impact across the world.

What You’ll Do:

The Site Reliability Engineer (SRE) will play a critical role in maintaining and improving the reliability, scalability, and performance of our SaaS infrastructure and products. You will work closely with software engineers, product teams, and other stakeholders to design, build, and maintain systems that can handle the demands of a growing customer base. You will focus on automating processes, and continuously improving the availability and performance of our services.

  • Reliability & Availability: Ensure the high availability, reliability, and performance of one or more SaaS products across production and staging environments. Monitor system health, track key performance indicators, and respond to incidents quickly to minimize downtime.
  • Incident Management: Perform incident response, troubleshooting, and post-mortem analysis for production incidents. Work to minimize the impact of incidents and drive improvements based on findings.
  • Automation & Efficiency: Implement automation for routine tasks like deployments, scaling, and maintenance. Develop tools and scripts that improve the operational and cost efficiency of the infrastructure.
  • Change Management: Work closely with engineering, product, and operations teams to design, deploy, and maintain cloud-based infrastructure and applications. Ensure that new releases and updates are deployed smoothly with minimal disruption.
  • Monitoring & Alerting: Build and maintain robust monitoring, alerting, and logging systems to provide real-time visibility into the health of our services. Analyze and act upon monitoring data including availability, performance and error logs to proactively detect and resolve issues.
  • Capacity Planning & Scalability: Monitor system capacity, forecast growth, and ensure that our SaaS platforms scale appropriately to handle increased traffic and load. Design and implement strategies for capacity management.
  • Security & Compliance: Ensure that security best practices are followed for all infrastructure components. Collaborate with security teams to implement security controls, auditing, and compliance measures.
  • Performance Optimization: Continuously optimize the performance of our systems and applications by identifying and addressing bottlenecks and improving overall system throughput.
  • Documentation & Knowledge Sharing: Document systems, processes, and procedures. Foster a culture of knowledge sharing and collaboration across teams to improve operational understanding and best practices.

What You’ll Need:

  • 3+ years of experience as a Site Reliability Engineer, DevOps Engineer, or in a similar role within a SaaS company or cloud environment.
  • Strong experience with cloud platforms (AWS) and infrastructure-as-code tools like Terraform, CloudFormation, or similar.
  • Experience with containerization technologies (Docker, Kubernetes) and orchestration platforms.
  • Experience with application performance monitoring (APM) and log analytics tools (e.g. ELK, Datadog, New Relic, etc.).
  • Proficiency in programming/scripting languages (Python, Bash, etc.).
  • Familiarity with CI/CD pipelines and automation tools.
  • Understanding of web application deployment and hosting fundamentals.
  • Understanding of database management and performance tuning.
  • Knowledge of networking fundamentals and web services (HTTP, DNS, load balancing, web application firewall, etc.).
  • Bachelor's degree in Computer Science, Engineering or a related field, or equivalent experience.
  • Strong analytical and troubleshooting skills with the ability to identify and resolve complex technical issues in distributed systems.
  • Excellent communication skills, with the ability to explain complex technical concepts to both technical and non-technical stakeholders.
  • Must be legally eligible to work in your country of residence which must be either the continental US or Canada.

Preferred Qualifications:

  • AWS Certified Solutions Architect or similar professional certification.
  • Experience with managing and maintaining large-scale distributed systems.
  • Experience with security best practices in cloud environments and SaaS platforms.

What You’ll Bring to our Team Dynamics:

  • Adaptive Achievement: You continuously learn from your experiences and adjust strategies to meet the evolving needs of the team and the business.
  • Productive Collaboration: You are comfortable working across functional teams—whether it's with engineers, product managers, or leadership. You communicate complex technical concepts in a clear and actionable manner, ensuring everyone is aligned to achieve shared goals.
  • Service Orientation: You are keen on understanding user needs and translating them into technical solutions that drive organizational success.
  • Active Learning: You are always looking for ways to improve processes, systems, and team workflows, ensuring that the work environment evolves as quickly as the technology we employ.

Why You’ll Love Working at SmartSimple + Foundant

  • At the heart of everything we do is a commitment to innovation and making a positive impact. Whether you’re working on projects that empower not-for-profits, community foundations, or corporations, your contributions will help drive real-world change.
  • We offer competitive salary and benefits, including tuition, and lifestyle reimbursements, and bespoke mindfulness and fitness initiatives.
  • With our Flexible PTO policy, you’ll have the freedom to manage your time in a way that supports your personal well-being and professional success.
  • We’re committed to your professional and personal development. With our merger, you'll have the chance to collaborate across teams at both SmartSimple and Foundant, giving you exposure to diverse ideas, expertise, and projects that span multiple industries.
  • As part of a larger organization, you’ll have even more opportunities to grow your career. Whether it’s exploring new roles, leadership opportunities, or shifting to a different department, we support internal mobility to help you achieve your career goals.
  • You’ll enjoy autonomy and responsibility, empowering you to approach your work creatively and independently, fostering innovation and independent thought.
  • Employee recognition is a core part of our culture. When you do a great job, we make sure everyone knows about it!

SmartSimple and Foundant are equal opportunity employers, committed to building a diverse workforce that represents the communities we serve. We welcome and encourage applications from all qualified candidates, and will consider all applicants without regard to race, color, citizenship, religion, sex, marital/family status, sexual orientation, gender identity, Indigenous status, age, disability, or individuals who may require accommodation.

In accordance with the Ontario Human Rights Code, the Accessibility for Ontarians with Disabilities Act (AODA), and other applicable legislation, SmartSimple and Foundant are also committed to providing accommodations throughout the interview and employment process. Accommodations are available upon request for candidates participating in all aspects of the selection process. If you have accessibility requirements during the recruitment process and require accommodation, please contact hr@smartsimple.com.

Seniority level
  • Associate
Employment type
  • Full-time
Job function
  • Engineering, Information Technology, and Other
  • Industries: Software Development
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer

Blink AI

Remote

CAD 70.000 - 110.000

Today
Be an early applicant

Staff Infrastructure Site Reliability Engineer

Remoteworldwide

Remote

CAD 90.000 - 150.000

2 days ago
Be an early applicant

Site Reliability Engineer

Dayforce

Remote

CAD 70.000 - 110.000

2 days ago
Be an early applicant

Software Engineer, Site Reliability (Senior or Staff)

BioRender

Remote

CAD 80.000 - 150.000

6 days ago
Be an early applicant

Site Reliability Engineer

Foundant Technologies

Remote

CAD 80.000 - 110.000

9 days ago

Senior Site Reliability Engineer - (Remote - Canada)

Jobgether

Remote

CAD 80.000 - 120.000

20 days ago

Site Reliability Engineer

Foundant Technologies, Inc.

Toronto

Remote

CAD 70.000 - 110.000

27 days ago

Senior Site Reliability Engineer

ITjobs.ca

Montreal

On-site

CAD 80.000 - 85.000

5 days ago
Be an early applicant

Site Reliability Engineer

Canonical

Moncton

Remote

USD 80.000 - 120.000

4 days ago
Be an early applicant