Working for a company like Smile Digital Health means supporting our mandate for #BetterGlobalHealth. We strive towards this goal every day, and the results can be seen in the impact of our innovative health data platform and data management solutions, which are used in over 20 countries. We were #19 on Deloitte's Technology Fast 50 Ranking for 2024!
Smile Digital Health makes it easy for healthcare stakeholders to collect and exchange data with our leading FHIR-based data liberation platform.
At its heart, the Smile platform enables people and organizations to better manage healthcare data. We help generate and liberate structured healthcare data to ensure effective delivery across care teams and health systems bringing #BetterGlobalHealth to patients everyday!
Apply today and find plenty of reasons to SMILE!
The Cloud Site Reliability Engineer (SRE) is responsible for ensuring the reliability, scalability, and performance of production‑grade services deployed across multiple cloud vendors and infrastructure platforms for Smile Digital Health, its clients, and partners. This role designs and automates performance testing frameworks, integrates them into CI/CD pipelines, and uses observability tools to proactively detect and resolve bottlenecks. Working closely with engineering, product, and security teams, the SRE ensures systems meet strict SLAs for performance and availability while driving continuous optimization across multiple cloud platforms.
Responsibilities:
- Collaborate with our Security Operations teams to help define and implement best practices around Cloud Service Provider configuration for Azure and other cloud providers.
- Develop, implement and coordinate a multi-tenant approach around service offerings for DB, Container platform, Authentication, Certificates, and Product Registries etc.
- Design and maintain performance testing strategies, framework, and environments in the cloud. Develop and maintain cost/utilization tracking and attribution processes for all Cloud Service Providers.
- Create documentation around Cloud Service Provider offerings detailing use cases, best practices, and implementation details.
- Develop and maintain technical relationships with our core Cloud Service Providers.
- Implement and maintain a secure and scalable infrastructure platform for delivering Cloud Services applications.
- Ensure that internal and external SLA’s meet and exceed expectations, and ensure that system centric KPIs are continuously monitored and improved.
- Create tools for automating deployment, monitoring and operations of the overall platform.
- Participate in an on‑call rotation to provide application support, incident management, and troubleshooting.
- Provide ongoing maintenance and support of internal tools, improve system health and reliability.
- Assist customers with the on‑site deployments when needed.
- Implement and manage observability tools (logging, metrics, tracing) for performance insights, Otel and Grafana Stack preferred. Ongoing compliance with organizational policies, procedures and practices (such as but not limited to security policies) are an ongoing requirement of the employment or contractual agreement.
- Accountable for ensuring that all working hours are accurately reported in Time Tracking System on a daily or weekly basis, that the majority of (if not all) hours are tracked as billable and that the project management tool in the time tracking system is properly and fully utilized.
- Tracking and reporting of billable hours is a critical aspect of project management and delivery to our customers and this is a major area of accountability.
- Comply with the privacy, security and confidentiality policies. Hold all confidential information in trust and strict confidence and ensure that it shall be used only for the purposes required to fulfill employment obligations, and shall not be used for any other purpose, or disclosed to any third party.
Requirements:
- Demonstrated expertise of cloud service providers and best practices around implementation and configuration, preferably managing Azure on behalf of multiple teams for a company that delivers SaaS products.
- Experience with Kubernetes, Openshift, Kafka, Elastic stack. Proven experience working with microservices architecture, with a strong focus on Java‑based services.
- Experience in applying chaos engineering practices to evaluate and enhance system resiliency.
- Skilled in troubleshooting performance issues, including analyzing time consumption, allocating resources, and recommending optimizations.
- Familiar with performance testing methodologies and tools to assess system behavior under load.
- Proven experience with Security and Compliance (SOC2, HIPAA, ISO27001) best practices and how to implement controls that support high‑velocity software delivery teams.
- Proficiency in Terraform, Ansible or Chef. Expertise in troubleshooting, support escalation, on‑call process optimization and documenting knowledge.
- Passionate about Infrastructure as code, automation, and developing solutions that help developers move quickly and safely.
- Familiarity with infrastructure management and operations lifecycle concepts and ecosystem.
- Experience operating and maintaining production systems in a Linux and public cloud environment.
- You have prior experience working in high‑performance or distributed systems, while we strive to hire at a variety of experience levels.
- Working knowledge of industry best practices regarding information security. Previous experience building or maintaining a large‑scale Cloud service.
- Proven ability to prioritize and track multiple projects in parallel. Proven ability to be highly responsive and customer‑focused.
$100,000 - $120,000 a year
Some of the benefits we offer:
- Remote Work Environment
- Flexible Time Away From Work Policy including PTO, Personal and Sick Days
- Competitive Salary and Health/Medical Benefits
- RRSP/TFSA/401K Employee Contribution
- Life and Disability
- Employee Assistance Program
- FHIR Study Program and Skillsoft Learning
- Super HAPI Fun Club
Smile's core values include respect, inclusion, embracing our differences, and celebrating shared values because our people are the foundation of our success. We are big on creating a sense of belonging and empowering each other to bring our authentic selves to work. We are dedicated to fostering a workplace that values diversity, equity, and inclusion.
We welcome and encourage candidates of all backgrounds to apply. Candidates are encouraged to inform us if they wish to discuss or require accommodations during interviews or while working at Smile.