Site Reliability EngineerLocation:Onsite – Kanata, Ontario
About Our Client
Imagine a startup delivering real-time data insights that empower businesses to make smarter, faster decisions. Backed by one of the world’s top tech groups, we blend cutting-edge technology with deep expertise to help companies stay agile and ahead of the curve. With the strength of a powerhouse behind us, we drive innovation and create transformative solutions for today’s dynamic markets.
Edge Signal provides a full-fledged edge computing platform powering computer-vision applications across Retail, Hospitality and Warehousing. they run entirely on AWS, ingesting and analyzing massive fleets of on-premise devices with Datadog monitoring.
We’re looking for an experienced Site Reliability Engineer to keep their cloud and edge infrastructure running flawlessly—and to help their customers get up and running smoothly.
This position is based at their head office in Kanata, Ottawa, reporting to the Director of Technology.
Ensure highly available, fault-tolerant AWS services (auto-scaling, disaster recovery, capacity planning).
Build and maintain Datadog dashboards, monitors and alerts for cloud resources and edge devices; author runbooks and automation scripts for incident response.
Develop tooling to provision, update and health-check thousands of edge devices; ingest device telemetry into Datadog for unified observability.
Automate routine ops tasks (onboarding steps, incident remediation) using shell, Python, etc.
Lead customer installations by configuring IP cameras, NVRS, and Edge Signal agents on-site.
Guide network, security and firmware setups to ensure seamless data flow from device to cloud.
Triage and resolve Freshdesk tickets; conduct root-cause analysis and drive timely closure.
Convert complex issues into Jira epics/stories and collaborate with product teams to ship fixes.
Manage AWS IAM (users, roles, policies, SSO) and enforce security best practices.
Monitor and optimize AWS spend—set budgets, report usage and recommend cost-savings strategies.
Integrate secrets management, vulnerability scanning and other compliance controls.
A minimum of a Bachelor's degree in Computer Science or a related field in engineering is required;
Min 3+ years as an SRE or DevOps engineer supporting production AWS environments.
Proven expertise in Datadog (APM, Infrastructure, Logs, Synthetic checks)
Strong Linux administration skills and proficient scripting ability (Bash, Python, or Go)
Experience with AWS IAM, SSO, Control Tower, cost-management tools, and billing dashboards
Excellent communicator with a bias toward collaboration and customer empathy
Prior work with edge computing or IoT device fleets
Experience configuring IP cameras, RTSP streams, and NVR systems
Freshdesk and Jira administration experience
AWS DevOps or Solutions Architect certification
* The salary benchmark is based on the target salaries of market leaders in their relevant sectors. It is intended to serve as a guide to help Premium Members assess open positions and to help in salary negotiations. The salary benchmark is not provided directly by the company, which could be significantly higher or lower.