Enable job alerts via email!

Site Reliability Engineer

NinjaOne

United Kingdom

Remote

GBP 70,000 - 90,000

Full time

Today
Be an early applicant

Job summary

A leading IT solutions provider is seeking a Site Reliability Engineer to join their team. The role involves resolving complex infrastructure issues and ensuring application integrity, with a strong focus on automation and observability. Ideal candidates will have over 5 years of experience and extensive AWS proficiency. This position offers full remote flexibility and aims to enhance IT processes for numerous clients, driving better visibility and security across endpoints.

Benefits

Access to Corporate Benefits Platform
Develop skills through training
Competitive compensation
Inclusive and diverse work environment

Qualifications

  • 5+ years’ experience in Site Reliability Engineer roles.
  • Expert+ level in Linux administration and troubleshooting.
  • Comprehensive experience with AWS and its core capabilities.
  • Hands-on experience with CI/CD and SDLC processes.

Responsibilities

  • Diagnose and resolve complex application and infrastructure issues.
  • Participate in on-call rotation and deployment planning.
  • Perform Root Cause Analysis (RCA) for applications.
  • Ensure best-practice architecture and security.

Skills

Linux administration
Scripting and troubleshooting
Observability tools (Prometheus/Grafana, New Relic, Splunk, DataDog)
AWS experience
Cloud automation and infrastructure-as-code tools
Containers and Kubernetes
CI/CD processes
Effective communication skills

Tools

CloudFormation
Terraform
Helm
Ansible
Job description

About the Role

At NinjaOne we are passionate about building unified IT solutions that simplify the way IT organizations work. We are currently looking for a Site Reliability Engineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion for automation and observability, ensuring the quality and availability of our services.

Location - We are flexible on remote working from home, if you are based in the UK or Germany. *This is a fully remote position with the option to be hybrid if you prefer.

On Call Requirements - Participate in our 24x7 on-call rotation, SCRUM, and deployment planning.

We hire the best software engineers, but experience in our stack can’t hurt: NinjaOne is built on Java, Kotlin, C++, and Postgres, supporting millions of user endpoints and running as a scalable cloud service in AWS. Knowing large-scale datastore bottlenecks, asynchronous application design and client-server architecture will help you.

What you’ll be doing
  • Diagnose and resolve complex application and infrastructure issues
  • Participate in our 24x7 on-call rotation, SCRUM, and deployment planning
  • Perform Root Cause Analysis (RCA) and provide recommendations for application teams
  • Improve availability and reduce customer impact using Industry best observability tools
  • Ensure best-practice and security-minded architecture by influencing design decisions
  • Create and maintain technical documentation and SOP’s
  • Develop software, scripts, or tooling to improve efficiency and reduce delivery time of applications and infrastructure
  • Other duties as needed
About You
  • 5+ years’ experience in Site Reliability Engineer roles
  • Expert+ level Linux administration, scripting, and troubleshooting
  • Demonstrable knowledge of Observability tools (Prometheus/Grafana, New Relic, Splunk, DataDog)
  • Comprehensive experience with AWS (Amazon Web Services) and its core capabilities (VPC, EC2, ECS, Route53, Fargate, ALB/NLB distributions, etc)
  • Extensive experience with cloud automation and infrastructure-as-code (IaC) toolsets, primarily CloudFormation but also including Terraform, Helm and Ansible. CDK a plus.
  • Good understanding of containers, Fargate, Kubernetes, and overall distributed microservice architectures
  • Passionate about automation, security, and self-service environments/portals
  • Hands-on experience with CI/CD and SDLC (Software Development Life Cycle) processes
  • Effective communication skills, both verbal and written.
About Us

NinjaOne automates the hardest parts of IT to deliver visibility, security, and control over all endpoints for more than 30,000 customers. The NinjaOne automated endpoint management platform is proven to increase productivity, reduce security risk, and lower costs for IT teams and managed service providers. NinjaOne is obsessed with customer success and provides free and unlimited onboarding, training, and support. NinjaOne is #1 on G2 in endpoint management, patch management, remote monitoring and management, and mobile device management.

What You’ll Love

Grow personally and professionally with one of the fastest growing companies.

Access to our Corporate Benefits Platform (with discounts for brands such as Expedia, FitX, Zalando and many more).

Develop your skills through our renowned training platform.

Receive competitive compensation.

Collaborate with a curious, kind, international and intercultural workforce.

This position is NOT eligible for Visa sponsorship.

All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, genetic information, marital status, veteran status, or any other status protected by applicable law. We are committed to providing an inclusive and diverse work environment.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.