Enable job alerts via email!

SRE Lead

Nexthink

Bengaluru

Hybrid

INR 20,00,000 - 30,00,000

Full time

Today
Be an early applicant

Job summary

A tech company specializing in cloud solutions is seeking a Lead Site Reliability Engineer to drive the development of modern SRE processes. You will lead a team in managing a microservices-based cloud platform, ensuring reliability, security, and compliance. Ideal candidates have extensive SRE or DevOps experience, proficiency in cloud platforms, and strong leadership skills. This role offers a competitive compensation package and a hybrid work model.

Benefits

Permanent Contract
Competitive compensation package
Flexible Hours
Unlimited vacation
Company-paid volunteer days
Regular team events

Qualifications

  • 5+ years of experience in site reliability engineering, DevOps, or a related role, with at least 2 years in leadership.
  • Proficiency in cloud platforms and cloud-native services.
  • Strong scripting skills in Python, Bash, Go, or similar.

Responsibilities

  • Lead, mentor, and develop a team of Site Reliability Engineers.
  • Oversee the design, deployment, and management of cloud infrastructure.
  • Drive automation of infrastructure provisioning and management.

Skills

Site Reliability Engineering
DevOps
Cloud Platforms (AWS, Azure, GCP)
Python
Terraform
Docker
Kubernetes
CI/CD Pipelines
Network Security

Education

Bachelor’s degree in Computer Science or Engineering

Tools

SIEM
Ansible
Job description
Overview

Nexthink is looking for a Lead Site Reliability Engineer who is passionate about building and running a high-performance cloud platform and enabling best-in-class site reliability and operations practices. This role will support Nexthink operations globally. The candidate will drive the development of modern, cloud-native SRE processes and the management and operations for Nexthink’s multi-tenant, microservices-based cloud platform. The platform has multiple instances deployed across the globe.

This role involves working closely with cross-functional teams to integrate reliability and security into our systems, ensuring they meet standards. The ideal candidate will have extensive experience in both software engineering and systems administration, with a strong understanding of SRE concepts, requirements and security practices.

Responsibilities
  • Lead, mentor, and develop a team of India-based Site Reliability Engineers.
  • Foster a culture of continuous improvement, collaboration, and innovation.
  • Oversee the design, deployment, and management of scalable and secure cloud infrastructure.
  • Drive automation of infrastructure provisioning, configuration, and management using Infrastructure as Code (IaC) tools.
  • Develop and maintain comprehensive monitoring, logging, and alerting systems to ensure high availability and performance.
  • Lead efforts in performance tuning and optimization for applications and infrastructure.
  • Ensure implementation and maintenance of security controls and best practices to achieve compliance with standards and certifications.
  • Conduct and oversee regular security assessments, vulnerability scans, and penetration testing.
  • Collaborate with the compliance team to prepare for and respond to audits.
  • Lead incident management efforts, ensuring rapid resolution and thorough root cause analysis.
  • Develop and implement strategies for improving incident response and minimizing downtime.
  • Work closely with development, operations, and security teams to integrate reliability and security into the software development lifecycle.
  • Communicate effectively with stakeholders, providing regular updates on system performance, reliability, and compliance status.
Qualifications
  • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • 5+ years of experience in site reliability engineering, DevOps, or a related role, with at least 2 years in a leadership position.
  • Proficiency in cloud platforms (AWS, Azure, GCP) and cloud-native services.
  • Strong scripting and programming skills (Python, Bash, Go, or similar).
  • Experience with Infrastructure as Code (IaC) tools such as Terraform, CrossPlane, CloudFormation, or Ansible.
  • Knowledge of containerization and orchestration (Docker, Kubernetes).
  • Familiarity with CI/CD pipelines and tools (Jenkins, GitLab, GitHub, etc.).
  • In-depth knowledge of standards (ISO, SOC2...) requirements and best practices.
  • Experience with security tools and practices (SIEM, IDS/IPS, firewalls).
  • Understanding of network security, encryption, and secure software development practices.
  • Ability to collaborate with and foster effective communication with global and multicultural engineering teams in EU and US timezones.
  • Ability to report timely and effectively to the upper engineering management.
Benefits
  • Permanent Contract and a competitive compensation package (including stock options).
  • Hybrid work model balancing office and remote work, with a structured approach for new hires to foster connections and onboarding.
  • ️ Flexible Hours and unlimited vacation (employees have unlimited paid time off on top of the 22days of holidays we offer)plus 3 company-paid volunteer days.
  • Fresh fruit, cookies, and soft drinks as well.
  • Regular company and team events like Voluntary Days, Pizza talks, Team Building activities, hosting Meetups at the office and more!
  • Bonuses for referring successful hires after three months of continuous employment.

Additional information: Please note that not all the benefits listed above are available for temporary, contract, and internship roles. To ensure you have the most up-to-date information, we recommend checking with your Recruitment Partner.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.