Enable job alerts via email!

SRE (Site Reliability Engineering)

Uplers

Pune District

Remote

INR 9,00,000 - 12,00,000

Full time

Today
Be an early applicant

Job summary

A reputable technology firm is seeking a Senior Site Reliability Engineer to ensure the reliability, scalability, and performance of systems. This mid-senior role, based in Pune, requires 8+ years of experience in IT, particularly with AWS technologies. You will collaborate with teams, define SLOs, and maintain system health through effective monitoring. Apply today for this full-time remote opportunity.

Benefits

Day off on the 3rd Friday of every month
Monthly Wellness Reimbursement Program
Paid paternity and maternity leaves

Qualifications

  • 7+ years of proven experience as a Senior Site Reliability Engineer or similar role.
  • 5+ years of AWS Cloud experience with certifications.
  • Experience with CI/CD tooling (GitHub Actions, Jenkins).

Responsibilities

  • Ensure the reliability and performance of systems and services.
  • Define and monitor Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
  • Deploy and manage monitoring tools for system health.

Skills

AWS Cloud experience
Strong problem-solving skills
Bash and Python scripting
Experience with monitoring systems

Education

Bachelor's degree in CS or related field

Tools

Terraform
CloudWatch
GitHub Actions
Job description

Join to apply for the SRE (Site Reliability Engineering) role at Uplers

Experience: 8+ years • Salary: Confidential (based on experience) • Shift: GMT+05:30 Asia/Kolkata (IST) • Opportunity Type: Remote • Placement Type: Full time Permanent Position

Note: This opportunity is for one of Uplers' clients – Forbes Advisor.

Responsibilities
  • The Site Reliability Engineering (SRE) team is responsible for the reliability, scalability, stability and performance of systems and services.
  • Collaborate with cross-functional teams to design, build, maintain, and troubleshoot systems; bridge gaps between development and operations.
  • Define and monitor Service Level Objectives (SLO) and Service Level Agreements (SLA) for critical systems; ensure uptime in line with SLOs/SLAs.
  • Deploy and manage monitoring tools to gain insights on system health and performance.
  • Analyze performance, identify bottlenecks, and implement solutions to improve scalability and latency.
  • Develop scripts, tools, and automation frameworks to reduce manual deployment, monitoring, and scaling efforts.
  • Partner with development teams to implement observability practices (logging, metrics, tracing) and proactively diagnose issues.
  • Create actionable alerts on monitoring systems to ensure rapid response to production incidents.
  • Forecast resource needs and provision for current and future demand.
  • Design and conduct chaos experiments to test system resilience.
  • Own, define, and implement Disaster Recovery (DR) processes; conduct planned and unplanned mock DR drills.
  • Ensure security best practices are followed during design and operations.
  • Maintain documentation of processes, playbooks, and systems; publish KPI reports and health updates to the business.
Requirements
  • Bachelor's degree in CS or related field or equivalent experience.
  • 12+ years of overall IT experience.
  • 7+ years of proven work experience as a Senior Site Reliability Engineer or similar role.
  • 5+ years of AWS Cloud experience with AWS DevOps Engineer or SysOps or Security certification, etc.
  • 3+ years’ experience with AWS technologies (EC2, RDS, ELB, S3, VPC, CloudWatch & monitoring tools); emphasis on cloud security.
  • 2+ years of experience in CDN and/or Cache systems (e.g., Fastly, Akamai, CloudFront).
  • Understanding of Cloud deployments (AWS / Docker / Kubernetes).
  • Experience provisioning IAC tools (Terraform, Chef, Ansible, Shell, Groovy, Python, etc.).
  • Experience with monitoring systems (CloudWatch, New Relic, Datadog, Splunk, ELK stack).
  • Experience managing cloud network resources (AWS preferred) such as CloudWatch, VPC, DNS, proxies, firewalls, etc.
  • CI/CD tooling experience (GitHub Actions, Jenkins, etc.).
  • Experience with JIRA, Bitbucket, Fortify, SonarQube, Nexus/Nexus IQ.
  • Experience with configuration automation tools (Puppet/Ansible/Chef/Salt).
  • Scripting: strong Bash and Python skills; automation focus.
  • Operating Systems: Windows and Linux system administration.
  • Strong problem-solving, communication, and documentation skills.
Good to Have
  • Experience with Terraform/Ansible/Chef/Puppet.
  • Experience with GitHub Actions.
  • Experience with CloudFront, Fastly.
  • Ability to oversee team members performing these functions.
  • Proactive about anticipating problems and future technical needs.
  • Comfort with server-side focus; client-side as needed; trend awareness in tech and best practices.
Perks
  • Day off on the 3rd Friday of every month (one long weekend each month).
  • Monthly Wellness Reimbursement Program to promote health and well-being.
  • Paid paternity and maternity leaves.
How to Apply
  • Step 1: Click On Apply and register or login on our portal.
  • Step 2: Complete the screening form and upload your updated resume.
  • Step 3: Increase your chances to be shortlisted and meet the client for the interview.
About Uplers

Our goal is to make hiring reliable, simple, and fast. We help talents find and apply for relevant opportunities and support any challenges during the engagement. There are more opportunities on the portal; based on assessments you clear, you can apply for them as well. If you are ready for a new challenge and a great work environment, apply today.

Job Details
  • Seniority level: Mid-Senior level
  • Employment type: Full-time
  • Job function: Engineering and Information Technology
  • Industries: Technology, Information and Internet

Get notified about new Site Reliability Engineer jobs in Pune, Maharashtra, India.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.