Enable job alerts via email!

Senior Site Reliability Engineer

Flexera

United States

Remote

USD 80,000 - 100,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Senior Site Reliability Engineer, where your expertise will drive the transformation of the software industry. In this role, you will tackle challenges in a fast-paced environment, focusing on building fault-tolerant, scalable systems while collaborating with cross-functional teams. Your contributions will enhance CI/CD pipelines, improve operational efficiency, and ensure the reliability of cloud infrastructure. This innovative firm values passionate professionals eager to grow their skills and make a significant impact. If you're ready to take on exciting challenges in a supportive culture, this opportunity is perfect for you.

Qualifications

  • 4+ years managing mission-critical production systems in AWS or equivalent.
  • Experience in deploying and orchestrating containers.

Responsibilities

  • Automate repetitive operations work to eliminate operational toil.
  • Design, develop, and deploy new features for Flexera products.

Skills

Agile software delivery methodologies
Cloud-based services management (AWS, Azure)
DevOps practices
Infrastructure provisioning
Container orchestration (Docker, Kubernetes)
Linux expertise
Networking fundamentals
GitHub for collaboration
AWS services (EC2, ECS, EKS, S3)
Database exposure (MySQL, Amazon RDS, MongoDB)

Education

Computer Science degree

Tools

Grafana
Prometheus
Docker
Kubernetes

Job description

We’re transforming the software industry. We’re Flexera. With more than 50,000 customers across the world, we’re achieving that goal. But we know we can’t do any of that without our team. Ready to help us re-imagine the industry during a time of substantial growth and ambitious plans? Come and see why we’re consistently recognized by Gartner, Forrester and IDC as a category leader in the marketplace.

Flexera delivers Technology Value Optimization solutions that enable some of the largest companies in the world to inform their IT so they can transform their IT. From on-prem to the cloud, companies can get the IT asset data needed to rightsize, reallocate spend, reduce risk and maximize ROI.

Senior Site Reliability Engineer (SRE)

Flexera is looking for an experienced Site Reliability Engineer to join our SRE team. We're a fast-growing, category-leading organization with ambitious objectives and a positive, inclusive culture. We're looking for passionate professionals who want to grow their talents and achieve great things. If that sounds like you, we want to talk to you about joining our team.

As a Site Reliability Engineer, you will be tasked with everything from helping with product design, to diagnosing issues, and writing automated scripts for mediating issues that occur in our production systems. You will be driven to build fault-tolerant, scalable systems and automate away as much operational toil as you can. You align with the goals of the DevOps movement in improving collaboration between the development and operations disciplines.

We are seeking someone with extensive experience working on a SaaS/Cloud product with a microservices architecture.

Responsibilities:

  • Help to eliminate operational toil - seek to automate repetitive operations work.
  • Establish and enhance CI/CD pipelines.
  • Create dashboards with Grafana/Prometheus to communicate the metrics for a given product service.
  • Collaborate with other teams.
  • Investigate, debug and provide resolution for customer issues.
  • Mentor team members on cloud computing, infrastructure, and best practices.
  • Ensure the security and reliability of shared Infrastructure with the Flexera cloud.
  • Make Reliability a first-class citizen.
  • Design, develop and deploy new features for Flexera products/platforms, as defined by goals from the SRE organization.
  • Create dashboards to communicate the metrics for a given product service.
  • Work with product owners and product engineering teams to perform capacity planning.
  • Work with product engineering teams to understand performance and behavior patterns.
  • Be part of an on-call rotation for alerts that require engineering expertise to diagnose.
  • Help carry out root cause analysis for incidents and design solutions (both software and human processes) to prevent recurrence.

Minimum Qualifications:

  • Computer Science degree or related industry experience managing a mission-critical production system in AWS (or equivalent Azure/Google cloud) for at least 4 years.

Critical Skills / Competencies:

Required:

  • Agile software delivery methodologies.
  • Experience managing cloud-based services like AWS or Azure at scale.
  • Experience with DevOps.
  • Infrastructure provisioning experience.
  • Experience deploying to and orchestrating containers (Docker, Kubernetes, etc.).
  • Expertise in Linux and good understanding of its commands.
  • Good networking fundamentals.
  • GitHub for collaboration and change management.
  • Experience with AWS services such as EC2, ECS, EKS, S3.
  • Database exposure preferably MySQL, Amazon RDS and MongoDB.

Good to have:

  • Understanding of RESTful APIs and other web-based application concepts.
  • Experience with any scripting language (Ruby is the current language, but comparable experience in Java, Python, Perl, etc. would suffice).
  • Knowledge of Go Lang.
  • Knowledge of Helm.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Site Reliability Engineer - Azure - Remote

Optum

Eden Prairie

Remote

USD 89,000 - 177,000

6 days ago
Be an early applicant

Sr. Site Reliability Engineer

Dayforce

Remote

USD 80,000 - 120,000

3 days ago
Be an early applicant

Senior Site Reliability Engineer

Exabeam

Remote

USD 90,000 - 150,000

Today
Be an early applicant

Senior Site Reliability Engineer

Yelosoftware

Remote

USD 90,000 - 150,000

2 days ago
Be an early applicant

Senior Site Reliability Engineer

Rackspace Technology

Remote

USD 80,000 - 130,000

6 days ago
Be an early applicant

Senior Reliability Engineer

Infoempregos

Remote

USD 50,000 - 90,000

-1 days ago
Be an early applicant

Senior Site Reliability Engineer

Granicus

Remote

USD 90,000 - 150,000

30+ days ago

Senior Site Reliability Engineer - FinOps

DraftKings

Remote

USD 90,000 - 150,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer (SRE)

Viz.ai

Remote

USD 90,000 - 150,000

13 days ago