Enable job alerts via email!

Manager, Site Reliability Engineering

QGenda

Atlanta (GA)

Hybrid

USD 90,000 - 150,000

Full time

24 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player in healthcare technology is seeking a Site Reliability Engineering Manager to lead a talented team. This role offers an exciting opportunity to enhance the reliability and performance of cloud-based systems while fostering a culture of collaboration and continuous improvement. You will be pivotal in ensuring compliance with healthcare regulations, managing infrastructure, and driving operational excellence. Join a dynamic work environment that values innovation and employee contributions, allowing you to make a significant impact in the healthcare sector while enjoying a comprehensive rewards package.

Benefits

Fully company-paid medical, dental, and vision insurance
Generous paid time off (PTO) policy
Paid parental leave
401(k) with company match
Hybrid-working model options
Annual Costco membership
Cell phone stipend
Commuter benefits
In-office perks

Qualifications

  • 5+ years in Site Reliability Engineering or DevOps in cloud environments.
  • 2+ years leading SRE or DevOps teams with a focus on collaboration.

Responsibilities

  • Lead and mentor Site Reliability Engineers for cloud-based systems.
  • Oversee incident management and ensure compliance with regulations.

Skills

Communication Skills
Problem-Solving Skills
Decision-Making Skills
Operational Excellence
Agile Methodologies

Tools

AWS
Infrastructure as Code (IaC)
CI/CD Pipelines

Job description

QGenda is redefining healthcare workforce management everywhere care is delivered. We're on a mission to empower the healthcare industry to better onboard, deploy, and manage their workforce. Over 4,500 healthcare organizations have trusted us to help them make strategic workforce decisions through our unified software platform. With more than 600 employees across the US, we are united in our vision and culture to make a difference for our customers, while enjoying the day-to-day.

At QGenda, we value our employees and their contributions toward the success of the business. We strive to create a dynamic work environment that fosters growth, innovation, and collaboration, where employees can be proud of the work they do and the impact it has on the healthcare industry.

As a Site Reliability Engineering (SRE) Manager, you'll lead and mentor a team of talented engineers to ensure the reliability, scalability, and performance of our cloud-based systems. This role focuses on fostering collaboration, improving automation, and proactively managing infrastructure to support growing demands while staying compliant with key regulations like HIPAA. You'll work cross-functionally with teams to design reliable and efficient applications, oversee incident management, and plan for future capacity. Additionally, you’ll contribute to team development, including recruiting and guiding career growth, while cultivating a culture of accountability and continuous improvement within the organization.

How You’ll Make an Impact
  • Team Leadership: Lead a growing team of Site Reliability Engineers, providing guidance, mentorship, and professional development opportunities.
  • Culture Building: Cultivate a strong culture of collaboration, ownership, and accountability, promoting a proactive approach to problem-solving and continuous learning.
  • Infrastructure Management: Oversee the performance, scalability, and reliability of our cloud-based infrastructure, ensuring it can scale to meet growing customer demands in a secure and cost-effective way.
  • Capacity Planning: Proactively plan and forecast infrastructure needs based on product growth, data trends, and customer requirements.
  • Cross-Team Collaboration: Partner with Product, Development, and QA Teams to ensure applications are designed with reliability, scalability, and performance in mind.
  • Security & Compliance: Ensure the platform complies with relevant healthcare regulations (e.g., HIPAA) while maintaining high standards of security, privacy, and data protection.
  • Automation & Monitoring: Oversee the development and maintenance of monitoring, alerting, and automation tools to improve operational efficiency and minimize manual interventions. Identify opportunities for automation to reduce toil and improve reliability.
  • Incident Management: Direct the response to production incidents, ensuring rapid resolution and root cause analysis. Implement post-mortem processes to improve the system and prevent future incidents.
  • Performance Reviews & Feedback: Conduct regular performance reviews, offer constructive feedback, and guide team members on their career paths.
Who You Are
  • Excellent communication skills, both technical and non-technical, with the ability to articulate complex issues to stakeholders at all levels.
  • Strong decision-making and problem-solving skills, with the ability to balance short-term firefighting with long-term strategic planning.
  • Ability to drive operational excellence and a culture of continuous improvement in a fast-paced, evolving environment.
  • Experience managing sprints in an agile environment that involves iterative planning, execution, and review to deliver value while adapting to changing requirements.
  • Strong experience with AWS cloud infrastructure.
  • Experience with infrastructure as code (IaC) and CI/CD pipelines.
Experience You Bring
  • 5+ years of experience in Site Reliability Engineering, DevOps, or similar roles in cloud-based environments, ideally with an enterprise software company (healthcare software experience is a plus).
  • Minimum of 2 years experience leading a Site Reliability Engineering or DevOps Team.
  • Proven experience managing and mentoring engineering teams, with a focus on building high-performing, collaborative teams.

Applicants for this position must be authorized to work for any employer in the United States (U.S.), including being located in the US. We are unable to sponsor, take over sponsorship of, or hire candidates with an employment visa at this time.

What’s In It For You

We offer a comprehensive total rewards package to support our full-time employees and their family’s day-to-day needs, well-being and major life events, which includes:

  • Fully company-paid options for medical (both in-person and virtual), dental and vision insurance.
  • Generous paid time off (PTO) policy to enjoy periods of uninterrupted rest and relaxation for a healthy work/life balance.
  • Paid parental leave for birth, adoption or permanent placement.
  • 401(k) with company match.
  • Options to work in a hybrid-working model or remotely from home, depending on the position.
  • Annual Costco membership, cell phone stipend, commuter benefits, in-office perks and more.

QGenda delivers technology solutions to improve how healthcare is delivered and increase access - for everyone. We can only succeed by bringing together diverse minds, thoughts, ideas and team members to create better solutions for our customers and make us a better company as a whole. We are committed to creating a culture of embracing diversity, inclusion and equity for all.

QGenda is an Equal Employment Opportunity employer and makes all employment decisions without regard to race, color, religion, creed, gender, sex (including pregnancy), sexual orientation, gender identity or expression, natural origin, ancestry, age, marital status, disability or genetic information, military status, status as a disabled or protected veteran or any other protected status under applicable law.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Manager, Site Reliability Engineering (IaC)

Axon

Boston

Remote

USD 142,000 - 228,000

Today
Be an early applicant

Manager, Site Reliability Engineering

QGenda

Atlanta

Hybrid

USD 132,000 - 178,000

11 days ago

Manager, Site Reliability Engineering (Observability)

Out in Science, Technology, Engineering, and Mathematics

New York

Remote

USD 135,000 - 216,000

9 days ago

Manager, Site Reliability Engineering

Dayforce

Remote

USD 90,000 - 150,000

9 days ago

Manager, Site Reliability Engineering

Dayforce US, Inc.

Minnesota

Remote

USD 90,000 - 150,000

26 days ago

Manager, Site Reliability Engineering

Dayforce US, Inc.

Minnesota

Remote

USD 80,000 - 130,000

26 days ago

Manager, Site Reliability Engineering

Axon

Seattle

Remote

USD 135,000 - 216,000

30+ days ago

Manager, Site Reliability Engineering

Jetty

Remote

USD 90,000 - 150,000

30+ days ago

Manager, Site Reliability Engineering (Observability)

Axon

Seattle

Remote

USD 135,000 - 216,000

30+ days ago