Enable job alerts via email!

Manager, Site Reliability Engineering

QGenda

Atlanta (GA)

Hybrid

USD 132,000 - 178,000

Full time

11 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Site Reliability Engineering Manager to lead a talented team in ensuring the reliability and scalability of cloud-based systems. This role emphasizes collaboration, automation, and proactive infrastructure management to meet growing demands while adhering to healthcare regulations. You will cultivate a culture of continuous improvement and accountability, guiding team development and enhancing operational efficiency. Join a dynamic organization committed to transforming healthcare workforce management and making a meaningful impact in the industry.

Benefits

Fully company-paid medical, dental, and vision insurance
Generous paid time off (PTO) policy
Paid parental leave
401(k) with company match
Hybrid-working model or remote options
Annual Costco membership
Cell phone stipend
Commuter benefits
In-office perks

Qualifications

  • 5+ years of experience in Site Reliability Engineering or DevOps.
  • Minimum of 2 years leading a Site Reliability Engineering or DevOps Team.

Responsibilities

  • Lead a team of Site Reliability Engineers to ensure system reliability.
  • Oversee incident management and capacity planning.

Skills

Site Reliability Engineering
AWS Cloud Infrastructure
Agile Methodologies
Problem-Solving
Team Leadership

Education

Bachelor's Degree in Computer Science or related field

Tools

Infrastructure as Code (IaC)
CI/CD Pipelines
Monitoring Tools

Job description

Join to apply for the Manager, Site Reliability Engineering role at QGenda

2 weeks ago Be among the first 25 applicants

Join to apply for the Manager, Site Reliability Engineering role at QGenda

Who We Are

QGenda is redefining healthcare workforce management everywhere care is delivered. We're on a mission to empower the healthcare industry to better onboarding, deploy, and manage their workforce. Over 4,500 healthcare organizations have trusted us to help them make strategic workforce decisions through our unified software platform. With more than 600 employees across the US, we are united in our vision and culture to make a difference for our customers, while enjoying the day-to-day.

Who We Are

QGenda is redefining healthcare workforce management everywhere care is delivered. We're on a mission to empower the healthcare industry to better onboarding, deploy, and manage their workforce. Over 4,500 healthcare organizations have trusted us to help them make strategic workforce decisions through our unified software platform. With more than 600 employees across the US, we are united in our vision and culture to make a difference for our customers, while enjoying the day-to-day.

At QGenda, we value our employees and their contributions toward the success of the business. We strive to create a dynamic work environment that fosters growth, innovation, and collaboration, where employees can be proud of the work they do and the impact it has on the healthcare industry.

QGenda is headquartered in Atlanta.

To learn more about QGenda, visit us at qgenda.com or follow us on Instagram or LinkedIn.

About Your Role

As a Site Reliability Engineering (SRE) Manager, you'll lead and mentor a team of talented engineers to ensure the reliability, scalability, and performance of our cloud-based systems. This role focuses on fostering collaboration, improving automation, and proactively managing infrastructure to support growing demands while staying compliant with key regulations like HIPAA. You'll work cross-functionally with teams to design reliable and efficient applications, oversee incident management, and plan for future capacity. Additionally, you’ll contribute to team development, including recruiting and guiding career growth, while cultivating a culture of accountability and continuous improvement within the organization.

How You’ll Make An Impact

  • Team Leadership: Lead a growing team of Site Reliability Engineers, providing guidance, mentorship, and professional development opportunities.
  • Culture Building: Cultivate a strong culture of collaboration, ownership, and accountability, promoting a proactive approach to problem-solving and continuous learning.
  • Infrastructure Management: Oversee the performance, scalability, and reliability of our cloud-based infrastructure, ensuring it can scale to meet growing customer demands in a secure and cost-effective way.
  • Capacity Planning: Proactively plan and forecast infrastructure needs based on product growth, data trends, and customer requirements.
  • Cross-Team Collaboration: Partner with Product, Development, and QA Teams to ensure applications are designed with reliability, scalability, and performance in mind
  • Security & Compliance: Ensure the platform complies with relevant healthcare regulations (e.g., HIPAA) while maintaining high standards of security, privacy, and data protection.
  • Automation & Monitoring: Oversee the development and maintenance of monitoring, alerting, and automation tools to improve operational efficiency and minimize manual interventions. Identify opportunities for automation to reduce toil and improve reliability.
  • Incident Management: Direct the response to production incidents, ensuring rapid resolution and root cause analysis. Implement post-mortem processes to improve the system and prevent future incidents.
  • Performance Reviews & Feedback: Conduct regular performance reviews, offer constructive feedback, and guide team members on their career paths.

Who You Are

  • Excellent communication skills, both technical and non-technical, with the ability to articulate complex issues to stakeholders at all levels.
  • Strong decision-making and problem-solving skills, with the ability to balance short-term firefighting with long-term strategic planning.
  • Ability to drive operational excellence and a culture of continuous improvement in a fast-paced, evolving environment.
  • Experience managing sprints in an agile environment that involves iterative planning, execution, and review to deliver value while adapting to changing requirements.
  • Strong experience with AWS cloud infrastructure
  • Experience with infrastructure as code (IaC) and CI/CD pipelines.

Experience You Bring

  • 5+ years of experience in Site Reliability Engineering, DevOps, or similar roles in cloud-based environments, ideally with an enterprise software company (healthcare software experience is a plus).
  • Minimum of 2 years experience leading a Site Reliability Engineering or DevOps Team
  • Proven experience managing and mentoring engineering teams, with a focus on building high-performing, collaborative teams.

Applicants for this position must be authorized to work for any employer in the United States(U.S.), including being located in the US. We are unable to sponsor, take over sponsorship of, or hire candidates with an employment visa at this time.

What’s In It For You

We offer a comprehensive total rewards package to support our full-time employees and their family’s day-to-day needs, well-being and major life events, which includes:

  • Fully company-paid options for medical (both in-person and virtual), dental and vision insurance
  • Generous paid time off (PTO) policy to enjoy periods of uninterrupted rest and relaxation for a healthy work/life balance
  • Paid parental leave for birth, adoption or permanent placement 401(k) with company match
  • Options to work in a hybrid-working model or remotely from home, depending on the position
  • Annual Costco membership, cell phone stipend, commuter benefits, in-office perks and more

QGenda delivers technology solutions to improve how healthcare is delivered and increase access - for everyone. We can only succeed by bringing together diverse minds, thoughts, ideas and team members to create better solutions for our customers and make us a better company as a whole. We are committed to creating a culture of embracing diversity, inclusion and equity for all.

QGenda is an Equal Employment Opportunity employer and makes all employment decisions without regard to race, color, religion, creed, gender, sex (including pregnancy), sexual orientation, gender identity or expression, natural origin, ancestry, age, marital status, disability or genetic information, military status, status as a disabled or protected veteran or any other protected status under applicable law.

If you require accommodations or assistance to complete the online application process, please contact recruiting@qgenda.com and identify the type of accommodation or assistance you are requesting. Do not include any medical or health information in this email. We will respond to your email promptly.

Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Quality Assurance
  • Industries
    IT Services and IT Consulting

Referrals increase your chances of interviewing at QGenda by 2x

Sign in to set job alerts for “Reliability Engineering Manager” roles.

Atlanta, GA $132,900.00-$177,200.00 1 week ago

Decatur, GA $106,856.00-$172,039.00 1 week ago

Engineering Manager - Waldorf Astoria Atlanta Buckhead

Atlanta, GA $165,000.00-$190,000.00 5 days ago

Electrical Engineering Manager, CT Fleet
Branch Engineering Manager - Automated Logic - Building Automation (BAS)
Branch Engineering Manager - Automated Logic - Building Automation (BAS)
Engineering Manager, Android Engineering

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Manager, Site Reliability Engineering (IaC)

Axon

Boston

Remote

USD 142,000 - 228,000

Today
Be an early applicant

Manager, Site Reliability Engineering

Dayforce

Remote

USD 90,000 - 150,000

9 days ago

Manager, Site Reliability Engineering

Dayforce US, Inc.

Minnesota

Remote

USD 90,000 - 150,000

25 days ago

Manager, Site Reliability Engineering

Axon

Seattle

Remote

USD 135,000 - 216,000

30+ days ago

Manager, Site Reliability Engineering

Jetty

Remote

USD 90,000 - 150,000

30+ days ago

Manager, Site Reliability Engineering (Observability)

Axon

Seattle

Remote

USD 135,000 - 216,000

30+ days ago

Manager, Site Reliability Engineering

QGenda

Atlanta

Hybrid

USD 90,000 - 150,000

24 days ago

Senior Manager Site Reliability Engineering (Kubernetes)- Remote

Akamai Technologies

Remote

USD 155,000 - 324,000

8 days ago

Lead Site Reliability Engineer - Cloud Platforms

Jobot

Atlanta

Remote

USD 160,000 - 200,000

3 days ago
Be an early applicant