Enable job alerts via email!

Site Reliability Engineer - Remote

Optum

Basking Ridge (NJ)

Remote

USD 110,000 - 115,000

Full time

Yesterday

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading healthcare organization is seeking a Site Reliability Engineer to enhance system reliability and performance. This remote role involves leading initiatives, collaborating with teams, and driving automation in a dynamic environment. Candidates should have extensive experience in cloud platforms, programming, and leadership within SRE teams. Join us to make a significant impact on health equity and organizational efficiency.

Qualifications

6+ years of site reliability engineering experience.
3+ years in a leadership or technical lead role.
3+ years of programming skills in Python, Go, or Java.

Responsibilities

Design, implement, and maintain scalable infrastructure solutions.
Lead the Site Reliability Engineering team in automating processes.
Manage incident response efforts and conduct root cause analyses.

Skills

Leadership

Automation

Cloud Platforms

Programming

Tools

Docker

Kubernetes

Terraform

Ansible

Prometheus

Grafana

Jenkins

GitLab CI

Join to apply for the Site Reliability Engineer - Remote role at Optum

Get AI-powered advice on this job and more exclusive features.

Optum is a global organization that delivers care, aided by technology to help millions of people live healthier lives. The work you do with our team will directly improve health outcomes by connecting people with the care, pharmacy benefits, data and resources they need to feel their best. Here, you will find a culture guided by diversity and inclusion, talented peers, comprehensive benefits and career development opportunities. Come make an impact on the communities we serve as you help us advance health equity on a global scale. Join us to start Caring. Connecting. Growing together.

Software engineering is the application of engineering to the design, development, implementation, testing and maintenance of software in a systematic method. The roles in this function will cover all primary development activity across all technology functions that ensure we deliver code with high quality for our applications, products and services and to understand customer needs and to develop product roadmaps.

These roles include, but are not limited to analysis, design, coding, engineering, testing, debugging, standards, methods, tools analysis, documentation, research and development, maintenance, new development, operations and delivery. With every role in the company, each position has a requirement for building quality into every output. This also includes evaluating new tools, new techniques, strategies; Automation of common tasks; build of common utilities to drive organizational efficiency with a passion around technology and solutions and influence of thought and leadership on future capabilities and opportunities to apply technology in new and innovative ways.

You’ll enjoy the flexibility to work remotely * from anywhere within the U.S. as you take on some tough challenges.

Primary Responsibilities

Lead digital-first initiatives for the UHC Provider Portal to improve customer experience
Design, implement, and maintain scalable, reliable, and secure infrastructure solutions to support application deployment and operational excellence
Develop and manage comprehensive monitoring, alerting, and incident response systems to ensure high availability and optimal performance of services
Lead the Site Reliability Engineering team in automating processes, reducing manual interventions, and enhancing system efficiencies through innovative tooling
Collaborate with development, product management, and architecture teams to integrate reliability and performance best practices into the software development lifecycle
Drive the creation and upkeep of documentation for system architectures, operational procedures, and SRE best practices to ensure knowledge sharing and consistency
Manage incident response efforts, conduct root cause analyses, and implement preventive measures to minimize downtime and enhance system resilience
Mentor and develop team members, fostering a culture of continuous improvement, learning, and professional growth
Align SRE initiatives with organizational goals, ensuring that reliability, security, and performance objectives support overall business strategies
Advocate for and implement security best practices within infrastructure and operational processes to safeguard systems and data

You’ll be rewarded and recognized for your performance in an environment that will challenge you and give you clear direction on what it takes to succeed in your role as well as provide development for other roles you may be interested in.

Required Qualifications

6+ years of site reliability engineering experience, including hands-on management of large-scale, distributed systems
6+ years of experience with public cloud platforms such as AWS, Azure, or Google Cloud, with proficiency in at least two major services within each platform
3+ years in a leadership or technical lead role, overseeing SRE teams and driving reliability-focused initiatives
3+ years of experience in containerization and orchestration technologies, including Docker and Kubernetes, with a minimum of 5 years of relevant experience
3+ years of programming and scripting skills in languages such as Python, Go, or Java, with 5+ years of hands-on development experience

Preferred Qualifications

4+ years of experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack) to ensure system health and performance
4+ years of experience with infrastructure as code (IaC) tools, such as Terraform or Ansible
4+ years of incident management, including root cause analysis and post-incident reviews
3+ years of experience with CI/CD pipelines and automation, utilizing tools like Jenkins, GitLab CI, or similar, with at least 5 years of experience
3+ years of enterprise security best practices, including implementing security measures within SRE processe
3+ years of experience with microservices architecture and deploying microservices at scale
All employees working remotely will be required to adhere to UnitedHealth Group’s Telecommuter Policy

At UnitedHealth Group, our mission is to help people live healthier lives and make the health system work better for everyone. We believe everyone-of every race, gender, sexuality, age, location and income-deserves the opportunity to live their healthiest life. Today, however, there are still far too many barriers to good health which are disproportionately experienced by people of color, historically marginalized groups and those with lower incomes. We are committed to mitigating our impact on the environment and enabling and delivering equitable care that addresses health disparities and improves health outcomes — an enterprise priority reflected in our mission.

UnitedHealth Group is an Equal Employment Opportunity employer under applicable law and qualified applicants will receive consideration for employment without regard to race, national origin, religion, age, color, sex, sexual orientation, gender identity, disability, or protected veteran status, or any other characteristic protected by local, state, or federal laws, rules, or regulations.

UnitedHealth Group is a drug free workplace. Candidates are required to pass a drug test before beginning employment.

Seniority level

Seniority level
Mid-Senior level

Employment type

Employment type
Full-time

Job function

Job function
Engineering and Information Technology
Industries
Hospitals and Health Care

Referrals increase your chances of interviewing at Optum by 2x

Edison, NJ
$110,000.00
-
$115,000.00
2 days ago

Software Engineer - Full Stack Developer

Iselin, NJ $100,000 - $105,000 3 weeks ago

Intern, Embedded and Platform Software Engineer, Summer 2025

Software Development Engineer - ADAS Parking Feature

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs