Enable job alerts via email!

Site Reliability Engineer Principal

Thecentermemphis

Memphis (TN)

Remote

USD 120,000 - 150,000

Full time

11 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

FedEx Dataworks seeks a Site Reliability Engineer Principal responsible for enhancing system reliability and mentoring junior engineers. With a focus on cloud platforms and infrastructure management, this fully remote position offers competitive salary and growth opportunities, contributing to smarter supply chains.

Qualifications

  • 7 years of experience in SRE or equivalent, or 5 years with a Master's degree.
  • Experience with monitoring, incident management, and performance engineering.

Responsibilities

  • Drive improvement initiatives for global reliability of systems.
  • Mentor junior engineers and provide technical solutions.

Skills

Cloud platforms
Infrastructure as code
Reliability engineering
Scripting
DevOps
Monitoring and alerting

Education

Bachelor’s degree in Computer Science or related field
Master’s degree in Computer Science or related field

Tools

Docker
Kubernetes

Job description

Description

About FedEx Dataworks:

Born out of FedEx, a pioneer that ships nearly 20 million packages a day and manages endless threads of information, FedEx Dataworks is an organization rooted in connecting the physical and digital sides of our network to meet today's needs and address tomorrow's challenges.

We are creating opportunities for FedEx, our customers, and the world at large by:

  • Exploring and harnessing data to define and solve true problems
  • Removing barriers between data sets to create new avenues of insight
  • Building and iterating on solutions that generate value
  • Acting as a change agent to advance curiosity and performance

At FedEx Dataworks, we are making supply chains work smarter for everyone.

Company Name: FedEx Dataworks, Inc.

Job Title: Site Reliability Engineer Principal

Location: 3630 Hacks Cross Road, Memphis, TN 38125 (100% Remote)

Job Description: Takes ownership and responsibility in the end-to-end resolution of complex problems and technical design gaps. Drives improvement initiatives that support the overarching global reliability of the company's systems, including capacity planning, failover strategies, performance improvements, reduction of Mean Time to Awareness/Resolve and postmortems. Provides technical solutions including specifying of requirements, functional decomposition, analysis, development and testing for current, new and major programs. Leverages critical thinking to improve best practices and provides enterprise- level recommendations that ensure reliability and resiliency. Advises and mentors junior engineers.

Qualifications: Bachelor’s degree or equivalent* in Computer Science, Engineering, Information Systems or related field plus 7 years of experience in the job offered or 7 years’ equivalent work experience in information technology or engineering environment. The employer will alternatively accept a Master’s degree in Computer Science, Engineering, Information Systems or related field plus 5 years of experience in the job offered or 5 years of equivalent work experience in information technology or engineering environment, in lieu of a Bachelor's degree plus 5 years of experience. The position requires experience with: Cloud platforms, e.g. Azure, AWS or GCP; Programming and scripting; Infrastructure as code; Virtualization and containerization, e.g. Docker, Kubernetes; Monitoring, alerting and observability; Reliability and availability engineering, implementing service level agreements (SLAs), service level objectives (SLOs); Incident management, troubleshooting, root cause analysis, continuous improvement; Software development lifecycle, DevOps, continuous integration and continuous delivery (CICD). A track record of excellence in maintaining system availability of complex environments at scale. Expert knowledge in Application Performance Monitoring including end user experience measurement, run-time environment and Application Profiling. Ability to identify, debug and propose viable solutions to issues of scale and performance Proven experience with implementing end to end monitoring and alerting. A related advanced degree may offset the experience requirements.

Position can telecommute from home from any location in the U.S.

*Employer will accept one (1) year of directly related experience in lieu of one (1) year of education.

FedEx Dataworks is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

Dataworks does not discriminate against qualified individuals with disabilities in regard to job application procedures, hiring, and other terms and conditions of employment. Further, Dataworks is prepared to make reasonable accommodations for the known physical or mental limitations of an otherwise qualified applicant or employee to enable the applicant or employee to be considered for the desired position, to perform the essential functions of the position in question, or to enjoy equal benefits and privileges of employment as are enjoyed by other similarly situated employees without disabilities, unless the accommodation will impose an undue hardship. If a reasonable accommodation is needed, please contact DataworksTalentAcquisition@corp.ds.fedex.com.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer Principal

FedEx Group

Memphis

Remote

USD 120.000 - 160.000

15 days ago

Site Reliability Engineer Principal

Tnentertainment

Memphis

Remote

USD 120.000 - 170.000

10 days ago

Principal Site Reliability Engineer - Storage

Akamai Technologies GmbH

Remote

USD 148.000 - 308.000

2 days ago
Be an early applicant

Site Reliability Engineer

Great Question, Inc.

Remote

USD 100.000 - 150.000

2 days ago
Be an early applicant

Principal Site Reliability Engineer - Remote

Bright Horizons

Remote

USD 120.000 - 180.000

11 days ago

Systems Safety Engineer

Stratolaunch

Mojave

Remote

USD 117.000 - 200.000

11 days ago

Senior Software Engineer - Platform

BetterComp

Remote

USD 140.000 - 180.000

6 days ago
Be an early applicant

Principal Platform Engineer (Frontend)

Vonage

Remote

USD 130.000 - 180.000

4 days ago
Be an early applicant

Principal Network Site Reliability Engineer - OCI (REMOTE)

Oracle Cloud ERP

Remote

USD 120.000 - 160.000

21 days ago