Enable job alerts via email!

Site Reliability Engineer Principal

Tnentertainment

Memphis (TN)

Remote

USD 120,000 - 170,000

Full time

10 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

FedEx Dataworks is looking for a Site Reliability Engineer Principal to enhance system reliability and operational practices. This role involves addressing technical design challenges, managing incident response, and mentoring junior engineers, with telecommuting options from anywhere in the U.S.

Qualifications

  • 7 years of experience in information technology or engineering.
  • Expert knowledge in Application Performance Monitoring.
  • Evidence of maintaining system availability in complex environments.

Responsibilities

  • End-to-end resolution of complex problems and technical design gaps.
  • Drives global reliability initiatives for company systems.
  • Provides technical solutions and mentors junior engineers.

Skills

Cloud platforms
Programming and scripting
Infrastructure as code
Virtualization and containerization
Monitoring, alerting and observability
Incident management
Software development lifecycle
DevOps

Education

Bachelor’s degree in Computer Science
Master’s degree in Computer Science

Tools

Docker
Kubernetes

Job description

Description

About FedEx Dataworks:

Born out of FedEx, a pioneer that ships nearly 20 million packages a day and manages endless threads of information, FedEx Dataworks is an organization rooted in connecting the physical and digital sides of our network to meet today's needs and address tomorrow's challenges.

We are creating opportunities for FedEx, our customers, and the world at large by:

  • Exploring and harnessing data to define and solve true problems
  • Removing barriers between data sets to create new avenues of insight
  • Building and iterating on solutions that generate value
  • Acting as a change agent to advance curiosity and performance

At FedEx Dataworks, we are making supply chains work smarter for everyone.

Company Name: FedEx Dataworks, Inc.

Job Title: Site Reliability Engineer Principal

Location: 3630 Hacks Cross Road, Memphis, TN 38125 (100% Remote)

Job Description: Takes ownership and responsibility in the end-to-end resolution of complex problems and technical design gaps. Drives improvement initiatives that support the overarching global reliability of the company's systems, including capacity planning, failover strategies, performance improvements, reduction of Mean Time to Awareness/Resolve and postmortems. Provides technical solutions including specifying of requirements, functional decomposition, analysis, development and testing for current, new and major programs. Leverages critical thinking to improve best practices and provides enterprise- level recommendations that ensure reliability and resiliency. Advises and mentors junior engineers.

Qualifications: Bachelor’s degree or equivalent* in Computer Science, Engineering, Information Systems or related field plus 7 years of experience in the job offered or 7 years’ equivalent work experience in information technology or engineering environment. The employer will alternatively accept a Master’s degree in Computer Science, Engineering, Information Systems or related field plus 5 years of experience in the job offered or 5 years of equivalent work experience in information technology or engineering environment, in lieu of a Bachelor's degree plus 5 years of experience. The position requires experience with: Cloud platforms, e.g. Azure, AWS or GCP; Programming and scripting; Infrastructure as code; Virtualization and containerization, e.g. Docker, Kubernetes; Monitoring, alerting and observability; Reliability and availability engineering, implementing service level agreements (SLAs), service level objectives (SLOs); Incident management, troubleshooting, root cause analysis, continuous improvement; Software development lifecycle, DevOps, continuous integration and continuous delivery (CICD). A track record of excellence in maintaining system availability of complex environments at scale. Expert knowledge in Application Performance Monitoring including end user experience measurement, run-time environment and Application Profiling. Ability to identify, debug and propose viable solutions to issues of scale and performance Proven experience with implementing end to end monitoring and alerting. A related advanced degree may offset the experience requirements.

Position can telecommute from home from any location in the U.S.

*Employer will accept one (1) year of directly related experience in lieu of one (1) year of education.

FedEx Dataworks is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

Dataworks does not discriminate against qualified individuals with disabilities in regard to job application procedures, hiring, and other terms and conditions of employment. Further, Dataworks is prepared to make reasonable accommodations for the known physical or mental limitations of an otherwise qualified applicant or employee to enable the applicant or employee to be considered for the desired position, to perform the essential functions of the position in question, or to enjoy equal benefits and privileges of employment as are enjoyed by other similarly situated employees without disabilities, unless the accommodation will impose an undue hardship. If a reasonable accommodation is needed, please contact DataworksTalentAcquisition@corp.ds.fedex.com.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer Principal

FedEx Group

Memphis

Remote

USD 120,000 - 160,000

15 days ago

Site Reliability Engineer Principal

Thecentermemphis

Memphis

Remote

USD 120,000 - 150,000

10 days ago

Principal Site Reliability Engineer - Storage

Akamai Technologies GmbH

Remote

USD 148,000 - 308,000

2 days ago
Be an early applicant

Site Reliability Engineer

Great Question, Inc.

Remote

USD 100,000 - 150,000

2 days ago
Be an early applicant

Principal Site Reliability Engineer - Remote

Bright Horizons

Remote

USD 120,000 - 180,000

11 days ago

Systems Safety Engineer

Stratolaunch

Mojave

Remote

USD 117,000 - 200,000

11 days ago

Senior Software Engineer - Platform

BetterComp

Remote

USD 140,000 - 180,000

6 days ago
Be an early applicant

Principal Site Reliability Engineer

Devoted Health

Remote

USD 166,000 - 185,000

8 days ago

Principal Platform Engineer (Frontend)

Vonage

Remote

USD 130,000 - 180,000

4 days ago
Be an early applicant