Enable job alerts via email!

Senior Infrastructure Operations - FEDRAMP

Content Guru Limited

Reston (VA)

On-site

USD 80,000 - 120,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Senior Infrastructure Operations Engineer to join their dynamic team. This role is pivotal in designing, deploying, and maintaining global infrastructure while ensuring service stability for critical applications. You will engage in fault management, capacity planning, and project implementation, working closely with various engineering teams. The position requires a proactive approach, emphasizing speed and efficiency in resolving issues and delivering projects. If you are a US citizen with a passion for infrastructure operations and a commitment to excellence, this is an exciting opportunity to make a significant impact in a fast-paced environment.

Qualifications

  • Experience in designing and maintaining global infrastructure.
  • Ability to manage faults and escalations effectively.

Responsibilities

  • Design and implement capacity monitoring solutions.
  • Act as escalation point for faults and lead incident investigations.
  • Deliver training and support knowledge transfer among engineers.

Skills

Infrastructure Design
Fault Management
Capacity Management
DevOps Methodology
Project Management
Training and Knowledge Transfer

Education

Bachelor's Degree in Computer Science or related field

Tools

Monitoring Tools
Project Management Software

Job description

Senior Infrastructure Operations - FEDRAMP

Working within a team responsible for all activities associated with designing, deploying, and maintaining the global infrastructure supporting our key services and applications. The infrastructure role includes acting as an escalation point for engineering departments for customer-facing faults. Engineers are tasked with creating, approving, and executing changes associated with both the initial deployment and life-cycle management of all hardware and software, internal and external, deployed on production platforms. A primary responsibility of the team is the configuration, monitoring, and resolution of all platform alarms, as well as providing training for and assessing rotation engineers. Working with other engineering teams and the Project Management team is crucial to the role in order to ensure that customer pipelines are correctly factored into the capacity management of the platforms. As the team supports critical services including emergency services, there is an element of out-of-hours work including weekday overnight and weekend on-call shifts to ensure minimum disruption and best-in-class customer service at all hours. Working with DevOps methodology underpins every aspect of the Infrastructure Operations Engineering role.

There will be an emphasis on supporting Continuous Monitoring (Con-Mon) of the storm FedRAMP infrastructure and clients for the U.S. Federal Government. As such, the successful candidate must be a US citizen.

A successful Engineer recognizes the need for speed in everything that they do, from identifying and resolving alarms through to delivering key project objectives, ensuring that all SLAs are met and that products and features are available for clients on the date promised. They will maintain a keen focus on the goals of the team, which is primarily to ensure service stability and improvement for all clients. In order to achieve both of these aims, they will effectively communicate with other departments to ensure all aspects and perspectives are considered in everything that they undertake.

In order to ensure the continued success of the team, engineers are expected to pass on their knowledge to other engineers, both during formal sessions and on a BAU basis, providing a friendly, approachable interface to maximize the effectiveness of the knowledge transfer and encourage reciprocity. In extension, engineers will seek to assist other engineers and departments where possible and appropriate and will work in such a way as required by the task at hand, being flexible to different approaches and ways of thinking.

Capacity Management

  • Designing capacity monitoring solutions and proposing solutions for resolving capacity constraints.

Faults / Escalations

  • Act as an escalation point for faults, attend/lead investigations into high-profile incidents, and attend wash-ups following high-profile incidents as an infrastructure representative.
  • Propose solutions for preventative measures to mitigate the recurrence of issues.
  • Create, approve, and implement solutions to faults.

Software Deployment

  • Review of application designs and work instructions.
  • Installation of new services and upgrades of existing services.
  • Writing, approving, and executing changes of all risk levels.

Project

  • Working with solution consultants during the pre-sale phase.
  • Leading sprint teams of various sizes to complete projects.
  • Design, plan, implement, and handover projects of all complexities.

OOH Work

  • Act as a level 1 escalation on a rotational basis and level 2 escalation over the Christmas period.
  • Complete OOH changes for own projects where relevant and attend datacentre sites when required.
  • Complete overnight shifts on a rotational basis or as required.

Product/Supplier/Tool Management

  • Assigned technical point of contact for a business tool.

Training

Maintaining, creating, and delivering training.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Data Center - Cloud Infrastructure Analyst

SA Photonics, a CACI Company

Washington

On-site

USD 82,000 - 173,000

12 days ago