Enable job alerts via email!

Site Reliability Engineer III

JPMorgan Chase & Co.

Glasgow

On-site

GBP 50,000 - 75,000

Full time

30+ days ago

Job summary

Join a dynamic team at JPMorgan Chase as a Site Reliability Engineer III, where you will impact the innovation of our technology systems. In this role, you will ensure the reliability and observability of complex applications, working with programming, cloud infrastructure, and automation. Ideal candidates are solution-oriented, with a passion for technology and a desire to contribute to mission-critical systems.

Qualifications

  • Experience with public cloud platforms like AWS, Azure, or GCP.
  • Proficient in programming languages such as Python, Go, or Java.
  • Strong skills in cloud computing and debugging.

Responsibilities

  • Drive improvement of reliability and monitoring for microservices.
  • Collaborate with development teams to ensure software reliability.
  • Implement metrics and dashboards to monitor application performance.

Skills

Reliability
Automation
Monitoring
Debugging
Collaboration

Education

Formal training in site reliability engineering

Tools

AWS
Kubernetes
Jenkins
Terraform
Dynatrace
Datadog

Job description

Social network you want to login/join with:

There’s nothing more exciting than being at the center of a rapidly growing field in technology and applying your skillsets to drive innovation and modernize the world's most complex and mission-critical systems.

Our team is at globally located, focused on ensuring production stability, automations, reliability and observability. We are looking for solution-oriented, commercially minded, customer-focused individuals, used to working in an agile environment who want to be a part of building something new from the ground up within a diverse and inclusive team.

Culture is as important to us and we are looking for intellectually curious, new technology passionate individuals who would like to expand their skills whilst working on a new exciting venture for the firm. Your work will have a massive impact, both on us as a company, as well as our clients and our business partners around the world.

As a Site Reliability Engineer III at JPMorgan Chase within the Corporate Technology - Capital Management, youwill solve complex and broad business problems with simple and straightforward solutions. Through code and cloud infrastructure, you will configure, maintain, monitor, and optimize applications and their associated infrastructure to independently decompose and iteratively improve on existing solutions. You are a significant contributor to your team by sharing your knowledge of end-to-end operations, availability, reliability, and scalability of your application or platform.

Job responsibilities

  • Able to drive the continuous improvement of reliability, monitoring and alerting for our mission-critical microservices.
  • Reduce toil by automation, creating reliable infrastructure and tooling to expedite feature development.
  • Develop and add metrics to microservices, define user-journeys, SLOs and error budgets, and configure dashboards and alerts based on these.
  • Facilitate blameless post-mortems and ensure permanent closure of incidents
  • Engage with development team throughout the life cycle to help develop software for reliability and scale, ensuring minimal refactoring or changes; Design self-healing and resiliency patterns
  • Collaborate and influence across the organization on behalf of their application portfolio.
  • Respond to incidents alongside developers and infrastructure engineers where required, providing support and insight.
  • Collaborate with other software engineers and teams to design and implement deployment approaches using automated continuous integration and continuous delivery pipelines
  • Implement infrastructure, configuration, and network as code for the applications and platforms in your remit
  • Understand service level indicators and utilizes service level objectives to proactively resolve issues before they impact customers
  • Supports the adoption of site reliability engineering best practices within your team(metrics, alerting, logging, automation, resiliency, capacity, performance)

Required qualifications, capabilities, and skills

  • Formal training or certification on site reliability engineering concepts and proficient applied experience in public cloud such as AWS or Azure or GCP
  • Proficient in at least one programming language such as Python, Go, Java/Spring Boot
  • Expertise in at least one technology stack designing, coding, testing, and delivering software
  • Experience with Kubernetes.
  • Experience in cloud computing (preferably AWS).
  • Proficiency in one or more technology domains, may be a cross-domain expert able to solve complex and mission critical problems within a business or across the firm
  • Excellent debugging and trouble shooting skills
  • Ability to contribute to large and collaborative teams and proactively recognize road blocks and demonstrates interest in learning technology that facilitates innovation
  • Experience with continuous integration and continuous delivery tools like Jenkins, GitLab, Terraform
  • Experience in at least one observability tool such as Dynatrace, Datadog, New Relic, CloudWatch, AppDynamics, Splunk.,

Preferred Qualification

  • Experience a plus in common SRE toolchains: Grafana, Prometheus, Elasticsearch, Kibana, Jaeger.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs