Enable job alerts via email!

Senior Site Reliability Engineer - AWS Kubernetes

ZipRecruiter

London

On-site

GBP 60,000 - 80,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a global financial services provider as a Full Stack Infrastructure Engineer. You'll be part of a new team focused on cloud and infrastructure technologies, responsible for designing and managing robust solutions that ensure reliability and performance. This role requires expertise in cloud platforms, networking, and automation, making it an exciting opportunity to shape the future of infrastructure in a dynamic environment.

Qualifications

  • Proven experience managing and optimizing a diverse infrastructure stack.
  • Extensive knowledge of cloud platforms (AWS, Azure, GCP) and infrastructure as code.

Responsibilities

  • Help design, implement, and manage robust infrastructure solutions.
  • Ensure reliability, scalability, and performance of infrastructure.

Skills

Cloud Platforms
Network Protocols
Scripting
DevOps Practices
Disaster Recovery

Tools

Terraform
AWS CloudWatch
Wireshark
Prometheus
Grafana

Job description

Job Description

A truly unique opportunity to help launch a brand new team within a global financial services provider. This new team of highly skilled Full Stack Infrastructure Engineers will cover Compute, Storage, Network, and Cloud technologies. You will help design, implement, and manage robust infrastructure solutions, ensuring reliability, scalability, and performance.

Requirements:
  1. Proven experience managing and optimizing a diverse infrastructure stack.
  2. Extensive knowledge of cloud platforms (AWS, Azure, GCP) and infrastructure as code (Terraform, CloudFormation).
  3. Familiarity with service mesh technologies (Istio, Linkerd).
  4. Solid understanding of virtualization (VMware, Hyper-V), containerization (Docker, Kubernetes), and orchestration.
  5. Understanding of storage solutions (SAN, NAS, cloud storage) and backup systems.
  6. Strong understanding of network protocols, routing, switching, and firewalls.
  7. Experience with load balancers (F5, HAProxy, Nginx) and network monitoring tools.
  8. Experience in DNS management and troubleshooting.
  9. Experience in network security best practices.
  10. Proficiency in monitoring and observability tools (Prometheus, Grafana, Splunk).
  11. Proficiency in at least one scripting language (Python, Bash) for automation.
  12. Experience with CI/CD pipeline management and DevOps practices.
  13. Strong understanding of disaster recovery and business continuity planning.
  14. Experience with performance tuning and capacity planning.
  15. Understanding of chaos engineering principles and practices.
  16. Skills in cost optimization for cloud infrastructure.
Specific Tools and Techniques:
  • Experience in using cloud monitoring tools like AWS CloudWatch, Azure Monitor, and Google Cloud Operations Suite.
  • Experience with packet capture tools like Wireshark for troubleshooting network issues.
  • Experience in using traceroute utilities and performance analysis tools like perf for identifying and resolving bottlenecks.
  • Familiarity with tools such as ipconfig/ifconfig for viewing network configurations, flushing DNS, and diagnosing network issues.
  • Experience with SNMP-based tools for network device monitoring and performance management.
  • Experience in using NetFlow for network traffic analysis.
  • Experience with tools like iostat, vmstat, and dstat for monitoring storage and system performance.
  • Experience in tools like df, du, lsblk, and fdisk for managing and troubleshooting file systems and disk partitions.
  • Familiarity with tools like Prometheus and Grafana for monitoring and observability.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer (Equity only 0.5%)

JR United Kingdom

London

Remote

GBP 70,000 - 110,000

3 days ago
Be an early applicant

Senior Site Reliability Engineer

JR United Kingdom

Hounslow

Remote

GBP 70,000 - 90,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer

JR United Kingdom

Colchester

Remote

GBP 70,000 - 90,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer

JR United Kingdom

Chelmsford

Remote

GBP 70,000 - 90,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer

JR United Kingdom

Woking

Remote

GBP 70,000 - 90,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer

JR United Kingdom

Watford

Remote

GBP 76,000 - 90,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer

JR United Kingdom

Bedford

Remote

GBP 76,000 - 90,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer

JR United Kingdom

Luton

Remote

GBP 70,000 - 90,000

7 days ago
Be an early applicant

Senior Site Reliability Engineer

JR United Kingdom

Brighton

Remote

GBP 75,000 - 90,000

7 days ago
Be an early applicant