Enable job alerts via email!

Site Reliability Engineer

DefenseStorm

Atlanta (GA)

Remote

USD 90,000 - 140,000

Full time

5 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company as a Site Reliability Engineer, where you will ensure the reliability and performance of cloud services. This role involves designing scalable infrastructures, leading migrations, and implementing security initiatives to support a growing customer base. You'll work with cutting-edge technologies like AWS, Terraform, and observability tools to enhance cloud infrastructure. If you're passionate about innovation and want to make a significant impact in a dynamic environment, this opportunity is perfect for you.

Qualifications

  • Experience with CI/CD pipelines and AWS cloud infrastructure.
  • Strong understanding of networking principles in cloud environments.

Responsibilities

  • Lead migration of EC2 workloads to ECS and develop DevOps tooling.
  • Design monitoring solutions using Prometheus and Grafana.

Skills

CI/CD pipelines
Networking principles
AWS cloud infrastructure management
Infrastructure as Code
Containerized workloads
Observability tools

Education

Bachelor's degree in computer science

Tools

GitHub Actions
AWS
ECS
Elasticsearch
PostgreSQL
Prometheus
Grafana
Terraform

Job description

Join to apply for the Site Reliability Engineer role at DefenseStorm.

Get AI-powered advice on this job and more exclusive features.

Job Overview

As a Site Reliability Engineer at DefenseStorm, you will play a crucial role in ensuring the reliability, scalability, and performance of our cloud-based services. The GRID application handles 250k events/sec, and you will contribute to designing and implementing robust, scalable cloud infrastructures to support our growing customer base.

Location

Atlanta, GA (Remote)

Job Duties and Responsibilities
  1. Lead migration of EC2 workloads to ECS and develop DevOps tooling for containerized applications.
  2. Advance zero trust security initiatives by implementing service mesh architectures such as Istio.
  3. Improve security, scalability, and reliability of AWS cloud infrastructure through continuous innovation.
  4. Design and implement monitoring and alerting solutions using Prometheus, Grafana, and OpsGenie.
  5. Maintain SLAs and SLOs by applying SRE best practices, including incident response and post-mortem analysis.
  6. Build, manage, and scale cloud infrastructure using Infrastructure as Code tools like Terraform.
  7. Support SOC 2 and ISO compliance efforts by promoting security best practices and automating audit processes.
  8. Perform other duties as assigned.
Minimum Requirements
  1. Experience with CI/CD pipelines using tools like GitHub Actions.
  2. Strong understanding of networking principles in cloud and container environments.
  3. Proven experience with AWS cloud infrastructure management.
  4. Expertise in Infrastructure as Code and deployment automation tools.
  5. Experience supporting containerized workloads in production.
  6. Familiarity with observability tools like monitoring, logging, and tracing.
  7. Knowledge of AWS, ECS, Elasticsearch, PostgreSQL, Prometheus, Grafana, GitHub Actions, Terraform.
Preferred Qualifications
  1. Bachelor's degree in computer science or equivalent experience.
  2. 3-5 years of experience in cybersecurity.
Additional Information

DefenseStorm is an equal opportunity employer. We prohibit discrimination and harassment of any kind based on race, color, religion, age, sex, national origin, disability, genetics, veteran status, sexual orientation, gender identity, or any other characteristic protected by law.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Site Reliability Engineer

Kforce Inc

Atlanta

Remote

USD 125,000 - 150,000

Yesterday
Be an early applicant

Site Reliability Engineer

Jobot

Atlanta

Remote

USD 100,000 - 150,000

5 days ago
Be an early applicant

Staff Software Engineer, Reliability Engineer - Store Systems & Services (Remote)

Lensa

Atlanta

Remote

USD 120,000 - 190,000

Yesterday
Be an early applicant

Staff Software Engineer, Reliability Engineer - Store Systems & Services (Remote)

Lensa

Atlanta

Remote

USD 120,000 - 190,000

2 days ago
Be an early applicant

Site Reliability Engineer II

InComm Payments

Atlanta

Remote

USD 80,000 - 120,000

8 days ago

Regional Reliability Engineer - Precision

Georgia Pacific

Atlanta

Remote

USD 70,000 - 110,000

5 days ago
Be an early applicant

Site Reliability Engineer

Jobot

Indianapolis

Remote

USD 100,000 - 150,000

10 days ago

Site Reliability Engineer

Jobot

Philadelphia

Remote

USD 100,000 - 150,000

10 days ago

Sr. Data Reliability Engineer (Remote)

CrowdStrike

Philadelphia

Remote

USD 110,000 - 180,000

5 days ago
Be an early applicant