Enable job alerts via email!

[Hiring] Site Reliability Engineer @Blackpoint%20cyber

Blackpoint%20cyber

United States

Remote

USD 100,000 - 140,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading cybersecurity company is searching for a Site Reliability Engineer to join their remote team. You will design and manage cloud infrastructure, ensuring system reliability and performance. The role requires expertise in AWS, Terraform, and automation tools, and offers a dynamic and collaborative environment.

Benefits

Health Insurance
Dental Insurance
401k plan
Discretionary Time Off

Qualifications

  • 4+ years experience as a Site Reliability Engineer.
  • Expertise in IaC using Terraform and Terragrunt.
  • Proven ability to troubleshoot complex systems.

Responsibilities

  • Design and maintain scalable infrastructure using Terraform.
  • Manage AWS cloud environments for efficiency and security.
  • Implement monitoring systems to ensure reliability.

Skills

Problem-Solving
Collaboration
Cloud Infrastructure
Automation
Infrastructure as Code (IaC)
Monitoring
Data Streaming
Caching

Tools

Terraform
AWS
Kubernetes
Prometheus
Grafana
Helm
Kafka
Redis

Job description

Jun 15, 2025 - Blackpoint%20cyber is hiring a remote Site Reliability Engineer. Location: Australia.

Blackpoint Cyber is the leading provider of world-class cybersecurity threat hunting, detection and remediation technology. Founded by former National Security Agency (NSA) cyber operations experts who applied their learningsto bring national security-grade technology solutions to commercial customers around the world, Blackpoint Cyber is in hyper-growth mode, fueled by a recent $190m series C round.

Job Overview:

We’re on the lookout for a passionate and experienced Site Reliability Engineer (SRE) to join our high-impact, fast-moving team. In this role, you’ll take the lead in designing, building, and scaling robust infrastructure, CI/CD pipelines, and build systems that power our products. You’ll work together with cross-functional teams to drive system reliability, performance, and automation, all while championing a culture of innovation, collaboration, and continuous improvement.

Key Responsibilities:

Infrastructure & Cloud Management

  • Design, build, and maintain highly scalable infrastructure using Terraform and Terragrunt to automate cloud resource provisioning.

  • Manage and optimize AWS cloud environments for cost-efficiency, security, and high availability.

  • Continuously improve infrastructure automation tools and methodologies to support scalability and maintainability.

Platform & System Reliability

  • Manage and scale Kafka and Confluent Cloud platforms for real-time data streaming.

  • Deploy and maintain Redis instances to support caching and real-time data processing workloads.

  • Implement and maintain robust monitoring and alerting systems using Prometheus, Grafana, Alert Manager, and OpsGenie to ensure system reliability and visibility.

  • Troubleshoot and resolve complex system issues, ensuring optimal performance and uptime.

Deployment & Release Engineering

  • Manage Kubernetes clusters using tools like Helm, ArgoCD, Istio, and Kustomize to support modern infrastructure-as-code and continuous delivery practices.

  • Enable feature flag management and safe, controlled rollouts using LaunchDarkly.

Collaboration & Continuous Improvement

  • Work closely with development teams to seamlessly integrate new features and services into the infrastructure.

  • Foster a culture of continuous improvement by regularly evaluating and adopting emerging SRE tools, technologies, and best practices.

Skills & Qualifications:

  • 4+ years proven experience as a SRE Engineer or in a similar role with a strong focus on cloud infrastructure and automation.

  • Excellent problem-solving skills with the ability to troubleshoot complex systems in production.

  • Strong communication and collaboration skills, with experience working in agile environments.

  • Expertise in Infrastructure as Code (IaC) using Terraform and Terragrunt.

  • Deep knowledge of AWS cloud services and best practices for designing secure and scalable architectures.

  • Hands-on experience with Confluent Cloud and Kafka for distributed data streaming.

  • Strong experience with REDIS for caching and RDS data storage.

  • Strong Experience with OpenSearch/Elasticsearch/ Chaos Search.

  • Proficiency in monitoring and alerting using Prometheus, Grafana, Alert Manager.

  • Extensive experience managing Kubernetes clusters, including package management with Helm, deployment with ArgoCD, and service mesh configurations using Istio.

  • Familiarity with Kustomize for Kubernetes resource configuration.

  • Development experience in NodeJS/Python/GoLang.

Nice to Have:

  • Experience with multi-cloud environments (e.g., GCP, Azure).

  • Familiarity with security, compliance best practices in cloud and containerized environments.

  • Knowledge of serverless architectures and CI/CD tools such as Jenkins and/or GitHub Actions.

Time Zones: Australian Easter Standard (AEST) or Day Light Savings time (AEDT).

Blackpoint Cyber welcomes and encourages applications from qualified individuals of all races, colors, religions, sex, sexual orientation, gender identity or expression, national origin, age, marital status, or any other legally protected status. We are committed to equality of opportunity in all aspects of employment. For eligible employees in the US, Blackpoint offers competitive Health, Vision, Dental, and Life Insurance plans, a robust 401k plan, Discretionary Time Off, and other minor perks.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.