Enable job alerts via email!

[Hiring] Site Reliability Engineer @Blackpoint%20cyber

Blackpoint%20cyber

United States

Remote

USD 100,000 - 140,000

Full time

2 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading cybersecurity company is searching for a Site Reliability Engineer to join their remote team. You will design and manage cloud infrastructure, ensuring system reliability and performance. The role requires expertise in AWS, Terraform, and automation tools, and offers a dynamic and collaborative environment.

Benefits

Health Insurance

Dental Insurance

401k plan

Discretionary Time Off

Qualifications

4+ years experience as a Site Reliability Engineer.
Expertise in IaC using Terraform and Terragrunt.
Proven ability to troubleshoot complex systems.

Responsibilities

Design and maintain scalable infrastructure using Terraform.
Manage AWS cloud environments for efficiency and security.
Implement monitoring systems to ensure reliability.

Skills

Problem-Solving

Collaboration

Cloud Infrastructure

Automation

Infrastructure as Code (IaC)

Monitoring

Data Streaming

Caching

Tools

Terraform

AWS

Kubernetes

Prometheus

Grafana

Helm

Kafka

Redis

Jun 15, 2025 - Blackpoint%20cyber is hiring a remote Site Reliability Engineer. Location: Australia.

Blackpoint Cyber is the leading provider of world-class cybersecurity threat hunting, detection and remediation technology. Founded by former National Security Agency (NSA) cyber operations experts who applied their learningsto bring national security-grade technology solutions to commercial customers around the world, Blackpoint Cyber is in hyper-growth mode, fueled by a recent $190m series C round.

Job Overview:

We’re on the lookout for a passionate and experienced Site Reliability Engineer (SRE) to join our high-impact, fast-moving team. In this role, you’ll take the lead in designing, building, and scaling robust infrastructure, CI/CD pipelines, and build systems that power our products. You’ll work together with cross-functional teams to drive system reliability, performance, and automation, all while championing a culture of innovation, collaboration, and continuous improvement.

Key Responsibilities:

Infrastructure & Cloud Management

Design, build, and maintain highly scalable infrastructure using Terraform and Terragrunt to automate cloud resource provisioning.

Manage and optimize AWS cloud environments for cost-efficiency, security, and high availability.

Continuously improve infrastructure automation tools and methodologies to support scalability and maintainability.

Platform & System Reliability

Manage and scale Kafka and Confluent Cloud platforms for real-time data streaming.

Deploy and maintain Redis instances to support caching and real-time data processing workloads.

Implement and maintain robust monitoring and alerting systems using Prometheus, Grafana, Alert Manager, and OpsGenie to ensure system reliability and visibility.

Troubleshoot and resolve complex system issues, ensuring optimal performance and uptime.

Deployment & Release Engineering

Manage Kubernetes clusters using tools like Helm, ArgoCD, Istio, and Kustomize to support modern infrastructure-as-code and continuous delivery practices.

Enable feature flag management and safe, controlled rollouts using LaunchDarkly.

Collaboration & Continuous Improvement

Work closely with development teams to seamlessly integrate new features and services into the infrastructure.

Foster a culture of continuous improvement by regularly evaluating and adopting emerging SRE tools, technologies, and best practices.

Skills & Qualifications:

4+ years proven experience as a SRE Engineer or in a similar role with a strong focus on cloud infrastructure and automation.

Excellent problem-solving skills with the ability to troubleshoot complex systems in production.

Strong communication and collaboration skills, with experience working in agile environments.

Expertise in Infrastructure as Code (IaC) using Terraform and Terragrunt.

Deep knowledge of AWS cloud services and best practices for designing secure and scalable architectures.

Hands-on experience with Confluent Cloud and Kafka for distributed data streaming.

Strong experience with REDIS for caching and RDS data storage.

Strong Experience with OpenSearch/Elasticsearch/ Chaos Search.

Proficiency in monitoring and alerting using Prometheus, Grafana, Alert Manager.

Extensive experience managing Kubernetes clusters, including package management with Helm, deployment with ArgoCD, and service mesh configurations using Istio.

Familiarity with Kustomize for Kubernetes resource configuration.

Development experience in NodeJS/Python/GoLang.

Nice to Have:

Experience with multi-cloud environments (e.g., GCP, Azure).

Familiarity with security, compliance best practices in cloud and containerized environments.

Knowledge of serverless architectures and CI/CD tools such as Jenkins and/or GitHub Actions.

Time Zones: Australian Easter Standard (AEST) or Day Light Savings time (AEDT).

Blackpoint Cyber welcomes and encourages applications from qualified individuals of all races, colors, religions, sex, sexual orientation, gender identity or expression, national origin, age, marital status, or any other legally protected status. We are committed to equality of opportunity in all aspects of employment. For eligible employees in the US, Blackpoint offers competitive Health, Vision, Dental, and Life Insurance plans, a robust 401k plan, Discretionary Time Off, and other minor perks.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.