Enable job alerts via email!

Senior Site Reliability Engineer

Juniper Square, Inc.

United States

Remote

USD 140,000 - 185,000

Full time

4 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Juniper Square is seeking a Senior Site Reliability Engineer to enhance and maintain their cloud infrastructure. The role involves automating management processes, working with modern technologies like Kubernetes and AWS, and ensuring the reliability and scalability of service. Ideal candidates should possess a strong problem-solving mindset and an initiative-driven approach to drive improvements and overcome challenges.

Benefits

Health, dental, and vision care

Life insurance

Mental wellness coverage

Flex Time Off

Annual professional development stipend

Qualifications

5+ years experience in SRE, DevOps, or Infrastructure Engineering required.
Proficiency in AWS services like RDS, Aurora.
Strong understanding of Kubernetes security practices.

Responsibilities

Design and manage scalable cloud infrastructure focusing on AWS and Kubernetes.
Improve system reliability through automation and incident response.
Participate in on-call rotation and conduct post-incident reviews.

Skills

Kubernetes

AWS

CI/CD

Infrastructure as Code

Python

Tools

GitHub Actions

Datadog

Terraform

Helm

Crossplane

About Juniper Square

Our mission is to unlock the full potential of private markets. Privately owned assets like commercial real estate, private equity, and venture capital make up half of our financial ecosystem yet remain inaccessible to most people. We are digitizing these markets, and as a result, bringing efficiency, transparency, and access to one of the most productive corners of our financial ecosystem. If you care about making the world a better place by making markets work better through technology – all while contributing as a member of a values-driven organization – we want to hear from you.

Juniper Square offers employees a variety of ways to work, ranging from a fully remote experience to working full-time in one of our physical offices. We invest heavily in digital-first operations, allowing our teams to collaborate effectively across 27 U.S. states, 2 Canadian Provinces, India, Luxembourg, and England. We also have a physical offices in San Francisco, New York City, and Bangalore for employees who prefer to work in an office some or all of the time.

About your role

We are looking for a Senior Site Reliability Engineer (SRE) to join our team and help scale, secure, and improve our cloud infrastructure. In this role, you will work with modern cloud-native technologies, automate infrastructure management, and enhance system reliability. You will collaborate closely with software engineers and the platform team to build and maintain self-service tools that empower development teams while ensuring the reliability and scalability of our services.

This role requires a high degree of ownership, a bias for action, and a problem-solving mindset. If you are someone who naturally seeks out inefficiencies, takes the initiative to fix them, and enjoys building scalable systems, we want to hear from you.

What you’ll do

Own reliability and scalability initiatives—identify, prioritize, and implement solutions before issues escalate.
Participate in an on-call rotation, responding to incidents, performing root cause analysis, and driving long-term fixes.
Design, deploy, and manage Kubernetes clusters using Helm charts, Cilium, and Karpenter to optimize performance and cost.
Architect and maintain AWS infrastructure with a focus on RDS/Aurora PostgreSQL, networking, and scaling best practices.
Implement GitHub Actions CI/CD pipelines, integrating security best practices and automation.
Define and enforce policy-based security for Kubernetes using Kyverno.
Automate infrastructure provisioning with Crossplane and Terraform to ensure consistency and scalability.
Enhance observability and monitoring using Datadog to proactively detect and resolve issues.
Improve security and reliability by identifying risks in CI/CD, cloud environments, and Kubernetes, then implementing necessary safeguards.
Lead post-incident reviews, drive lessons learned into long-term improvements, and document best practices in Confluence.

Qualifications

Technical Skills

5+ years of experience in SRE, DevOps, or Infrastructure Engineering with a proven track record of ownership and initiative.
Strong experience with Kubernetes, Helm, and CNIs, including networking and security.
Proficiency in AWS services such as RDS, Aurora, IAM, VPC, EKS, and EC2.
Experience in PostgreSQL administration, including performance tuning and high availability in RDS/Aurora.
Hands-on experience with GitHub Actions and ArgoCD for secure and scalable CI/CD automation.
Strong background in Infrastructure as Code (IaC) with Crossplane and Terraform.
Deep understanding of observability and monitoring with Datadog.
Experience with Kyverno for Kubernetes policy-based security enforcement.
Proficiency in Python and Bash scripting for automation and system management.
Strong understanding of CI/CD security best practices and ability to implement controls for securing deployments.

Soft Skills

Self-starter mentality—actively seeks out and fixes problems without waiting for assignments.
High ownership and accountability—takes initiative in driving improvements and following through to resolution.
Strong problem-solving mindset—identifies bottlenecks, inefficiencies, and risks, then delivers scalable solutions.
Excellent communication skills—documents processes in Confluence, collaborates cross-functionally, and influences engineering teams toward operational excellence.

Preferred Qualifications

Deep experience with GitHub Actions for CI/CD automation, with a focus on security best practices.
Extensive knowledge of Helm charts for managing Kubernetes applications.
Strong experience in PostgreSQL, including optimization and high availability in RDS/Aurora.
Experience with NoSQL databases and best practices for scaling and performance.
Proven ability to influence engineering culture toward automation, self-service, and operational excellence.
Experience with Karpenter for Kubernetes autoscaling.
Previous experience with cost optimization strategies in AWS environments.
Experience with Atlassian tools (Jira, Confluence) for tracking incidents and documentation.
Strong experience with and a passion for expanding AI into the SRE and DevOps world.

Compensation

Compensation for this position includes a base salary, equity, and a variety of benefits. The U.S. base salary range for this role is $140,000 - $185,000 USD. Actual base salaries will be based on candidate-specific factors, including experience, skillset, and location, and local minimum pay requirements as applicable.

Benefits include:

Health, dental, and vision care for you and your family
Life insurance
Mental wellness coverage
Fertility and growing family support
Flex Time Off in addition to company paid holidays
Paid family leave, medical leave, and bereavement leave policies
Retirement saving plans
Allowance to customize your work and technology setup at home
Annual professional development stipend

Your recruiter can provide additional details about compensation and benefits.

#LI-Remote

#LI-AD1

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs