Enable job alerts via email!

Principal Site Reliability Engineer (WildFire Cloud Infrastructure)

ZipRecruiter

Santa Clara (CA)

On-site

USD 160,000 - 225,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative company seeks a Site Reliability Engineer to enhance its hybrid infrastructure. In this role, you will focus on automation, performance, and reliability, while collaborating with a diverse team of experts. You will design and operate secure cloud infrastructure, ensuring applications are production-ready and scalable. This position offers a dynamic environment where you can grow your skills in cutting-edge technologies like Kubernetes and GCP, contributing to the mission of safeguarding digital life. Join a forward-thinking team committed to continuous learning and impactful innovation.

Benefits

Stock options

Bonuses

Flexible working hours

Health insurance

Mental health resources

Financial health resources

Personalized learning opportunities

Qualifications

Experience with configuration management tools like Ansible and Terraform.
Proficiency in Python and/or Go with a focus on cloud infrastructure.

Responsibilities

Design, build, and operate reliable, secure cloud infrastructure.
Automate deployment of services and implement monitoring.

Skills

Python

Kubernetes

Ansible

Terraform

Linux Administration

Network Troubleshooting

CI/CD Pipelines

GitLab

Production Engineering

Education

Bachelor's in Computer Science

Master's in Computer Science

Tools

Docker

GCP

AWS

GitHub

Job Description

Company Description

Our Mission

At Palo Alto Networks, everything starts and ends with our mission: being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on challenging and disrupting the way things are done, and we’re looking for innovators who are as committed to shaping the future of cybersecurity as we are.

Who We Are

We take our mission of protecting the digital way of life seriously. We are relentless in protecting our customers and believe that the unique ideas of every team member contribute to our success. Our values, crowdsourced by employees, are reflected in our everyday actions—from disruptive innovation and collaboration to integrity and inclusivity.

As part of our team, you will shape the future of cybersecurity. We work fast, value ongoing learning, and respect each employee as a unique individual. Our development and wellbeing programs, including FLEXBenefits, mental and financial health resources, and personalized learning opportunities, are designed to support you.

We believe in collaboration and in-person interactions. Our employees typically work full-time from our offices with flexibility where needed, fostering casual conversations, problem-solving, and trusted relationships.

Job Description

Your Career

Palo Alto Networks manages a large hybrid infrastructure and is a major GCP customer. As a Site Reliability Engineer, you will support services on this infrastructure, focusing on automation, architecture, performance, metrics, troubleshooting, security, and reliability.

Our stack includes Kubernetes, Docker, GCP, AWS, Ansible, Terraform, Vault, Gitlab, Spinnaker, Pub/sub, Bigtable, Memorystore, Bigquery, RabbitMq, Kafka, MySQL, Python, and Go. You are not expected to know all these technologies initially but should be eager to learn those relevant to your role.

Your Impact

Contribute to SRE and DevOps success
Develop expertise in new technologies
Collaborate with developers, researchers, data scientists, and security experts
Design, build, and operate reliable, secure cloud infrastructure
Ensure applications are production-ready, scalable, and reliable
Develop tools and automation frameworks
Automate deployment of services
Implement monitoring and alerting
Participate in on-call rotations
Lead root cause analysis of critical issues
Mentor and promote SRE culture
Participate in design reviews

Qualifications

Your Experience

BSc or MSc in Computer Science, related field, or equivalent experience
Experience with configuration management tools like Ansible, Terraform, Helm, Kubernetes
Proficiency in Python and/or Go
Experience managing applications in Kubernetes with autoscaling
Background in Production Engineering, DevOps, or SRE
Expertise in GCP or AWS, especially GCP
Strong Linux administration and network troubleshooting skills
Proficiency in scripting languages such as Python, Golang, Shell
Experience with CI/CD pipelines, GitLab, GitHub
Ability to troubleshoot complex distributed systems handling high-volume transactions
Excellent communication skills
Self-disciplined, motivated, and proactive
Passion for infrastructure as code and monitoring
Quick learner of new technology stacks

Additional Information

The Team

Our engineering team is central to our mission of preventing cyberattacks. We innovate continuously, challenge industry norms, and build products to solve unprecedented problems. We value individuals comfortable with ambiguity, eager for challenges, and motivated by the risks and opportunities of a secure digital environment.

Compensation Disclosure

The salary range for this role is $160,000 - $225,000 per year, depending on experience and location. Compensation may include stock options and bonuses. More details about benefits are available in our employee benefits section.

#LI-TD1

Our Commitment

We value diversity and inclusion, believing that diverse teams drive innovation. We are committed to providing accommodations for qualified individuals with disabilities. If you need assistance, please contact us at accommodations@paloaltonetworks.com.

Palo Alto Networks is an equal opportunity employer. We consider all qualified applicants without regard to race, ethnicity, gender, age, sexual orientation, disability, or other protected characteristics. All information will be kept confidential according to EEO guidelines.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Be an early applicant

Site Reliability Engineer

DTEX Systems

Fremont null

On-site

USD 150,000 - 190,000

Full time

5 days ago

Be an early applicant