Enable job alerts via email!

Principal Site Reliability Engineer (Wildfire Cloud Infrastructure)

Palo Alto Networks

Santa Clara (CA)

On-site

USD 160,000 - 225,000

Full time

9 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Palo Alto Networks seeks a Principal Site Reliability Engineer to enhance its hybrid infrastructure. This role encompasses automation, security, and performance in cloud environments, engaging with diverse teams to ensure system reliability. Candidates should have strong expertise in Python, Go, and cloud management tools.

Qualifications

Experience in Production Engineering, DevOps, or Site Reliability.
Ability to diagnose complex distributed systems.
Passion for infrastructure and monitoring as code.

Responsibilities

Design, build, and operate reliable Cloud infrastructure.
Ensure applications are production-ready and scalable.
Participate in on-call rotation with SRE teams.

Skills

Python

Linux administration

Network troubleshooting

CI/CD pipelines

Education

BS or MS in Computer Science

Tools

Ansible

Terraform

Kubernetes

Principal Site Reliability Engineer (Wildfire Cloud Infrastructure)

Direct message the job poster from Palo Alto Networks

Senior Technical Recruiter at Palo Alto Networks

Our Mission

At Palo Alto Networks everything starts and ends with our mission:

Being the cybersecurity partner of choice, protecting our digital way of life.

Our vision is a world where each day is safer and more secure than the one before. We are a company built on the foundation of challenging and disrupting the way things are done, and we’re looking for innovators who are as committed to shaping the future of cybersecurity as we are.

Who We Are

We take our mission of protecting the digital way of life seriously. We are relentless in protecting our customers and we believe that the unique ideas of every member of our team contributes to our collective success. Our values were crowdsourced by employees and are brought to life through each of us everyday - from disruptive innovation and collaboration, to execution. From showing up for each other with integrity to creating an environment where we all feel included.

As a member of our team, you will be shaping the future of cybersecurity. We work fast, value ongoing learning, and we respect each employee as a unique individual. Knowing we all have different needs, our development and personal wellbeing programs are designed to give you choice in how you are supported. This includes our FLEXBenefits wellbeing spending account with over 1,000 eligible items selected by employees, our mental and financial health resources, and our personalized learning opportunities - just to name a few!

At Palo Alto Networks, we believe in the power of collaboration and value in-person interactions. This is why our employees generally work full time from our office with flexibility offered where needed. This setup fosters casual conversations, problem-solving, and trusted relationships. Our goal is to create an environment where we all win with precision.

Our Mission

At Palo Alto Networks everything starts and ends with our mission:

Who We Are

Job Description

Your Career

Palo Alto Networks runs a large hybrid infrastructure and is one of the largest GCP customers. As a Site Reliability Engineer, you will be part of a team supporting the services running on this infrastructure. This includes automation, architecture, performance, metrics, troubleshooting, security, and reliability.

Our stack includes Kubernetes, Docker, GCP, AWS, Ansible, Terraform, Vault, Gitlab, Spinnaker, Pub/sub, Bigtable, Memorystore, Bigquery, RabbitMq, Kafka, MySQL, Python, and Go. We don’t expect you to know all these, but we do expect you to learn the ones needed for this role.

Your Impact

Contribute to the success of SRE and DevOps
Develop expertise in new technologies
Work with developers, researchers, data scientists, and security experts
Design, build, and operate reliable, secure Cloud infrastructure
Ensure that applications are production-ready, scalable, and reliable
Develop tools and automation frameworks
Automate robust deployment of robust services
Orchestrate end-to-end monitoring and alerting
Participate with SRE and Dev teams in the on-call rotation
Lead root cause analysis of critical business and production issues
Mentor and champion SRE culture
Participate in design reviews

Qualifications

Your Experience

BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience
Expertise in configuration management with a framework such as Ansible, Terraform, Helm, Kubernetes
Proficient in Python and/or Go
Expertise in managing applications in the Kubenetes cluster with autoscaling enabled
Experience in Production Engineering, DevOps, or Site Reliability
Expertise in the public cloud (GCP or AWS), especially in GCP
Strong Linux administration, internals, and network troubleshooting
Proficiency with programming languages like Python, Golang, and shell scripting to automate tasks
Experience with CI/CD pipelines, GitLab, and GitHub preferred
Ability to diagnose and troubleshoot complex distributed systems handling high-volume transactions
Excellent written and verbal communication, able to collaborate and rally support
Self-disciplined, self-managed, self-motivated, and strong sense of ownership, urgency, and drive
Passion for infrastructure and monitoring as code
Ready to understand and dissect new technology stacks quickly

Additional Information

The Team

Our engineering team is at the core of our products – connected directly to the mission of preventing cyberattacks. We are constantly innovating – challenging how we and the industry think about cybersecurity. Our engineers don’t shy away from building products to solve problems no one has pursued before.

We define the industry, instead of waiting for directions. We need individuals who feel comfortable in ambiguity, excited by the prospect of a challenge, and empowered by the unknown risks facing our everyday lives that are only enabled by a secure digital environment.

Compensation Disclosure

The compensation offered for this position will depend on qualifications, experience, and work location. For candidates who receive an offer at the posted level, the starting base salary (for non-sales roles) or base salary + commission target (for sales/commissioned roles) is expected to be between $160,000 - $225,000/YR. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here.

Our Commitment

We’re problem solvers that take risks and challenge cybersecurity’s status quo. It’s simple: we can’t accomplish our mission without diverse teams innovating, together.

We are committed to providing reasonable accommodations for all qualified individuals with a disability. If you require assistance or accommodation due to a disability or special need, please contact us at accommodations@paloaltonetworks.com.

Palo Alto Networks is an equal opportunity employer. We celebrate diversity in our workplace, and all qualified applicants will receive consideration for employment without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, physical or mental disability, political affiliation, protected veteran status, race, religion, sex (including pregnancy), sexual orientation, or other legally protected characteristics.

All your information will be kept confidential according to EEO guidelines.

Seniority level

Seniority level
Associate

Employment type

Employment type
Full-time

Job function

Industries
Computer and Network Security

Referrals increase your chances of interviewing at Palo Alto Networks by 2x

Get notified about new Site Reliability Engineer jobs in Santa Clara, CA.

Sunnyvale, CA $117,000.00-$173,000.00 2 weeks ago

Menlo Park, CA $117,000.00-$173,000.00 2 weeks ago

Sunnyvale, CA $147,000.00-$208,000.00 2 weeks ago

Menlo Park, CA $147,000.00-$208,000.00 2 weeks ago

Site Reliability Engineer, AI/ML Platforms

San Jose, CA $133,900.00-$242,000.00 1 week ago

Fremont, CA $147,000.00-$208,000.00 2 weeks ago

Software Engineer, AI Platform - New Grad

Mountain View, CA $125,400.00-$188,100.00 2 weeks ago

Santa Clara, CA $101,000.00-$161,000.00 2 days ago

New Grads 2025 - General Software Engineer

San Jose, CA $120,000.00-$165,000.00 4 months ago

Senior Software Engineer, AI/ML, YouTube

Reliability Engineer, Chassis Systems, Semi

Hayward, CA $100,000.00-$150,000.00 4 months ago

Cloud QA Automation Engineer Intern (Fall 2025)

Sunnyvale, CA $197,000.00-$291,000.00 2 weeks ago

San Jose, CA $133,900.00-$242,000.00 1 week ago

San Mateo, CA $80,000.00-$174,000.00 1 day ago

Principal Site Reliability Engineer (Wildfire Cloud Infrastructure)

Foster City, CA $160,000.00-$190,000.00 3 months ago

New Grads 2025 - Software Engineer, Algorithm

San Jose, CA $120,000.00-$165,000.00 9 months ago

Senior Site Reliability Engineer - remote

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs