Enable job alerts via email!

Principal Site Reliability Engineer (WildFire Cloud Infrastructure)

ZipRecruiter

Santa Clara (CA)

On-site

USD 160,000 - 225,000

Full time

30+ days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative company seeks a Site Reliability Engineer to enhance its hybrid infrastructure. In this role, you will focus on automation, performance, and reliability, while collaborating with a diverse team of experts. You will design and operate secure cloud infrastructure, ensuring applications are production-ready and scalable. This position offers a dynamic environment where you can grow your skills in cutting-edge technologies like Kubernetes and GCP, contributing to the mission of safeguarding digital life. Join a forward-thinking team committed to continuous learning and impactful innovation.

Benefits

Stock options
Bonuses
Flexible working hours
Health insurance
Mental health resources
Financial health resources
Personalized learning opportunities

Qualifications

  • Experience with configuration management tools like Ansible and Terraform.
  • Proficiency in Python and/or Go with a focus on cloud infrastructure.

Responsibilities

  • Design, build, and operate reliable, secure cloud infrastructure.
  • Automate deployment of services and implement monitoring.

Skills

Python
Go
Kubernetes
Ansible
Terraform
Linux Administration
Network Troubleshooting
CI/CD Pipelines
GitLab
Production Engineering

Education

Bachelor's in Computer Science
Master's in Computer Science

Tools

Docker
GCP
AWS
GitHub

Job description

Job Description

Company Description

Our Mission

At Palo Alto Networks, everything starts and ends with our mission: being the cybersecurity partner of choice, protecting our digital way of life. Our vision is a world where each day is safer and more secure than the one before. We are a company built on challenging and disrupting the way things are done, and we’re looking for innovators who are as committed to shaping the future of cybersecurity as we are.

Who We Are

We take our mission of protecting the digital way of life seriously. We are relentless in protecting our customers and believe that the unique ideas of every team member contribute to our success. Our values, crowdsourced by employees, are reflected in our everyday actions—from disruptive innovation and collaboration to integrity and inclusivity.

As part of our team, you will shape the future of cybersecurity. We work fast, value ongoing learning, and respect each employee as a unique individual. Our development and wellbeing programs, including FLEXBenefits, mental and financial health resources, and personalized learning opportunities, are designed to support you.

We believe in collaboration and in-person interactions. Our employees typically work full-time from our offices with flexibility where needed, fostering casual conversations, problem-solving, and trusted relationships.

Job Description

Your Career

Palo Alto Networks manages a large hybrid infrastructure and is a major GCP customer. As a Site Reliability Engineer, you will support services on this infrastructure, focusing on automation, architecture, performance, metrics, troubleshooting, security, and reliability.

Our stack includes Kubernetes, Docker, GCP, AWS, Ansible, Terraform, Vault, Gitlab, Spinnaker, Pub/sub, Bigtable, Memorystore, Bigquery, RabbitMq, Kafka, MySQL, Python, and Go. You are not expected to know all these technologies initially but should be eager to learn those relevant to your role.

Your Impact

  • Contribute to SRE and DevOps success
  • Develop expertise in new technologies
  • Collaborate with developers, researchers, data scientists, and security experts
  • Design, build, and operate reliable, secure cloud infrastructure
  • Ensure applications are production-ready, scalable, and reliable
  • Develop tools and automation frameworks
  • Automate deployment of services
  • Implement monitoring and alerting
  • Participate in on-call rotations
  • Lead root cause analysis of critical issues
  • Mentor and promote SRE culture
  • Participate in design reviews

Qualifications

Your Experience

  • BSc or MSc in Computer Science, related field, or equivalent experience
  • Experience with configuration management tools like Ansible, Terraform, Helm, Kubernetes
  • Proficiency in Python and/or Go
  • Experience managing applications in Kubernetes with autoscaling
  • Background in Production Engineering, DevOps, or SRE
  • Expertise in GCP or AWS, especially GCP
  • Strong Linux administration and network troubleshooting skills
  • Proficiency in scripting languages such as Python, Golang, Shell
  • Experience with CI/CD pipelines, GitLab, GitHub
  • Ability to troubleshoot complex distributed systems handling high-volume transactions
  • Excellent communication skills
  • Self-disciplined, motivated, and proactive
  • Passion for infrastructure as code and monitoring
  • Quick learner of new technology stacks

Additional Information

The Team

Our engineering team is central to our mission of preventing cyberattacks. We innovate continuously, challenge industry norms, and build products to solve unprecedented problems. We value individuals comfortable with ambiguity, eager for challenges, and motivated by the risks and opportunities of a secure digital environment.

Compensation Disclosure

The salary range for this role is $160,000 - $225,000 per year, depending on experience and location. Compensation may include stock options and bonuses. More details about benefits are available in our employee benefits section.

#LI-TD1

Our Commitment

We value diversity and inclusion, believing that diverse teams drive innovation. We are committed to providing accommodations for qualified individuals with disabilities. If you need assistance, please contact us at accommodations@paloaltonetworks.com.

Palo Alto Networks is an equal opportunity employer. We consider all qualified applicants without regard to race, ethnicity, gender, age, sexual orientation, disability, or other protected characteristics. All information will be kept confidential according to EEO guidelines.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Principal Site Reliability Engineer (Wildfire Cloud Infrastructure)

Palo Alto Networks

Santa Clara null

On-site

On-site

USD 160,000 - 225,000

Full time

9 days ago

Principal Site Reliability Engineer (Wildfire Cloud Infrastructure) Santa Clara, California, Un[...]

Palo Alto Networks, Inc.

Santa Clara null

On-site

On-site

USD 160,000 - 225,000

Full time

14 days ago

Principal Site Reliability Engineer (Wildfire Cloud Infrastructure)

AECOM

Santa Clara null

On-site

On-site

USD 160,000 - 225,000

Full time

11 days ago

Senior Product Manager (AI-Driven Security Insights)

Palo Alto Networks

Santa Clara null

On-site

On-site

USD 166,000 - 225,000

Full time

11 days ago

Site Reliability Engineer - USDS

TikTok

San Jose null

Hybrid

Hybrid

USD 145,000 - 250,000

Full time

4 days ago
Be an early applicant

Site Reliability Engineer

DTEX Systems

Fremont null

On-site

On-site

USD 150,000 - 190,000

Full time

5 days ago
Be an early applicant

Principal Site Reliability Engineer WildFire Cloud Infrastructure

Palo Alto Networks

Santa Clara null

On-site

On-site

USD 160,000 - 225,000

Full time

30+ days ago

Site Reliability Engineer - AI Cloud

Super Micro Computer Spain, S.L.

San Jose null

On-site

On-site

USD 145,000 - 165,000

Full time

4 days ago
Be an early applicant

Infrastructure Site Reliability Engineer (Entry Level)- USDS

TikTok

Mountain View null

Hybrid

Hybrid

USD 118,000 - 177,000

Full time

4 days ago
Be an early applicant