Enable job alerts via email!

Staff Cloud Availability Platform Engineer

Crusoe Energy Systems LLC

San Francisco (CA)

Hybrid

USD 210,000 - 250,000

Full time

5 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading technology company is seeking a Staff Engineer to lead the architectural design and implementation of core infrastructure for AI applications. Ideal candidates will have a strong background in distributed systems, Kubernetes, and backend engineering, with a commitment to driving impactful architectural decisions.

Benefits

Hybrid work schedule
Industry competitive pay
Restricted Stock Units
Health insurance options
Employer HSA contributions
Paid Parental Leave
Short-term and long-term disability coverage
Paid time off and holidays
Tuition reimbursement
Cell phone reimbursement

Qualifications

  • Experience in designing Kubernetes clusters with a focus on performance and reliability.
  • Expertise in strongly typed languages like Go, Rust, Java, or C++.
  • Knowledge of Linux networking and event-driven systems.

Responsibilities

  • Lead the design and implementation of Kubernetes clusters.
  • Drive end-to-end ownership of the networking stack.
  • Design and scale event-driven systems for real-time applications.

Skills

Kubernetes
Networking
Scalable APIs
Linux Host/Container Networking
Event-Driven Systems
Developer-facing APIs
Infrastructure as Code

Education

8+ years of experience in platform or backend engineering
Staff level experience (2+ years)

Tools

Terraform
Helm

Job description

Crusoe is building the World’s Favorite AI-first Cloud infrastructure company. We’re pioneering vertically integrated, purpose-built AI infrastructure solutions trusted by Fortune 500 companies to power their most advanced AI applications. Crusoe is redefining AI cloud infrastructure, with a mission to align the future of computing with the future of the climate. Our AI platform is recognized as the "gold standard" for reliability and performance. Our data centers are optimized for AI workloads and are powered by clean, renewable energy.

Be part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure.

About This Role:
We are looking for a Staff member to lead the design and implementation of core infrastructure and platform services at scale. This role is ideal for engineers who thrive in complex distributed systems, are passionate about Kubernetes, networking, and scalable APIs, and want to drive architectural decisions that impact the entire engineering organization.

What You’ll Be Working On:

  • Leading the architectural design and implementation of Kubernetes clusters with a focus on performance, reliability, and multi-tenancy.

  • Driving end-to-end ownership of the host networking stack, including CNI integration, custom network policies, and observability/security features.

  • Designing and scaling event-driven systems (e.g., Kafka, NATS, SNS/SQS) to support real-time applications and control planes.

  • Defining and implementing robust developer-facing APIs (REST/gRPC) for platform services and infrastructure tooling.

  • Establishing platform-wide patterns and best practices for scalable service integration and system interoperability.

  • Contributing to the evolution of observability, service discovery, and performance tooling across the platform.

  • Leading design reviews and architecture discussions, offering deep domain expertise and mentorship to peers.

  • Collaborating with security, compliance, and governance teams to ensure platform auditability and robustness.

What You’ll Bring to the Team:

  • 8+ years of experience in platform, infrastructure, or backend engineering, with 2+ years at a Staff.

  • Proven expertise in strongly typed languages such as Go, Rust, Java, or C++.

  • Deep knowledge of Kubernetes architecture, including service discovery, networking, and load balancing.

  • Expertise in Linux host/container networking, service meshes, and L3/L4 traffic debugging.

  • Experience architecting and operating event-driven systems with high availability and low latency.

  • Strong background in building developer-facing APIs with a focus on usability and performance.

  • Familiarity with infrastructure-as-code tools like Terraform and Helm, and GitOps/CI/CD workflows.

  • Experience working in multi-cloud or hybrid environments, particularly with AWS.

Bonus Points:

  • Experience contributing to open-source infrastructure or Kubernetes projects.

  • Familiarity with compliance frameworks (e.g., SOC 2, ISO 27001).

  • Exposure to zero-trust networking or advanced access control models.

  • Past leadership of cross-functional or platform-wide initiatives.

Benefits:

  • Hybrid work schedule

  • Industry competitive pay

  • Restricted Stock Units in a fast growing, well-funded technology company

  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents

  • Employer contributions to HSA accounts

  • Paid Parental Leave

  • Paid life insurance, short-term and long-term disability

  • Teladoc

  • 401(k) with a 100% match up to 4% of salary

  • Generous paid time off and holiday schedule

  • Cell phone reimbursement

  • Tuition reimbursement

  • Subscription to the Calm app

  • MetLife Legal

  • Company paid commuter benefit; $200/month

Compensation Range

Compensation will be paid in the range of up to $210,000 -$250,000 + Bonus. Restricted Stock Units are included in all offers. Compensation to be determined by the applicants knowledge, education, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Staff+ Cloud Availability Platform Engineer

ZipRecruiter

Sunnyvale

Hybrid

USD 245,000 - 290,000

12 days ago

Senior Staff+ Cloud Availability Platform Engineer

ZipRecruiter

San Francisco

Hybrid

USD 245,000 - 290,000

23 days ago

Senior Staff+ Cloud Availability Platform Engineer

Crusoe

San Francisco

Hybrid

USD 245,000 - 290,000

24 days ago

Sr Staff BI and ML/Advanced Analytics Platform Architect

General Electric

San Ramon

Remote

USD 127,000 - 214,000

20 days ago

Staff Application Platform Engineer

ZipRecruiter

Palo Alto

On-site

USD 210,000 - 240,000

12 days ago

Staff Platform Engineer (1745)

Collibra

Remote

USD 190,000 - 276,000

17 days ago

Staff Platform Engineer

Zip Co

New York

Remote

USD 180,000 - 230,000

30+ days ago

Staff ML Platform Engineer – Large Scale Training (LLMOps/MLOps)

TrueFoundry

San Mateo

On-site

USD 167,000 - 251,000

26 days ago

Software Engineer (Staff Engineer, Supply Platform)

Liftoff Mobile

Redwood City

Hybrid

USD 206,000 - 276,000

13 days ago