Site Reliability Engineer (SRE) - Platform Infrastructure team (100% Remote - USA)
Join Hopper as a Senior Site Reliability Engineer in our Platform Infrastructure team. This team builds and maintains the cloud foundation that powers products used by millions of travelers worldwide.
Our mission is to empower engineers to ship fast, stay resilient, and scale effortlessly. If you are passionate about automation, scalability, and improving developer experience, this role offers a tangible impact in a growing travel tech company.
You will work on evolving a large-scale, multi-region infrastructure in Google Cloud, supporting hundreds of engineers and product teams. Your contributions will include building automated, self-service platform tools that are secure, reliable, cost-efficient, and user-friendly.
Responsibilities include:
- Enhancing platform tooling to support expanding services and teams.
- Designing infrastructure workflows that are simple, consistent, and scalable.
- Driving automation to reduce manual work and improve reliability.
- Scaling infrastructure offerings to meet team needs while maintaining cohesiveness.
- Participating in incident response as part of a globally distributed on-call rotation.
- Supporting engineering teams with troubleshooting and review of core systems.
- Collaborating with a high-impact team focused on operational excellence and developer experience.
Ideal candidates will have:
- Experience in SRE, DevOps, Software or Systems Engineering with a focus on reliable, scalable infrastructure.
- Strong troubleshooting skills in distributed, cloud-native environments.
- Solid system design skills emphasizing simplicity and maintainability.
- Effective communication and collaboration skills across teams.
Required expertise includes:
- Hands-on experience with cloud platforms, preferably Google Cloud Platform (GCP).
- Proficiency with Infrastructure as Code, ideally Terraform.
- Experience with containers, Kubernetes, Helm, or Kustomize.
- Knowledge of Service Mesh technologies like Istio.
Networking & Security knowledge:
- Understanding of DNS, TLS, certificates, ingress controllers, etc.
- Best practices in cloud security, IAM, RBAC, and network segmentation.
- Familiarity with authentication and authorization protocols.
Observability & Tooling experience:
- Experience with logs, metrics, tracing, and APM tools such as Datadog.
- Knowledge of CI/CD pipelines and deployment automation.
- Familiarity with SQL and NoSQL databases.
Scripting & Automation skills:
- Ability to write scripts in Bash, Python, or similar languages for automation.
Perks and Benefits:
- Competitive salary and pre-IPO equity.
- Unlimited PTO, travel stipends, and flexible workspaces.
- Generous parental leave and comprehensive health coverage.
- Dynamic, impact-driven teams and open communication.
About Hopper:
Hopper is a leading travel platform leveraging data and AI to revolutionize travel planning and fintech solutions. Serving hundreds of millions globally, Hopper continues to grow through innovative products and strategic partnerships.