Altinity is the leading enterprise provider for ClickHouse, the fastest open-source analytic database on the planet. We deliver scalable, real-time analytics through Altinity.Cloud, a managed ClickHouse service that runs securely on AWS, GCP, Azure, and hybrid infrastructure. Altinity’s global, remote-first team helps businesses across industries solve data-intensive problems at scale.
We’re hiring a Cloud Support Engineer with strong SRE/DevOps expertise and hands-on experience with Kubernetes Operators, including the Altinity Kubernetes Operator for ClickHouse. You’ll play a central role in supporting ClickHouse workloads running in Altinity.Cloud, helping customers deploy and manage distributed analytic clusters across multiple cloud providers.
This role is ideal for someone who is passionate about automation, infrastructure as code, and scaling mission-critical systems through Operators and platform engineering best practices.
Your responsibilities will be divided as follows:
- Support & Operations (70%)
- Provide expert guidance to customers deploying ClickHouse with the Altinity Kubernetes Operator.
- Troubleshoot issues in managed ClickHouse environments running on AWS, GCP, Azure, and on-prem Kubernetes.
- Diagnose Kubernetes-level issues including networking, storage, pod scheduling, autoscaling, and Operator logic.
- Automate recurring operational tasks and reduce support load through self-healing, observability, and CI/CD improvements.
- Maintain monitoring, alerting, and logging pipelines for ClickHouse clusters (Prometheus, Grafana, Loki, Fluentbit).
- Contribute to playbooks, runbooks, knowledge base articles, and incident response workflows
- R&D and Platform Improvement (30%)
- Collaborate with Altinity engineers to improve the ClickHouse Operator and Altinity.Cloud control plane.
- Contribute to Helm chart improvements, custom resource templates, and automation of cluster lifecycle tasks.
- Evaluate new Kubernetes features, clickhouse-operator enhancements, and cloud-native best practices.
- Write internal and external content on operating and supporting ClickHouse in Kubernetes.
What a Day Looks Like in This Role:
- A customer reports slow pod rescheduling after node failure—you diagnose a cloud zone imbalance and patch the affinity settings in the Operator CR.
- You update a Helm chart to support a new Altinity.Cloud feature and deploy it to a staging cluster.
- You join a team sync to discuss extending auto-scaling support in the ClickHouse Operator.
- You write a tutorial for deploying ClickHouse with multi-volume storage class support in Kubernetes.
- You help debug a query timeout that turns out to be a Kubernetes network policy misconfiguration.
Candidates need to meet the following qualifications:
- 3+ years in SRE, DevOps, Platform Engineering, or Cloud Infrastructure roles.
- Hands-on experience with Kubernetes Operators (installing, managing CRDs, troubleshooting reconciliation loops).
- Experience with Altinity Kubernetes Operator, or a similar custom resource controller (e.g., Strimzi, ArgoCD, Crossplane, Prometheus Operator).
- Familiarity with cloud-native infrastructure on AWS, GCP, or Azure.
- Strong scripting skills (bash, Python, or Go).
- Experience with container orchestration (Docker, Helm), CI/CD pipelines, and GitOps workflows.
- Solid understanding of Linux and distributed systems.
- Strong English communication (written and verbal).
The following additional qualifications are a significant plus:
- Experience running ClickHouse in production (but deep expertise is not required—training will be provided).
- Knowledge of Helm, Terraform, Ansible, or other infrastructure-as-code tools.
- Exposure to large-scale observability stacks: Prometheus, Grafana, ELK, Loki, Thanos, etc.
- Understanding of SRE principles: SLIs/SLOs, incident management, capacity planning.
- Experience with data pipelines (Kafka, Flink, Airflow, dbt) or real-time data processing.
We’re looking for engineers who love solving infrastructure challenges and want to help customers get the most from ClickHouse and Kubernetes. You do not need to be a ClickHouse expert – but curiosity, systems thinking, and automation mindset are key.
Our Benefits:
- Work from Anywhere, Anytime: We are not your typical nine-to-five shop! Enjoy the flexibility of working from literally wherever and whenever. Create a schedule that works for you and your family or lifestyle. Recharge your batteries with our open vacation policies.
- Cultural Diversity: We love that we get to work with passionate people from around the world. Currently, our team is made of professionals from 20 different countries!
- Career Development and Learning Culture: We provide opportunities to learn new technologies as well as try out new roles; opportunities to learn ClickHouse deeply, contribute to open source, and grow into senior or platform roles. We also offer access to training on leading-edge technologies, plus flexible work schedules for external education.
- Technical Depth: Be part of a team that’s building the infrastructure for real-time analytics at global scale. Grow your skills in Kubernetes, distributed databases, observability, and automation.
- USA Employees Benefits: We offer comprehensive PPO health care plans for our US-based employees that are incredibly flexible to meet the needs of individuals or families. 401K with company match also available.
- Company Travel: We come together in person two to three times per year in locations across the globe. During non-pandemic times, of course! We also fund travel to conferences and encourage presentations as well as contributions to open source communities.
Apply now and tell us how your Operator, DevOps, or SRE background can help scale the future of cloud-native analytics.
This is a full-time position and includes equity in the company.
Altinity is venture-funded and financially stable.
We are 100% remote. You may work anywhere you have work permits.
We are eager to meet you!