
Aktiviere Job-Benachrichtigungen per E-Mail!
Erstelle in nur wenigen Minuten einen maßgeschneiderten Lebenslauf
Überzeuge Recruiter und verdiene mehr Geld. Mehr erfahren
A leading technology firm in Aachen seeks an experienced engineer to join the SRE Cloud Infrastructure team. You will help build automated self-service solutions that ensure the reliability of cloud services. The ideal candidate has strong Kubernetes and cloud experience and a developer mindset to write clean, maintainable code. This role allows for flexible working arrangements and emphasizes personal development alongside competitive benefits.
Do stuff that matters - Become a part of gridX and contribute your own part to digitalise the energy industry with us and thus make renewable energies accessible and affordable everywhere #getshitdone
At gridX we are building the digital brain for the energy transition. We are looking for an engineer who wants their code to have a tangible impact on a sustainable future.
As part of the SRE Cloud Infrastructure team you will join a culture defined by a single principle : Reliability First . However for us reliability isnt about fixing broken things or keeping the lights on manually. Its about enablement. We engineer the automated self-service solutions that empower our engineering teams to own their services from development to production.
You are a builder. An experienced autonomous engineer who is ready to evolve our systems engineer away complexity and champion a culture where reliability is built in by design.
Take Ownership: You actively evolve our multi-tenant cloud and container infrastructure. You take end-to-end ownership of various components ensuring they are secure scalable observable and cost-efficient.
Engineer Infrastructure as Software: You bring a developers mindset to operations. You solve complexity by writing high-quality code and automation ensuring our platform is managed strictly via declarative code.
Drive Observability: You mature our observability platform ensuring we arent just collecting data but providing the insights teams need to drive architectural decisions improve performance and establish meaningful SLOs.
Architect for Resilience: You proactively identify bottlenecks before they become incidents and when things do break you lead the resolution and drive post-mortems to ensure we learn.
Empower Others: You build self-service capabilities that allow engineering teams to own their full lifecycle. You also drive the adoption of best practices through code or architecture reviews and technical deep-dives and share your expertise through high-quality documentation and operational runbooks.
This is how you and your application stand out
You have solid experience in an SRE or Platform role building and managing distributed systems in production environments. You are comfortable working with a high degree of autonomy navigating ambiguity and driving technical initiatives end-to-end.
You have strong hands-on experience with a major public cloud provider. You understand the architectural foundations of cloud infrastructure (Compute Storage Networking and IAM) and are fluent in managing them as code.
You apply a pragmatic software engineering mindset to operations. You write clean maintainable code and scripts prioritizing long-term stability and quality.
You have operational experience with Kubernetes at scale understanding how to manage upgrades security and resource allocation in a production cluster.
You embody a Reliability First mindset understanding incident lifecycle management and the importance of psychological safety in engineering.
What sets you apart
You have hands-on expertise in the AWS services we use heavily such as EKS EC2 VPC RDS Lambda S3 Kinesis DynamoDB SNS and SQS.
You go beyond usage and understand the internal components of Kubernetes (scheduling API server controllers RBAC). Experience writing custom Controllers or Operators is a significant plus .
You have strong skills in at least one modern programming language (e.g. Go Typescript Java Python Rust) have a willingness to work with Go which is our core language for tooling and automation and embrace AI-assisted workflows to accelerate development.
You have expertise in modern observability stacks (e.g. Grafana LGTM Thanos VictoriaMetrics). You can operate and tune the platform at scale while guiding teams on effective instrumentation and alerting strategies.
You have deep technical expertise in Release Engineering and GitOps as well as maintaining infrastructure that enable developers to release their software securely and reliably.
You have deep knowledge of TCP / IP DNS and HTTP protocols and you understand the intricacies of container networking.
We believe in a future where all DERs are connected and optimized to efficiently power the new energy age!
Our passionate and interdisciplinary team at our offices in Aachen and Munich is ready to face the digital transformation of various industries : With our interoperable IoT platform we bring connectivity analytics and intelligent control into decentralized energy systems.
Whether its maximizing the self-sufficiency of buildings intelligent charging strategies for EVs cross-sector optimization of branches and districts control of virtual power plants or completely new business models - the gridX platform enables our partners to bring their customers one step ahead of the competition and continuously create added value.
Join us disrupting the international energy sector with our cutting edge IoT platform!
Employment Type: Full-Time
Experience: years
Vacancy: 1