Enable job alerts via email!

Senior Site Reliability Engineer

Omilia

Polska

On-site

PLN 100,000 - 130,000

Full time

Today

Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company in Poland is seeking a Senior Site Reliability Engineer to ensure platform reliability and availability across production environments. The ideal candidate will have experience with Kubernetes, AWS, and ELK tools. A Bachelor's degree in Engineering or equivalent is required, along with strong scripting skills and excellent communication abilities. This role offers fixed compensation and long-term employment in an innovative environment.

Benefits

Fixed compensation

Long-term employment

Professional development (courses, training, etc.)

Apple gear

Qualifications

Experience developing or maintaining software for production services at scale.
Versatility working with agile/lean methods.
Ability to think critically and anticipate challenges.

Responsibilities

Ensure platform reliability and availability through monitoring and automation.
Participate in on-call rotations and improve alert quality.
Collaborate with engineering to develop runbooks and operational documentation.

Skills

Scripting skills (Bash, Python, or Go)

Experience with ELK

Experience with AWS

Strong communication skills

Experience in operating Kubernetes or Docker Swarm

Experience with Grafana/Prometheus stack

Education

Bachelor's Degree or MS in Engineering or equivalent

Tools

Kubernetes

Docker Swarm

Terraform

Ansible

We are looking for a Senior Site Reliability Engineer with Cloud platform experience. This individual will be part of a team responsible for operating and maintaining production clusters and developing our observability solutions; they will collaborate with team members to develop automation strategies, monitoring & alerting, and ensuring overall platform reliability. Your goal will be to become an integral part of the team, making every challenge of the platform – your own challenge, and solving them accordingly.

Responsibilities

Ensure platform reliability and availability across production and pre‑production environments through proactive monitoring, alerting, and automation
First response for incidents, contribute to problem management and root‑cause analysis
Supporting the development team's effort towards reliability, creating a solid reliability culture within the development lifecycle
Develop troubleshooting documentation for production support resourcesli>
Collaborate with Engineering teams to develop optimised and productive runbooks, operational documentation and automation of operational tasks
Collaborate with development and cloud engineering teams to embed reliability and performance into the software delivery lifecycle
Design, implement, and evolve observability solutions (metrics, logs, traces, dashboards) using tools such as Prometheus, Grafana, and ELK
Participate in on‑call rotations and continuously improve alert quality and response processes
Champion a culture of reliability, performance, and continuous improvement across teams

Requirements

Bachelor's Degree or MS in Engineering or equivalent
Experience in operating at least one container orchestration cluster (Kubernetes, Docker Swarm)
Experience developing or maintaining software for production services at scale
Experience with ELK
Experience with AWS
Experience with Grafana/Prometheus stack
Strong scripting skills (Bash, Python, or Go)
Excellent communication skills
Thinking out of the box and anticipating challenges… it is imperative we are not simply reactive; we must expect challenges and question technologies, procedures and thinking already in place. You will be expected to constantly review and challenge at all levels
Versatility. We work with agile/lean methods. We'd much rather iterate and learn than assume we know all the answers
Being a team player. You don't (always) work in isolation and are excited by the thought of using your team whilst involving product, experience design, engineering, and more in the process

Will be considered as a plus

Telephony knowledge (SIP, VoIP)
Experience in Linux Administration (RedHat, CentOS, AL)
Working knowledge in Configuration Management tools (Terraform, Ansible)
Experience with TCP/IP and general networking concepts
RDBMS knowledge (MySQL, Postgres)
NoSQL knowledge (Redis)

Benefits

Fixed compensation
Long‑term employment with the working days vacation
Development in professional growth (courses, training, etc)
Being part of successful cutting‑edge technology products that are making a global impact in the service industry
Proficient and fun‑to‑work‑with colleagues
Apple gear

Omilia is proud to be an equal opportunity employer and is dedicated to fostering a diverse and inclusive workplace. We believe that embracing diversity in all its forms enriches our workplace and drives our collective success. We are committed to creating an environment where everyone feels welcomed, valued, and empowered to contribute their unique perspectives without regard to factors such as race, color, religion, gender, gender identity or expression, sexual orientation, national origin, heredity, disability, age, or veteran status, all eligible candidates will be given consideration for employment.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Top cities