Aktiviere Job-Benachrichtigungen per E-Mail!

Senior Site Reliability Engineer

Zandkasteel FZCO

Deutschland

Remote

EUR 70.000 - 90.000

Vollzeit

Heute

Sei unter den ersten Bewerbenden

Zusammenfassung

A leading SMS technology provider is seeking a Senior Site Reliability Engineer to ensure platform stability and optimize back-end systems. You will work with Python/Flask applications and manage Linux servers to deliver a reliable SMS marketing service. This fully remote role offers growth opportunities in a scale-up environment.

Leistungen

Endless growth opportunities

Flexible working schedule

Performance bonuses

Fully remote setup

Qualifikationen

5+ years of experience in a relevant engineering role, ideally as a Python Developer.
Experience managing Python/Flask applications in production environments.
Proficient in Linux server admin (Debian/Ubuntu) and network analysis.

Aufgaben

Manage and optimize Linux-based servers to ensure stability and performance.
Continuously monitor system health and address bottlenecks.
Troubleshoot complex issues with diagnostic tools.

Kenntnisse

Python development

Linux server administration

Network analysis tools

Monitoring systems

Automation tools

Troubleshooting skills

Tools

Git

Wireshark

Prometheus

Grafana

About the Company

Our client is one of the leading SMS providers for marketing teams in the US. Their advanced dashboard and queueing mechanisms help their clients scale campaigns to the next level. With a global team they’re in scale-up mode and looking for strong problem solvers who thrive on building reliable systems.

About the Role

We are looking for a Senior Site Reliability Engineer (SRE) with strong infrastructure experience to help ensure platform stability and optimize back-end systems in Python. You will play a key role in keeping their SMS marketing platform fast, reliable, and scalable. This is a highly technical position at the intersection of backend engineering and infrastructure. You’ll be working hands-on with Python/Flask application, Linux servers, and networking stack to make sure millions of SMS messages are delivered without delay or downtime.

This is a Full-Time remote role.

Requirements

5+ years of experience as a Site Reliability Engineer, System Engineer, Infrastructure Engineer, Platform Engineer, Backend Systems Engineer, or similar role, ideally as a Python Developer.
Experience running and maintaining Python/Flask applications in production.
Advanced Python development skills, particularly with Python libraries/frameworks.
In-depth knowledge of Linux server administration (Debian/Ubuntu).
Proficiency with network analysis tools: intercepting proxies, packet captures (Wireshark, mitmproxy, tcpdump, etc.).
Familiarity with distributed systems, scaling strategies, and performance tuning.
Strong understanding of monitoring and logging systems (e.g., Prometheus, Grafana, ELK, Datadog).
Experience with version control (Git) and CI/CD workflows.
Comfort with automation tools and scripting for infrastructure management.
Excellent troubleshooting and analytical skills.
Strong sense of ownership and accountability for uptime, stability, and performance.

Your responsibilities

Maintain and optimize infrastructure: Manage Linux-based (Debian/Ubuntu) servers running Python/Flask applications, ensuring stability and performance.
Ensure high uptime: Continuously monitor system health and proactively address bottlenecks or weak points to maximize reliability of SMS send-outs.
Troubleshoot complex issues: Use intercepting proxies, packet captures, and diagnostic tools to identify, analyze, and resolve traffic or delivery issues.
Optimize backend workflows: Work with Python/Flask async frameworks to streamline message queuing, delivery, and scaling mechanisms.
Implement monitoring and alerting: Set up dashboards, logs, and alerts that provide visibility into system health and performance.
Automate infrastructure tasks: Build tools/scripts to reduce manual work and ensure consistency in deployments and optimizations.
Own decision-making: Take initiative in addressing infrastructure needs and make competent technical decisions without requiring constant supervision.

Growth Opportunities/Perks

Endless growth opportunities as they’re in a scale-up phase.
Potential to move into a more elaborate R&D or leadership role.
Flexible working schedule as long as deadlines and quality are met.
Work alongside highly skilled developers in a unique and challenging industry.
Performance bonuses as the company grows.
Fully remote setup.

This Position Is Perfect For You If…

You’re a fast learner. You won’t be expected to know everything from the start, but you’ll need to be motivated and quick to learn new tools, technologies, and patterns in a complex infrastructure environment.

You’re detail-oriented. You notice flaws in systems before they become problems, and you enjoy digging into logs, metrics, or packet captures until you find the root cause.

You’re reliable under pressure. When systems break, you don’t panic — you troubleshoot calmly, take action, and make the right call to stabilize the platform.

Our hiring process is made up of four parts, so please be aware that you will need to dedicate time for a questionnaire, a video, and two 1-on-1 interviews.

Thank you for taking the time to consider this position. I look forward to hearing from you soon!

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.

eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.

Top-Städte

Top-Unternehmen

Beliebte Jobs

Senior Site Reliability Engineer

Zandkasteel FZCO

Deutschland

Remote

EUR 70.000 - 90.000