Enable job alerts via email!

Software Engineer, Production Engineering (DBPE)

TN United Kingdom

London

Remote

GBP 40,000 - 70,000

Full time

Yesterday
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a Software Engineer for Production Engineering in London. The role involves building and maintaining distributed data systems, focusing on automation and reliability. Ideal candidates will have programming experience and a solid understanding of software engineering practices. Join a diverse team that values varied experiences and backgrounds.

Qualifications

  • Practical experience in at least one programming language (Java, Python).
  • Understanding of distributed systems and software architecture.
  • Knowledge of CI/CD and observability practices.

Responsibilities

  • Ensure reliability, scalability, security, and maintainability of systems.
  • Respond to customer escalations and automate alerts.
  • Develop automations to reduce operational toil.

Skills

Analytical thinking
Communication
Adaptability
Software engineering practices
Distributed systems design

Tools

Kubernetes
Terraform
AWS
GCP
Azure
Prometheus
Grafana
Splunk

Job description

Social network you want to login/join with:

Software Engineer, Production Engineering (DBPE), London

Client: DataStax

Location: London, United Kingdom

Job Category: Other

EU work permit required: Yes

Job Reference: 55cc34a2578f

Job Views: 4

Posted: 23.05.2025

Expiry Date: 07.07.2025

Job Description:

As a database production engineer at DataStax, you will build, operate, and maintain distributed data systems to help leading enterprises manage their complex data needs. You will work on automation, monitoring, alerting, enhancements, and bug fixes to ensure an excellent experience for our developers and enterprises.

What will you do:
  1. Ensure reliability, scalability, security, and maintainability of the systems you own.
  2. Respond to customer escalations and automated alerts, from initial triage to resolution.
  3. Participate in blameless post-mortem analyses to learn from mistakes.
  4. Perform manual operational tasks (toil).
  5. Develop automations to reduce toil.
  6. Improve monitoring and alerting to reduce incident detection time.
  7. Work with technologies such as Kubernetes, Helm, ArgoCD, Terraform, Cassandra, Java, Python, Go, AWS, GCP, Azure, Prometheus, Grafana, and Splunk ecosystem.
Your experience should include:
  • Practical experience in at least one programming language (e.g., Java, Python).
  • Strong analytical thinking, especially for triaging issues.
  • Ability to communicate clearly in writing.
  • Ability to learn and adapt quickly.
  • Knowledge of software engineering practices (version control, refactoring, automated testing, CI/CD, observability).
  • Understanding of distributed systems design and software architecture.
  • Fundamentals of computer science and operating systems.
  • Bonus: database fundamentals, especially C*.
  • Bonus: experience with Linux containers and orchestration (e.g., Kubernetes).

Not sure if you qualify? Apply anyway! We value diverse experiences and backgrounds, whether you're new to the corporate world, returning after a gap, or transitioning careers. We look forward to connecting with you.

Explore Roles

#LI-Remote

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.