Job Search and Career Advice Platform

Enable job alerts via email!

Test Environment Manager

isepglobal

Greater London

Hybrid

GBP 90,000 - 120,000

Full time

Yesterday
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology services company in Greater London seeks a Test Environment Manager to enhance system reliability and automation in non-production environments. Candidates should possess 15+ years of experience and expertise in managing cloud/on-prem environments, and proficiency with IaC tools. The role emphasizes incident management, observability, and the promotion of a reliability culture among teams. Strong scripting skills in Python or Bash are necessary, along with excellent communication and problem-solving abilities.

Qualifications

  • 15+ years of experience in a relevant field.
  • Strong analytical and debugging skills under pressure.
  • Experience with containerization and serverless computing.

Responsibilities

  • Automate environment lifecycle with Infrastructure as Code.
  • Define and measure service level objectives for environments.
  • Use observability tools to monitor environment health.
  • Manage incident response and conduct blameless post mortems.
  • Identify opportunities for continuous improvement.

Skills

Observability
Cloud/on-prem management
Infrastructure as Code (IaC)
DevOps exposure
Monitoring and logging tools (Prometheus, Grafana)
CI/CD platforms (Jenkins, GitLab CI)
Configuration management (Ansible, Terraform)
Scripting (Python, Bash)
Linux systems
Networking concepts
Job description

Role: Test Environment Manager

Location: London

Work Mode: Hybrid (3 days from office)

Contract role

Experience Level: 15+ Years.

Job Description:

A Test Environment Manager (TEM) is responsible for transforming the SDLC environment with an engineering–focused role that emphasizes system reliability, automation, and performance in a non–production setting.

Mandatory Skills

The primary technical skills which are needed are Observability, Management for cloud/on prem environments, IAC automation with DevOps exposure along with other soft skills.

Operational Responsibilities
  • Automate environment lifecycle: Develop Infrastructure as Code (IaC) to automate the provisioning, teardown, and configuration of test environments, integrating them with the CI/CD pipeline.
  • Establish service level objectives (SLOs): Define and measure key service indicators (SLIs) for test environments, such as availability and provisioning time, to ensure they meet the needs of development and testing teams.
  • Monitor environment health and performance: Use observability tools like Prometheus and Grafana to track the health of test environments, identify bottlenecks, and resolve issues proactively, not reactively.
  • Manage incident response: Lead the incident management process for test environment issues, conducting blameless post mortems to understand the root causes and implement lasting fixes.
  • Minimize toil: Automate manual, repetitive tasks associated with test environments to free up engineering time for more strategic work.
Strategic and Cultural Responsibilities
  • Drive continuous improvement: Analyze environment performance data, incident reports, and post mortems to identify opportunities for continuous improvement and innovation.
  • Balance reliability and speed: Use an "error budget" for test environments. If environments are highly reliable, teams can use the budget for quicker feature development; if reliability is low, the focus shifts to improving stability.
  • Instill a reliability culture: Promote a blameless culture around test environment incidents, encouraging shared ownership and collaboration between development, QA, and SRE teams.
  • Capacity planning: Anticipate future resource needs of test environments by analysing usage patterns and project forecasts. Ensure the infrastructure can scale to meet demand.
  • Advance test data management: Work with Test Data Managers to ensure that test data is not only readily available but also consistent, compliant, and automatically provisioned with the environments.
Technical Skills
  • Expertise in tooling: Proficiency with monitoring and logging tools (e.g., Prometheus, Splunk, Grafana), CI/CD platforms (e.g., Jenkins, GitLab CI), and configuration management tools (e.g., Ansible, Terraform).
  • Cloud infrastructure knowledge: Deep understanding of cloud platforms like AWS, including experience with containerization technologies (Docker, Kubernetes) and serverless computing.
  • Scripting and programming: Strong scripting skills in languages such as Python or Bash to automate environment management tasks.
  • Systems and networking knowledge: Solid understanding of Linux systems, networking concepts, and database management.
Soft Skills
  • Leadership and influence: The ability to champion SRE practices and influence technical and business stakeholders across different teams.
  • Problem solving: Strong analytical and debugging skills for investigating and resolving complex environment issues under pressure.
  • Communication: Excellent communication and collaboration skills to bridge the gap between development, QA, and operations teams.
  • Adaptability: A proactive and adaptable mindset to keep pace with evolving technology and development methodologies.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.