Enable job alerts via email!

Site Reliability Engineer

ZipRecruiter

Plano (TX)

Hybrid

USD 90,000 - 130,000

Full time

10 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading technology company is seeking an IT Operations Engineer (SRE) for a hybrid role in Texas. The ideal candidate will have a robust background in Site Reliability Engineering, cloud platforms, and automation. You will be responsible for ensuring system reliability, developing automated solutions, and collaborating with various teams to enhance performance and cost efficiency. This position is ideal for a proactive individual with a passion for continuous improvement in IT systems.

Qualifications

3+ years in Site Reliability Engineering, Systems Engineering, or similar role.
Strong experience with cloud platforms.
Familiarity with automation and continuous improvement.

Responsibilities

Ensure reliability and uptime of production systems through monitoring and incident response.
Develop and maintain automated solutions for configuration and deployment.
Collaborate with teams to design resilient and scalable systems.

Skills

Cloud platforms

Scripting

Site Reliability Engineering

Linux systems

Networking

Performance tuning

Monitoring and observability

Security best practices

Containerization

Education

Bachelor's degree in computer science, Engineering, or a related field

Tools

Azure

AWS

GCP

Terraform

Ansible

Docker

Kubernetes

Power Automate

PowerApps

Job Description

Title: IT Operations Engineer (SRE)
Job Type: Contract
Location: Hybrid – Daytona Beach, Florida OR Plano, TX

Job Summary
The ideal candidate has experience leading root cause analysis in an enterprise environment, with knowledge of various aspects of IT systems, including networking, infrastructure (on-prem, hybrid, cloud), endpoints, data, and modern workplace platforms. They should have managed endpoints on an enterprise level, including policy management, patching, vulnerability management, observability, and related strategies. Familiarity with Site Reliability Engineering best practices, automation, and continuous improvement is essential.

Qualifications

Bachelor's degree in computer science, Engineering, or a related field (or equivalent experience).
3+ years in a Site Reliability Engineering, Systems Engineering, or similar role.
Strong experience with cloud platforms such as Azure, AWS, or GCP.
Proficient in scripting or programming languages such as Python, Go, Bash, or PowerShell.
Experience with Power Automate and PowerApps.
Experience with infrastructure as code tools such as Terraform or Ansible.
Strong understanding of Linux systems, networking, and performance tuning.
Experience with monitoring and observability tools such as Azure Monitor, Zabbix, Grafana, Datadog, Dynatrace, LogicMonitor, ControlUp, etc.
Familiarity with ITIL/ITSM processes and incident/change management systems.
Knowledge of security best practices such as least privilege access, secure configurations, and patching.
Experience supporting large-scale or distributed systems in production.
Knowledge of FinOps or cloud cost optimization.
Hands-on experience with containerization and orchestration tools such as Docker or Kubernetes.
Systems administration experience, including applying best practices, optimization, and vendor management.

Description and Responsibilities

Ensure reliability and uptime of production systems through monitoring, incident response, and capacity planning.
Develop and maintain automated solutions for configuration, deployment, monitoring, and alerting/self-healing.
Collaborate with application and infrastructure teams to design resilient and scalable systems.
Participate in on-call rotations, respond to incidents, and perform root cause analysis.
Define and track SLIs, SLOs, and SLAs, using data to inform operational decisions.
Continuously improve system performance, cost efficiency, and observability.
Work with developers to integrate reliability and security best practices into the software development lifecycle.
Document processes, runbooks, and architectural decisions.

Eligibility: All applications authorized to live and work in the United States on a permanent basis are welcome to apply. Residency in the US is required. Sponsorship is not available for this position.

Wright Technical Services and our client are Equal Opportunity Employers. We are committed to creating an inclusive environment for all employees. All qualified applicants will receive consideration without regard to race, color, religion, sex, national origin, age, disability, or veteran status.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Green Dot Corporation

Remote

USD 87,000 - 132,000

Yesterday

Be an early applicant