Enable job alerts via email!

Telecoms: Senior Manager, Technology Operations & SRE (Reliability)

American Workforce Solutions

United States

Remote

USD 110,000 - 250,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company is seeking a Senior Manager for Technology Operations & SRE, pivotal in overseeing the stability of a cloud-based telephony platform. This role demands both technical acumen in AWS and SRE practices, along with strong leadership skills to guide a diverse team. Key duties include ensuring platform reliability, incident management, and advancing automation efforts. Candidates should have over 8 years of relevant experience and embrace a proactive approach to problem-solving in a dynamic work environment.

Benefits

Medical insurance
Vision insurance
401(k)
Child care support

Qualifications

  • 8+ years in Technology Operations, DevOps, or SRE.
  • Experience in AWS cloud-native environments (EC2, EKS).
  • Familiarity with VoIP protocols and telephony software is a plus.

Responsibilities

  • Ensure the reliability and performance of the cloud-based telephony platform.
  • Act as Incident Commander during outages and act swiftly to resolve issues.
  • Manage and mentor a team of ~20 engineers.

Skills

Cloud-native environments
Proactive communication
Automation
Problem-solving

Tools

AWS services
NetScout
OpsGenie
Datadog
Prometheus
Grafana
Python
Bash
Terraform
Jenkins
GitLab CI

Job description

Telecoms: Senior Manager, Technology Operations & SRE (Reliability)
Telecoms: Senior Manager, Technology Operations & SRE (Reliability)

The Senior Manager, Technology Operations & SRE (Reliability) is a critical leadership role, responsible for ensuring the stability, performance, and resilience of our AWS-hosted open-source telephony platform (SIP, Kamailio, Asterisk, FreeSWITCH, etc.). This role combines hands-on technical expertise in cloud-native environments with strategic people management, serving as a key Incident Commander during critical incidents and driving Site Reliability Engineering (SRE) best practices. You will lead a team of ~20 engineers (SRE and augmented DevOps, including offshore resources), foster cross-functional collaboration, and champion a culture of passion, tenacity, and proactive communication to maintain a highly available platform in a fast-paced, mission-driven organization.

Key Responsibilities:

  • Platform Reliability: Ensure the availability, performance, and resilience of our client’s cloud-based telephony platform, leveraging AWS services (EC2, EKS, Route 53, CloudWatch) and monitoring tools (NetScout, OpsGenie, Datadog) to support real-time communication.
  • Incident Management: Act as Incident Commander, leading rapid response to outages, call quality issues, or captioning delays, using tools like PagerDuty and AWS CloudWatch to minimize customer impact and provide proactive updates to senior leadership.
  • Root Cause Analysis: Conduct thorough RCAs for incidents, implementing corrective actions and refining runbooks to prevent recurrence, with a focus on reducing escalations through effective triage (targeting 80% resolution without escalation).
  • Automation & Observability: Develop automation scripts (Python, Bash) and enhance observability with tools like Prometheus, Grafana, and Datadog to monitor WebRTC metrics, captioning accuracy, and infrastructure health, enabling proactive issue detection.
  • AWS Expertise: Optimize AWS infrastructure (EC2, EKS, S3, Lambda, Route 53) and Kubernetes clusters for scalability, fault tolerance, and low-latency workloads, mentoring the team to improve platform reliability.
  • SRE Best Practices: Drive SRE principles (SLOs, SLIs, error budgets) and SDLC processes, transitioning the team from a disbanded NOC model to a mature SRE framework, focusing on production support and reducing project distractions.
  • Team Leadership: Manage and mentor a team of ~20 (SRE and augmented DevOps), fostering a culture of passion, adaptability, and collaboration. Navigate team morale during leadership transitions, winning trust while maintaining objective decision-making.
  • Proactive Communication: Provide high-level updates to senior leadership during major platform changes, educating stakeholders on monitoring and outcomes to preempt inquiries and align with organizational goals.
  • Cross-Functional Collaboration: Partner with engineering, product, and compliance teams to address reliability gaps, optimize captioning performance, and ensure compliance.
  • Continuous Improvement: Stay informed on industry trends (e.g., WebRTC, AI-driven transcription) to enhance telephony architecture and captioning workflows, leveraging analytical skills to interpret platform data.

Qualifications:

Technical Skills

  • Experience: 8+ years in Technology Operations, DevOps, or SRE, with strong expertise in AWS cloud-native environments (EC2, EKS, S3, Lambda, CloudWatch, Route 53).
  • Observability Tools: Proficiency with NetScout, OpsGenie, Datadog, Prometheus, or Grafana for monitoring infrastructure and application metrics.
  • Automation: Strong coding knowledge in Python or Bash for automating workflows and processes.
  • DevOps Tools: Experience with Terraform, Jenkins, or GitLab CI for Infrastructure as Code and CI/CD pipelines.
  • SRE & SDLC: Deep knowledge of SRE principles (SLOs, SLIs, blameless postmortems) and SDLC processes, with experience building or transitioning teams to SRE models.
  • Telephony Knowledge (Preferred): Familiarity with VoIP protocols (SIP, RTP, WebRTC) and open-source telephony software (Asterisk, Kamailio, FreeSWITCH) is a huge plus.
  • Networking: Basic understanding of network troubleshooting (e.g., Wireshark) and QoS optimization for low-latency communication.
  • Captioning (Optional): Experience with real-time transcription systems (e.g., AWS Transcribe) or caption formats (WebVTT, SRT) is a plus.

Leadership & Soft Skills

  • Leadership: Proven ability to lead and mentor diverse technical teams in a remote, high-stakes environment, with experience managing morale during transitions.
  • Tenacity & Passion: A proactive, “adapt and overcome” mindset, thriving in a 24/7 support environment with a passion for mission-driven work.
  • Communication: Exceptional verbal and written skills for proactive stakeholder updates, cross-functional collaboration, and presenting to non-technical audiences, including compliance teams.
  • Problem-Solving: Strong analytical skills to diagnose complex issues under pressure and interpret platform data for decision-making.
  • Culture Fit: Ability to align with the fast-paced, collaborative culture.
  • Time Management: Adept at prioritizing tasks and managing high-stakes responsibilities in a dynamic setting.
  • Confidentiality: Commitment to handling sensitive customer data in compliance with regulations.
  • 100% Remote: Work from home, with flexibility to collaborate across US time zones
Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Engineering, Information Technology, and Management
  • Industries
    Telecommunications

Referrals increase your chances of interviewing at American Workforce Solutions by 2x

Inferred from the description for this job

Medical insurance

Vision insurance

401(k)

Child care support

Get notified about new Technology Operations Manager jobs in United States.

Jamaica, NY $225,000.00-$250,000.00 8 hours ago

Paterson, NJ $225,000.00-$250,000.00 8 hours ago

Orlando, FL $225,000.00-$250,000.00 8 hours ago

Brooklyn, NY $225,000.00-$250,000.00 8 hours ago

Newark, NJ $225,000.00-$250,000.00 8 hours ago

Boston, MA $225,000.00-$250,000.00 21 hours ago

United States $200,000.00-$250,000.00 1 week ago

Arlington, VA $225,000.00-$250,000.00 8 hours ago

Minneapolis–Saint Paul, WI $225,000.00-$250,000.00 8 hours ago

Washington, DC $225,000.00-$250,000.00 8 hours ago

United States $110,000.00-$125,000.00 1 week ago

Atlanta, GA $225,000.00-$250,000.00 8 hours ago

Athens, GA $225,000.00-$250,000.00 8 hours ago

Chicago, IL $225,000.00-$250,000.00 8 hours ago

United States $142,000.00-$202,000.00 1 week ago

Jacksonville, FL $225,000.00-$250,000.00 8 hours ago

Columbus, OH $225,000.00-$250,000.00 8 hours ago

Wilmington, NC $225,000.00-$250,000.00 8 hours ago

Charleston, SC $225,000.00-$250,000.00 8 hours ago

Business Operations Manager, One Medical Operations

Milwaukee, WI $225,000.00-$250,000.00 8 hours ago

Hartford, CT $225,000.00-$250,000.00 8 hours ago

Charlotte, NC $225,000.00-$250,000.00 8 hours ago

Green Bay, WI $225,000.00-$250,000.00 8 hours ago

Baltimore, MD $225,000.00-$250,000.00 8 hours ago

Philadelphia, PA $225,000.00-$250,000.00 8 hours ago

Indianapolis, IN $225,000.00-$250,000.00 8 hours ago

Raleigh, NC $225,000.00-$250,000.00 8 hours ago

Bridgeport, CT $225,000.00-$250,000.00 8 hours ago

Louisville, KY $225,000.00-$250,000.00 8 hours ago

United States $80,000.00-$120,000.00 4 months ago

Grand Rapids, MI $225,000.00-$250,000.00 8 hours ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.