Enable job alerts via email!

Head of AI Operations

Greater Giving, Inc.

Columbus (GA)

Remote

USD 120,000 - 180,000

Full time

25 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An established industry player is seeking a Senior AI Leader to drive innovative Generative AI and Machine Learning initiatives. This pivotal role involves overseeing a dedicated team responsible for the reliability and performance of AI systems in production. The ideal candidate will possess a robust background in AI/ML platform management and software engineering, with a proven ability to lead technical teams in a fast-paced environment. Join a company committed to fostering a culture of inclusion and innovation, where your contributions will directly impact global strategies and drive business outcomes. Embrace the opportunity to work in a dynamic environment that values creativity and proactive leadership.

Benefits

Medical, dental, and vision care
EAP programs
Paid time off
Recognition programs
Retirement and investment options
Charitable gift matching programs
Worldwide days of service

Qualifications

  • 10+ years of experience in software and support engineering for AI systems.
  • Strong command of AWS and GCP with hands-on experience in AI workloads.
  • Expertise in designing reliable, scalable, and observable production systems.

Responsibilities

  • Lead SRE and Support for AI production systems ensuring high availability.
  • Build observability frameworks to monitor model performance and reliability.
  • Partner with teams to ensure reliability in every stage of AI development.

Skills

AI/ML platform management
Software engineering
Incident management
Stakeholder management
Project management
Critical thinking
Flexibility

Education

Bachelor’s degree in Computer Science
Master’s degree in AI or related area

Tools

AWS
GCP
AWS SageMaker
Google VertexAI
Snowflake Cortex
Fiddler AI
Weights & Biases

Job description

This senior AI leadership role is pivotal in delivering strategic Generative AI (GenAI) and Machine Learning (ML) initiatives that will transform Global Payments. You will be responsible for ensuring the reliability, scalability, and performance of our production AI systems and services. This role oversees a dedicated team of SREs and Support Engineers responsible for monitoring, incident response, and the stability of AI services in production. You will play a critical role in ensuring our GenAI workloads- from foundational models to fully integrated inference pipelines- run reliably and at scale across cloud and hybrid environments. The ideal candidate will have a robust background in AI/ML platform management, software engineering, and a proven track record of leading technical teams in a fast-paced environment. You will need to foster a culture of innovation and ensure ongoing alignment of initiatives with evolving business strategies, at a global level.

RESPONSIBILITIES
  • Lead the SRE and Support function for AI production systems, including LLM inference services and monitoring, vector databases, orchestration platforms and AI agent frameworks.
  • Ensure high availability, low latency performance, and secure operation of GenAI PAIs and applications.
  • Build and scale observability frameworks to monitor model drift, hallucination, bias, performance degradation, latency spikes, and bottlenecks.
  • Define and enforce SLAs, SLOs, and error tolerance tailored to AI/ML workloads, covering batch, realtime, and on-demand use cases.
  • Lead incident management and root cause analysis across AI pipelines, including model serving, feature stores, and data flows.
  • Partner with the AI Engineering, MLOps and Platform teams to ensure reliability is baked right into every stage of AI development and deployment.
  • Work closely with the Platform teams to implement and support auto-scaling, failover, and self-healing strategies for AI workloads in multi-cloud and hybrid environments.
  • Develop and manage on-call strategies, escalation procedures, and global support rotations for critical AI services.
  • Passionate about customer success with what your teams build. Take care to measure and monitor, that what your teams build is used, and useful to driving business outcomes.
  • Ensure compliance with industry standards and best practices in AI solutions monitoring, including security protocols and data governance policies.
  • Stay abreast of emerging technologies and trends in GenAI and Machine Learning to drive continuous improvement and innovation.
  • Inspire and motivate your team, and foster a positive and productive work environment consistent with Global Payment’s values.
MUST HAVES:
  • Bachelor’s or Master's degree in Computer Science, Math, AI, or a related area.
  • Strong command of AWS and GCP with experience managing AI workloads.
  • Hands-on experience with AWS SageMaker, AWS Bedrock, Google VertexAI, and Snowflake Cortex.
  • At least 10 years of experience in software and support engineering, for enterprise-grade cloud based AI systems.
  • Deep knowledge of LLMs, inference pipelines, vector databases, RAG and agentic architectures.
  • Expertise in designing and running reliable, scalable, and observable production systems.
  • Proven ability to lead high-severity incident response and drive root cause analysis postmortems.
  • Hands-on experience with observability platforms (e.g. Fiddler AI, Arize, Weights & Biases, etc).
  • Deep understanding of containerization and designing ephemeral solutions.
  • Ability to define, monitor, and enforce service-level objectives tailored to GenAI workloads.
  • Expert on industry trends and various LLMs. This should include commercial Foundational Models from OpenAI, Anthropic, Cohere, Google, as well as open-source models available in those platforms including Mistral, Llama, etc.
  • Passionate engineering leader with experience building high performance teams.
  • Proficiency in stakeholder management to effectively communicate and manage expectations of those linked to the work outside your team.
  • Proficiency in project management and resource allocation to ensure timely, efficient and successful delivery of outcomes.
  • Experience in strategic planning and execution with strong decision-making skills to align initiatives with business goals and make informed choices that benefit the organization.
  • Some experience in handling compliance and regulatory requirements to ensure engineering practices adhere to relevant laws and regulations.
BONUS ATTRIBUTES:
  • Familiarity with MLOps workflows, data versioning and model lifecycle management.
  • Familiarity with Machine Learning model development.
  • Knowledge in Salesforce AI offerings.
ABILITIES:
  • Ability to work proactively with a high level of initiative and accuracy.
  • Ability to manage multiple assignments effectively and meet established deadlines.
  • Strong interpersonal skills to interact professionally with staff and stakeholders.
  • Excellent organizational skills and attention to detail.
  • Critical thinking ability ranging from moderately to highly complex tasks.
  • Flexibility in adapting to changing business needs and priorities.
  • Ability to work creatively and independently with minimal supervision.
  • Ability to utilize experience and judgment in accomplishing goals.
  • Experience in navigating organizational structures and collaborating across teams.

At Global Payments our vision is to be “Champions of Inclusion.” We are fully committed and focused on creating a better tomorrow in the communities in which we live and work. We aspire to ensure fair treatment, access, opportunity and advancement for all team members. We believe all team members should be able to bring their true, authentic selves to the workplace and feel accepted, engaged and understood.

Global Payments offers a comprehensive benefits package to all of our team members, including medical, dental and vision care, EAP programs, paid time off, recognition programs, retirement and investment options, charitable gift matching programs, and worldwide days of service. To learn more, review our Benefits page at: https://jobs.globalpayments.com/en/why-global-payments/benefits/.

This position is eligible to be considered for remote hiring anywhere in the USA.

#LI-Remote

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

AI Cloud Operations Lead - Remote

Lensa

Trenton

Remote

USD 130,000 - 200,000

3 days ago
Be an early applicant

AI Cloud Operations Lead - Remote

Lensa

Annapolis

Remote

USD 130,000 - 200,000

2 days ago
Be an early applicant

AI Cloud Operations Lead - Remote

Lensa

Salt Lake City

Remote

USD 130,000 - 200,000

2 days ago
Be an early applicant

AI Cloud Operations Lead - Remote

Lensa

Raleigh

Remote

USD 130,000 - 200,000

2 days ago
Be an early applicant

Sr Director, AI Operations

TELUS Digital

Remote

USD 145,000 - 200,000

2 days ago
Be an early applicant

AI Cloud Operations Lead - Remote

Lensa

Olympia

Remote

USD 130,000 - 200,000

3 days ago
Be an early applicant

AI Cloud Operations Lead - Remote

Lensa

Richmond

Remote

USD 130,000 - 200,000

3 days ago
Be an early applicant

Chief Operating Officer - Sales Operations - AI/ML Focused Digital Transformation Services

Stralynn Consulting Services, Inc

Nashville

Remote

USD 120,000 - 180,000

5 days ago
Be an early applicant

Operations Program Manager, AI

Figma

New York

Remote

USD 80,000 - 130,000

6 days ago
Be an early applicant