Enable job alerts via email!

AI Operations Specialist

Africonology Solutions

Johannesburg

On-site

ZAR 600 000 - 800 000

Full time

30+ days ago

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A technology solutions provider is looking for an AI Operations Specialist in Johannesburg to ensure the stability and performance of AI systems in production. The role involves overseeing deployments, maintaining system health, conducting testing, and ensuring a seamless customer experience. Candidates should have a Bachelor's degree in Computer Science, with at least 3 years of experience in operations or AI support roles. Proficiency in Machine Learning practices and test automation is required.

Qualifications

  • 3+ years of experience in operations, DevOps, or AI/ML support roles.
  • Strong knowledge of Machine Learning practices.
  • C1 English proficiency.

Responsibilities

  • Monitor AI systems to ensure uptime and performance.
  • Design, develop, and maintain automated test scripts.
  • Manage version control and deployment of AI models and bots.
  • Implement guardrails and compliance with data privacy.
  • Collaborate with developers and data scientists.

Skills

Machine Learning Operations (MLOps)
System Monitoring
Quality Assurance
Test Automation
Scripting in Python
API Testing
Collaboration

Education

Bachelor's degree in Computer Science or related field

Tools

Jira
Azure DevOps
Selenium
Postman
Git
Job description
AI Operations Specialist

Responsible for ensuring the stability, reliability, and performance of AI systems in production. This role blends Machine Learning Operations (MLOps), Bot Operations, and Quality Assurance (QA) practices, supporting both backend models and customer-facing bots.

The specialist will manage deployments, monitor system health, conduct testing, and validate quality to guarantee seamless AI operations and a consistent customer experience.

Key Responsibilities
  • System Monitoring & Reliability
    • Monitor AI systems to ensure uptime, performance, and error-free operation.
    • Implement automated monitoring and alerting systems for AI performance.
    • Define SLOs and error budgets for AI features.
  • Testing & Quality Assurance
    • Design, develop, and maintain automated test scripts for web, mobile, and API testing.
    • Execute manual test cases for exploratory and non-automatable scenarios.
    • Identify, log, and track bugs to closure using Jira, Azure DevOps, or equivalent tools.
    • Ensure adherence to QA methodologies, tools, and best practices.
  • AI & Bot Operations
    • Manage version control, deployment, and release cycles of AI models and bots.
    • Conduct offline/online evaluations and A/B tests for prompts, models, and policies.
    • Track and troubleshoot production issues (latency, escalations, fallback errors).
    • Design and orchestrate conversations using Voiceflow and related tools.
  • Observability, Security & Compliance
    • Work with observability/evaluation tools (Langfuse, Arize/Phoenix, W&B, Prometheus, Grafana).
    • Implement guardrails, safety red-teaming, and prompt-injection defenses.
    • Manage PII handling, content safety filters, and data loss prevention (DLP).
    • Document incidents, perform RCA reporting, and ensure compliance with data privacy and security policies.
  • Collaboration & Continuous Improvement
    • Collaborate with developers, data scientists, and business teams to resolve operational issues.
    • Participate in incident on-call rotations, maintain runbooks, and conduct disaster recovery (DR) tests.
    • Recommend improvements to enhance system resilience, reduce downtime, and optimize costs.
Requirements
  • Bachelor's degree in Computer Science, Engineering, or related field (or equivalent experience).
  • 3+ years of experience in operations, DevOps, or AI/ML support roles.
  • Proven track record managing large, complex, multi-stakeholder projects.
  • Strong knowledge of Machine Learning practices (model monitoring, retraining, pipelines).
  • Familiarity with conversational AI platforms (Amazon Lex, Salesforce Einstein Bots, ElevenLabs).
  • Integration experience with Amazon Connect, Genesys Cloud CX, NICE CXone, or similar.
  • Proficiency with test automation tools (Selenium, Playwright, Cypress, Appium, or equivalent).
  • Experience with API testing tools (Postman, RestAssured, Karate).
  • Strong scripting and automation skills (Python, Bash, CI/CD pipelines).
  • C1 English proficiency.
Nice to Have
  • Experience with performance/load testing tools (JMeter, Locust, Gatling).
  • Knowledge of cloud platforms (AWS, Azure, GCP).
  • Familiarity with Git or other version control systems.
  • QA certification (ISTQB or equivalent).
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.