Enable job alerts via email!

Senior Member of Technical Staff - AI/ML Infrastructure Engineer 3

Oracle

Santa Clara (CA)

On-site

USD 79,000 - 179,000

Full time

2 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading technology company seeks a Senior Member of Technical Staff specializing in AI/ML Infrastructure Engineering. You will design and maintain infrastructure supporting AI and machine learning initiatives, collaborating with cross-functional teams to optimize performance and security. The ideal candidate will have expertise in containerization, scripting, and Linux systems, and experience managing AI/ML workloads.

Benefits

Comprehensive health coverage
401(k)
Paid time off
Potential bonuses and equity

Qualifications

  • Experience scripting and automating with Ansible, Terraform, and Kubernetes.
  • Solid Linux skills with hands-on experience in various distributions.
  • Strong understanding of security principles and best practices.

Responsibilities

  • Design, deploy, and manage infrastructure components for AI/ML workflows.
  • Implement automation for provisioning, configuration, and monitoring.
  • Troubleshoot and resolve infrastructure issues.

Skills

Scripting and automation
Containerization
Networking
Troubleshooting
Communication
Documentation
Linux

Tools

Ansible
Terraform
Kubernetes
Docker
Jenkins
GitLab
Prometheus

Job description

Senior Member of Technical Staff - AI/ML Infrastructure Engineer 3

Join to apply for the Senior Member of Technical Staff - AI/ML Infrastructure Engineer 3 role at Oracle.

Job Description

As an AI/ML Infrastructure Engineer on the GPU Strategic Customers Engineering team, you will play a critical role in designing, implementing, and maintaining infrastructure supporting our AI and machine learning initiatives. You will collaborate with data scientists, software engineers, and IT professionals to ensure efficient, secure, and scalable deployment of AI/ML models. Your expertise will help optimize infrastructure performance, reliability, and cost-efficiency.

Qualifications
  • Experience scripting and automating with tools like Ansible, Terraform, and Kubernetes.
  • Experience with containerization (Docker, Kubernetes) and orchestration for distributed systems.
  • Strong understanding of networking, security principles, and best practices.
  • Excellent troubleshooting skills for complex issues in fast-paced environments.
  • Effective communication and collaboration skills for cross-functional teams.
  • Proven documentation skills for infrastructure design, configuration, and troubleshooting.
  • Solid Linux skills with hands-on experience in Oracle Linux/RHEL/CentOS, Ubuntu, Debian, including system administration and performance tuning.
Preferred Qualifications
  • Proficiency in programming languages such as Python, Rust, Go, Java, or Scala.
  • Experience designing and managing infrastructure for AI/ML or HPC workloads.
  • Knowledge of machine learning frameworks like TensorFlow, PyTorch, or scikit-learn in production.
  • Familiarity with DevOps tools for CI/CD and monitoring (Jenkins, GitLab, Prometheus).
  • Experience with High-Performance Computing systems.
Responsibilities
  • Own problems and develop solutions.
  • Design, deploy, and manage infrastructure components for AI/ML workflows.
  • Collaborate to understand infrastructure needs for training, testing, and deployment.
  • Implement automation for provisioning, configuration, and monitoring.
  • Optimize infrastructure performance and resource utilization.
  • Ensure security and compliance standards are met.
  • Troubleshoot and resolve infrastructure issues.
  • Stay updated on emerging AI/ML infrastructure technologies.
  • Document designs, configurations, and procedures.
Additional Information

This role is open for applications for at least three days from the posting date. The US hiring range is $79,800 - $178,100 annually, with potential bonuses and equity. Benefits include comprehensive health coverage, 401(k), paid time off, and more.

About Us

Oracle is a global leader in cloud solutions, committed to innovation, inclusion, and empowering employees. We support a diverse and inclusive workforce and provide accessible accommodations as needed.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Software Engineer, AI Infrastructure

Figma

San Francisco

Remote

USD 149,000 - 350,000

2 days ago
Be an early applicant

Senior Member of Technical Staff - AI/ML Infrastructure Engineer 3

Oracle

San Francisco

On-site

USD 79,000 - 179,000

2 days ago
Be an early applicant

Senior / Staff Software Engineer (Infrastructure)

Nubank

Durham

Remote

USD 100,000 - 150,000

Yesterday
Be an early applicant

AIML - Staff ML Infrastructure Engineer, ML Platform & Technology - ML Compute

Apple

Santa Clara

On-site

USD 175,000 - 313,000

4 days ago
Be an early applicant

Software Engineer, AI Infrastructure

Figma

New York

Remote

USD 149,000 - 350,000

2 days ago
Be an early applicant

Senior Linux Infrastructure Engineer (HPC)

The Voleon Group

Remote

USD 170,000 - 205,000

2 days ago
Be an early applicant

[Hiring] Infrastructure Engineer @Federato

Federato

Remote

USD 140,000 - 170,000

6 days ago
Be an early applicant

Sr Staff Engineer, ML Infrastructure and Performance

LinkedIn

Mountain View

Hybrid

USD 149,000 - 247,000

Yesterday
Be an early applicant

Senior ML Infrastructure Engineer

Apple Inc.

Cupertino

On-site

USD 175,000 - 265,000

Yesterday
Be an early applicant