Enable job alerts via email!

Senior DevOps Engineer

Together AI

San Francisco (CA)

On-site

USD 160,000 - 230,000

Full time

4 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Together AI, a pioneering artificial intelligence company in San Francisco, is seeking a Senior DevOps Engineer to enhance cloud infrastructure automation and manage GPU workload orchestration. This role is crucial for developing tools that drive reliability and service efficiency, calling for talented individuals ready to implement best practices and innovative CI/CD solutions.

Benefits

Startup equity
Health insurance
Competitive benefits

Qualifications

  • 5+ years of relevant experience in DevOps and cloud computing.
  • Experience in Go, Python, Java, or C++ as programming languages.
  • Strong sense of ownership and desire to build tools for efficiency.

Responsibilities

  • Design, build, and maintain CI/CD infrastructure.
  • Automate services and improve operability.
  • Collaborate with internal teams to enhance system performance.

Skills

Infrastructure as Code
Automation
Collaboration
Troubleshooting

Tools

Terraform
Ansible
Kubernetes
CI/CD

Job description

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

We are hiring a talented Senior DevOps Engineer to develop the software and processes for orchestration of AI workloads over large fleets of distributed GPU hardware. In this role, you'll be part of a cloud engineering organization that aims to automate everything and build failure-resistant and horizontally scalable cloud infrastructure for GPU-resident applications.

As a Senior DevOps Engineer, you'll build deep understanding of Together AI’s services and use that knowledge to optimize and evolve our infrastructure's reliability, availability, serviceability, and profitability.

The best applicants for this role are deeply technical, enthusiastic, great collaborators, and intrinsically motivated to deliver high quality infrastructure. You have experience practicing infrastructure-as-code, including the use of tools like Terraform and Ansible. You also have strong software development fundamentals, systems knowledge, troubleshooting abilities, and a deep sense of responsibility.

Requirements

  • Minimum of 5 years of prior relevant experience in DevOps, cloud computing, data center operations and Linux systems administration
  • Experience in programming in at least one of the following languages: Go, Python, Java, and C++
  • Experience designing and building advanced CI/CD pipeline frameworks
  • Experience with cloud computing toolsets like Terraform, Vault, and Packer
  • Experience with configuration management tools like Ansible, Pulumi, Chef and Puppet
  • Experience with Kubernetes and containerization
  • Strong sense of ownership and desire to build great tools for others

Responsibilities

  • Introduce tools to facilitate greater automation and operability of services
  • Design, build, and maintain CI/CD infrastructure
  • Architect, deploy, and scale observability infrastructure
  • Create runtime tools/processes that optimize cloud triaging and limit downtime
  • Define best practices to make our systems and services measurable
  • Work closely with internal teams to ensure best practices are appropriately applied
  • Build tools to help engineering and research teams measure and improve their velocity
  • Analyze and decompose complex software systems
  • Collaborate with and influence others to improve the overall design

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $160,000 - $230,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Accepted file types: pdf, doc, docx, txt, rtf

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

Education

School Select...

Degree Select...

Select...

Start date year

End date year

LinkedIn Profile

Are you willing to work four days per week in our San Francisco office? * Select...

Are you legally authorized to work in the US or will you require sponsorship to work? *

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior DevOps Engineer (Kubernetes, Docker, Jenkins)

CatchProbe Intelligence Technologies

San Francisco null

Remote

Remote

USD 120,000 - 180,000

Full time

9 days ago

Senior DevOps Engineer

Flashbots

null null

Remote

Remote

USD 120,000 - 180,000

Full time

Today
Be an early applicant

Senior Devops Engineer

Oshi Health

Chicago null

Remote

Remote

USD 150,000 - 180,000

Full time

Yesterday
Be an early applicant

Sr. DevOps Engineer

Unit21

San Francisco null

Remote

Remote

USD 165,000 - 185,000

Full time

30+ days ago

Senior Devops Engineer - Remote / Telecommute

Jobs via Dice

St. Louis null

Remote

Remote

USD 85,000 - 223,000

Full time

Today
Be an early applicant

Sr. DevOps Engineer

Unit21, Inc.

San Francisco null

Remote

Remote

USD 170,000 - 186,000

Full time

30+ days ago

Senior DevOps Engineer

TogetherWeTech

San Francisco null

On-site

On-site

USD 170,000 - 260,000

Full time

2 days ago
Be an early applicant

Senior DevOps Engineer

Corporate Tools LLC

Austin null

Remote

Remote

USD 140,000 - 170,000

Full time

4 days ago
Be an early applicant

Senior DevOps Engineer

Extend, Inc.

San Francisco null

On-site

On-site

USD 150,000 - 180,000

Full time

Today
Be an early applicant