Enable job alerts via email!

DevOps Engineer - Remote / Telecommute

Cynet systems Inc

Atlanta (GA)

Remote

USD 100,000 - 130,000

Full time

Today

Be an early applicant

Job summary

A cutting-edge technology company in Atlanta is seeking an experienced cloud engineer to develop and maintain scalable inference platforms for large language models. Responsibilities include managing cloud engineering projects and implementing distributed inference optimization techniques. Candidates should have experience with modern distributed environments and proficiency in relevant programming languages. A Bachelor’s or Master’s degree in a related field is preferred.

Qualifications

Deep experience building services in cloud and distributed environments.
Experience with Large Language Models (LLMs).
Strong communication skills for technical documentation.
Hands-on experience with benchmarking tools.
Familiarity with LLM performance metrics.
Experience with inference engines like vLLM or NVIDIA Dynamo.
Knowledge of distributed inference techniques.

Responsibilities

Develop and maintain inference platforms for LLMs.
Manage end-to-end cloud engineering projects.
Improve tools and systems for performance monitoring.
Design frameworks for benchmarking model performance.
Implement optimization techniques for distributed inference.

Skills

Cloud infrastructure expertise

AI model inference understanding

Proficiency in Python

C++ programming ability

Problem-solving skills

Analytical skills

Debugging skills

Collaboration ability

Education

Bachelor’s or Master’s degree in Computer Science

Experience with AI infrastructure

Tools

Kubernetes

Docker

CI/CD

APIs

CUDA

ROCm

AITER

NCCL

Job Description

Responsibilities:

Develop and maintain scalable inference platforms for serving LLMs optimized for NVIDIA and client GPUs.
Manage end-to-end cloud engineering projects from ideation and prototyping to deployment and operations.
Build and improve tooling and observability systems to monitor performance and system health.
Design benchmarking frameworks to test and evaluate model serving performance across models, engines, and GPU configurations.
Implement distributed inference optimization techniques, including tensor/data parallelism, KV cache optimizations, and intelligent routing.
Build cross-platform inference support for diverse model architectures.
Contribute to open-source inference engines to enhance performance and efficiency.
Collaborate closely with cloud infrastructure, AI, and DevOps teams to ensure efficient deployment and scaling.

Requirements / Must Have:

Deep experience building services in modern cloud and distributed environments (Kubernetes, Docker, CI/CD, APIs, data storage, monitoring, logging, and alerting).
Experience hosting and running inference on Large Language Models (LLMs).
Strong communication skills with the ability to write detailed technical documentation.
Hands-on experience building or using benchmarking tools for evaluating LLM inference.
Familiarity with LLM performance metrics (prefill throughput, decode throughput, TPOT, TTFT).
Experience with inference engines such as vLLM, SGLang, or Modular Max.
Familiarity with distributed inference serving frameworks (llm-d, NVIDIA Dynamo, Ray Serve, etc.).
Proficiency with client and NVIDIA GPU software such as CUDA, ROCm, AITER, NCCL, or Client.
Knowledge of distributed inference optimization techniques and GPU tuning strategies.

Skills:

Expertise in cloud infrastructure, containerization, and microservices.
Strong understanding of AI model inference and GPU acceleration.
Proficiency in Python, C++, or related programming languages.
Excellent problem-solving, analytical, and debugging skills.
Ability to collaborate in a dynamic and fast-paced environment.

Qualification and Education:

Bachelor’s or Master’s degree in Computer Science, Artificial Intelligence, Electrical Engineering, or a related field.
Experience with AI infrastructure or LLM deployment platforms is highly preferred.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

DevOps Engineer - Remote / Telecommute

Cynet systems Inc

Atlanta (GA)

Remote

USD 100,000 - 130,000

Full time

Job summary

Qualifications

Responsibilities

Skills

Education

Tools

Company

Services

Free resources

Support

DevOps Engineer - Remote / Telecommute

Cynet systems Inc

Atlanta (GA)

Remote

USD 100,000 - 130,000

Full time

Job summary

Qualifications

Responsibilities

Skills

Education

Tools

Follow us

Company

Services

Free resources

Support