Enable job alerts via email!

AI Solutions Specialist

DataDirect Networks

United States

Remote

USD 90,000 - 150,000

Full time

6 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a pioneering company in AI and data storage innovation, where you'll lead the development of cutting-edge GPU virtualization and HPC solutions. As an AI Solutions Specialist, your expertise will drive AI inferencing and optimize high-performance computing environments. Collaborate with cross-functional teams to innovate and enhance the scalability of AI solutions. This role offers a unique opportunity to make a significant impact in the evolving landscape of AI and data management, working with a passionate team dedicated to excellence and customer success. If you're ready to challenge yourself and contribute to groundbreaking projects, this is the place for you.

Qualifications

  • Extensive experience in optimizing AI inference and GPU workloads.
  • Proven track record of managing large-scale HPC clusters.
  • Expertise in deploying cloud-based solutions across hybrid environments.

Responsibilities

  • Lead technical development of AI and HPC solutions.
  • Optimize storage and cluster environments for AI workloads.
  • Collaborate with teams to drive strategic partnerships.

Skills

AI Inference Optimization
GPU Virtualization
CUDA Programming
TensorFlow
PyTorch
Cluster Management
Cloud Solutions (AWS, Azure, Google Cloud)
Performance Tuning

Education

BS or MS in Computer Science

Tools

NVIDIA vGPU
Docker
Kubernetes

Job description







AI Solutions Specialist




Job Locations

US-Remote


























Job ID
2025-5161


Name Linked

Remote: US


Country

United States


City

Remote

Worker Type
Regular Full-Time Employee





Overview




This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing.

"DDN's A3I solutions are transforming the landscape of AI infrastructure." - IDC

"The real differentiator is DDN. I never hesitate to recommend DDN. DDN is the de facto name for AI Storage in high performance environments" - Marc Hamilton, VP, Solutions Architecture & Engineering | NVIDIA

DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. With a proven track record of performance, reliability, and scalability, DDN empowers businesses to tackle the most challenging AI and data-intensive workloads with confidence.

Our success is driven by our unwavering commitment to innovation, customer-centricity, and a team of passionate professionals who bring their expertise and dedication to every project. This is a chance to make a significant impact at a company that is shaping the future of AI and data management.

Our commitment to innovation, customer success, and market leadership makes this an exciting and rewarding role for a driven professional looking to make a lasting impact in the world of AI and data storage.






Job Description




As a AI Solutions Specialist at DDN, you will lead the technical development of cutting-edge AI, GPU virtualization, and high-performance computing (HPC) solutions. You will play a critical role in optimizing our storage and cluster environments to drive AI inferencing, GPU computing, and large-scale HPC systems to new heights. You will leverage your deep technical expertise in AI inference, GPU virtualization, and infrastructure optimization to enable seamless integration of our storage products with modern computing stacks.

Your work will impact our customers' ability to run AI-driven workloads and maximize performance across hybrid on-premise and cloud environments. You'll collaborate with cross-functional teams to innovate, drive strategic partnerships, and ensure the scalability and efficiency of AI and HPC solutions for some of the most demanding applications in the world.

Key Responsibilities:

    Lead Innovation in GPU Virtualization & AI Workloads:
    Design, optimize, and implement advanced GPU virtualization solutions, including GPU Direct Storage integration, to enhance performance for AI inferencing and HPC workloads.
  • Optimize Large-Scale AI & HPC Infrastructures:
    Develop and deploy solutions that improve cluster utilization and optimize performance for AI and GPU-driven systems. Manage GPU clusters and related infrastructure to maximize availability, scalability, and efficiency.
  • AI Inference & Model Optimization:
    Drive the optimization of AI inference workloads using frameworks such as TensorFlow, PyTorch, and other industry-leading tools. Leverage expertise in CUDA to tune and accelerate AI models and workloads.
  • Hybrid Cloud Infrastructure Strategy:
    Architect, deploy, and optimize cloud-based and hybrid on-premise solutions for AI and HPC workloads. Ensure integration with cloud providers and bare-metal systems to deliver high-performance, scalable, and cost-effective solutions.
  • Drive Performance Improvement:
    Continually assess and optimize system configurations for AI inference and HPC workloads, driving significant performance improvements through specialized technologies such as RDMA, InfiniBand, and high-bandwidth interconnects.
  • Strategic Planning & Partnerships:
    Build and maintain relationships with key stakeholders, including cloud service providers, hardware manufacturers (e.g., Nvidia), and customers, to stay ahead of industry trends and integrate best-in-class technologies into DDN's offerings.

Required Skills and Experience:

  • Extensive experience in optimizing AI inference and GPU-based workloads using frameworks such as TensorFlow, PyTorch, and CUDA. Strong understanding of GPU virtualization, including integration of technologies such as NVIDIA vGPU and GPUDirect.
  • Proven track record of managing large-scale HPC clusters, optimizing performance, and scaling workloads. Proficient in cluster management tools and optimizing infrastructure for AI-driven applications.
  • Expertise in deploying cloud-based solutions across hybrid environments (AWS, Azure, Google Cloud, etc.). Experience in managing and optimizing cloud-native infrastructure for real-time AI and HPC workloads.
  • Knowledge of RDMA, InfiniBand, high-bandwidth interconnects, and their impact on performance in distributed systems.
  • Extensive experience working with Nvidia GPUs. Familiarity with Nvidia's software stack and optimizations for AI/ML workloads.
  • Programming (e.g., Python, C++, CUDA) and performance tuning for large-scale, complex systems. Experience optimizing LLM (Large Language Model) training and inference workloads.

Preferred Qualifications:

  • +5 years of experience
  • BS or MS degree in Computer Science, Engineering, or related technical field.
  • Experience with distributed systems, containerization (Docker, Kubernetes), and orchestration.
  • Familiarity with machine learning and AI frameworks, and the ability to work with data science teams to optimize models.





DDN




Join our dynamic and driven team, where engineering excellence is at the heart of everything we do. We seek individuals who love to challenge themselves and are fueled by curiosity. Here, you'll have the opportunity to work across various areas of the company, thanks to our flat organizational structure that encourages hands-on involvement and direct contributions to our mission. Leadership is earned by those who take initiative and consistently deliver outstanding results, both in their work ethic and deliverables, making strong prioritization skills essential. Additionally, we value strong communication skills in all our engineers and researchers, as they are crucial for the success of our teams and the company as a whole.

Interview Process: After submitting your application, one of our recruiters will review your resume. If your application passes this stage, you will be invited to a 30-minute interview during which a member of our team will ask some basic questions. If you clear the interview, you will enter the main process, which can consist of up to four interviews in total:

  • Systems design: Translate high-level requirements into a scalable, fault-tolerant service (depending on role).
  • Real-time problem-solving: Demonstrate practical skills in a live problem-solving session.
  • Meet and greet with the wider team.
  • Our goal is to finish the main process in 2-3 weeks at most.

DataDirect Networks, Inc. is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity, gender expression, transgender, sex stereotyping, sexual orientation, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.

#LI-Remote





Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Principal AI/ML Engineer

Optum

Remote

USD 122,000 - 235,000

Today
Be an early applicant

AI Workflow Specialist

Franklin Fitch

Remote

USD 140,000 - 180,000

3 days ago
Be an early applicant

AI Ops Specialist

Ready.net

Remote

USD 125,000 - 160,000

Yesterday
Be an early applicant

Freelance Legal Advisor (US Law) - AI Tutor

Mindrift

Orlando

Remote

USD 100,000 - 125,000

5 days ago
Be an early applicant

Freelance Economic Analyst - AI Tutor

Mindrift

New York

Remote

USD 60,000 - 100,000

Today
Be an early applicant

AI Software Ecosystem Solution Manager

Avature

Morrisville

Remote

USD 80,000 - 120,000

3 days ago
Be an early applicant

AI Training Consultant Remote - US

Outlier AI, Inc.

Remote

USD 60,000 - 100,000

8 days ago

AI Training Consultant

Outlier

Remote

USD 60,000 - 100,000

8 days ago

Data Engineer Lead/Consultant – Machine Learning/AI

NLP PEOPLE

Remote

USD 144,000 - 171,000

Yesterday
Be an early applicant