Enable job alerts via email!

AI Solutions Specialist

Data Direct Networks

United States

Remote

USD 90,000 - 150,000

Full time

5 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a forward-thinking company at the forefront of AI and data storage innovation. As an AI Solutions Specialist, you'll lead the development of cutting-edge AI and HPC solutions, optimizing storage environments for maximum performance. This role offers the chance to work with advanced GPU virtualization and AI workloads, collaborating with cross-functional teams to drive strategic partnerships. The company values innovation and customer success, providing a dynamic environment for professionals eager to make a significant impact in the AI landscape. If you're passionate about technology and looking for a rewarding challenge, this opportunity is for you.

Benefits

Flexible Work Hours
Health Insurance
401(k) Plan
Professional Development
Remote Work Opportunities

Qualifications

  • Extensive experience in optimizing AI inference and GPU workloads.
  • Proven track record of managing large-scale HPC clusters.
  • Expertise in deploying cloud-based solutions across hybrid environments.

Responsibilities

  • Lead the technical development of AI and HPC solutions.
  • Optimize storage and cluster environments for AI workloads.
  • Collaborate with teams to ensure scalability and efficiency.

Skills

AI Inference Optimization
GPU Virtualization
CUDA Programming
TensorFlow
PyTorch
Cluster Management
Cloud Solutions (AWS, Azure, Google Cloud)
Performance Tuning

Education

Bachelor's Degree in Computer Science
Master's Degree in Engineering

Tools

Docker
Kubernetes
NVIDIA vGPU
RDMA
InfiniBand

Job description

Overview

This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing.

"DDN's A3I solutions are transforming the landscape of AI infrastructure." – IDC

“The real differentiator is DDN. I never hesitate to recommend DDN. DDN is the de facto name for AI Storage in high performance environments” - Marc Hamilton, VP, Solutions Architecture & Engineering | NVIDIA

DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. With a proven track record of performance, reliability, and scalability, DDN empowers businesses to tackle the most challenging AI and data-intensive workloads with confidence.

Our success is driven by our unwavering commitment to innovation, customer-centricity, and a team of passionate professionals who bring their expertise and dedication to every project. This is a chance to make a significant impact at a company that is shaping the future of AI and data management.

Our commitment to innovation, customer success, and market leadership makes this an exciting and rewarding role for a driven professional looking to make a lasting impact in the world of AI and data storage.

Job Description

As a AI Solutions Specialist at DDN, you will lead the technical development of cutting-edge AI, GPU virtualization, and high-performance computing (HPC) solutions. You will play a critical role in optimizing our storage and cluster environments to drive AI inferencing, GPU computing, and large-scale HPC systems to new heights. You will leverage your deep technical expertise in AI inference, GPU virtualization, and infrastructure optimization to enable seamless integration of our storage products with modern computing stacks.

Your work will impact our customers' ability to run AI-driven workloads and maximize performance across hybrid on-premise and cloud environments. You’ll collaborate with cross-functional teams to innovate, drive strategic partnerships, and ensure the scalability and efficiency of AI and HPC solutions for some of the most demanding applications in the world.

Key Responsibilities:

  • Lead Innovation in GPU Virtualization & AI Workloads:Design, optimize, and implement advanced GPU virtualization solutions, including GPU Direct Storage integration, to enhance performance for AI inferencing and HPC workloads.
  • Optimize Large-Scale AI & HPC Infrastructures:Develop and deploy solutions that improve cluster utilization and optimize performance for AI and GPU-driven systems. Manage GPU clusters and related infrastructure to maximize availability, scalability, and efficiency.
  • AI Inference & Model Optimization:Drive the optimization of AI inference workloads using frameworks such as TensorFlow, PyTorch, and other industry-leading tools. Leverage expertise in CUDA to tune and accelerate AI models and workloads.
  • Hybrid Cloud Infrastructure Strategy:Architect, deploy, and optimize cloud-based and hybrid on-premise solutions for AI and HPC workloads. Ensure integration with cloud providers and bare-metal systems to deliver high-performance, scalable, and cost-effective solutions.
  • Drive Performance Improvement:Continually assess and optimize system configurations for AI inference and HPC workloads, driving significant performance improvements through specialized technologies such as RDMA, InfiniBand, and high-bandwidth interconnects.
  • Strategic Planning & Partnerships:Build and maintain relationships with key stakeholders, including cloud service providers, hardware manufacturers (e.g., Nvidia), and customers, to stay ahead of industry trends and integrate best-in-class technologies into DDN’s offerings.

Required Skills and Experience:

  • Extensive experience in optimizing AI inference and GPU-based workloads using frameworks such as TensorFlow, PyTorch, and CUDA. Strong understanding of GPU virtualization, including integration of technologies such as NVIDIA vGPU and GPUDirect.
  • Proven track record of managing large-scale HPC clusters, optimizing performance, and scaling workloads. Proficient in cluster management tools and optimizing infrastructure for AI-driven applications.
  • Expertise in deploying cloud-based solutions across hybrid environments (AWS, Azure, Google Cloud, etc.). Experience in managing and optimizing cloud-native infrastructure for real-time AI and HPC workloads.
  • Knowledge of RDMA, InfiniBand, high-bandwidth interconnects, and their impact on performance in distributed systems.
  • Extensive experience working with Nvidia GPUs. Familiarity with Nvidia’s software stack and optimizations for AI/ML workloads.
  • Programming (e.g., Python, C++, CUDA) and performance tuning for large-scale, complex systems. Experience optimizing LLM (Large Language Model) training and inference workloads.

Preferred Qualifications:

  • +5 years of experience
  • BS or MS degree in Computer Science, Engineering, or related technical field.
  • Experience with distributed systems, containerization (Docker, Kubernetes), and orchestration.
  • Familiarity with machine learning and AI frameworks, and the ability to work with data science teams to optimize models.
DDN

Join our dynamic and driven team, where engineering excellence is at the heart of everything we do. We seek individuals who love to challenge themselves and are fueled by curiosity. Here, you'll have the opportunity to work across various areas of the company, thanks to our flat organizational structure that encourages hands-on involvement and direct contributions to our mission. Leadership is earned by those who take initiative and consistently deliver outstanding results, both in their work ethic and deliverables, making strong prioritization skills essential. Additionally, we value strong communication skills in all our engineers and researchers, as they are crucial for the success of our teams and the company as a whole.

Interview Process:After submitting your application, one of our recruiters will review your resume. If your application passes this stage, you will be invited to a 30-minute interview during which a member of our team will ask some basic questions. If you clear the interview, you will enter the main process, which can consist of up to four interviews in total:

  • Systems design: Translate high-level requirements into a scalable, fault-tolerant service (depending on role).
  • Real-time problem-solving: Demonstrate practical skills in a live problem-solving session.
  • Meet and greet with the wider team.
  • Our goal is to finish the main process in 2-3 weeks at most.

DataDirect Networks, Inc. is an Equal Opportunity/Affirmative Action employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity, gender expression, transgender, sex stereotyping, sexual orientation, national origin, disability, protected Veteran Status, or any other characteristic protected by applicable federal, state, or local law.

#LI-Remote

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Principal AI/ML Engineer

Optum

Remote

USD 122,000 - 235,000

6 days ago
Be an early applicant

Principal AI/ML Engineer - Remote - 2256738

Primary Care Plus

Bellevue

Remote

USD 122,000 - 235,000

Today
Be an early applicant

Freelance Bioinformatics Specialist - AI Tutor (Talent pool)

Mindrift

New York

Remote

USD 80,000 - 120,000

Today
Be an early applicant

DocOps Specialist - AI Solutions

Katalon, Inc.

Remote

USD 70,000 - 110,000

5 days ago
Be an early applicant

Head of Research & Development (AI / Computer Vision)

Panoptyc

California

Remote

USD 120,000 - 160,000

Yesterday
Be an early applicant

AI Analyst, SDR Programs

Remote

Remote

USD 42,000 - 142,000

Today
Be an early applicant

Intermediate (5+ year) Project Manager to lead AI enhancements and product integrations.

TeamSoft

Sugar Land

Remote

USD 80,000 - 110,000

Yesterday
Be an early applicant

AI Analyst - US

Jacobs

San Diego

Remote

USD 90,000 - 110,000

Yesterday
Be an early applicant

AI Training Consultant Remote - US

Outlier AI, Inc.

Remote

USD 60,000 - 100,000

14 days ago