Enable job alerts via email!

AI Software Engineer, Inference

Nexus

San Francisco (CA)

On-site

USD 135,000 - 200,000

Full time

18 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Nexus is seeking an AI Software Engineer focused on Inference to build systems that deliver real-time predictions efficiently. This role involves collaborating with researchers and engineers to optimize ML workloads and ensure high performance in production environments. The ideal candidate will have strong programming skills and experience with containerization and cloud services.

Benefits

Competitive salary and generous equity compensation
Health insurance for employees and their dependents
Daily lunch and dinner provided at SF headquarters
Company-paid travel to events and conferences

Qualifications

  • 3+ years of experience in software engineering, preferably with exposure to ML systems in production.
  • Strong skills in Python, Go, or Java, and a solid understanding of system performance fundamentals.
  • Comfort with performance tuning and profiling of ML model execution.

Responsibilities

  • Design and optimize lightning-fast inference pipelines for both real-time and batch predictions.
  • Deploy and scale machine learning models in production across cloud and containerized environments.
  • Monitor performance in the wild — build tools to track model behavior, latency, and reliability.

Skills

Python
Go
Java
Containerization
Machine Learning

Tools

Docker
Kubernetes
AWS
GCP
Azure

Job description

Join to apply for the AI Software Engineer, Inference role at Nexus

Join to apply for the AI Software Engineer, Inference role at Nexus

Get AI-powered advice on this job and more exclusive features.

About Nexus

Nexus is building a world supercomputer by leveraging the latest advancements in AI, cryptography, engineering, and science. Our team of world-leading experts is developing and deploying the Nexus Layer 1 blockchain and Nexus zkVM (zero-knowledge virtual machine) in support of our mission to enable the Verifiable Internet.

About Nexus

Nexus is building a world supercomputer by leveraging the latest advancements in AI, cryptography, engineering, and science. Our team of world-leading experts is developing and deploying the Nexus Layer 1 blockchain and Nexus zkVM (zero-knowledge virtual machine) in support of our mission to enable the Verifiable Internet.

Nexus raised $25M in Series A funding from Lightspeed, Pantera, Dragonfly, SV Angel, and more.

We are headquartered in San Francisco, and this role will be in-person with the rest of the Nexus team.

AI Software Engineer, Inference

We’re looking for an AI Software Engineer focused on Inference to help us bring powerful AI models to life — fast, efficient, and at scale. This role is all about building the systems that deliver real-time predictions, keeping latency low and performance high. If you love optimizing ML workloads and making complex systems run smoothly in production, this one's for you.

At our startup, speed matters — not just in how our models perform, but in how quickly we learn, ship, and grow. You’ll be a core part of the engineering team, collaborating closely with researchers and product engineers to build scalable inference systems that support everything we do.

Responsibilities

  • Design and optimize lightning-fast inference pipelines for both real-time and batch predictions
  • Deploy and scale machine learning models in production across cloud and containerized environments
  • Leverage frameworks like TensorFlow Serving, TorchServe, or Triton to serve models at scale
  • Monitor performance in the wild — build tools to track model behavior, latency, and reliability
  • Work with researchers to productionize models, implement model compression, and make inference as efficient as possible
  • Solve problems fast — whether it’s a scaling bottleneck, a failed deployment, or a rogue latency spike
  • Build internal tools that streamline how we deploy and monitor inference workloads

Requirements

  • 3+ years of experience in software engineering, preferably with exposure to ML systems in production
  • Strong skills in Python, Go, or Java, and a solid understanding of system performance fundamentals
  • Experience with containerization (Docker, Kubernetes) and deploying services in the cloud (AWS, GCP, or Azure)
  • Solid understanding of model serving architectures and techniques for optimizing latency and throughput
  • Comfort with performance tuning and profiling of ML model execution
  • A practical mindset and eagerness to own production systems from build to run
  • Embrace AI as a core part of how you work, think, and build.

Bonus Points

  • Experience with hardware acceleration for inference (GPUs, TPUs, etc.)
  • Familiarity with real-time data processing and streaming tools
  • Hands-on with edge deployment (mobile, embedded, etc.)
  • Contributions to open-source projects in model serving or ML infrastructure

Benefits

  • Competitive salary and generous equity compensation
  • Health insurance for employees and their dependents
  • Daily lunch and dinner provided at SF headquarters
  • Company-paid travel to events and conferences

Nexus is committed to diversity in our workforce and is proud to be an Equal Opportunity Employer (EEO).

Seniority level
  • Seniority level
    Mid-Senior level
Employment type
  • Employment type
    Full-time
Job function
  • Job function
    Engineering and Information Technology
  • Industries
    Software Development

Referrals increase your chances of interviewing at Nexus by 2x

Sign in to set job alerts for “Artificial Intelligence Engineer” roles.

San Francisco, CA $135,000.00-$200,000.00 3 months ago

San Francisco, CA $135,000.00-$170,000.00 3 months ago

San Francisco, CA $130,000.00-$230,000.00 4 months ago

Research Engineer - Machine Learning (ML)

San Francisco, CA $130,000.00-$180,000.00 2 months ago

San Francisco, CA $150,000.00-$300,000.00 8 months ago

San Francisco, CA $140,000.00-$215,000.00 1 month ago

San Francisco, CA $150,000.00-$250,000.00 2 months ago

San Francisco, CA $56.25-$173,000.00 2 weeks ago

Machine Learning Engineer for Game Technology

San Francisco, CA $140,000.00-$160,000.00 3 months ago

Redwood City, CA $123,000.00-$185,000.00 6 months ago

Software Engineer, Python - AI Training (Freelance, Remote)
Machine Learning Engineer, AI/ML for Drug Discovery

San Francisco, CA $150,000.00-$200,000.00 5 months ago

San Francisco, CA $180,000.00-$270,000.00 2 weeks ago

San Francisco, CA $88,000.00-$140,000.00 1 month ago

San Francisco, CA $175,000.00-$225,000.00 6 months ago

San Francisco, CA $180,000.00-$220,000.00 5 days ago

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Applied AI Software Engineer

Canvas Medical

San Francisco

Remote

USD 120,000 - 180,000

3 days ago
Be an early applicant

Software Engineer, AI Infrastructure

Figma

San Francisco

Remote

USD 149,000 - 350,000

Yesterday
Be an early applicant

Senior Software Engineer- Observability and Reliability Platform Engineering (REMOTE)

GEICO

San Francisco

Remote

USD 90,000 - 230,000

Yesterday
Be an early applicant

Software Engineer, Product Engineering

Figma

San Francisco

Remote

USD 149,000 - 350,000

Yesterday
Be an early applicant

Software Engineer, Data Infrastructure

Figma

San Francisco

Remote

USD 149,000 - 350,000

Yesterday
Be an early applicant

Software Engineer, Production Engineering

Figma

San Francisco

Remote

USD 149,000 - 350,000

Yesterday
Be an early applicant

Security Software Engineer

Canonical

San Francisco

Remote

USD 150,000 - 275,000

Yesterday
Be an early applicant

Remote Senior Software Engineer - 34123

Turing

San Francisco

Remote

USD 120,000 - 180,000

Today
Be an early applicant

Remote Senior Software Engineer - 34123

Turing

San Francisco

Remote

USD 105,000 - 180,000

Today
Be an early applicant