Enable job alerts via email!

Senior Machine Learning Engineer

Bonfy.AI

Mountain View (CA)

Hybrid

USD 90,000 - 150,000

Full time

4 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

An innovative firm is seeking a passionate engineer to enhance the safety and accountability of AI systems. This role involves designing tools to evaluate LLM behavior and developing metrics that ensure trust in AI applications. You will collaborate across teams to create a cohesive content safety experience while working on cutting-edge technology. With a flexible hybrid schedule and a mission-driven environment, this opportunity allows you to make a meaningful impact in the evolving landscape of AI. Join a team that values clarity and respect, and help shape the future of responsible AI.

Benefits

Generous equity
Flexible hybrid schedule
Health coverage
Vision coverage
Dental coverage

Qualifications

  • Hands-on experience with modern NLP systems in real-world contexts.
  • Comfortable moving from prototype to production in Python.

Responsibilities

  • Design and build tools to evaluate and improve LLM behavior.
  • Define and evolve trust metrics beyond accuracy.

Skills

NLP systems
Python programming
debugging skills
evaluation frameworks
model interpretability

Job description

Bonfy.AI | Mountain View, CA | Hybrid

Security for the Age of AI

About Us

At Bonfy.AI, we’re building the trust layer for generative AI. Our Adaptive Content Security platform detects and mitigates subtle risks baked into large language model (LLM) outputs—before they make it to the user. From hallucinations to hidden data leaks, we help enterprises use GenAI without compromising truth, privacy, or reputation.

We’re model-agnostic, outcome-focused, and unapologetically rigorous. Our customers include Fortune 500 teams deploying LLMs in high-stakes domains—where trust isn't optional.

Why This Role Matters

We’re looking for an engineer who wants to go deeper than metrics—someone who can analyze model behavior, identify subtle failure modes, and build real-time systems that make AI safer to use. You won’t be tuning models for leaderboard glory; you’ll be making them safer, traceable, and accountable. This is a chance to shape the foundation of how the world trusts AI.

What You’ll Do
  • Design and build tools that evaluate and improve LLM behavior across diverse use cases
  • Define and evolve trust metrics that go beyond accuracy — including traceability, robustness under edge cases, and interpretability of model decisions.
  • Work across teams—infra, product, security—to embed ML insights into a cohesive content safety experience.
  • Help us define and refine trust metrics beyond accuracy: traceability, brittleness, interpretability.
What We’re Looking For
  • Hands-on experience working with modern NLP systems in real-world contexts (LLMs, embeddings, transformers, etc.).
  • Comfort moving from prototype to production in Python—outside the notebook.
  • Experience building or working with evaluation frameworks and pipelines.
  • Practical thinking, sharp debugging skills, and an appetite for ambiguity.
Bonus Points For:
  • Experience using or building tools that evaluate the behavior of language models (LLMs).
  • Background in environments where trust, safety, or compliance is critical—even if outside traditional “regulated” industries.
  • Hands-on experience testing AI systems for edge cases, failure modes, or unexpected behavior.
Why Join Us
  • You’ll have technical autonomy and direct exposure to customer use cases.
  • We’re early-stage, well-funded, and mission-driven—your code will shape our trajectory.
  • We believe in clarity, urgency, and respect. We value what you ship, not how loud you are.
  • You’ll work with a sharp, kind, high-trust team that knows what’s at stake.
Compensation & Benefits

Competitive salary. Generous equity. Flexible hybrid schedule. Health, vision, and dental coverage. And most importantly: a chance to build something meaningful during the most critical phase of AI’s evolution.

Apply If...
  • You believe safety isn’t just an add-on—it’s essential to how AI is built.
  • You understand that trust in AI must be demonstrated through evidence, not assumed by design.
  • You’re willing to question conventional approaches when they fall short.
  • You want to contribute meaningfully to the evolution of responsible AI, not just follow established paths.

Bonfy.AI — Truth. Security. Intelligence.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Senior Machine Learning Engineer | MLOps & Scalable Systems

APS (Arizona Public Service)

Phoenix

Remote

USD 120,000 - 150,000

Today
Be an early applicant

Senior Machine Learning Engineer Remote - US

Infinite Reality, Inc.

Remote

USD 145,000 - 157,000

2 days ago
Be an early applicant

Senior machine learning engineer

General Motors

Honolulu

Remote

USD 100,000 - 130,000

Today
Be an early applicant

Senior Machine Learning Engineer

NLP PEOPLE

Charlottesville

Remote

USD 90,000 - 130,000

Today
Be an early applicant

Senior Machine Learning Engineer

Infinite Reality

Remote

USD 133,000 - 151,000

7 days ago
Be an early applicant

Machine Learning Engineer II or Senior Machine Learning Engineer

ACT

Iowa

Remote

USD 95,000 - 143,000

3 days ago
Be an early applicant

Sr. Machine Learning Engineer

Provation Medical, Inc.

Raleigh

Remote

USD 90,000 - 150,000

5 days ago
Be an early applicant

Senior Machine Learning Engineer

Apollo

Poland

Remote

USD 120,000 - 160,000

3 days ago
Be an early applicant

Remote Senior Machine Learning Engineer Needed at Cohere Health Cohere Health

Digitaltidewave

Remote

USD 145,000 - 170,000

7 days ago
Be an early applicant