Enable job alerts via email!

Member of Technical Staff - Applied AI Software Engineer, Health

Microsoft

City Of London

On-site

GBP 60,000 - 80,000

Full time

Today
Be an early applicant

Job summary

A leading technology company is looking for skilled software engineers to join its AI health team. Your role will involve designing evaluation systems for LLMs, collaborating closely with product teams, and running experiments to assess performance in healthcare contexts. Ideal candidates should have a strong background in Python programming and machine learning, alongside experience in cross-functional teams. Join us to help make a difference in user health management.

Qualifications

  • Significant Python programming experience and machine learning research.
  • Experience building with and around LLMs.
  • Ability to collaborate in cross-functional teams.

Responsibilities

  • Contribute to translating research into benchmarks.
  • Design evaluation systems for LLM capabilities in healthcare.
  • Run experiments to test prompting techniques.

Skills

Python programming
Machine learning research
Evaluation systems design
Cross-functional collaboration
LLM performance analysis

Education

Bachelor's or Higher Degree in Computer Science

Tools

ML evaluation tools
Data engineering tools
Job description
Overview

At Microsoft AI, our health team is on a mission to help millions of users better understand and proactively manage their health and wellbeing. We’re responsible for ensuring that Microsoft AI’s models and services are useful, trusted and safe across diverse customer health journeys.

We’re assembling a world-class team of builders with backgrounds in healthcare, big tech, and frontier AI research labs. Our goal is to translate cutting-edge research—like MAI-DxO (microsoft.ai/new/the-path-to-medical-superintelligence)—into transformative products for millions of users across copilot.com and Microsoft’s consumer ecosystem.

We are looking for software engineers with experience designing and building LLM-based products, and taking them through to production. You will be a key bridge between research and product, and play a pivotal role in establishing Copilot as the leader in safe, informative, trustworthy and useful health information. We are specifically looking for engineers with experience designing, building and running evaluations for LLMs. You will be expected to build eval pipelines, curate and synthesise datasets, design automated analyses and explain results to internal stakeholders.

Microsoft’s mission is to empower every person and every organisation on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realise our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.

Starting January 26, 2026, MAI employees are expected to work from a designated Microsoft office at least four days a week if they live within 50 miles (U.S.) or 25 miles (non-U.S., country-specific) of that location. This expectation is subject to local law and may vary by jurisdiction.

Responsibilities
  • Be a key cross-functional team member, and close collaborator to product teams. Make significant contributions from translating research into benchmarks to co-authoring product roadmap.
  • Design and build evaluation systems that test LLM capabilities in the healthcare domain, and interpret and communicate results.
  • Run experiments to determine how different prompting techniques affect results on internal and industry benchmarks.
  • Improve the tooling that is used internally to implement, run and analyze evaluations.
  • Design internal benchmarking and regression testing capabilities that capture model accuracy, safety and utility.
Qualifications

Required Qualifications:

  • Bachelor's or Higher Degree in Computer Science, or related technical discipline AND significant Python programming experience / machine learning research.
  • Deep experience building with and around LLMs, and experience building tools for analysing and understanding their performance. Including, but not limited to, prompt / context engineering.
  • Experience collaborating in cross functional teams, working through ambiguity to deliver high quality results.
  • Have 0 to 1 experience with a bias towards shipping and learning, while balancing a high-quality bar.
  • Proven ability to collaborate and contribute to a positive, inclusive work environment, fostering knowledge sharing and growth within the team.

Preferred qualifications:

  • Experience in healthcare technology, or experience in the health domain.
  • Experience with data engineering - handling text dataset sourcing, curation, and processing tasks at scale.
  • Passionate about conversational AI and its deployment.
  • Demonstrated written and verbal communication skills with the ability to work closely with cross-functional teams, including product managers, designers, and other engineers.
  • Passion for learning new technologies and staying up to date with industry trends, best practices, and emerging technologies and patterns in AI.
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.