This is a fully remote, full time role and not open to contract work.
About b.well
b.well is transforming the healthcare experience by empowering individuals to access and act on their complete health record. We unify data across health systems, payers, labs, pharmacies, and wearable devices using modern APIs and interoperability standards (FHIR, HL7, CCDA), helping people make better health decisions for themselves and their families.
We work with health systems, employers, payers, retail pharmacies, and platform partners to deliver a deeply personalized and actionable health experience. Join us as we modernize one of the most complex, fragmented, and mission-critical industries in the world.
Our platform is already driving impact at scale, with integrations that include:
- Walgreens – reaching over 100 million users
- Samsung Health – serving more than 60 million users
Who We're Looking For
We’re hiring a Principal Data Engineer to help build the future of personalized healthcare at scale. This role sits at the intersection of distributed data systems, real-time ML/AI applications, and next-generation platform infrastructure. You will lead the design and implementation of our high-scale data and AI platforms, help embed LLM-driven intelligence across workflows, and enable product teams to ship faster with confidence.
You’ll work across our technical stack—from data ingestion and transformation to model orchestration, DevOps, and API development. If you're excited about combining the best of data engineering, ML infrastructure, and LLM agents to rewire healthcare from the inside out, we want to meet you.
What You’ll Do
- Architect, build, and maintain scalable, secure, and observable data pipelines across batch, streaming, and real-time use cases.
- Drive the integration of Large Language Models (LLMs) and AI workflows into product experiences, including agent-based architectures and retrieval-augmented generation (RAG) systems.
- Build data services and microservices using FastAPI, MongoDB, Kafka, and DuckDB, supporting both product-facing APIs and backend systems.
- Design and optimize pipelines using Apache Spark, Databricks, Prefect, and Pandas for complex healthcare data workflows.
- Implement observability, logging, alerting, and data quality checks across distributed systems.
- Champion DevOps practices: infrastructure-as-code (Terraform), CI/CD (GitHub Actions, ArgoCD), containerization (Kubernetes), and environment automation.
- Collaborate with ML engineers and researchers to productionize models, enable low-latency inference, and support continuous retraining workflows.
- Work with platform and security teams to ensure HIPAA-compliant infrastructure, secure APIs, and protected data pipelines.
- Mentor engineers, guide architectural direction, and shape best practices in platform scalability, resiliency, and experimentation.
What You Bring- 10+ years of experience in software and data engineering, including 5+ years working with distributed data systems.
- Strong experience building in Python, with deep familiarity with Pandas, PySpark, and Databricks.
- Expertise in cloud-native and containerized environments using Kubernetes, Docker, and AWS.
- Proven ability to architect and build real-time systems using Kafka, DuckDB, and event-driven design.
- Strong grasp of modern DevOps tooling (Terraform, GitHub Actions, Prometheus, Datadog).
- Hands-on experience with FastAPI for building scalable, high-performance APIs.
- Experience working with LLM technologies and deploying ML/AI agents into real-world production use cases.
- Deep understanding of healthcare interoperability standards (FHIR, HL7, CCDA) and regulated data practices.
Nice to Have- Experience deploying and fine-tuning LLMs or building semantic search and vector DB-based RAG systems.
- Experience working with Prefect or Airflow for orchestrating large-scale workflows.
- Familiarity with HIPAA, HITECH, HITRUST, and secure data architecture principles.
- Previous experience in healthcare or healthtech startup environments.
- Contributions to open-source or a visible GitHub/Stack Overflow presence.
- This is a full-time, remote role open to candidates across the U.S. We also offer hybrid or in-office options in Baltimore.
- Meaningful equity in a rapidly growing company
- Full medical, dental, and vision benefits
- Flexible PTO and remote-first culture
- 401(k)
- Professional development, and internal mentorship
The target salary range for this position is $175,000 - $210,000 and is part of a competitive total rewards package including stock options, benefits, and incentive pay for eligible roles. Individual pay may vary from the target range and is determined by a number of factors including experience, location, internal pay equity, and other relevant business considerations. We review all employee pay and compensation programs annually at minimum to ensure competitive and fair pay.
Data shows that women, people of color, and other underrepresented groups may be less likely to apply for jobs unless they believe they are a perfect match. But b.well holds diversity amongst its key values, and we have a strong commitment to building our workforce and products through that lens.
You don't have to check every box in this job description to be a great fit for the role! If you're excited about this position and the prospect of working for b.well, please apply. If it turns out this role isn't for you, there may be other openings that could align with your experience and expertise!
We are committed to an inclusive and diverse b.well. We are an equal opportunity employer. We do not discriminate based on race, ethnicity, color, ancestry, national origin, religion, sex, sexual orientation, gender identity, age, disability, veteran, genetic information, marital status or any other legally protected status.