Enable job alerts via email!

Software Data Engineer

Iambic Therapeutics

San Diego (CA)

Remote

USD 90,000 - 150,000

Full time

25 days ago

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

Join a pioneering company at the forefront of AI-driven drug discovery! In this remote role, you'll be instrumental in building and optimizing data pipelines that transform complex datasets into high-quality training inputs for innovative AI models. Collaborate with a talented team of ML scientists and software developers to enhance Python-based workflows and ensure the reliability of our data storage infrastructure. This is a unique opportunity to contribute to groundbreaking research and development in therapeutics, all while enjoying a flexible work environment and a commitment to diversity and inclusion. Elevate your career with a forward-thinking organization that values your expertise and fosters a culture of collaboration and innovation.

Benefits

Company Paid Healthcare
401K Matching
Uncapped Vacation
Onsite Gym
Dining Facilities

Qualifications

  • 8+ years of experience in data engineering or related fields.
  • Strong Python skills and experience with ETL systems design.
  • Familiarity with data lake technologies and orchestration tools.

Responsibilities

  • Design and improve data pipelines for AI model training.
  • Collaborate with ML engineers to enhance data processing workflows.
  • Conduct code reviews and maintain software best practices.

Skills

Python
ETL Systems Design
Data Pipeline Orchestration
Data Processing at Scale
Communication Skills
Problem-Solving

Education

Bachelor's Degree
Master's Degree
PhD

Tools

Prefect
Airflow
Argo
Databricks
Spark
AWS
Kubernetes

Job description

JOB SUMMARY

In this role, you’ll play a pivotal part in building and optimizing data pipelines that transform large, multi-modal datasets into high-quality training inputs for cutting-edge AI models for drug discovery. You’ll help evolve our data pipeline and storage infrastructure to support faster, more reliable turnarounds for research and development of new models.

You’ll join a multidisciplinary team, collaborating closely with ML scientists, software developers and DevOps engineers to improve the performance and reliability of Python-based workflows. As a key contributor, you’ll participate in the design, testing, and maintenance of core software systems, conduct thoughtful code reviews, and champion engineering best practices—including version control, testing, and documentation.

This role is remote, with preference for candidates on the East Coast or UK.

KEY RESPONSIBILITIES

  1. Design and improve data pipelines that process large, multi-modal datasets from a variety of internal and external sources into training datasets for AI models.
  2. Evolve our data storage layer to support analytics, schema evolution, reproducibility, and efficient data access.
  3. Collaborate with ML engineers to improve the performance and reliability of Python-based data processing workflows.
  4. Collaborate on the creation, testing and maintenance of software systems.
  5. Code review for pull requests in adjoining areas.
  6. Maintenance of and mentorship in software best practices, including version control, testing and documentation.
  7. Clear oral communication of work in meetings and company demos, at a level suited to the audience.

QUALIFICATIONS

  1. Minimum of 8 years of related experience with a Bachelor’s degree; or 6 years and a Master’s degree; or a PhD with 3 years experience; or equivalent experience.
  2. Proven ability to design flexible, maintainable ETL systems.
  3. Experience with data pipeline orchestration tools such as Prefect, Airflow, Argo, Databricks, or Spark.
  4. Understanding of the ML model lifecycle; prior work with scientific or ML workflows is a plus.
  5. Hands-on experience with multi-terabyte scale data processing.
  6. Familiarity with AWS; Kubernetes experience is a bonus.
  7. Knowledge of data lake technologies such as Parquet, Iceberg, AWS Glue etc.
  8. Strong Python software engineering skills.
  9. Pragmatic mindset — able to evaluate tradeoffs find solutions that empower ML researchers to move quickly.
  10. Background in bioinformatics or chemistry is a plus.

ABOUT IAMBIC THERAPEUTICS

Founded in 2019 and headquartered in San Diego, California, Iambic Therapeutics is disrupting the therapeutics landscape with its unique AI-driven drug-discovery platform. Iambic has assembled a world-class team that unites pioneering AI experts and experienced drug hunters with strong track records of success in delivering clinically validated therapeutics. The Iambic platform has been demonstrated to deliver high-quality, differentiated therapeutics to clinical stage with unprecedented speed and across multiple target classes and mechanisms of action. The Iambic team is advancing an internal pipeline of clinical assets to address urgent unmet patient needs. Learn more about the Iambic team, platform, and pipeline at iambic.ai.

MISSION & CORE VALUES

The culture and work at Iambic Therapeutics are profoundly strengthened by the diversity of our people and our differences in background, culture, national origin, religion, sexual orientation, and life experiences. We are committed to building an inclusive environment where a diverse group of talented humans work together to discover therapeutics and create technologies.

PAY AND BENEFITS

We offer industry leading competitive pay, company paid healthcare, flexible spending accounts, voluntary life Insurance, 401K matching, and uncapped vacation to our team. We are in a brand-new state-of-the art facility in beautiful San Diego with an onsite gym, dining, and easy access to great places to live and play.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

MTS 2, Software Engineer, Data Contracts

eBay

Remote

USD 80,000 - 130,000

Yesterday
Be an early applicant

Software Engineer, Data Hub

StackAdapt Inc.

Remote

USD 80,000 - 120,000

7 days ago
Be an early applicant

Software Engineer, Data

Argon

Remote

USD 80,000 - 110,000

5 days ago
Be an early applicant

Software Data Engineer (UK)

Iambic Therapeutics, Inc.

California

Remote

USD 90,000 - 150,000

5 days ago
Be an early applicant

Data Engineer

Millennium Health

San Diego

Remote

USD 121,000 - 149,000

4 days ago
Be an early applicant

Software Data Engineer (US)

Iambic Therapeutics, Inc.

San Diego

Hybrid

USD 120,000 - 160,000

5 days ago
Be an early applicant

Python and Kubernetes Software Engineer - Data, Workflows, AI/ML & Analytics

Canonical

Columbus

Remote

USD 115,000 - 185,000

9 days ago

Python and Kubernetes Software Engineer - Data, AI/ML & Analytics

Canonical

San Francisco

Remote

USD 100,000 - 720,000

9 days ago

Python and Kubernetes Software Engineer - Data, AI/ML & Analytics

Canonical

San Jose

Remote

USD 90,000 - 140,000

8 days ago