Enable job alerts via email!

Software Engineer, Data Infrastructure

Evolutionary Scale

San Francisco, New York (CA, NY)

Hybrid

USD 120,000 - 160,000

Full time

3 days ago
Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in AI-driven biological research seeks a Data Infrastructure Engineer in San Francisco. This role focuses on developing and maintaining scalable data processing pipelines essential for biology datasets. Candidates should possess deep knowledge of data engineering principles and experience with technologies like Spark, Ray, and Hadoop, contributing to innovative solutions for biological design.

Qualifications

  • 5+ years of experience preferred in data infrastructure or engineering roles.
  • Proven experience with large-scale distributed data systems.
  • Experience with cloud providers like AWS, GCP, or Azure a plus.

Responsibilities

  • Design and maintain data processing pipelines for biology datasets.
  • Manage data infrastructure for robust operations.
  • Optimize data ingestion, storage, and retrieval processes.

Skills

Large-scale data processing
Problem-solving
Understanding of data processing principles

Tools

Spark
Ray
Hadoop
Kafka Streams
Spark Streaming
Flink
AWS
GCP
Azure

Job description

Who we are

EvolutionaryScale’s mission is to develop artificial intelligence to understand biology for the benefit of human health and society, through open, safe, and responsible research, and in partnership with the scientific community. Over the next ten years AI will transform biological design, making molecules and entire cells programmable. We will develop the foundation models for biology that enable this.

The EvolutionaryScale team is based in San Francisco and New York. We believe in flexibility around work schedules and locations, but expect that our team members will work half of the days or more of most weeks from one of our offices.

What you’ll do

As a Data Infrastructure Engineer, you will work closely with bioinformatics and research teams to ensure our data jobs are reliable, efficient, and scalable. You'll implement best practices for handling large-scale data processing, select and integrate the right technologies, and drive continuous improvements in performance and quality of our data sets.

The role
  • Design, develop, and maintain large-scale batch processing pipelines using tools like Spark and Ray, for acquiring biology datasets.
  • Manage data infrastructure components to ensure robust and fault-tolerant operations.
  • Optimize data ingestion, storage, and retrieval processes for acquiring large and growing biology datasets, and for efficient pre and post training data ingestion.
  • Create systems for easy and reproducible data evaluation and experiments.
  • Integrate modern ML based data curation technologies with data processing pipelines.
  • Work with researchers and other engineering teams to understand data needs, create solutions that meet modeling requirements.
Preferred qualifications

Apply even if you don’t meet all of these!

  • Staff level engineers with 5+ years experience highly preferred
  • Proven experience with large-scale data processing systems using technologies such as Hadoop, Spark, or Ray.
  • Knowledge of streaming data frameworks like Kafka Streams, Spark Streaming, or Flink.
  • Understanding of data processing principles and best practices.
  • Strong problem-solving skills, including the ability to research, debug, and resolve complex technical problems.
  • Experience with major cloud providers (AWS, GCP, or Azure), including familiarity with data warehousing tools is a plus.
  • Knowledge of biology and biology datasets is a big plus but not required.
  • Experience with large scale distributed systems or machine learning is also not required but a plus.
Apply for this job

*

indicates a required field

First Name *

Last Name *

Email *

Phone

Resume/CV *

Enter manually

Accepted file types: pdf, doc, docx, txt, rtf

Are you legally authorized to work in the United States? * Select...

Do you now or will you in the future require sponsorship to work in the U.S.? (e.g., H-1B visa status)? *

When would you be available to start a new position? *

Can you work at the specified job location? *

New York City preferred

Open to either

No

Could you provide the contact details of one or two colleagues, collaborators, or managers who could serve as references for your work? Take your time with this request if needed; we only call references after you pass the full interview panel. Feel free to email this to us later as well.

If resource is not a problem and the sky's the limit, what is the first project you would like to work on at EvolutionaryScale?

What is the project from your resume that you are most proud of and why. Share any news articles, publications, or open source repos if available.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs

Software Engineer, Data Infrastructure

Figma

San Francisco

Remote

USD 149.000 - 350.000

30+ days ago

Software Engineer, Developer Productivity

Whatnot

Seattle

Remote

USD 120.000 - 170.000

18 days ago

Senior Software Engineer – Swift (Hulu)

The Walt Disney Company

Seattle

On-site

USD 145.000 - 195.000

3 days ago
Be an early applicant

Lead Software Engineer - Front End

The Walt Disney Company

New York

On-site

USD 156.000 - 210.000

3 days ago
Be an early applicant

Software Engineer, Production Engineering

Figma

San Francisco

Remote

USD 149.000 - 350.000

30+ days ago

Lead Software Engineer - Secret Management

Disneyland Hong Kong

Seattle

On-site

USD 159.000 - 214.000

3 days ago
Be an early applicant

Senior Software Engineer – Swift (Hulu)

Disneyland Hong Kong

Seattle

On-site

USD 145.000 - 195.000

3 days ago
Be an early applicant

Software Engineers - Backend

Calm

Minneapolis

Remote

USD 159.000 - 260.000

25 days ago

Software Engineer II - ESPN/Sports

Disneyland Hong Kong

Santa Monica

On-site

USD 114.000 - 169.000

3 days ago
Be an early applicant