Enable job alerts via email!

Software Engineer, Data Infrastructure & Acquisition

Speechify

Portland (TX)

Remote

USD 140,000 - 200,000

Full time

Today
Be an early applicant

Job summary

A tech company advancing AI-audio solutions is looking for a Software Engineer, Data Infrastructure & Acquisition. This key role focuses on data collection for model training and works collaboratively to enhance data-driven products. Ideal candidates should have advanced degrees in Computer Science, extensive software development experience, and strong skills in scripting and cloud infrastructure. Competitive salary range is $140,000-$200,000 plus bonuses and equity.

Benefits

Competitive salaries with equity
Autonomy and focus-friendly culture
Opportunity to impact AI technologies

Qualifications

  • 5+ years of industry software development experience.
  • Proficiency with bash/Python scripting in Linux environments.
  • Experience with large-scale data processing workflows is a plus.

Responsibilities

  • Find new sources of audio data for the ingestion pipeline.
  • Operate and extend the cloud infrastructure for the ingestion pipeline.
  • Collaborate with Scientists to optimize data quality for models.

Skills

BS/MS/PhD in Computer Science or related field
5+ years of industry software development experience
Proficiency with bash/Python scripting in Linux environments
Proficiency with Docker and Infrastructure-as-Code
Experience with web crawlers
Strong written and verbal communication skills

Education

BS/MS/PhD in Computer Science or related field

Tools

Docker
GCP
Job description
Overview

The mission of Speechify is to make sure that reading is never a barrier to learning. Speechify’s products help millions turn text into audio and are available on multiple platforms. We are a distributed team with a focus on building great user experiences and impactful AI-enabled products.

Software Engineer, Data Infrastructure & Acquisition – This role is part of the Data side of Speechify’s AI team and is responsible for all aspects of data collection to support model training operations. We build high-quality datasets at petabyte-scale and low cost through integrated infrastructure, engineering, and research.

This is a key role for someone who thinks strategically, thrives in fast-paced environments, and enjoys shaping data-driven product decisions.

What You’ll Do

  • Be scrappy to find new sources of audio data and bring it into our ingestion pipeline
  • Operate and extend the cloud infrastructure for our ingestion pipeline, currently running on GCP and managed with Terraform
  • Collaborate with Scientists to optimize cost, throughput, and data quality for next-generation models
  • Work with the AI Team and Speechify Leadership to craft the dataset roadmap supporting consumer and enterprise products

An Ideal Candidate Should Have

  • BS/MS/PhD in Computer Science or related field
  • 5+ years of industry software development experience
  • Proficiency with bash/Python scripting in Linux environments
  • Proficiency with Docker and Infrastructure-as-Code; experience with at least one major Cloud Provider (GCP preferred)
  • Experience with web crawlers and large-scale data processing workflows is a plus
  • Ability to handle multiple tasks and adapt to changing priorities
  • Strong written and verbal communication skills

What We Offer

  • A fast-growing environment where you can shape the company and product
  • An entrepreneurial-minded team that supports risk-taking and hustle
  • Autonomy and a focus-friendly culture
  • Opportunity to make a meaningful impact in AI and audio technologies
  • Competitive salaries with equity and an asynchronous work culture
  • Work on a life-changing product used by millions, including people with learning differences
  • US-based salary range for this full-time role: $140,000-$200,000 + bonus + equity, depending on experience

How to Apply

Think you’re a good fit? Tell us about yourself and why you’re interested in the role when you apply. Include links to your portfolio and LinkedIn.

Equal Opportunity

Speechify is committed to a diverse and inclusive workplace. Speechify does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.