Aktiviere Job-Benachrichtigungen per E-Mail!

Software Engineer, Data Infrastructure & Acquisition - Frankfurt, Germany

Speechify

Frankfurt

Vor Ort

EUR 70.000 - 90.000

Vollzeit

Heute
Sei unter den ersten Bewerbenden

Zusammenfassung

A tech company focusing on AI in Frankfurt is looking for a Software Engineer for Data Infrastructure & Acquisition. The role involves managing data collection for training operations and building high-quality datasets. Ideal candidates will have 5+ years in software development, proficiency with Bash and Python, and experience on GCP. This position offers a competitive salary and a culture that values technical excellence and collaboration.

Leistungen

Competitive salaries
Autonomy in work culture
Entrepreneurial-minded team

Qualifikationen

  • 5+ years of industry software development experience.
  • Proficiency in Linux environments.
  • Experience with web crawlers is a plus.

Aufgaben

  • Find new sources of audio data and ingest it into the data pipeline.
  • Operate and extend the cloud infrastructure for ingestion.
  • Collaborate with scientists to improve dataset quality and cost.

Kenntnisse

Bash scripting
Python scripting
Docker
Infrastructure-as-Code
Data processing workflows
Adaptability
Communication skills

Ausbildung

BS/MS/PhD in Computer Science or related field

Tools

GCP
Jobbeschreibung
Software Engineer, Data Infrastructure & Acquisition - Frankfurt, Germany

The mission of Speechify is to make sure that reading is never a barrier to learning.

Speechify’s products convert text to speech across platforms (iOS, Android, Mac, Chrome Extension, Web App). Speechify has a 100% distributed workforce and a culture focused on leadership, technical excellence, and delivering results.

Overview

The Data Infrastructure & Acquisition role sits on the AI/Data side of Speechify. This role is responsible for all aspects of data collection to support model training operations. You will help build high-quality datasets at petabyte-scale and low cost through close collaboration between infrastructure, engineering, and research.

What You’ll Do

  • Be scrappy to find new sources of audio data and ingest it into the data pipeline.
  • Operate and extend the cloud infrastructure for ingestion, currently on GCP and managed with Terraform.
  • Collaborate with Scientists to improve cost, throughput, and quality to power next-generation models.
  • Partner with the AI Team and Speechify leadership to craft the dataset roadmap for consumer and enterprise products.

An Ideal Candidate Should Have

  • BS/MS/PhD in Computer Science or a related field.
  • 5+ years of industry software development experience.
  • Proficiency with bash and Python scripting in Linux environments.
  • Proficiency in Docker and Infrastructure-as-Code concepts; professional experience with a major cloud provider (GCP preferred).
  • Experience with web crawlers and large-scale data processing workflows is a plus.
  • Ability to manage multiple tasks and adapt to changing priorities.
  • Strong written and verbal communication skills.

What We Offer

  • A fast-growing environment where you can help shape the company and product.
  • An entrepreneurial-minded team that supports risk-taking and hustle.
  • Autonomy and a focus-friendly work culture.
  • Opportunity to impact a transformative industry and work on a life-changing product.
  • Competitive salaries and a culture that values asynchronous collaboration.
  • Impactful work at the intersection of artificial intelligence and audio.

Think you’re a good fit for this job?

Tell us about yourself and why you’re interested in the role when you apply, and include links to your portfolio and LinkedIn.

Hol dir deinen kostenlosen, vertraulichen Lebenslauf-Check.
eine PDF-, DOC-, DOCX-, ODT- oder PAGES-Datei bis zu 5 MB per Drag & Drop ablegen.