Enable job alerts via email!

Language Engineer, Artificial General Intelligence - Data Services

Amazon.com, Inc

Hartford

On-site

GBP 50,000 - 70,000

Full time

Today
Be an early applicant

Job summary

A leading technology company is searching for a Language Engineer to support complex multimodal dataset development. Candidates should hold a Master's or PhD in a relevant field, with at least 2 years of experience in computational linguistics or language data processing. Expertise in Python and familiarity with multimodal data is essential. This role involves collaborating with engineers and data creators to evaluate AI models and design data creation tasks.

Qualifications

  • Master's or higher degree in a relevant field.
  • 2+ years experience in computational linguistics or AI data creation.
  • Experience with language data annotation systems.
  • Proficient with scripting languages, especially Python.
  • Excellent communication and organizational skills.

Responsibilities

  • Design and conduct complex data creation tasks.
  • Analyze and extract insights from large datasets.
  • Build tools for data analysis using Python.
  • Collaborate with team members to evaluate AI model performance.
  • Coordinate data collection efforts with human participants.

Skills

Computational Linguistics expertise
Python proficiency
Experience with multimodal data
Strong communication skills
Data annotation experience
Analytical skills
Machine Learning knowledge

Education

Master's or higher degree in relevant field
PhD in Computational Linguistics

Tools

SQL
R
Matlab
Job description

The Amazon Artificial General Intelligence (AGI) Data Services organization is responsible for developing diverse datasets to train and evaluate the Amazon AI models. We are looking for Language Engineers to join our science and engineering team to support the development of complex, multimodal datasets, using a range of approaches including synthetic data generation, model-supported data generation, and human-in-the-loop data collections.

Responsibilities
  • Design and conduct complex data creation tasks using synthetic and model-based data generation methods, following state-of-the-art approaches.
  • Analyze and extract insights from large amounts of data.
  • Build tools or tool prototypes for data analysis or data creation, using Python or another scripting language.
  • Use modeling tools to bootstrap or test new AI functionalities.
  • Collaborate with scientists, software engineers, and other data creators to evaluate performance of AI models.
  • Design complex data collections with human participants in response to science needs: author instructions, define and implement quality targets and mechanisms, provide day‑to‑day coordination of data collection efforts (including planning, scheduling, and reporting), and be responsible for the final deliverables.
About the team

Amazon strives to be the world's most customer‑centric company, where customers can research and purchase anything they might want online or offline. We set big goals and are looking for people who can help us reach and exceed them. The AGI organization provides AI capabilities for a variety of Amazon products and searches. We provide secure, flexible, cost‑effective, and high‑quality data development services to our customers, enabling them to build advanced ML models.

Qualifications
  • Master's or higher degree in a relevant field (Computational Linguistics or equivalent field with computational analysis).
  • 2+ years experience in computational linguistics or language data processing or AI data creation.
  • Experience with language data annotation systems and other forms of data markup.
  • Proficient with scripting languages, such as Python.
  • Experience working with speech, text, and multimodal data in multiple languages.
  • Excellent communication, strong organizational skills and very detail oriented.
  • Comfortable working in a fast paced, highly collaborative, dynamic work environment.
  • PhD in Computational Linguistics (or equivalent field with computational emphasis).
  • Expertise in bootstrapping AI data collections for quickly evolving requirements.
  • Extensive experience working with speech, text, and multimodal data in multiple languages.
  • Experience in data creation for complex agentic workflows.
  • Practical experience with Machine Learning.
  • Familiarity with technical concepts such as APIs.
  • Practical knowledge of version control and agile development.
  • Familiarity with database queries and data analysis processes (SQL, R, Matlab, etc.).
  • Willingness to support several projects at one time, and to accept reprioritization as necessary.
  • Able to think creatively and possess strong analytical and problem solving skills.

Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our success. We make recruiting decisions based on your experience and skills. We value your passion to discover, invent, simplify and build.

Protecting your privacy and the security of your data is a longstanding top priority for Amazon. Please consult our Privacy Notice (https://www.amazon.jobs/en/privacy_page) to know more about how we collect, use and transfer the personal data of our candidates.

Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.