Enable job alerts via email!

Senior Research Engineer - Multimodal & Video Foundation Model

Tether.io

Pakistan

On-site

PKR 2,000,000 - 2,750,000

Full time

Today
Be an early applicant

Job summary

A leading fintech AI company seeks a Senior Research Engineer to drive innovation in multimodal systems and video generation models. The role involves designing AI architectures, optimizing large-scale data pipelines, and collaborating with cross-functional teams to develop generative AI applications. Ideal candidates will have a Bachelor's in Computer Science and expertise in Python and PyTorch, along with experience in multimodal data processing.

Qualifications

  • Deep expertise in video generation model architectures.
  • Hands-on experience in developing generative AI models.
  • Experience working with large-scale datasets.

Responsibilities

  • Pioneer multimodal and video-centric research and development.
  • Design and implement novel AI architectures for multimodal models.
  • Collaborate cross-functionally with research and engineering teams.

Skills

Python
PyTorch
Large-scale text data handling
Multimodal data processing

Education

Bachelor’s degree in Computer Science or related field
Job description
Overview

Senior Research Engineer - Multimodal & Video Foundation Model at Tether.io. Join a global team driving innovation in AI for fintech and multimodal systems.

About the Job

As a member of the AI model team, you will drive innovation in architecture development for cutting-edge models of various scales, including small, large, and multi-modal systems. Your work will enhance intelligence, improve efficiency, and introduce new capabilities to advance the field.

You will have a deep expertise in video generation model architectures with a hands-on, research-driven approach. Your mission is to explore and implement novel techniques and algorithms that lead to groundbreaking advancements: data curation, strengthening baselines, identifying and resolving existing pre-training bottlenecks to push the limits of model performance.

Responsibilities
  • Pioneer multimodal and video-centric research that moves fast and breaks ground, contributing directly to usable prototypes and scalable systems.
  • Design and implement novel AI architectures for multimodal language models, integrating text, visual, and audio modalities.
  • Engineer scalable training and inference pipelines optimized for large-scale multimodal datasets and distributed GPU systems across thousands of GPUs.
  • Optimize systems and algorithms for efficient data processing, model execution, and pipeline throughput.
  • Build modular tools for preprocessing, analyzing, and managing multimodal data assets (e.g., images, video, text).
  • Collaborate cross-functionally with research and engineering teams to translate cutting-edge model innovations into production-grade solutions.
  • Prototype generative AI applications showcasing new capabilities of multimodal foundation models in real-world products.
  • Develop benchmarking tools to rigorously evaluate model performance across diverse multimodal tasks.
Qualifications
  • Bachelor’s degree in Computer Science, Computer Engineering, or a related technical field, or equivalent practical experience.
  • Expertise in Python & PyTorch, including experience with the full development pipeline from data processing and loading to training, inference, and optimization.
  • Experience working with large-scale text data, or (bonus) interleaved data spanning audio, video, image, and/or text.
  • Hands-on experience in developing or benchmarking at least one of the following topics: LLMs, Vision-Language Models, Audio-Language Models, generative video models.

Nice to have skills:

  • PhD in Computer Vision, Machine Learning, NLP, Computer Science, Applied Statistics, or a closely related field.
  • Demonstrated expertise in computer vision, video generation foundation models and/or multimodal research.
  • First-author publications at leading AI conferences (e.g., CVPR, ICCV, ECCV, ICML, ICLR, NeurIPS).
Important information for candidates
  • Recruitment scams have become increasingly common. To protect yourself, please keep the following in mind when applying for roles:
  • Apply only through our official channels. All open roles are listed on our official careers page: https://tether.recruitee.com/
  • Verify the recruiter’s identity. All our recruiters have verified LinkedIn profiles. If you’re unsure, confirm their identity via their profile or through our website.
  • Avoid unusual communication methods. We do not conduct interviews over WhatsApp, Telegram, or SMS. All communication is through official company emails and platforms.
  • Double-check email addresses. Communications from us will come from emails ending in @tether.to or @tether.io.
  • We will never request payment or financial details. If someone asks for personal financial information, report it immediately.
Employment details
  • Seniority level: Not Applicable
  • Employment type: Full-time
  • Job function: Information Technology
  • Industries: Technology, Information and Internet
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.