Enable job alerts via email!

AI Research Engineer (Model Serving & Inference)

Tether Operations Limited

London

Remote

GBP 60,000 - 85,000

Full time

3 days ago

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in digital finance seeks an innovative professional for their AI model team in London. The role focuses on optimizing inference architectures for advanced AI systems across various environments. Candidates should possess a strong background in Computer Science, ideally with a PhD, and proven experience in kernel optimization for mobile devices. The opportunity allows for collaboration with top talent and sets industry standards in fintech.

Qualifications

Strong publication record in AI R&D.
Experience in mobile devices optimization with measurable improvements.
Deep understanding of low-latency techniques and resource management.

Responsibilities

Design and deploy efficient model serving architectures.
Set and monitor performance targets like latency and memory usage.
Conduct inference testing in various environments.

Skills

Kernel optimization

Inference optimization

Performance evaluation

Memory management

Collaboration

Education

Degree in Computer Science or related field

PhD preferred

Join Tether and Shape the Future of Digital Finance

At Tether, we’re not just building products, we’re pioneering a global financial revolution. Our solutions enable businesses to seamlessly integrate reserve-backed tokens across blockchains, leveraging blockchain technology for secure, instant, and cost-effective digital transactions. Transparency and trust are fundamental to our mission.

Innovate with Tether

Tether Finance: Home to the trusted stablecoin USDT and pioneering digital asset tokenization services.

Tether Power: Promoting sustainable growth through eco-friendly energy solutions for Bitcoin mining in diverse, geo-distributed facilities.

Tether Data: Advancing AI and peer-to-peer communication with innovative solutions like KEET, our secure data sharing app.

Tether Education: Providing access to top digital learning resources to empower individuals in the digital economy.

Tether Evolution: Blending technology and human potential to push innovation boundaries for a better future.

Why Join Us?

Our global team works remotely, passionate about transforming fintech. Join us to collaborate with top talent, innovate, and set industry standards. We value excellent English communication skills and a drive to contribute to our cutting-edge platform.

About the job:

As part of our AI model team, you will focus on developing and optimizing model serving and inference architectures for advanced AI systems, ensuring high responsiveness, efficiency, and scalability across various environments, including resource-limited devices and complex multi-modal systems.

Your responsibilities include designing robust inference pipelines, establishing performance metrics, and troubleshooting bottlenecks to achieve low-latency, low-memory AI performance in real-world applications.

Responsibilities:

Design and deploy efficient model serving architectures optimized for diverse environments, including resource-constrained devices.
Set and monitor performance targets such as latency, throughput, and memory usage.
Conduct inference testing in simulated and live environments, tracking key performance indicators and documenting results.
Prepare high-quality datasets and scenarios for real-world deployment testing, focusing on low-resource devices.
Analyze pipeline efficiency, diagnose bottlenecks, and optimize for scalability and reliability.
Collaborate with cross-functional teams to integrate optimized frameworks into production, ensuring continuous improvement.

Minimum qualifications:

Degree in Computer Science or related field; PhD preferred, with a strong publication record in AI R&D.
Proven experience in kernel and inference optimization on mobile devices, with measurable improvements in latency and memory footprint.
Deep understanding of model serving architectures, low-latency techniques, and memory management in resource-constrained environments.
Expertise in CPU/GPU kernel development for mobile platforms and deploying inference pipelines on resource-limited devices.
Ability to apply empirical research to overcome system challenges, with a focus on performance evaluation and iterative optimization.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

AI Research Engineer (Model Serving & Inference)

Tether Operations Limited

London

Remote

GBP 60,000 - 85,000

Full time

Job summary

Qualifications

Responsibilities

Skills

Education

Job description

Similar jobs

AI Research Engineer (Model Evaluation)

London

Remote

GBP 70,000 - 110,000

AI Research Engineer (Model Evaluation)

London

Remote

GBP 70,000 - 100,000

AI Research Engineer (Fine-tuning)

London

Remote

GBP 60,000 - 100,000

AI Research Engineer (Fine-tuning - 100% Remote UK)

London

Remote

GBP 60,000 - 100,000

REMOTE Freelance Career Coach wanted USD 99 per client

London

Remote

GBP 60,000 - 80,000

Digital Workplace Coach - M365, Copilot, ChatGPT, Asana, Webex

London

Remote

GBP 40,000 - 70,000

Japanese Document Reviewer

Basingstoke

Remote

GBP 60,000 - 80,000

Japanese Document Reviewer

Milton Keynes

Remote

GBP 60,000 - 80,000

Japanese Document Reviewer

Luton

Remote

GBP 60,000 - 80,000