
Enable job alerts via email!
Generate a tailored resume in minutes
Land an interview and earn more. Learn more
A leading design platform in Greater London is seeking a Senior Research Scientist to advance their work on reinforcement learning and agentic systems. You will drive research initiatives and participate in the development of innovative solutions that enhance product experiences. This position demands deep expertise in reinforcement learning, Python, and teamwork across multiple domains. The role offers a flexible work environment supported by a variety of benefits designed to foster success and well-being.
Company Description
Hiya, g’day, mabuhay, kia ora, 你好, hallo, vítejte!
Thanks for stopping by. We know job hunting can be a little time consuming and you’re probably keen to find out what’s on offer, so we’ll get straight to the point.
The buzzing Canva London campus features several buildings around beautiful leafy Hoxton Square in Shoreditch. While our global headquarters is in Sydney, Australia, London is our HQ for Europe, with all kinds of teams based here, plus event spaces to gather our team and communities.
You’ll experience a warm welcome from our Vibe team at front of house, amazing home cooked food from our Head Chef and a variety of workspaces to hang out with your team mates or get solo work done. That said, we trust our Canvanauts to choose the balance that empowers them and their team to achieve their goals and so you have choice in where and how you work.
At Canva, our mission is to empower the world to design. We’re building AI that feels magical and lands real impact for millions of people – helping anyone create with confidence. We’re looking for a senior research scientist who lives and breathes reinforcement learning and agentic systems to push the frontier of reasoning, tool use, and reliability – and ship it to users.
We explore multimodal agentic architectures, build scalable training and evaluation loops, and partner closely with product and platform teams to turn breakthroughs into delightful product features. We are a cutting-edge post-training team, developing new multimodal agentic systems. We work on all topics of multimodal modelling, post-training and design agents, we build scalable training and evaluation loops, and partner closely with product and platform teams to turn breakthroughs into delightful product features. We are looking for a person with experience in post-training and reinforcement learning (RL) to join our team.
You’ll drive research directions and play a leading role in hands‑on work across the agent stack—from reward design and policy optimization to planning, memory, and tool orchestration, dataset construction, to post-training, and the development of novel post-training approaches. You’ll design tight experiments, iterate quickly, and land trustworthy conclusions. Most importantly, you’ll help convert research into reliable, safe, and high‑quality product experiences.
Achieving our crazy big goals motivates us to work hard – and we do – but you’ll experience lots of moments of magic, connectivity and fun woven throughout life at Canva, too. We also offer a range of benefits to set you up for every success in and outside of work.
Check out lifeatcanva.com for more info.
We make hiring decisions based on your experience, skills and passion, as well as how you can enhance Canva and our culture. When you apply, please tell us the pronouns you use and any reasonable adjustments you may need during the interview process.
We celebrate all types of skills and backgrounds at Canva so even if you don’t feel like your skills quite match what’s listed above – we still want to hear from you!
Please note that interviews are conducted virtually.