Job Search and Career Advice Platform

Enable job alerts via email!

Performance Engineer - Akamai Inference Cloud

Akamai Technologies GmbH

Remote

PLN 120,000 - 180,000

Full time

Today
Be an early applicant

Generate a tailored resume in minutes

Land an interview and earn more. Learn more

Job summary

A leading technology company is looking for a Performance Engineer to optimize AI inference platforms. Responsibilities include benchmarking AI models, implementing optimization techniques, and collaborating with engineering teams. Successful candidates will have expertise in AI/ML performance and GPU optimization. This role also offers flexible working options, allowing you to work from home or the office in Poland.

Benefits

Flexible working options
Health benefits
Family support options

Qualifications

  • Experience in performance engineering with AI/ML model optimization and inference tuning.
  • Knowledge of inference optimization techniques including quantization and model compilation.
  • Proficiency with GPU optimization and understanding of memory hierarchies.

Responsibilities

  • Benchmarking and profiling AI models and measuring performance metrics.
  • Researching and implementing model optimization techniques.
  • Collaborating with engineering teams for optimization recommendations.

Skills

AI/ML performance optimization
Inference frameworks
GPU optimization
Problem-solving skills
Benchmarking and profiling tools

Tools

TensorRT
Triton
TorchServe
Job description

Poland (Remote)

Job Description

Do you thrive on optimizing AI systems for peak performance?

Are you ready to push the boundaries of inference speed and efficiency?

Join the Akamai Inference Cloud Team!

The Akamai Inference Cloud team is part of Akamai's Cloud Technology Group. We build AI platforms for efficient, compliant, and high-performing applications. These platforms support customers in running inference models and empower developers to create advanced AI solutions effectively. #AIC

Partner with the best

The Performance Engineer ensures optimal benchmarking, tuning, and performance of an AI inference platform. Responsibilities include applying advanced optimization techniques to enhance throughput, reduce latency, and improve resource efficiency. The role involves working with models, hardware accelerators, and infrastructure. Expertise in AI/ML performance optimization, proficiency with inference frameworks, and a passion for maximizing hardware and software performance are essential.

As a Performance Engineer, you will be responsible for:

  • Benchmarking and profiling AI models and inference workloads across different hardware configurations, measuring latency, throughput, and resource utilization.
  • Researching and implementing model optimization techniques including quantization, pruning, distillation, and hardware-specific optimizations.
  • Optimizing inference frameworks and infrastructure to maximize performance, working with TensorRT, vLLM, TorchServe, Triton and other serving platforms.
  • Establishing performance baselines and monitoring for the platform, identifying and addressing performance regressions.
  • Collaborating with engineering teams to identify bottlenecks, recommend optimizations, and validate performance improvements.

Do what you love

To be successful in this role you will:

  • Have experience in performance engineering with hands‑on expertise in AI/ML model optimization and inference performance tuning.
  • Demonstrate solid knowledge of inference optimization techniques including quantization (INT8, FP16), model compilation, hardware acceleration, and familiarity with compiler optimizations and ML compilers.
  • Show proficiency with GPU optimization and understanding of memory hierarchies and techniques to maximize hardware utilization.
  • Have experience with profiling and benchmarking tools for AI workloads, identifying performance bottlenecks in distributed systems.
  • Demonstrate problem‑solving skills with ability to analyze performance data, communicate insights clearly, and drive optimization efforts.
  • Possess knowledge of distributed inference and model parallelism techniques.
  • Have experience with cost optimization for compute‑intensive workloads.

Work in a way that works for you

FlexBase, Akamai's Global Flexible Working Program, is based on the principles that are helping us create the best workplace in the world. When our colleagues said that flexible working was important to them, we listened. We also know flexible working is important to many of the incredible people considering joining Akamai. FlexBase gives 95% of employees the choice to work from their home, their office, or both (in the country advertised). This permanent workplace flexibility program is consistent and fair globally, to help us find incredible talent, virtually anywhere. We are happy to discuss working options for this role and encourage you to speak with your recruiter in more detail when you apply.

Benefits include:

  • Your health
  • Your finances
  • Your family
  • Your time at work
  • Your time pursuing other endeavors
Get your free, confidential resume review.
or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.