Overview
Our Toronto-based client is building the next generation of high-performance, energy-efficient computing platforms. This role sits at the intersection of silicon, firmware, and operating systems — ideal for senior engineers who thrive in complex, performance-critical environments. You’ll design and optimize custom kernel-level solutions, create powerful telemetry and debugging tools, and collaborate with world-class teams to ensure that no performance is left untapped.
Key Responsibilities
- Build and optimize custom Linux kernel drivers that integrate with low-level firmware to expose and control advanced CPU features.
- Develop performance and telemetry tooling to capture, analyze, and visualize CPU behavior, enabling deep insights into system efficiency and scalability.
- Debug and profile across the silicon–firmware–OS boundary, working directly with performance counters, schedulers, and power management subsystems.
- Collaborate with leading hardware and software engineers to push the boundaries of efficiency, performance, and innovation in next-generation datacenter and edge platforms.
- Contribute to system bring-up, kernel porting, and board support packages for new CPU architectures.
- Leverage open-source tools and communities (perf, ftrace, eBPF, etc.) and contribute improvements upstream.
- Investigate and resolve the most challenging performance bottlenecks, spanning compiler output, kernel scheduling, cache/memory, and interconnect.
- Drive performance benchmarking methodology and automation for large-scale workloads, from microbenchmarks to real application scenarios.
- Provide technical leadership, mentoring, and guidance for cross-functional teams working on hardware bring-up, firmware integration, and OS performance tuning.
Preferred Qualifications
- Expertise in Linux kernel internals, including schedulers, memory management, interrupts, and boot flows.
- Strong background in device driver development (custom drivers, PCIe, I/O, networking, storage, or accelerators).
- Hands-on experience with performance counters and profiling tools (perf, ftrace, eBPF, VTune, oprofile, or custom frameworks).
- Familiarity with power and performance management concepts such as DVFS, CPU idle states, clock/power domains, and thermal throttling.
- Ability to debug at multiple layers: firmware, kernel, virtualization, and user space applications.
- Exposure to SoC bring-up, BSP development, and low-level board initialization.
- Proficiency in systems programming languages (C, C++, Python, assembly) with emphasis on writing performant, maintainable, and low-level code.
- Knowledge of CPU microarchitecture concepts (pipelines, caches, MMU/virtual memory, coherency, interconnects).
- Experience working with or contributing to open-source kernel communities.
- Comfort navigating ambiguous performance issues, using telemetry and measurement to drive root-cause analysis.
How to Apply?
All qualified and interested applicants can apply directly to Gord Marriage by sending an email with attached resume to gord.marriage@talentlab.com. You may also apply directly on our website at www.talentlab.com. Although we thank all applicants for their interest, only those in consideration will be contacted.