Enable job alerts via email!

Principal Software Engineer, Catalog & Real-Time Serving Systems

Instacart

Ontario

On-site

CAD 120,000 - 160,000

Full time

Yesterday

Be an early applicant

Boost your interview chances

Create a job specific, tailored resume for higher success rate.

Job summary

A leading company in the tech industry is seeking a Sr. Staff or Principal Engineer to drive the evolution of their core Catalog and Machine Learning infrastructure. This pivotal role involves architectural leadership, ensuring high-performance systems, and collaborating with diverse teams to enhance personalized user experiences. Ideal candidates will have extensive experience in distributed systems and ML deployment, coupled with strong problem-solving and communication skills.

Qualifications

Extensive experience in software engineering focused on distributed systems and ML serving.
Proven track record in designing and scaling high-performance systems.
Strong communication skills and ability to mentor junior engineers.

Responsibilities

Provide architectural leadership for Catalog and ML serving infrastructure.
Design and scale solutions for business and ML needs.
Collaborate with cross-functional teams to deliver integrated ML-driven solutions.

Skills

Distributed Systems

Streaming Processing

Machine Learning Serving

Data Intensive Applications

Cloud Platforms

Problem Solving

Collaboration

Education

Bachelor's Degree in Computer Science or related field

Tools

Flink

Sagemaker

Overview

About the Role

We are seeking a highly experienced and visionary Sr. Staff or Principal Engineer to join our Customers organization. This role is crucial for the evolution and scalability of our core Catalog and data intensive systems, while also playing a pivotal role in advancing our Machine Learning serving and serving infrastructure capabilities. This position will not only impact core business functions and drive significant revenue but also shape the future of our personalized, real-time ML-driven experiences. The ideal candidate will possess deep expertise in distributed systems, streaming processing, data intensive applications, and particularly, the deployment, scaling, and optimization of Machine Learning models in production.

This is a unique opportunity to join a dynamic and innovative team, and to make a significant impact on the future of our platform by advancing both our core data infrastructure and our Machine Learning capabilities. If you are a highly motivated and experienced engineer with a passion for solving complex technical challenges across distributed systems, data engineering, and ML serving, we encourage you to apply.

About the Team

Join a dynamic team at the heart of Instacart's success, leading the core shopping experience that millions of users rely on. We are obsessive about perfecting every aspect of the customer shopping journey on the app, encompassing UX formats, feeds, algorithms, personalization, recommender engines, and ranking systems to deliver an exceptional experience. Our team thrives on collaborative problem-solving in a fast-paced environment.

About the Job

Provide architectural leadership for Catalog, streaming, and data-intensive systems, emphasizing ML serving infrastructure and best practices, and drive the technical roadmap.
Design, build, and scale reliable, efficient, and adaptable solutions to address changing business and ML needs.
Lead the development and optimization of ML serving endpoints, ensuring high availability, low latency, robust performance, and implement fail-fast input validations and track metrics using Datadog.
Centralize ML serving logic and decouple it from product applications to improve debugging, manageability, and system performance.
Drive and contribute to company-wide transformational initiatives, impacting key business metrics like revenue, personalization, and operational efficiency, and influence the direction of ML infrastructure including real-time inferencing.
Serve as a subject matter expert for Catalog, streaming, data-intensive, and ML serving technologies, providing guidance and mentorship to engineering and data science teams.
Identify and implement innovative solutions to optimize system performance, reduce costs, and improve data processing and ML serving latency.
Collaborate with cross-functional teams, including Product, Retailer, IC App, Ads, ML Infrastructure, and Data Science, to deliver integrated ML-driven solutions, and lead incident response and resolution for high-severity issues.

About You

Minimum Qualifications

Extensive experience in software engineering, with a focus on distributed systems, streaming processing (e.g., Flink), data intensive applications, and particularly, Machine Learning serving and deployment.
Proven track record of designing, implementing, and scaling large-scale, high-performance systems, including ML serving infrastructure.
Deep understanding of database technologies, data modeling, data pipelines, and ML model deployment patterns.
Strong architectural skills and the ability to design and evaluate complex technical solutions across diverse technology domains, including Catalog, Streaming, and Machine Learning.
Excellent problem-solving and debugging skills, with specific experience in addressing issues related to ML model serving, data quality, and infrastructure stability.
Strong communication and collaboration skills, with the ability to effectively work across teams, influence stakeholders, and mentor junior engineers.
Experience with cloud platforms and related technologies, including ML serving platforms (e.g., Sagemaker).
Ability to quantify and demonstrate the impact of technical contributions on business results (e.g., revenue, efficiency, cost savings, and ML model performance).
Familiarity with challenges related to ML lifecycle, data flow, and best practices

Preferred Qualifications

Experience working with large-scale catalog systems or similar data-intensive platforms.
Significant experience in designing and implementing high-throughput, low-latency ML serving systems.
Contributions to open-source projects or technical publications related to distributed systems, data engineering, or Machine Learning serving.
Experience in a high-growth, fast-paced environment, particularly in the context of scaling ML initiatives.

Get your free, confidential resume review.

or drag and drop a PDF, DOC, DOCX, ODT, or PAGES file up to 5MB.

Similar jobs