¡Activa las notificaciones laborales por email!

AI Infrastructure Engineer

BayRockLabs

México

A distancia

MXN 1,491,000 - 1,865,000

Jornada completa

Hace 4 días
Sé de los primeros/as/es en solicitar esta vacante

Descripción de la vacante

A leading product engineering firm is seeking a Senior AI Infrastructure Engineer to architect and maintain cloud-based MLOps pipelines. This role requires 7+ years of experience in infrastructure engineering, strong coding skills in TypeScript and Python, and a solid understanding of AI/ML practices. Join us to drive innovation in a supportive and collaborative environment.

Formación

  • 7+ years of professional experience in software engineering and infrastructure engineering.
  • Experience in building and maintaining AI/ML infrastructure in production.
  • Strong coding skills in TypeScript and Python.

Responsabilidades

  • Design, implement, and maintain cloud-native infrastructure for AI workloads.
  • Build and manage scalable data pipelines for ML and analytics.
  • Collaborate with teams to improve performance and reliability.

Conocimientos

Software engineering
Infrastructure engineering
Cloud-native infrastructure
AI/ML infrastructure
TypeScript
Python
AWS knowledge
Collaboration

Herramientas

Databricks
AWS Bedrock
CloudFormation
MLFlow
Docker
CI/CD pipelines
ECS

Descripción del empleo

About BayRock Labs

At BayRock Labs, we pioneer innovative tech solutions that drive business transformation. As a leading product engineering firm based in Silicon Valley, we provide full-cycle product development, leveraging cutting-edge technologies in AI, ML, and data analytics. Our collaborative, inclusive culture fosters professional growth and work-life balance. Join us to work on ground-breaking projects and be part of a team that values excellence, integrity, and innovation. Together, let's redefine what's possible in technology.

What You will Do

We’re looking for a Senior AI Infrastructure Engineer to help design, build, and scale our AI and data infrastructure. In this role, you’ll focus on architecting and maintaining cloud-based MLOps pipelines to enable scalable, reliable, and production-grade AI/ML workflows, working closely with AI engineers, data engineers, and platform teams. Your expertise in building and operating modern cloud-native infrastructure will help enable world-class AI capabilities across the organization.

If you are passionate about building robust AI infrastructure, enabling rapid experimentation, and supporting production-scale AI workloads, we’d love to talk to you.

  • Design, implement, and maintain cloud-native infrastructure to support AI and data workloads, with a focus on AI and data platforms such as Databricks and AWS Bedrock.
  • Build and manage scalable data pipelines to ingest, transform, and serve data for ML and analytics.
  • Develop infrastructure-as-code using tools like Cloudformation, AWS CDK to ensure repeatable and secure deployments.
  • Collaborate with AI engineers, data engineers, and platform teams to improve the performance, reliability, and cost-efficiency of AI models in production.
  • Drive best practices for observability, including monitoring, alerting, and logging for AI platforms.
  • Contribute to the design and evolution of our AI platform to support new ML frameworks, workflows, and data types.
  • Stay current with new tools and technologies to recommend improvements to architecture and operations.
  • Integrate AI models and large language models (LLMs) into production systems to enable use cases using architectures like retrieval-augmented generation (RAG).

Requirements

  • 7+ years of professional experience in software engineering and infrastructure engineering.
  • Extensive experience building and maintaining AI/ML infrastructure in production, including model, deployment, and lifecycle management.
  • Strong knowledge of AWS and infrastructure-as-code frameworks, ideally with CDK.
  • Expert-level coding skills in TypeScript and Python building robust APIs and backend services.
  • Production-level experience with Databricks MLFlow, including model registration, versioning, asset bundles, and model serving workflows.
  • Expert level understanding of containerization (Docker), and hands on experience with CI/CD pipelines, orchestration tools (e.g., ECS) is a plus.
  • Proven ability to design reliable, secure, and scalable infrastructure for both real-time and batch ML workloads.
  • Ability to articulate ideas clearly, present findings persuasively, and build rapport with clients and team members.
  • Strong collaboration skills and the ability to partner effectively with cross-functional teams.

Nice to Have

  • Familiarity with emerging LLM frameworks such as DSPy for advanced prompt orchestration and programmatic LLM pipelines.
  • Understanding of LLM cost monitoring, latency optimization, and usage analytics in production environments.
  • Knowledge of vector databases / embeddings stores (e.g., OpenSearch) to support semantic search and RAG.
Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.