Job Search and Career Advice Platform

¡Activa las notificaciones laborales por email!

Lead Data Engineer (Remote)

UP.Labs

A distancia

MXN 800,000 - 1,100,000

Jornada completa

Hace 2 días
Sé de los primeros/as/es en solicitar esta vacante

Genera un currículum adaptado en cuestión de minutos

Consigue la entrevista y gana más. Más información

Descripción de la vacante

A dynamic venture studio is seeking a Lead Data Engineer to spearhead the design and evolution of its data infrastructure. You will collaborate with Data Scientists and ML Engineers to ensure high-quality data for analytics and AI projects. The role involves building data pipelines using Databricks, Snowflake, and Python while also providing technical leadership within the team. Ideal candidates have extensive experience in data engineering, with significant expertise in cloud environments and data processing technologies.

Formación

  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field (or equivalent practical experience).
  • 5+ years of experience building and maintaining data pipelines and platforms in production.
  • Demonstrated ability to lead technical initiatives and mentor other engineers.
  • Hands-on expertise with Databricks for large-scale data processing and analytics.
  • Experience working with Snowflake as a cloud data warehouse.
  • Advanced SQL skills for complex querying, modeling, and performance optimization.
  • Strong proficiency in Python for data engineering workflows.
  • Practical experience with dbt for analytics engineering and data transformations.
  • Experience building data platforms on Azure.
  • Solid knowledge of relational databases, particularly PostgreSQL.
  • Familiarity with AWS.
  • Experience with Kafka or similar messaging/streaming platforms.
  • Strong understanding of Apache Spark for distributed data processing.
  • Exposure to GenAI-related data use cases.
  • Proven experience collaborating with Data Science and ML teams.
  • Strong communication skills.

Responsabilidades

  • Design, build, and maintain scalable data pipelines and data platforms using Databricks.
  • Develop and manage data models and transformations using dbt and SQL.
  • Implement and optimize data workflows across Snowflake and relational databases.
  • Build robust ETL/ELT pipelines using Python and Apache Spark.
  • Design and operate cloud-native data solutions primarily on Azure.
  • Implement and manage streaming data pipelines using Kafka.
  • Support GenAI and ML use cases with high-quality data.
  • Implement CI/CD best practices for data workflows.
  • Monitor and optimize data pipelines for performance and reliability.
  • Collaborate with cross-functional teams.
  • Provide technical leadership and mentorship.
  • Organize work for the team and drive progress.
  • Contribute to strategic decisions around data architecture.
Descripción del empleo
Overview

UPLabs is a dynamic venture studio dedicated to building innovative startup companies from the ground up. Our team thrives on solving complex problems, driving technological advancements, and creating impactful digital products. We’re seeking a highly skilled professional to join our growing team and contribute to our mission of launching the next wave of successful startups. As a Lead Data Engineer, you will own the design and evolution of end-to-end data infrastructure, enabling reliable, scalable, and high-quality data for analytics and AI initiatives. You will work closely with Data Scientists, ML Engineers, and product teams to ensure data platforms are production-ready, well-governed, and optimized for performance. In addition to hands‑on technical work, you will act as a technical leader—setting standards, guiding architectural decisions, and mentoring other data engineers.

Responsibilities
  • Design, build, and maintain scalable data pipelines and data platforms using Databricks as the core processing environment
  • Develop and manage data models and transformations using dbt and SQL, ensuring analytics‑ready datasets
  • Implement and optimize data workflows across Snowflake and relational databases such as PostgreSQL
  • Build robust ETL/ELT pipelines using Python and Apache Spark for batch and streaming workloads
  • Design and operate cloud‑native data solutions primarily on Azure, with exposure to AWS when required by specific ventures
  • Implement and manage streaming data pipelines using Kafka for real‑time and near‑real‑time use cases
  • Support GenAI and ML use cases by providing high‑quality, well‑structured data (e.g., feature stores, embeddings pipelines, analytical datasets)
  • Implement CI/CD best practices for data workflows to ensure reliable and repeatable deployments
  • Monitor, troubleshoot, and optimize data pipelines for performance, reliability, and cost efficiency
  • Collaborate cross‑functionally with Data Scientists, ML Engineers, Product Managers, and business stakeholders to translate requirements into scalable data solutions
  • Provide technical leadership and mentorship to other Data Engineers, promoting best practices and continuous improvement
  • Organize work for the team, unblock others, and drive progress through daily standups.
  • Contribute to strategic decisions around data architecture, tooling, and platform evolution
Required Skills and Expertise
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field (or equivalent practical experience)
  • 5+ years of experience building and maintaining data pipelines and platforms in production
  • Demonstrated ability to lead technical initiatives and mentor other engineers
  • Hands‑on expertise with Databricks for large‑scale data processing and analytics.
  • Experience working with Snowflake as a cloud data warehouse.
  • Advanced SQL skills for complex querying, modeling, and performance optimization.
  • Strong proficiency in Python for data engineering workflows.
  • Practical experience with dbt for analytics engineering and data transformations.
  • Experience building data platforms on Azure.
  • Solid knowledge of relational databases, particularly PostgreSQL.
  • Familiarity with AWS in multi‑cloud or venture‑specific environments.
  • Experience with Kafka or similar messaging/streaming platforms.
  • Strong understanding of Apache Spark for distributed data processing.
  • Exposure to GenAI‑related data use cases, such as supporting RAG pipelines, embeddings, or ML feature datasets.
  • Proven experience collaborating with Data Science and ML teams.
  • Strong communication skills and ability to lead technical discussions with both technical and non‑technical stakeholders.
Consigue la evaluación confidencial y gratuita de tu currículum.
o arrastra un archivo en formato PDF, DOC, DOCX, ODT o PAGES de hasta 5 MB.