Overview
UPLabs is a dynamic venture studio dedicated to building innovative startup companies from the ground up. Our team thrives on solving complex problems, driving technological advancements, and creating impactful digital products. We’re seeking a highly skilled professional to join our growing team and contribute to our mission of launching the next wave of successful startups. As a Lead Data Engineer, you will own the design and evolution of end-to-end data infrastructure, enabling reliable, scalable, and high-quality data for analytics and AI initiatives. You will work closely with Data Scientists, ML Engineers, and product teams to ensure data platforms are production-ready, well-governed, and optimized for performance. In addition to hands‑on technical work, you will act as a technical leader—setting standards, guiding architectural decisions, and mentoring other data engineers.
Responsibilities
- Design, build, and maintain scalable data pipelines and data platforms using Databricks as the core processing environment
- Develop and manage data models and transformations using dbt and SQL, ensuring analytics‑ready datasets
- Implement and optimize data workflows across Snowflake and relational databases such as PostgreSQL
- Build robust ETL/ELT pipelines using Python and Apache Spark for batch and streaming workloads
- Design and operate cloud‑native data solutions primarily on Azure, with exposure to AWS when required by specific ventures
- Implement and manage streaming data pipelines using Kafka for real‑time and near‑real‑time use cases
- Support GenAI and ML use cases by providing high‑quality, well‑structured data (e.g., feature stores, embeddings pipelines, analytical datasets)
- Implement CI/CD best practices for data workflows to ensure reliable and repeatable deployments
- Monitor, troubleshoot, and optimize data pipelines for performance, reliability, and cost efficiency
- Collaborate cross‑functionally with Data Scientists, ML Engineers, Product Managers, and business stakeholders to translate requirements into scalable data solutions
- Provide technical leadership and mentorship to other Data Engineers, promoting best practices and continuous improvement
- Organize work for the team, unblock others, and drive progress through daily standups.
- Contribute to strategic decisions around data architecture, tooling, and platform evolution
Required Skills and Expertise
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field (or equivalent practical experience)
- 5+ years of experience building and maintaining data pipelines and platforms in production
- Demonstrated ability to lead technical initiatives and mentor other engineers
- Hands‑on expertise with Databricks for large‑scale data processing and analytics.
- Experience working with Snowflake as a cloud data warehouse.
- Advanced SQL skills for complex querying, modeling, and performance optimization.
- Strong proficiency in Python for data engineering workflows.
- Practical experience with dbt for analytics engineering and data transformations.
- Experience building data platforms on Azure.
- Solid knowledge of relational databases, particularly PostgreSQL.
- Familiarity with AWS in multi‑cloud or venture‑specific environments.
- Experience with Kafka or similar messaging/streaming platforms.
- Strong understanding of Apache Spark for distributed data processing.
- Exposure to GenAI‑related data use cases, such as supporting RAG pipelines, embeddings, or ML feature datasets.
- Proven experience collaborating with Data Science and ML teams.
- Strong communication skills and ability to lead technical discussions with both technical and non‑technical stakeholders.