Join to apply for the VP, Lead Data Engineer C13 (Hybrid) role at Citi.
Get AI-powered advice on this job and more exclusive features.
The Lead Data Analytics analyst is responsible for building Data Engineering Solutions using next generation data techniques in on-premises and public cloud infrastructure. The individual should have knowledge of data engineering solutions, Kubernetes, public cloud platforms (AWS), and Snowflake. The individual will work directly with product owners, customers, and technologists to deliver data products/solutions in a collaborative and agile environment.
Responsibilities:
- Design and develop big data solutions, partnering with domain experts, product managers, analysts, and data scientists to develop Big Data pipelines in Hadoop or Snowflake. Responsible for delivering data as a service framework.
- Move all legacy workloads to cloud platform.
- Ensure automation through CI/CD across platforms both in cloud and on-premises.
- Research and assess open-source technologies, cloud tech stack components (AWS/GCP), to recommend and integrate into design and implementation.
- Serve as a technical expert and mentor team members on Big Data and Cloud Tech stacks.
- Define needs around maintainability, testability, performance, security, quality, and usability for data platforms.
- Drive implementation, patterns, reusable components, and coding standards for data engineering processes.
- Tune Big Data applications on Hadoop and non-Hadoop platforms for optimal performance.
- Evaluate new IT developments and evolving business requirements to recommend system enhancements.
- Contribute to the objectives of the entire function, integrating data analytics expertise.
- Produce detailed analysis of complex issues and recommend actions.
- Supervise day-to-day staff management, resource management, work allocation, mentoring, and coaching.
- Assess risk when making business decisions, ensuring compliance and safeguarding the firm's reputation, clients, and assets.
Qualifications:
- 6-10+ years of total IT experience.
- 8+ years of experience with Hadoop (Cloudera)/big data technologies.
- 5+ years of experience in public cloud infrastructure (AWS or GCP).
- Experience with Kubernetes and cloud-native technologies.
- Experience with DevOps practices.
- Hands-on experience with Hadoop ecosystem (HDFS, MapReduce, Hive, Pig, Impala, Spark, Kafka, Kudu, Solr).
- Experience designing and developing Data Pipelines using Java, Scala, or Python.
- Experience with Spark programming (PySpark, Scala, Java).
- Familiarity with core cloud provider services from AWS, Azure, or GCP.
- Proficiency with Python/PySpark/Scala and basic ML libraries.
- Understanding of data structures, algorithms, distributed storage & compute.
- Strong problem-solving, interpersonal, and teamwork skills.
- Team management experience leading data engineers and analysts.
- Experience with Snowflake or Delta Lake is a plus.
- Basic knowledge of Linux systems, OS, and networking internals.
Education:
- Bachelor’s degree or equivalent experience; Master’s degree preferred.