Overview
We are seeking an expert with deep proficiency as a Platform Engineer, possessing experience in data engineering. This individual should have a comprehensive understanding of both data platforms and software engineering, enabling them to integrate the platform effectively within an IT ecosystem.
Responsibilities
- Manage and optimize data platforms (Databricks, Palantir).
- Ensure high availability, security, and performance of data systems.
- Provide valuable insights about data platform usage.
- Optimize computing and storage for large-scale data processing.
- Design and maintain system libraries (Python) used in ETL pipelines and platform governance.
- Optimize ETL Processes – Enhance and tune existing ETL processes for better performance, scalability, and reliability.
Qualifications
- Minimum 10 Years of experience in IT/Data.
- Minimum 5 years of experience as a Data Platform Engineer/Data Engineer.
- Bachelor's in IT or related field.
- Infrastructure & Cloud: Azure, AWS (expertise in storage, networking, compute).
- Data Platform Tools: Any of Palantir, Databricks, Snowflake.
- Programming: Proficiency in PySpark for distributed computing and Python for ETL development.
- SQL: Expertise in writing and optimizing SQL queries, preferably with experience in databases such as PostgreSQL, MySQL, Oracle, or Snowflake.
- Data Warehousing: Experience working with data warehousing concepts and platforms, ideally Databricks.
- ETL Tools: Familiarity with ETL tools & processes
- Data Modelling: Experience with dimensional modelling, normalization/denormalization, and schema design.
- Version Control: Proficiency with version control tools like Git to manage codebases and collaborate on development.
- Data Pipeline Monitoring: Familiarity with monitoring tools (e.g., Prometheus, Grafana, or custom monitoring scripts) to track pipeline performance.
- Data Quality Tools: Experience implementing data validation, cleaning, and quality frameworks, ideally Monte Carlo.
Nice to have
- Containerization & Orchestration: Docker, Kubernetes.
- Infrastructure as Code (IaC): Terraform.
- Understanding of Investment Data domain (desired).