Enable job alerts via email!
A technology-driven company is looking for a Data Engineer to design and maintain their Data Lake infrastructure. The role includes architecting ingestion pipelines, optimizing storage strategies, and collaborating with various teams. Applicants should have significant experience in data systems, expert-level Python skills, and familiarity with tools like Airflow and Kafka. This is a full-time remote position offering professional growth opportunities and flexibility.
Data Engineering team is responsible for designing building and maintaining the Data Lake infrastructure including ingestion pipelines storage systems and internal tooling for reliable scalable access to market data.
Key Responsibilities
Ingestion&Pipelines: Architect batchstream pipelines (Airflow Kafka dbt) for diverse structured and unstructured marked data. Provide reusable SDKs in Python and Go for internal data producers.
Storage&Modeling: Implement and tune S3 columnoriented and timeseries data storage for petabytescale analytics; own partitioning compression TTL versioning and cost optimisation.
Tooling & Libraries: Develop internal libraries for schema management data contracts validation and lineage; contribute to shared libraries and services for internal data consumers for research backtesting and real-time trading purposes.
Reliability & Observability: Embed monitoring alerting SLAs SLOs and CI/CD; champion automated testing data quality dashboards and incident runbooks.
Collaboration: Partner with Data Science QuantResearch Backend and DevOps to translate requirements into platform capabilities and evangelise best practices.
Qualifications :
Additional Information :
What we offer:
Remote Work :
Yes
Employment Type :
Full-time