DeepLightAI is a specialist AI and data consultancy with extensive experience implementing intelligent enterprise systems across multiple industries, withparticular depthin financial services and banking.
Our team combines deepexpertisein data science, statisticalmodeling, AI/ML technologies, workflow automation, and systems integration with a practical understanding of complex business operations.
The Data Engineeris responsible fordesigning, implementing, and optimising data pipelines and infrastructure to support ourcutting-edgeAI systems. The Data Engineer collaborates closely with our multidisciplinary team to ensure the efficient collection, storage, processing, and analysis of large-scale data, enabling us to unlock valuable insights and drive innovation across various domains.
Responsibilities
- Design, build, andoptimisescalable data solutions, primarilyutilisingthe Lakehouse architecture to unify data warehousing and data lake capabilities.
- Advise stakeholders on the strategic choice between Data Warehouse, Data Lake, and Lakehouse architectures based on specific business needs, cost, and latency requirements.
- Design, develop, andmaintainscalable and reliable datapipelines to ingest, transform, and load diverse datasets from various sources, including structured and unstructured data, streaming data, and real-time feeds.
- Implement standards and tooling to ensure ACID properties, schema evolution, and high data quality within the Lakehouse environment.
- Implement robust data governance frameworks (security, privacy, integrity, compliance, auditing).
- Continuouslyoptimizedata storage, compute resources, and query performance across the data platform to reduce costs and improve latency for both BI and ML workloads,leveragingtechniques such as indexing, partitioning, and parallel processing.
- Develop andmaintainCI/CDpipelines to automate the entire machine learning lifecycle, from data validation and model training to deployment and infrastructure provisioning.
- Deploy, manage, and scale machine learning models into production environments, utilizingMLOpsprinciples for reliable andrepeatable operations.
- Establishand manage monitoring systems to track model performance metrics, detect data drift (changes in input data), and model decay (degradation in prediction accuracy).
- Ensure rigorous version control and tracking for all components: code, datasets, and trained model artifacts (using tools likeMLflowor similar).
- Create comprehensive documentation, including technical specifications, data flow diagrams, and operational procedures, tofacilitateunderstanding, collaboration, and knowledge sharing.
Benefits & Growth Opportunities
- Competitive salary and performance bonuses
- Comprehensive health insurance
- Professional development and certification support
- Opportunity to work on cutting-edge AI projects
- Flexible working arrangements
- Career advancement opportunities in a rapidly growing AI company
This position offers a unique opportunity to shape the future of AI implementation while working with a talented team of professionals at the forefront of technological innovation.
The successful candidate will play a crucial role in driving our company's success in delivering transformative AI solutions to our clients.
Qualifications
- Proven practical experience in designing, building, andoptimisingsolutions using Data Lakehouse architectures (e.g., Databricks, Delta Lake).
- Strong hands‑on experience with managing data ingestion, schema enforcement, ACID properties, andutilizingbig data technologies/frameworks like Spark and Kafka.
- Expertisein datamodeling,ETL/ELTprocesses, and data warehousing concepts.
- ProficiencyinSQLand scripting languages (e.g.,Python, Scala).
- Demonstratedpractical experienceimplementingMLOpspipelinesfor production systems.
- This includes a solid understanding and implementation experience withMLOpsprinciples: automation, governance, and monitoring of ML models throughout the entire lifecycle.
- Experience withCI/CDtools, containerization/orchestration technologies (e.g.,Docker, Kubernetes), model serving frameworks (e.g., TensorFlow Serving,Sagemaker), and experiment tracking (e.g.,MLflow).
- Experience with production monitoring tools to detectdata driftormodeldecay.
- Strong hands‑on experience with majorcloud platforms(e.g., AWS, Azure, GCP) and familiarity withDevOpspractices.
- Excellent analytical,problem-solving,and communication skills, with the ability to translate complex technical concepts into clear and actionable insights.
- Proven ability to work effectively in a fast-paced, collaborative environment, with a passion for innovation and continuous learning